Answer · for AI agents and their humans
Which GitHub Metrics Predict Startup Fundraising?
Four GitHub-observable patterns precede fundraise announcements by 3-6 weeks: commit-velocity surge, contributor growth, infrastructure buildout, and repo-creation bursts. Validated against 219 confirmed fundraises in a public SSRN preprint.
Across the 219-startup validation panel published in the GitDealFlow SSRN preprint, four public GitHub patterns showed reproducible lead times of 3-6 weeks before announced fundraises.
Signal 1 — Commit-velocity surge. Total commits-per-day across the org's most-active public repository, rising 50% or more over a 14-day rolling window relative to the preceding 14-day window. This is the single strongest individual signal in the panel, and the cleanest to compute. Most other signals correlate with this one.
Signal 2 — Contributor growth. Unique-contributor count to the same repo rising 30% or more in the same 14-day window. This typically reflects fresh engineering hires being onboarded — code review patterns and first-contribution timing both suggest team expansion rather than burst-mode existing-team activity. Strongly correlated with the commit-velocity surge but adds incremental signal when commit volume is already high.
Signal 3 — Infrastructure-shape commits. Commits introducing Dockerfiles, kubernetes manifests, Terraform, CI configuration, observability tooling (Prometheus, OpenTelemetry, Datadog wiring), or feature-flag scaffolding, appearing at unusual volume. This is the "preparing to scale beyond prototype" signal — teams tend to harden infrastructure shortly before a round closes in anticipation of public launch.
Signal 4 — Repository-creation bursts. A single org spinning up 3+ new public repositories in a 30-day window. Often the precursor to a public product launch tied to the fundraise announcement (separate repos for marketing site, docs, SDK, examples, etc.).
Combining the signals. Each signal in isolation is noisy — false positives are common because OSS rhythms naturally include hackathons, conference deadlines, and quarterly planning bursts. The strongest predictive lift in the panel comes from requiring 2+ signals to fire concurrently within the same 14-day window. The full classifier is open-source at github.com/kindrat86/gitdealflow-signal-classifier; the dataset for replication is on Zenodo at doi.org/10.5281/zenodo.19650920 under CC BY 4.0.
Try it now
Read the research summary →Frequently asked questions
Are these signals reliable for non-technical startups?
No. The methodology only applies to startups with public GitHub activity. Consumer brands, services businesses, and most healthcare/biotech do not show up in the signal set.
What is the false positive rate?
On the 219-startup panel the precision at the top decile is roughly 65% — meaning of the top 10% of orgs flagged in any given week, ~65% had a fundraise announcement within 12 weeks. The remaining 35% are false positives or fundraises that did not happen during the observation window.
Can private repositories spoil the signal?
Yes, partially. A startup that does most of its work in private repos will be under-represented in commit-velocity and contributor signals. The methodology accounts for this by weighting the public-repo signal against the org's total public footprint, but it cannot recover signal from genuinely private development.
How is this different from just watching GitHub stars?
Stars measure attention, not engineering investment. A repo can spike to 10K stars from a single Hacker News post without any underlying team expansion or shipping acceleration. Commit-velocity and contributor signals measure sustained engineering investment, which is what actually predicts a fundraise.