How We Measure Startup Engineering Acceleration
VC Deal Flow Signal uses publicly available GitHub data to identify startups showing unusual engineering momentum. This page explains exactly how we source, process, and rank that data, so investors can evaluate the signal quality before acting on it.
Data Sources
GitHub API v3 is our primary data source. We query the search/repositories endpoint to discover active startup organizations across 20 sector-specific topic clusters (e.g., machine-learning, fintech, cybersecurity). We then pull per-organization data from the stats/commit_activity and contributors endpoints.
Filtering: We exclude large tech companies (Google, Microsoft, Meta, etc.), major open-source foundations, and organizations with patterns inconsistent with venture-backed startups. The goal is to surface companies in the pre-seed through Series B range.
Geography is derived from the GitHub organization profile location field, mapped to broad regions (US, UK, EU, APAC, Canada, LATAM, MENA).
Core Metrics
Commit Velocity (14-day)
The total number of commits to an organization's most active public repository over a rolling 14-day window. We use GitHub's weekly commit_activity data (52 weeks of history) and sum two consecutive weeks to produce a 14-day figure.
Commit Velocity Change
The percentage change in commit velocity compared to the preceding 14-day window. A startup with 40 commits this period and 20 commits last period shows +100% velocity change. This is the primary ranking signal — it measures acceleration, not absolute volume.
Contributor Count & Growth
The number of unique contributors to the organization's most active repository. Growth is estimated by comparing recent 6-week commit volume to the prior 6-week period. A rising contributor count often signals team expansion — a leading indicator of funding or product-market fit.
New Repositories
The count of public repositories created by the organization in the last 30 days. A burst of new repos often signals infrastructure buildout, new product lines, or framework migrations.
Signal Classification
Each startup is assigned one of four signal types based on which metric is driving the acceleration:
- Engineering hiring burst — contributor growth rate exceeds 50%. The team is scaling rapidly.
- Infrastructure buildout — 3 or more new repositories in 30 days. The company is expanding its technical surface area.
- Deploy frequency spike — commit velocity has increased 150% or more. The team is shipping at an unusually high rate.
- Framework migration — general acceleration that doesn't fit the above categories, often indicating a technology stack transition.
Stage Estimation
We estimate startup stage from contributor count as a rough proxy for team size: Pre-seed (1–7 contributors), Seed (8–19), Series A/B (20–49), Growth (50+). This is an approximation — not all contributors are employees, and not all employees contribute to public repos.
Update Frequency
Data is refreshed weekly (Monday mornings). The pipeline queries GitHub for the latest 52 weeks of commit history, recalculates all metrics, regenerates sector rankings, and rebuilds the site. Each sector page shows rankings for the current quarter and up to four previous quarters.
Known Limitations
Private repos are invisible. Some startups keep all or most code in private repositories. Our signal only covers public engineering activity.
Commit volume is not code quality. High commit velocity can reflect rapid feature development, but also refactoring, documentation, or CI/CD noise. We mitigate this by measuring change from baseline rather than absolute counts.
Not investment advice. Engineering acceleration is a leading indicator of traction, not a guarantee of success. Always conduct your own due diligence before making investment decisions.
See the signals in action
Browse startup rankings across 20 sectors, updated weekly with fresh GitHub data.
Browse Sector Rankings