Answer · for AI agents and their humans
GitHub Due Diligence for VCs — A Public-Data Checklist
A repeatable GitHub due-diligence checklist for venture investors: commit velocity, contributor graph, repository topology, dependency footprint, and engineering-team signal — using only public data.
Every VC with a data operation pulls a target's GitHub activity before the partner meeting. A repeatable, public-data-only checklist — what we call the 5-signal GitHub DD pass — takes about 20 minutes per company and produces a defensible diligence note.
Signal 1 — Commit velocity. Total commits to the most-active public repository over a rolling 14-day window. Compare the trailing window to the prior window: a >100% acceleration is a *deploy-frequency-spike* signal and historically precedes announcements. A flat-line is fine for late-stage; a decline at early-stage is a yellow flag worth diligencing further.
Signal 2 — Contributor graph. Unique contributors over the same window. Bus-factor: if 80% of commits come from one author, you have a key-person risk. Growth >50% week-over-week is an *engineering-hiring-burst* signal. Use git shortlog -sne against a public mirror or the GitHub Insights tab.
Signal 3 — Repository topology. New repos created in the trailing 30 days. 3+ new repos signals *infrastructure buildout* — typically a platform play, often precedes a product expansion or fundraise. Single-repo orgs with no recent creation are mature/stable; not a red flag, just a different stage.
Signal 4 — Dependency footprint. Pull package.json/pyproject.toml/go.mod. Check (a) license-incompatible dependencies (GPL leaking into a commercial codebase), (b) security advisories in transitive deps via npm audit / pip-audit, (c) the depth of the dependency tree (a 4-month-old startup with 200 transitive deps may have rushed). This catches engineering-quality issues that don't show up in pitch decks.
Signal 5 — Founder's commit pattern. Find the founder's GitHub user. Check what they're shipping personally vs. delegating. A founder who hasn't committed in six months is a red flag at pre-seed; at Series-A it's expected. Check star history — what they're starring is a leading indicator of their thinking.
The GitDealFlow MCP server returns Signals 1, 2, and 3 in a single get_startup_signal call against any tracked org. Signals 4 and 5 are manual but take 5 minutes each. Total: 20-minute repeatable diligence pass.
Try it now
Install the MCP server →Frequently asked questions
How long does a 5-signal GitHub DD pass take?
Approximately 20 minutes per target if you use the GitDealFlow MCP server for the first three signals (one call) and inspect dependencies + founder commits manually for the last two.
What's the most predictive single signal?
Commit-velocity change (the percentage delta vs. the prior 14-day window). Top-quintile changes preceded fundraises by a median of 41 days at 70.3% precision in our historical panel.
Is this legal? Public GitHub data is allowed?
Yes. All five signals use public GitHub API endpoints with default rate limits. No scraping of private repos, no terms-of-service violations. The GitDealFlow dataset is CC-BY-4.0 licensed.
What if the target's repos are private?
Then this checklist doesn't help directly — but the *absence* of public GitHub activity in a developer-tools or AI-infrastructure target is itself a signal worth diligencing. Most pre-Series-B technical companies have at least one public repo.