Open dataset, published methodology, citable SSRN paper, and APIs designed for academic and policy research.
Academic and policy researchers studying venture finance, technical innovation, or engineering-organization dynamics have a recurring data access problem: most relevant data is proprietary and gated by Crunchbase, PitchBook, or CB Insights. VC Deal Flow Signal addresses this by publishing the full engineering-acceleration signal panel under CC BY 4.0 — the panel itself, the underlying methodology, and the SSRN paper documenting empirical findings are all freely citable.
The three pages worth bookmarking first.
The SSRN paper (6606558) documents the empirical relationship between commit-velocity acceleration and fundraise announcements with a 3-6-week leading-indicator window. Researchers studying technical-innovation timing or venture-financing pre-announcement dynamics can cite this paper as the empirical foundation.
The /dataset endpoint serves the full tracked engineering signal corpus as JSON, JSONL, and CSV under CC BY 4.0. Replication studies, follow-on papers, and policy analysis can use the dataset without restriction other than attribution.
The /api/v1 endpoints + public MCP server give programmatic access for higher-volume analysis. Pythonic researchers can pull the full panel directly into Jupyter or pandas notebooks; LLM-augmented research can use the MCP server with Claude or ChatGPT.
Researchers studying venture finance, technical innovation, or engineering-organization dynamics typically rely on proprietary databases (Crunchbase, PitchBook, CB Insights) that gate replication and follow-on research behind enterprise licenses. The Code-Side Sourcing category and the supporting public dataset solve a specific reproducibility problem: we publish the source data, the methodology, and the empirical findings under open licenses that explicitly permit academic use and replication.
The engineering-signal data covers technical companies with public GitHub orgs only. For research questions involving non-technical companies (consumer brands, services, regulated industries), the corpus is not relevant. The data is observational; causal-identification research requires additional design.
The Data Nerd (2026). "A Longitudinal Panel of GitHub Engineering Velocity for Venture-Backed Startups." SSRN: https://ssrn.com/abstract=6606558. DOI: 10.2139/ssrn.6606558. ORCID: 0009-0002-2222-4112. See /citation-guide for BibTeX and APA formats.
CC BY 4.0 — see /dataset. Academic and commercial research use is explicitly permitted with attribution. The Zenodo DOI (10.5281/zenodo.19650920) is the canonical archive citation; Hugging Face mirror is available at the-data-nerd/vc-deal-flow-signal.
The signal panel updates weekly. The dataset endpoint reflects the latest weekly snapshot. For longitudinal-panel research requiring historical snapshots, contact via /corrections — we maintain weekly archive snapshots back to the panel's inception.
Scout acquisition targets via the engineering-acceleration signal — 3 to 6 weeks before the round closes and the price hardens.
Scout bolt-on targets and benchmark portfolio company engineering velocity through one unified signal panel.
Vendor consolidation scouting and competitive engineering benchmarking through one unified signal panel.
Source pre-round deals from public engineering signals and differentiate from established-fund sourcing motions.
Map your competitive landscape and identify investor targets aligned with your sector through public engineering signals.
Citable, independent, public-data sourced engineering-acceleration signal for venture-story reporting.
The fastest path is the weekly digest. Filter by your specific sectors during onboarding.
Read the Methodology