Data Infrastructure · Startup idea
Snowflake and BigQuery overserved everyone under 100GB of data. The DuckDB-on-S3 stack is the right architecture; the product layer is what's missing.
Why now
MotherDuck, DuckLake, and the broader DuckDB ecosystem proved the architecture works. The next layer is the operator-friendly UI — query in a notebook, materialize to S3, no Snowflake bill.
The idea you could build today
Hosted DuckDB on top of S3 / R2. Notebook UI for ad-hoc queries. CDC ingest from Postgres / MySQL. Per-GB-stored pricing under $0.01.
Build stack
The three repos already trying
Framework migration
+92%
14-day velocity Δ
5 contributors
A collection of tools for OpenAPI specifications. (NOTE: This organization is not affiliated with OpenAPI Initiative (OA
Framework migration
+44%
14-day velocity Δ
100 contributors
An orchestration platform for the development, production, and observation of data assets.
Framework migration
+13%
14-day velocity Δ
100 contributors
Matched against the current-period startup signal panel (data-infrastructure, developer-tools). Rankings shift weekly as the underlying GitHub activity moves. Read the methodology.
The seed-round pattern hiding in the trendline
DuckDB-class repos with velocity in the "hosted control plane" or "S3 sink" modules are the seed-stage tells.
Yes — and they're the leader. The opportunity is the OSS-first, self-hostable, MotherDuck alternative for the long tail.
Use the signal, not just the idea
The repos above re-rank automatically as commit velocity, contributor growth, and new-repo creation move. Want the data feed for this idea wired into your own stack? The MCP server exposes every signal as a tool any agent host can query.
Updated 2026-05-18. The framing is editorial; the “three repos already trying” slot is generated from the live signal panel. Anonymity rule: we name public GitHub orgs, never individual founders or stealth teams.