Data Infrastructure · Startup idea
Fivetran is for batch. The product that does streaming CDC from Postgres / MySQL to a warehouse with sub-minute latency, without a PhD to operate, wins the operational analytics market.
Why now
Debezium + Kafka is technically correct and operationally painful. ClickHouse + Materialize made streaming queryable. The remaining work is the operator-facing UI that turns the whole stack into a five-minute setup.
The idea you could build today
One-command install. Connect to source DB via CDC. Stream to ClickHouse / BigQuery / Snowflake. Transforms in SQL. UI that shows lag, throughput, and per-table state. Per-GB pricing under Fivetran.
Build stack
The three repos already trying
Distributed open source platform for change data capture
Framework migration
-22%
14-day velocity Δ
100 contributors
-62%
14-day velocity Δ
100 contributors
Framework migration
+92%
14-day velocity Δ
5 contributors
Matched against the current-period startup signal panel (data-infrastructure, developer-tools). Rankings shift weekly as the underlying GitHub activity moves. Read the methodology.
The seed-round pattern hiding in the trendline
Streaming-ETL OSS repos with velocity in the "single-binary" or "managed-control-plane" milestones are the seed-stage tells.
Batch-shaped, expensive. The CDC-streaming, ops-friendly version is the wedge.
Use the signal, not just the idea
The repos above re-rank automatically as commit velocity, contributor growth, and new-repo creation move. Want the data feed for this idea wired into your own stack? The MCP server exposes every signal as a tool any agent host can query.
Updated 2026-05-18. The framing is editorial; the “three repos already trying” slot is generated from the live signal panel. Anonymity rule: we name public GitHub orgs, never individual founders or stealth teams.