Character bible (under the anonymity rule)
I'm The Data Nerd. I won't tell you my real name and that's on purpose — the methodology is the protagonist, I'm just the storyteller. Engineer for fifteen years, angel-checker since deal #5. The whole product rests on whether the signal is real, not on whether you find me charismatic.
Anonymous engineer-investor. Wrote the SSRN methodology paper. Refuses to do podcasts.
00 · Disclosure
Behind the handle is one real engineer-investor. The handle exists because the methodology has to stand on whether the signal is real, not on whether the person delivering it is charismatic. You will never see my face. You will never hear my real voice. You will never see my real name on this site.
What you will hear: a synthetic voice (Cartesia) reading every page narration, every YouTube short, every monthly founder talk. Same synthetic voice across every surface so the brand has rhythm even without a person. Synthetic voice (Cartesia). Same voice across YouTube, email-audio, and every page narration. There is no founder voice. The methodology is real. The voice is a writing convention.
What you will read: the same handle (“The Data Nerd”) signed at the bottom of every email, every page, every video script. Voice fragmentation breaks the character; one handle, every surface, forever. The methodology paper on SSRN uses my real name behind a Sipiteno Ltd. corporate veil; everything on this site uses the handle.
Before you read on
This is the character bible, so it gets engineering-heavy below — merge graphs, regressions, contributor de-duplication, the lot. That’s the engine. You only ever see the translated read. The creed of the people this is built for: “We move on the engineering signal before the round — without reading a line of code.”
So when a section gets technical, that’s the machinery talking, not a quiz. Where the jargon runs thickest you’ll see a In plain English: line that says the same thing the way a corp-dev partner would say it across a table.
01 · Identity archetype
The four archetypes a founder character can occupy are Leader, Adventurer, Reluctant Hero, and Reporter. Leaders front-load conviction; Adventurers front-load risk; Reluctant Heroes front-load reluctance; Reporters front-load curiosity. I'm the fourth one with a touch of the third — a Reluctant Reporter. The first time I noticed the seabird flock I thought I was seeing a coincidence. The fifth time I checked I knew I owed the fishermen the warning. The product on this site is the warning, formalised.
In contrast: Not a Leader (I won't take stage), not an Adventurer (I won't sell risk as romance), not a pure Reluctant Hero (I'm not waiting to be drafted) — a Reporter who can't unsee the pattern and a Reluctant Hero about whether the discovery is mine to publish.
02 · The tribe
“We move on the engineering signal before the round — without reading a line of code.”
The reader who nods through the polarity is a First Mover — a solo angel, scout, seed fund, corp-dev or PE operator who evaluates companies for a living but doesn't read code and doesn't want to. The handle the reader earns is 'first mover': the investor who reaches the founder before the round, on a signal someone else translated into plain English. The product is built around that identity. The pricing is built around that identity. Every page on this site is built around that identity. If the label feels off — if you'd rather pull up the merge graph and run the regression yourself — that's diagnostic, and the product is probably wrong for you.
03 · Polarity (8 positions)
Eight polarities. If you nod through all eight, you’re my reader. If even one feels wrong, please save your money — this is the wrong product for you and that’s honest.
For: Public data is more valuable than private data.
Against: Edge from access.
Renaissance Technologies started in 1988 on data anyone could buy — Reuters quotes, SEC filings, OPRA ticks. Medallion compounded ~39% net for thirty years. The data wasn't edge. The lens was. Same logic on GitHub.
For: Code is more honest than copy.
Against: The deck is the company.
A pitch deck is a marketing artifact written for the next round. A merge graph is the company's actual behaviour, updated daily.
In plain English: the deck is what a company says about itself for the next fundraise; what its engineers actually ship every day is the truer story. You watch the work, not the pitch — and you never have to open the work yourself.
For: Anonymity is a credibility signal.
Against: Cult of personality.
If the signal needs a charismatic founder to land, the signal isn't strong enough. If we're right, the data carries the argument.
For: €9.97/mo is a feature, not a price ceiling.
Against: Six-figure data subscriptions for six-person funds.
We'd rather have a thousand readers who tell five friends than a hundred enterprise contracts. The founding price is locked forever.
For: Methodology before metrics.
Against: Black-box scores.
Every number on this site links to the formula that produced it. The /methodology page is the moat. If you can reproduce the regression, you can audit the claim. If we hide the formula we deserve to be ignored.
For: False positives published in the same email as the wins.
Against: Curated case-study reels.
Every Tuesday digest names at least one signal that fired wrong the prior week, with the post-mortem inline. A vendor who never publishes a miss is a vendor with no calibration discipline. The /scorecard page is permanent and includes every miss.
For: Async over live.
Against: Discovery-call theatre.
Two daily reply batches. No calendar links above the Sharp tier. A long written email beats a 30-minute call you scheduled to qualify yourself. If the question can be answered in writing, the call wastes the buyer's hour and the founder's anonymity at the same time.
For: Distribution is a moat. Friction is the leak.
Against: Walled-garden datasets.
Every public surface has a markdown mirror at /md. Every page has an agent-card endpoint. The MCP server installs in one line. The OpenAPI spec is at a stable URL. We pay the cost of redundant discoverability so the reader, the agent, and the LLM all find us through whichever path fits them.
04 · Six parables
Every core claim needs a parable that makes it feel obvious. These are the six I rotate through emails, videos, and the Sunday digest. Each one is also hyperlinkable at /parables so you can send a single one to a friend without sending them this whole bible.
Parable 1 · The Unknown Lighthouse Keeper
A lighthouse keeper notices a particular flock of seabirds arrive a week before every storm. He doesn't know why. He only knows that when the birds arrive, ships should already be in harbour. The fishermen who follow him stop losing boats. The ones who say 'birds aren't weather data' keep losing them.
Lesson: Engineering acceleration is the seabird flock. It doesn't prove the storm. It precedes it reliably enough that ignoring it is the expensive choice.
Parable 2 · The Loud Engine
Two cars start a race. One is silent at the line. The other idles loud, builds revs, the driver checks his mirrors, the passenger fastens her belt. The silent car may win — but the loud one is doing every observable thing a car about to launch does.
Lesson: Code is the engine of a startup. When the engine is visibly louder for two weeks running, the launch usually follows. We aren't reading the future. We're reading the things that always happen right before the future arrives.
Parable 3 · The Letter the Postman Already Read
Imagine the postman could read every letter in his bag. The richest man in town wouldn't pay him for tomorrow's letters — those aren't in the bag yet. He'd pay him for today's letters delivered three days early.
Lesson: GitHub already wrote the letters. Crunchbase reads them on the day they land. We open them in transit. Everyone else gets the same mail we do — they just get it the week after the founder posted on LinkedIn.
Parable 4 · The Sunday Email I Never Sent
The Sunday before the $4M Series A I should have been in, I drafted a three-line email to the founder. 'Saw your settlement-layer commits. The way you're handling the FX edge case is the kind of thing your competitors will copy in eighteen months.' I read it back. Decided I hadn't earned the right. Closed the laptop. Three weeks later the deck went out and the round closed inside a week.
Lesson: The email I didn't send cost me a position I'd already done the work to deserve. Now I send the email. The product on this site is a system that decides which Sunday emails are worth sending — so I never have to ask whether I've earned the right again.
Parable 5 · The Reader Who Told Me I Was Wrong
Six weeks into the public beta a Series B associate replied to a Tuesday digest with two lines. 'You flagged orgname. Their commit velocity tripled because they migrated a monorepo. There was no acceleration. Just a re-org.' She was right. The model had no signal for monorepo migration events. We added it the next Sunday — false positive rate dropped from 7% to 4% on the back of one reader's reply.
Lesson: Every methodology is wrong somewhere. The cheap move is to deny it. The expensive move — and the one that compounds — is to publish the limit before the reader finds it. We publish ours at /methodology. The reader who corrects us is the reader who matters most.
In plain English: the team looked like it had suddenly tripled its output — but it was just two teams’ work getting merged into one place, not real new momentum. A reader who knew the company caught it; we fixed the blind spot, and our false-alarm rate dropped.
Parable 6 · The Tuesday I Broke the Regression
On a Tuesday in February I refactored the velocity computation 'just to clean it up.' Pushed at 9pm. Wednesday morning the digest went out with three orgs ranked at the top that did not belong there — a hackathon, a bot-heavy security tool, a vendor's documentation repo. Thirty subscribers replied. I rolled back, ran the panel against the prior week's truth set, found the off-by-one in the contributor-deduplication step, shipped the fix Thursday at 3am, posted the post-mortem at /uptime Friday morning.
Lesson: The methodology is more interesting than the wins. When something breaks, the post-mortem goes public the same week. The regression code, the truth set, the fix commit — all linkable, all CC BY 4.0. That's the whole reason the price is €9.97/mo and not €9,970.
In plain English: I tinkered with how the scoring works, broke it, and Wednesday’s list put three companies on top that didn’t belong there. Readers told me; I traced it to one small counting bug, fixed it, and posted exactly what went wrong in public. You only ever saw the corrected list and the post-mortem.
05 · Three flaws, on purpose
A founder character without visible flaws reads as a brochure. These are real. If they’re dealbreakers, I’m the wrong vendor.
Flaw 1 — Slow to reply.
Email replies happen in two daily batches, never inside the hour. No LinkedIn DMs at all. If you need a vendor on Slack at 11pm, I'm not it.
Flaw 2 — Won't do calls before you've subscribed.
Sharp Tier funds get one quarterly call, included. Insider Circle gets the monthly group briefing. Below that, everything is async and written. I'd rather write you a long email than waste your hour on a discovery call you didn't need.
Flaw 3 — No video, podcasts, or named publication.
Anonymity is non-negotiable. If your firm requires named attribution on every paper or photo on every LinkedIn post, I'm the wrong vendor. The handle is what lets me say uncomfortable things about how the consensus deal-flow industry works.
06 · Seven voice rules
A consistent founder voice compounds across surfaces. These are the seven rules every page, email, and video script gets reviewed against. They’re published so the reader can audit drift.
Rule 1 — Specific over general.
Never say 'a startup'. Say 'a three-founder fintech with one repo'. Never say 'a fund'. Say 'the partner at [redacted] who DM'd me about the fintech the morning after the announcement'. Specific scales; general dies.
Rule 2 — Translate, don't dump — plain business English over code jargon.
We're talking to Marcus: a dealmaker who evaluates companies but doesn't read code, and whose fear is looking non-technical in a technical room. The Data Nerd is an engineer, but he writes for a non-coder — he reaches for the plain-English image a corp-dev partner would use ('they're shipping far more than usual,' 'the team doubled overnight,' 'they're building the thing competitors will copy in a year'), never a merge graph or a regression coefficient as the load-bearing explanation. An occasional code metaphor is fine as flavour; it can never be the thing the reader has to decode. 'Synergize the funnel' is still banned — so is anything that makes the reader feel he should already know what a commit graph is.
Rule 3 — Number, then claim. Never claim, then number.
Wrong: 'GitHub data is the most leading signal — we ran a panel of 219 startups.' Right: '219 startups, five quarters, median 31-day lead time. That's why we say GitHub data is the most leading signal.' Numbers up front earn the claim.
Rule 4 — Admit what we don't know in the same breath.
Every claim has a limit. Naming it before the reader does is the cheapest credibility move there is. 'False positive rate is 4%. We're trying to get it to 2% by Q4 but right now it's 4%.' Better than 'industry-leading accuracy'.
Rule 5 — No hype words.
Banned: 'unlock', 'leverage', 'game-changer', 'revolutionary', 'AI-powered' (without an actual model name), 'cutting-edge', 'next-generation'. If a word would survive being deleted, delete it.
Rule 6 — Cliffhanger at the end of every email.
The P.S. previews tomorrow's email or the next chapter. The reader closes the browser still curious. That's the entire job of email #N — get them to open email #N+1.
Rule 7 — Never sign anyone else's name.
Every byline is 'The Data Nerd' or unsigned. There is no second persona. There is no team handle. Voice fragmentation breaks the character; one handle, every surface, forever.
07 · Catchphrases
Founder voices have 5–7 verbal tells the audience can quote. These are mine. If you’ve been reading for more than two months and one of these doesn’t sound familiar, the voice has drifted and someone owes you a refund.
“Trust the math, not me.”
“The methodology is the protagonist. I'm the storyteller.”
“Public data, private lens.”
“The deck lags the code by 21 to 47 days.”
“Code is more honest than copy.”
“If we're right, the data carries the argument.”
“Read the methodology before you trust the metric.”
08 · Where you’ll meet me
Email drip
Where: Days 0–180 in the welcome + daily-story sequences
How: Every email signs as The Data Nerd. P.S. previews the next.
YouTube
Where: Acceleration Watch (weekly), State-of-the-Engine talk (monthly), Data Nerd Brief (weekly)
How: Cartesia synthetic voice — the same voice across every video. No real-voice cameo, ever.
Manifesto + Origin + Founder pages
Where: /manifesto, /origin, /about/founder, /story
How: Long-form character: backstory, polarity, parables, flaws.
Page narrations
Where: /walkthrough, /predicted, /state-of-github audio companions
How: Synthetic voice reads the page. Disclosure on every player.
Weekly Sunday digest
Where: Free Acceleration Watch, every Monday 06:00 UTC
How: Five startups, sector-tagged, signed The Data Nerd. The rhythm is the relationship.
Reply-to inbox
Where: signal@gitdealflow.com
How: Same handle. Two daily reply batches. No call-scheduling links.
09 · Live status — what I’m doing this week
The Sunday digest covers broadcast cadence; the monthly State of GitHub address covers the long-form. The /now page is the in-between — five fields, five minutes, every Monday. The cadence IS the character.
Open the /now page →10 · Twelve months from now
The character has to project a future or it’s a static pose. These are the five public commits the narrator will be graded against on 2027-05-09. Either kept or admitted-broken-with-reason. No third option.
Commit 1 — Still anonymous.
No founder face, no real voice, no real-name media tour. If a podcast audience grows by 100K through breaking the rule, the rule still holds. The whole product rests on whether this commitment is kept; the day it breaks is the day the methodology has to compete with personality, and it loses.
Commit 2 — Twelve State-of-Engine addresses on the record.
One per month, every month, May 2026 → April 2027. Each one with a falsifiable prediction graded the following month. Twelve in a row is the cadence proof — eleven is a project, twelve is a practice.
Commit 3 — /scorecard published with at least 80 weekly picks graded.
Twelve months × 4–5 weekly picks per Sunday = ~52 grading windows by May 2027. Hit/Miss/Pending public, no curation. If the published precision drops below 60% across the panel, the price drops with it — the credibility chain has to hold both directions.
Commit 4 — One additional methodology author on the SSRN paper.
Co-author named — credit shared. Not because the work needs help (it doesn't) but because a methodology that lives in one anonymous head is one regression-rewrite away from breaking. A second name on the next preprint version is a continuity commitment to the buyer.
Commit 5 — Insider Circle at 200 paid members or the price drops.
Founding-member rate locked at €97/mo until 200 active subscribers, then a 60-day notice and a public price hike. The cohort closes when the math closes. Members who joined early stay at the locked rate forever.
Five breakout startups every Sunday — the engineering signal translated into plain English, 21 to 47 days before the deck circulates. No code-reading, no card.
11 · Where to go next
Three doors, by commitment level:
Character framing drawn from direct-response sales canon. Implemented under the anonymity rule (manifesto pillar #4) — handle, synthetic voice, methodology glyph, no real face/name/voice.