Validation Methods

Waitlist is not demand: 3 false signals founders mistake for intent

By Gregor The Builder Apr 21, 2026 11 min read

Five Reddit threads in five days, same shape each time. A founder collects what looks like interest (27 early-access signups here, 2,500 visitors there, 15 years of "great idea" feedback from friends in another), ships the product, and watches the Stripe dashboard flatline. Every thread ends with the same question: "what went wrong with my marketing?" It's the wrong question.

The signal was wrong before the launch. Waitlists, social likes, and "I'd love to try it" comments aren't demand. They're curiosity receipts. Most founders don't have a framework for telling the two apart, so they read encouragement as evidence and build the wrong thing. Below are the three false signals, the mechanism behind each, and a weekend protocol for reading real intent before you write code.

Key Takeaways

Waitlist-to-paid conversion on cold traffic sits at 2-8% (Getwaitlist, 2024). Waitlists measure willingness to give an email, not willingness to pay.

Friend and family feedback is selection-biased. Family-and-friends-funded founders produced 53% fewer patents than professionally-backed ones (Zaccaria, Journal of Corporate Finance, 2023).

Stated intent predicts behavior weakly. A meta-analysis of 47 experiments found d+ = 0.36 (Webb and Sheeran, 2006).

The 11-point Juster purchase-probability scale with verbal anchors outperforms simple Likert intent scales at predicting adoption (Brennan, Massey Marketing Bulletin, 1994).

You can triage 10 concepts in a weekend with synthetic panels, then interview survivors, then A/B test the winner.

Why most startup failure is a demand problem, not an execution problem

Roughly 43% of failed venture-backed startups cite product-market-fit failure as the primary cause of shutdown (CB Insights, 2026). The much-quoted "ran out of cash" shows up in around 70% of postmortems, but that's a symptom. You run out of cash because no one buys at a price that covers the build. Demand is the upstream problem.

When conversion is zero, the instinct is to blame distribution. Retarget the ads, hire a growth contractor, rewrite the landing page. That instinct skips the first question: is anyone, anywhere, paying this price for this thing? If the answer's no, the marketing budget goes to waste. You can't out-distribute a product people don't want.

Skipping that question isn't free. MVP builds today run $40,000-$150,000 across reputable dev shops (SpdLoad, 2025), or 3-6 months of solo-founder time. That's the real price of a false positive on a waitlist. You pay it before anyone has to pay you.

Back to those five Reddit threads. All of them share one feature. The founder shipped based on a signal that didn't cost the respondent anything. An email costs nothing. A thumbs-up costs nothing. A "yeah I'd try it" over coffee costs nothing. The thing you're asking them to do later costs $29 a month. That gap between "no cost" and "$29" is where your pipeline dies.

Citation capsule. Around 43% of failed venture-backed startups name product-market-fit failure as their primary shutdown reason, across 385 identifiable-cause postmortems since 2023 (CB Insights, 2026). Cash runs out because demand didn't materialize. That's a validation problem, not a marketing problem, and it's answerable before you build.

Read the full product validation procedure if you want the end-to-end flow this post sits inside.

False signal #1: the curiosity signup

On most pre-launch products, waitlists convert to paid at 2-8% of cold traffic. The ~20% number operators like to quote is reserved for strong-PMF cases (Getwaitlist, 2024). Median SaaS visit-to-signup on a landing page is just 3.8% (Unbounce, 2024). Stack the two and the funnel bleeds out fast.

A waitlist measures one thing cleanly: willingness to hand over an email for a future maybe. That action costs near zero. Buying your product later doesn't. A signal collected at zero cost can't predict a decision made at real cost. That's the mechanism.

What the downstream funnel actually does

Even the committed-enough-to-try signal barely survives the trial gate. Median trial-to-paid across SaaS runs 24.8% (SaaS benchmarks 2024-2025, ChartMogul). Those are people who already installed something, logged in, and used it. A waitlist signup hasn't done any of that. If a quarter of tried-it users convert, what's the rate for people who only typed an email?

Line of people standing outside a shop with a 'coming soon' sign, two or three actually buying inside, warm amber dusk light

Citation capsule. Waitlist conversion to paid runs 2-8% on cold traffic (Getwaitlist, 2024), while median SaaS trial-to-paid sits at 24.8% (SaaS benchmarks 2024-2025, ChartMogul). The cost-of-entry gap is the mechanism. Email is free to give. Dollars aren't. A zero-cost signal can't predict a real-cost decision.

See how to measure purchase intent properly for the scales that actually track buying behavior.

False signal #2: friends, network, and the people who already like you

Family-and-friends-funded founders produced 53% fewer patents than professionally-backed ones (Zaccaria, Journal of Corporate Finance, 2023). The finding is about capital, but the signal is about feedback. People who already like you don't pressure-test your idea. They applaud it.

"I showed 15 friends and they all loved it" isn't evidence. It's the opposite. Friends are selection-biased, they're almost never your target customer, and the social contract around showing ideas rewards politeness over accuracy. Nobody at a dinner table says "I wouldn't pay for that." They say "sounds great, send me the link." The link gets starred, not clicked.

The mechanism is the protocol, not the person

It's not a character problem. It's a protocol problem. You asked the wrong people a question with no stakes attached. A real answer needs three things your network can't give you: a respondent in the actual target segment, a measurement scale that separates "interested" from "would pay," and enough friction in the asking that respondents pause before answering.

Your sister-in-law isn't going to pause. She's going to say it sounds great because she loves you. That's a good thing. It's just not data.

Founder showing a phone mockup to family at a dinner table while target-audience strangers walk past the window, warm interior lighting

Citation capsule. A 2023 study in the Journal of Corporate Finance found family-and-friends-funded founders produced 53% fewer patents than professionally-backed ones (Zaccaria, 2023). The mechanism generalizes to feedback, too. Friendly capital and friendly feedback both soften pressure. Softer pressure produces weaker ideas and weaker signals.

False signal #3: soft-commit language and the intent-behavior gap

A meta-analysis of 47 experimental studies found that even a medium-to-large shift in stated intent produces only a small-to-medium shift in actual behavior, with an average effect size of d+ = 0.36 (Webb and Sheeran, Psychological Bulletin, 2006). In plain terms, the gap between "I'd buy this" and "I bought this" is structural, not anecdotal. It's the rule, not the exception.

Most validation advice blames execution when a product fails. The diagnosis is usually one layer deeper. The founder didn't collect demand, they collected language. "I'd love to try it" and "I'll buy it" are linguistically similar and predictively opposite. If your pre-launch evidence is a pile of soft-commit quotes, you didn't measure intent. You measured niceness.

The asking protocol changes the answer

The Juster purchase-probability scale (11 points with verbal anchors from "certain, practically sure" to "no chance, almost no chance") outperforms simple Likert intent scales at predicting adoption (Brennan, Massey Marketing Bulletin, 1994). On fast-moving goods, Juster scale predictions track actual purchases within 5-8 percentage points. Same respondents, different scale, better prediction.

There's a second effect underneath it. Chandon, Morwitz and Reinartz found that the act of asking about intent itself raises the intent-behavior correlation by 58% in surveyed groups versus matched non-surveyed controls (Chandon et al., Journal of Marketing, 2005). They call it self-generated validity. The measurement distorts the thing measured.

So asking matters twice over. Once because different scales produce different predictive power. Once because the act of scoring pins the respondent to a position they'd otherwise drift past. If you're going to ask, ask on a scale that's been validated.

Citation capsule. Webb and Sheeran's 2006 meta-analysis of 47 intent-behavior experiments produced an average effect size of d+ = 0.36 (Psychological Bulletin, 2006). The Juster 11-point scale outperforms simple Likert intent scales by a clear margin on adoption prediction (Brennan, Massey Marketing Bulletin, 1994). The asking protocol matters more than the asking itself.

What a real demand signal actually looks like

Opt-out trials (card required up front) convert to paid at 48.8% versus 18.2% for opt-in trials (First Page Sage, 2025). Same product, same audience, wildly different conversion. The difference is stakes. A real demand signal has stakes in the asking, a measurement protocol with known predictive power, and a respondent who's actually in the target segment.

Three ways to add stakes without building

Three cheap options exist before you write code. A paid pre-order is the highest-stake, lowest-volume read. A card-required trial on a landing page sits in the middle. A scored purchase-intent study on a structured panel with defined segments gives you the highest volume at the lowest per-answer cost. Pick by your speed and budget, not by tradition.

Any of the three can kill a concept before you commit build time. That's the job. A pre-build protocol isn't about proving success. It's about disqualifying failure cheaply, at the step where disqualification costs hours instead of months. If your idea survives a $5 synthetic panel, a $200 card-required landing test, and a handful of interviews, you've earned the right to build.

Citation capsule. Opt-out trials convert at 48.8% to paid, opt-in trials at 18.2% (First Page Sage, 2025). Stakes in the asking sharpen the signal by more than 2x. The same principle scales down: paid pre-orders, card-required trials, and scored intent panels all outperform zero-cost signups because they force a small real decision up front.

Compare across nine methods in the 2026 market research methods decision matrix.

A weekend pre-build protocol for reading real intent

You can run a full pre-build intent read in a weekend and a half. Step one, write 3-5 concept variants of your idea. Step two, define 2-3 target segments. Step three, run scored purchase-intent questions across each variant x segment combination on a validated scale like Juster or a 5-point PI with anchored wording. Step four, interview the top-scoring survivors. Step five, A/B test the winner's landing page.

Each step is cheap on its own. Synthetic panels run in hours at $5-$100 per concept. Recruited interviews take about a week if you pay for recruiting. Landing-page A/B tests take days once traffic's flowing. You're trading a weekend plus a follow-up week for the price of a single month of wasted dev time.

Where synthetic fits (and where it doesn't)

Generative Agent Simulations research showed LLM-simulated consumer panels replicate about 85% of test-retest accuracy on human survey answers, based on interviews with 1,000+ participants (Park et al., 2024). That's triage-grade, not courtroom-grade. A 2025 mega-study of 19 experiments across 164 outcomes reported an average twin-human correlation near 0.2 (arXiv 2509.19088, 2025). Variance compresses. Synthetic is useful for triage, not final answers.

That caveat is why the protocol has five steps instead of one. Synthetic reads kill the obvious losers and rank the maybes. Recruited interviews tell you why the survivors scored well. A/B tests confirm the winning concept converts real traffic. Each method covers a weakness of the one above it. Read more on synthetic consumer research and where it fits.

Citation capsule. Generative-agent research showed LLM consumer twins replicate ~85% of human test-retest accuracy (Park et al., 2024), while a 2025 digital-twin mega-study reported an average twin-human correlation near 0.2 across 164 outcomes (arXiv 2509.19088, 2025). Synthetic panels are a legitimate first step for triage. They aren't a replacement for recruited humans on the decisions that matter.

Frequently asked questions

Is a big waitlist proof of demand?

No. Waitlist signups cost a user an email. Buying your product costs them money. Waitlist-to-paid conversion runs 2-8% on cold traffic (Getwaitlist, 2024), and only strong-PMF cases hit the ~20% number operators like to quote. A big list is a curiosity receipt, not a pre-sold pipeline.

What's the difference between a Likert and a Juster purchase-intent scale?

A 5-point Likert asks "how likely are you to buy" on scale points with no probability anchors. A Juster scale uses 11 points with verbal probability anchors from "certain, practically sure" to "no chance, almost no chance." Juster outperforms Likert by a clear margin on adoption prediction (Brennan, Massey Marketing Bulletin, 1994).

How do I know if feedback from friends is reliable?

You don't, and you can't fix it. Run the idea past strangers in your target segment using a scored protocol. Treat friend feedback as a gut-check, not a signal. Family-and-friends-funded founders produced 53% fewer patents than professionally-backed ones (Zaccaria, Journal of Corporate Finance, 2023). Friendly environments produce softer outputs.

Can AI-simulated consumer panels replace real market research?

No. A 2025 digital-twin mega-study found average twin-human correlation near 0.2 across 164 outcomes (arXiv 2509.19088, 2025). Synthetic panels are useful for triage and upstream concept testing, not for final go/no-go decisions. Recruited interviews, A/B tests, and human panels stay in the stack below them.

The cost of a false positive

The expensive mistake isn't running the wrong validation method. It's trusting the cheapest possible signal (an email, a "sounds great," a thumbs-up) and then spending $40,000-$150,000 (SpdLoad, 2025) or six months of your life acting on it. That's the price of treating curiosity as intent.

Waitlists measure willingness to give an email. Friends measure willingness to be kind. Soft-commit language measures willingness to be polite. None of those measure willingness to pay. The good news is that willingness to pay is measurable before you build. A validated intent scale, a small stake in the asking, and a respondent in the real target segment are enough to separate ideas worth building from ones that only sound good at dinner.

If you want to see what a scored synthetic intent read looks like on a real concept, that's the step of the protocol will.it.sell handles. Use it first, then interview the survivors, then A/B test the winner. Don't skip the steps that come after. Start with the validation procedure and work backward into which step you need first.

Gregor is building will.it.sell, a pre-revenue synthetic consumer research tool for B2C product teams. All stats in this post are linked to their primary sources.

Stop guessing. Start knowing.

Your first product validation is free. Get your report in minutes.

Test Your Product Idea Free