Flow's transcription is the best in the category. That isn't the problem. The problem is what happens to your words after they're transcribed. Some vanish, some get quietly rewritten, and all of it gets less reliable the moment you start paying. Three cracks, one promise, and they're already showing up in your ratings.
↳ Eight words land. The ninth never reaches the cursor. No error, no undo, no trace. Just gone.
Curated and early reviews love you. Organic, recent ones don't. That gap is the sound of trust draining out after the trial ends, and it's widening: a sample of recent App Store reviews already averages closer to 4.1. The rest of this is why.
Transcription lands. Delivery doesn't. Words get lost in the last six inches between the model and your cursor, and the tell is that you've engineered an entire recovery layer for it. You don't build a Scratchpad that pops up when pasting fails, or a shortcut called "paste my last transcript," unless words go missing often.
Your help center, in its own words: there's an article titled "How to Fix Accidentally Using Your Password as Your Name." When delivery lands in the wrong field, Flow doesn't just lose text, it leaks it. ↳ docs: fix-text-not-pasting
This is the sharp one, because it's your flagship feature. The AI cleanup doesn't just remove the ums. Sometimes it edits you. Better grammar, softer meaning. For Slack it's fine. For an exec email or a product doc, where the exact wording is the artifact, a tool that rewrites your intent isn't polishing, it's distorting.
The nuance that makes it worse: for LLM prompts the polish is wasted (models handle fragments fine), and for high-stakes writing it's harmful. The one feature is either pointless or dangerous depending on context, and the user gets no control over which.
The trial is flawless. Then a month in, accuracy dips, "taking longer than usual" starts appearing, dictations hang. This is the pattern that craters the organic reviews, and it lands at the worst possible moment: right after someone has committed to a year.
It already has a name. A Medium piece months back called it "the Wispr Flow trust gap," and public status trackers logged repeated dictation-latency incidents this spring, including multi-day runs in late May and early June. The bot telling a paying customer to check their wifi is its own trust event.
Trust is the whole product. The magic is not having to check.
A "just talk" tool only works the second a user stops re-reading what it wrote. That's the entire value. Every word that vanishes, every meaning that gets softened, every slow day after they paid teaches them to start checking again, and once they're checking, the magic is dead.
That's the mechanism behind the 2.7. By month two, enough small betrayals have stacked up that the user no longer trusts Flow with anything that matters, exactly when they're locked into a year of it.
And the model won't save you. The same open transcription models are already free and local in a dozen competitors. The only durable moat is being the tool people trust with their precise words. Right now, that's the thing slipping.
Your words. Exactly.
Every time. For as long as you pay.
The Scratchpad, "paste last transcript," the before-and-after compare in Polish settings, Android's auto-copy on failure. The components already exist. They're just buried in troubleshooting and framed as "here's how to recover when we fail." Surface them, default them, guarantee them, name them.
Make the safety net the default on every platform, not a help article. Verify the text actually landed instead of firing and forgetting. You already do exactly this on Android. Make it universal.
Promote the before/after you already built, and add a faithful mode so users pick fidelity by context. "We should not ship this" stays exactly that.
Close the trial-to-paid gap so paid never feels slower than the trial. Be transparent about incidents, and put a human on the support loop. Reliability is a promise, not a setting.
"The dictation tool that never loses your words, never changes your meaning, and never gets worse" is the most ownable claim in a category racing on model quality. It's also the only thing that pulls the 4.8 and the 2.7 back together.
I'm CJ. I use Flow every single day, and I co-founded a consumer iOS app that went from idea to a live App Store launch in a 7-day build sprint, owning product and design end to end. Your whole brand is a promise, so I went and pressure-tested it against your docs, your ratings, and your users. This isn't a bug list. It's a positioning bet you can ship.
This is the kind of problem I'd want to be in the room for. So I built the room a starting point.