The thesis
Every restaurant discovery app has the same problem. The recommendations are either algorithmic noise or crowd-sourced mediocrity. Google hands you 4.3 stars and "Great food!" from two hundred strangers. Yelp ranks by who complained loudest. TikTok surfaces whatever a 19-year-old filmed last Tuesday.
None of them tell you why a place matters. None of them say: skip the pasta, order the lamb shoulder, sit at the bar if you're solo, and don't bother on weekends because the wait kills the vibe. None of them treat a $12 taco cart and a $200 omakase with the same editorial seriousness.
That's the gap. BonVivant is what we built to close it — a discovery platform where the AI doesn't just find places. It has opinions about them, and it can defend them.
What "having taste" actually means in software
It's easy to write "AI with taste" on a landing page. It's harder to define what that means in a database.
For us it meant rejecting the default move — stuffing venue data into a prompt at query time and hoping for the best. We tried that. The results were grammatically correct and spiritually empty. Every restaurant sounded like every other restaurant. "A charming spot with great atmosphere" is not taste. It's the absence of taste, dressed up in adjectives.
So we inverted the problem. Instead of generating opinions on demand, we generate them carefully, ahead of time, in a pipeline that treats every venue the way a good editor would: research it, decide what's worth saying, write the piece, then have someone else read it and tell you if it's any good.
The pipeline
Every venue we cover passes through ten stages before it ever shows up in search.
We start with the raw stuff — photos, hours, addresses, the boring scaffolding. Then we classify: what kind of place is this, who's it for, how loud is the room, how expensive, what's the crowd. Then we write — generating editorial content in a consistent voice, with a specific point of view, including dish-level recommendations and the practical stuff (parking, wait times, when to go, when not to).
Then we judge.
The judge is the interesting part. After the writer drafts a piece, a separate AI reads it cold and scores it on four axes — specificity, accuracy, voice, usefulness — and rejects anything generic, factually shaky, or off-tone. Rejected pieces don't just get thrown out. The judge's specific feedback gets fed back into the writer for the next attempt. The writer learns, in a small way, from the editor.
That feedback loop turned out to matter more than any individual model choice. Our first pass had an editorial rejection rate of 85% — most of what got written was mush. With the feedback loop in place, it's down to 38%, and the failures are now interesting failures instead of lazy ones.
Why offline beats real-time
The conventional wisdom in AI products is to do as much as possible at query time. User asks a question, you retrieve relevant documents, you generate a response, you serve it. RAG.
We went the other way. The expensive, thoughtful work happens once, in advance, per venue. The lookup at query time is fast, cheap, and grounded — the model isn't improvising recommendations, it's referencing editorial content that's already been written, judged, and approved.
The tradeoff is obvious. It's slower to add a new city, and every venue costs more to onboard than it would in a pure-RAG setup. The payoff is that nothing in the user-facing chat is invented. When BonVivant tells you to order the lamb shoulder, it's because someone — well, something — already decided that was the right call, and another something agreed.
This is the pattern we keep coming back to: do the slow, careful thinking offline. Serve the answer fast. Almost everything else in the system is downstream of that decision.
What this blog is for
This is where we'll write about the engineering. Not "we used framework X" tutorials — there are enough of those — but the decisions that turned out to matter. What we tried, what failed, what surprised us, and the patterns we think are worth stealing.
Coming up:
- LLM-as-judge with feedback loops. How we got the rejection rate from 85% down to 38% by treating the judge as a writing coach, not a gatekeeper.
- Pre-computed enrichment. Why we do the expensive AI work offline instead of at query time, and when you shouldn't.
- Multi-city from day one. Designing a data model that scales to new cities without code changes — and the parts of that we got wrong the first time.
- Editorial voice as a system prompt. Teaching a model to have opinions without sounding like a Yelp reviewer.
We're a small team building in the open. If you're working on AI content systems, discovery products, or just like reading about how things actually get built — pull up a chair.
BonVivant launches first in San Diego. The Journal is where the editorial lives; this is where the engineering does.