Flagship · 2024 – 2025 · SynergyBoat client
Text-to-SQL engine
Non-technical operators answered business questions without writing SQL, live across three databases.
§ 01 The problem
A non-technical operations team needed to answer business questions that lived across three databases — PostgreSQL for users and billing, MySQL for a legacy product system, and MongoDB for event logs. Every question cost an engineer’s day. The founder needed that engineer back.
The obvious answer — train operators to write SQL — had failed twice before. The subtle answer — a text-to-SQL engine that was actually trustworthy across three dialects — was what we built.
§ 02 What shipped
A text-to-SQL engine that auto-discovers schemas across all three databases, classifies user intent against a small catalogue of query shapes, resolves ambiguous values (e.g., “last week” vs. an ISO date range), and generates optimized queries that stay inside an 8K-token context budget.
Live in production. Operators now run their own reports. The engineering team reclaimed the hours previously lost to ad-hoc queries.
§ 03 How it works
Four stages. Intent classification maps a natural-language request to a small
set of query shapes — analytics-style aggregates, lookup by entity, time-series
comparisons, or “I don’t know, ask me more.” Schema retrieval uses pre-computed
digests so the prompt never carries a full schema dump. Value resolution
handles ambiguity before the model generates — "customers from last month"
becomes concrete dates, "premium users" becomes a resolved filter. Query
generation happens last, with the context budget enforced upstream so the model
never has to decide what to drop.
The key insight: every stage is a budgeting decision. Tokens are money and correctness simultaneously. Designing for the budget first made everything downstream simpler.
§ 04 What I’d do differently
Caching the schema digests earlier would have saved weeks. The first version hit the databases on every query; observability eventually made it obvious, but the fix was a one-afternoon change that should have been there from day one.
Intent classification was also too ambitious early. The first version tried to handle open-ended requests; the working version has a clear “ask me more” branch that punts when confidence is low. That branch is the single biggest accuracy improvement we made.