Flagship · 2024 – 2025 · SynergyBoat client

Text-to-SQL engine

Non-technical operators answered business questions without writing SQL, live across three databases.

§ 01 The problem

A non-technical operations team needed to answer business questions that lived across three databases — PostgreSQL for users and billing, MySQL for a legacy product system, and MongoDB for event logs. Every question cost an engineer’s day. The founder needed that engineer back.

The obvious answer — train operators to write SQL — had failed twice before. The subtle answer — a text-to-SQL engine that was actually trustworthy across three dialects — was what we built.

§ 02 What shipped

A text-to-SQL engine that auto-discovers schemas across all three databases, classifies user intent against a small catalogue of query shapes, resolves ambiguous values (e.g., “last week” vs. an ISO date range), and generates optimized queries that stay inside an 8K-token context budget.

Live in production. Operators now run their own reports. The engineering team reclaimed the hours previously lost to ad-hoc queries.

§ 03 How it works

Four stages. Intent classification maps a natural-language request to a small set of query shapes — analytics-style aggregates, lookup by entity, time-series comparisons, or “I don’t know, ask me more.” Schema retrieval uses pre-computed digests so the prompt never carries a full schema dump. Value resolution handles ambiguity before the model generates — "customers from last month" becomes concrete dates, "premium users" becomes a resolved filter. Query generation happens last, with the context budget enforced upstream so the model never has to decide what to drop.

The key insight: every stage is a budgeting decision. Tokens are money and correctness simultaneously. Designing for the budget first made everything downstream simpler.

§ 04 What I’d do differently

Caching the schema digests earlier would have saved weeks. The first version hit the databases on every query; observability eventually made it obvious, but the fix was a one-afternoon change that should have been there from day one.

Intent classification was also too ambitious early. The first version tried to handle open-ended requests; the working version has a clear “ask me more” branch that punts when confidence is low. That branch is the single biggest accuracy improvement we made.