Replacing Our SQL Parser

Our SQL parser was a third-party product we paid through the nose for. It ran on a modern toolchain (ANTLR), and it covered the dialects we needed (Snowflake, BigQuery, Fabric, Databricks). It was comprehensive, but it was slow, and most importantly it wasn't ours to change.

The initial POC was a rewrite I did as a midweek crunch. Two days of long-day late-night work supporting the same four dialects, plus our new custom annotations layered on top of them that the product depends on. The POC ran 3–20× faster than the production parser on real customer SQL.

Parser throughput, relative to the production parser

Production parser1× (baseline)

POC, low end of range3×

POC, high end of range20×

Speedup measured against the same production SQL corpus, run through both parsers. The range reflects different query shapes; the high end was observed on the largest, most deeply-nested queries.

We never shipped this version.

What was wrong with the original parser?

The grammar wasn't ours. Although it fundamentally supported the four dialects we shipped against, we weren't able to customize it to our needs, and that was starting to bite us.

A bit of an explanation. A new internal initiative was pushing the migration from YAML defined metadata to SQL defined metadata. SQL is the "native language" of our users, and we needed to move to a world in which it was natural to review and edit the underlying data model, as well as enable easy AI authoring. A Coalesce competitor, DBT, does something conceptually similar. They have the SQL of their node, and a frontmatter that defines the node's metadata. Its also important to understand that part of the reason Coalesce uses YAML defined metadata is that the executed SQL at table create and table "run" time (DDL and DML respectively) is a generated SQL string. Templates take the node metadata and generate the SQL.

Naturally, thats at odds with giving users the ability to edit the SQL itself. So a midway point was needed. The answer was in using something "SQL-like" to define the metadata. That is, you write a SQL select query, and from that we infer the metadata. Where a select comes from a previous table without a transform, the datatype can be inferred, and where a transform is used the datatype can be specified. So on, and so forth.

What we end up with is an input that looks like SQL, feels like SQL, but isn't what executes. We also need a way of describing for a given column additional attributes, such as whether it's values should be tested after the run, or if the column is the business key, etc. All per template, and all without writing both a DDL and DML.

The answer, we decided, was "annotations". Columns could be followed with arbitrary annotations defined by the template, such as @businessKey. These annotations can take parameters @test('notNull'), something our YAML could support.

Finally, this meant we needed to solve the parser. We needed a way to take our SQL-like string, and turn it back into our structured data model (and vice versa).

The two-day rewrite

The rules I gave myself were simple: match the interface so that we could swap behind a flag, and meet or exceed syntax support.

I covered the four dialects the product ships against, plus the layer of annotated SQL the product adds on top. The secret sauce is in how it was achieved.

/loop, /goal, and Ralph Wiggum

At the time, Claude didn't ship with /loop and codex without /goal, but the precedent had been recently set. Ralph Wiggum was a technique in which a bash script repeatedly called Claude with the same prompt. The only requirement was that the prompt had an explicit exit condition. In our case, that exit condition was 100% green tests.

The workflow that proved what I needed it to prove was simple. All we had to do was pull all SQL examples we could find, and build a suite against it using the existing parser.

Therefore:

Download the corpus of production SQL. Every single line of SQL.
Run the corpus through the existing parser, using the output as the "expect".
Hand the test suite to Claude alongside documentation of the dialect grammars, test suites for those grammars (to catch unused syntax), and tell Claude to make all the tests green.
Loop until none were red.

In all reality, the longest part of shipping this POC was deciding on and producing the corpus. Beyond that, my POC also introduced syntax highlighting for our dialect to the Monaco editor, intellisense suggestions for available annotations, shortcuts to open nodes from node references, column datatype tooltips, and more.

Leveraging the corpus

The corpus was critical in more than just producing the original grammar. It was also key for performance testing. We could assess across all of our real SQL where the old parser was getting hung up, and where our new parser was behind, and solve for that. Once again, claude could be looped until it was reliably faster than the existing parser for all production use cases.

The corpus outlived the POC. Whatever ships now is held to the same standard.

In retrospect

The team that owns the production parser kept my architecture, the same stack, and the same test suite. They wrote their own parser rules, and skipped the Monaco integrations.

I wish they'd shipped my POC. The re-engineering wasn't free. The rebuild took a little longer than the POC did, yet the POC was clean enough to ship. I recognise that isn't the usual case for a POC, including most of mine. POCs are usually messy enough that "we'll just clean it up" is wishful. We were fortunate that my POC just needed a little more love to feature flag before shipping it, but ultimately I was pulled into other work before I could productionise it.

Key takeaway

When a paid third-party dependency lives in the hot path of your product, owning the rules is sometimes cheaper than the license fee, and can be critical to business. Building a business against a technology you don't own, or don't control, is terrifying. You remain entirely at their mercy. If AI had not had the rise it did, Coalesce would never have spent the resources to replace the parser. In Coalesce's foundations, it was built entirely against Snowflake. If Snowflake chose not to support our integration, we were done. If Snowflake chose to build our product internally, we were done. To some extent, they did.

I'm almost always in favour of using a library over engineering a solution oneself, and thats a little at odds with AI's practices these days. I think it comes down to one key thing. If you need to home roll today because a suitable library isn't available, then home roll. AI has made that easier than ever. Otherwise, continue sticking with libraries. We needed our own parser because we wanted custom syntax. If it had just been performance, we might have simply continued to complain instead.