Home

Cursor Model ROI

Cursor exposes a growing matrix of models: frontier names, mid-tier workhorses, fast cheap options, and vendor-specific cache pricing that only shows up once you read the footnotes. The official table is accurate, but it is not optimized for the question I keep asking: if I am going to burn tokens on this repo today, which default actually makes sense?

List price per million tokens is a start, but it is not the whole picture. Some models are expensive on paper and cheap in practice because your prompt stack hits cache reads. Others look affordable until you notice output pricing or Max Mode multipliers. Arena-style benchmarks are an imperfect proxy for coding quality, yet they are still a useful sanity check when two models look similar on price.

Cursor Model ROI is a small React app that puts those dimensions beside each other in one sortable grid so you can compare without juggling three browser tabs. The in-app line is: what you actually get for the token — pricing, arena signals where we have them, speed tiers, and cache economics spelled out rather than hand-waved.

What you see in the UI

The main surface is a comparison table. Each row is a Cursor-supported model bundle: published input and output rates, optional cache write and read lines, and joined metadata from other JSON snapshots the refresh job maintains. You can sort by the columns that matter for your question — sometimes that is raw output price, sometimes it is a derived score that folds cache in, sometimes it is "how fast does Cursor classify this thing."

Opening a row launches a detail drawer with the full breakdown and notes pulled from the underlying export (promotions, context window limits, doc caveats). Above the table, filters trim the list when you only care about a provider or a price band. None of this replaces reading your own traces, but it does replace guessing from memory when someone asks "why are we still defaulting to X?"

Output price (USD per 1M output tokens)

Composer 2$2.50
Claude 4.5 Haiku$5.00
GPT-5.4$15.00
Claude 4.7 Opus$25.00
Static snapshot for shape only — values match a Cursor pricing export in the app repo at the time I wrote this. Promotions, Max Mode multipliers, and doc updates move constantly; the live tool is what you should trust for picking a default model today.

The chart above is only there to show why a single number is misleading: the spread between a budget routing model and a flagship is an order of magnitude on output alone. The live site adds arena and speed context on top so the comparison is less one-dimensional than bars alone.

Where the data comes from

The app does not scrape private usage from your machine. It ships with JSON snapshots produced by a refresh script in the same package — Cursor's published model page, plus whatever auxiliary sources the script merges for arena rankings, speed labels, cache provider facts, and short-lived promotions. A GitHub Actions workflow runs that script on a schedule and on changes to the app, then deploys if the diff is non-empty.

That means the site can be a day behind a sudden doc edit, but it also means every number is inspectable in the repo. There is a methodology panel in the UI for how composite sorts are defined, and a footer that states how stale the bundle is so you are not comparing today's intent against last month's prices by accident.

Stack and shipping

It follows the same deployment pattern as the other reb.nz tools: Vite, TypeScript, Tailwind, and AWS Amplify wired up with Pulumi in the monorepo. Analytics are minimal — enough to know whether anyone uses the filters — and there is no account system because there is nothing to protect except public rate cards.

If the table saves you one wrong default model pick per month, it has already earned its keep. If it starts arguments on Slack about whether arena scores mean anything, that is fine too; at least those arguments will cite the same cells.

Try it at cursor-roi.reb.nz.