I Cut My AI Bill in Half by Working Faster

I use Claude Code all day, every day. It's my primary tool. I design and direct — Claude writes the code. That's been the deal for a while now.

Until today I was on Claude Max at $200/month. And I used to pace myself. Some days I'd hit limits and have to stop. I was rationing a $200 plan.

Today I dropped to the $100 plan. I've been watching my usage for the past month, and it wasn't even close. I don't pace myself anymore. I just go.

What Changed

Two things: DevPlan and AMP.

Before I built these tools, a typical session looked like this:

Spend 10 minutes pasting in context from the last session
Burn half the context window on "where was I?"
Go down a dead end, back up, try again
Repeat something I already tried two sessions ago
Hit a wall and start a new conversation, losing everything

Every one of those is wasted tokens. Wasted time. Wasted money.

What It Looks Like Now

AMP and Nellie mean every session starts in 30 seconds. Checkpoints restore where I left off. Lessons compound — if I hit a gotcha once, I never hit it again. No more context pasting. No more repeated mistakes.

DevPlan means I'm not exploring blind. The work is structured before execution starts. Plans are validated. Haiku handles the execution, Sonnet verifies it. When something breaks, the lesson feeds back into the system. Next time it doesn't break.

The result: every token goes toward actual work. Not recovery. Not exploration. Not repeating myself.

The Math

This isn't about using a cheaper model or making fewer calls. I'm using the same model. I'm using it more intensely than before. I build all day without pacing myself.

The difference is waste. I eliminated most of it.

No context reset overhead — Nellie handles it
No unstructured exploration — DevPlan handles it
No repeated mistakes — lessons handle it
No dead ends — validated plans handle it

Cut the waste and $100 goes further than $200 used to.

The Point

Most people trying to reduce AI costs look at the model. Use something cheaper. Make fewer calls. Compress your prompts.

I went the other direction. Same model, same intensity, better workflow. The tokens I was burning on context recovery and dead ends just — disappeared.

I was rationing a $200 plan. Now I'm uncapped on $100.

The bottleneck was never compute. It was workflow.