I Tested Ponytail: When It Helps And When It Doesn't

It's one text file. No code, nothing to install, just instructions. And right now it has more than 40,000 stars on GitHub. People say it makes Claude Code write up to 90% less code. Same job, 90% less. So I gave Claude the same task with it and without it, counted every line and every token, and what I found is more interesting than the hype. Part of it is real. One of the things most people say about it is just not true.

What Ponytail is

Ponytail is a Claude Code skill, which is to say a text file. Inside it is a set of rules Claude uses when it needs them. This one tells Claude to act like a lazy senior developer: before it writes any code, it walks through a short list of questions.

Does this need to exist? If not, skip it.
Does the standard library already do this? If yes, use it.
Is it a native platform feature? If yes, use it.
Is there a dependency that already does it? If yes, use it.
Can it be done in one line? Then do it in one line.

If the answer to all five is no, then it writes the minimum that meets the requirement. That's the whole idea: stop the model from building more than you asked for.

Installing it

Installing is two steps, both on the GitHub repo. First you add the marketplace it's published to:

/plugin marketplace add DietrichGebert/ponytail

Then you install the plugin itself and pick a scope. I installed it at the local scope so it only affects this one project:

/plugin install ponytail@ponytail

After that, either restart the session or run /reload-plugins to load it into the current one. You can confirm it's on by checking .claude/settings.local.json for the enabled plugin. That's it.

The test

I set up two copies of the same repo, a small FastAPI app with a list of items. One had Ponytail, the other didn't. Same task, same model on both sides (Haiku 4.5, because that's what the repo's own tests used), and I measured two things: the tokens it cost, and the lines of code it wrote. Then I did it four times with four different tasks.

Task 1: add an admin dashboard

This is the one that sells the tool. With Ponytail, Claude added two endpoints to a single file and finished in 27 seconds. Without it, Claude built a whole static folder, an index.html, extra endpoints I never asked for, and took 2 minutes 4 seconds.

The line count: 101 lines with Ponytail, 464 without. On tokens, the Ponytail run also used fewer output tokens, 3.6K against 8.4K. On a big, vague task, the tool does exactly what it promises.

Tasks 2, 3 and 4: where it flips

Then I ran three smaller, clearer tasks: add user authentication, export items as CSV, and add rate limiting. The pattern reversed.

Task	Lines with Ponytail	Lines without
Admin dashboard	101	464
User auth	40	30
CSV export	15	11

On user auth, the Ponytail run wrote more code (40 lines vs 30), ran slower (2 minutes vs 32 seconds), and used more output tokens (6K vs 2.3K). CSV export was basically a tie, a simple task with no real difference. Rate limiting went the same way as auth: the version without Ponytail finished faster, and the Ponytail run used more than double the output tokens.

So the "it makes Claude cheaper" claim didn't hold up. On the one big open-ended task it saved tokens. On the small, well-scoped ones it cost the same or more.

So should you install it?

Honestly, yes, but for one reason, and it isn't the token bill. Ponytail isn't magic and it isn't a cost saver. It's a way to stop the AI from doing too much. When you hand it a big request like "build me a dashboard, add login," it keeps the model from over-engineering. That makes it great for prototypes, and great when you're new and can't always tell whether the AI is overbuilding.

But if your prompts are already small and clear, you won't notice much, and you might even pay a little more. Know what it's for, and treat it as a nice thing to have, not a switch that cuts your usage. If cutting your usage is the actual goal, the habits in How I Cut Claude Tokens By 80% move the needle a lot more.

The real lesson

Here's the part I want you to remember. This viral tool, the one with more than 40,000 stars, is a simple text file. You don't have to wait for someone to go viral on GitHub to get this. You can write your own skill in about five minutes: a few rules in plain English, and you ask Claude to create it for you.

I did it on camera. I opened a fresh session and asked Claude to create a local skill that just says "don't do too much." It made the skill.md, and that was the whole thing. No reason to over-engineer it. That's the point of the channel: anyone can build with AI, and I mean anyone.

Links

If you want help figuring out which AI tools are actually worth adding to your workflow and which are just hype, get in touch. See our AI consulting service for how we work.