Skip to content

Comments

Improve using-superpowers skill description and conciseness#459

Open
fernandezbaptiste wants to merge 1 commit intoobra:mainfrom
fernandezbaptiste:improve-using-superpowers-skill
Open

Improve using-superpowers skill description and conciseness#459
fernandezbaptiste wants to merge 1 commit intoobra:mainfrom
fernandezbaptiste:improve-using-superpowers-skill

Conversation

@fernandezbaptiste
Copy link

@fernandezbaptiste fernandezbaptiste commented Feb 12, 2026

Hey @obra, thanks for publishing superpowers. Appreciate you sharing your workflos, and kudos on soon hitting 50k stars! I just starred it. Side note, I shared one of your blogs on the newsletter I take care of (perhaps you heard of the ai native dev newsletter?) - kudos for the fantastic content.

was running your skills through some evals and noticed a few things on using-superpowers that were pretty quick to improve (moving from ~61% to ~100% agent performance):

  • frontmatter description conflicting with other skill since the trigger covers all conversations. expanded it with specific actions

  • redundancy over "check skills first" in different words with red flags, trimmed to the 5 most distinct patterns

  • added an output section so the agent knows exactly what to produce after skill resolution.

these were easy changes to bring the skill in line with what performs well against Anthropic's best practices. honest disclosure, I work at a company where we build tooling around this. not a pitch, just fixes that were straightforward to make.

you've got 41 skills here, if you want to do it yourself, evals are free and open to run: link otherwise happy to make the improvements for you.

- Expand frontmatter description with capability statement and trigger clauses
- Condense EXTREMELY-IMPORTANT block (rule already stated in The Rule section)
- Trim redundant Red Flags entries that repeat the same point
- Add Output section specifying what the skill produces
@obra
Copy link
Owner

obra commented Feb 12, 2026

Can you tell me a little bit about the evals you did? "61% to 100%" Is an interesting number, and some of the changes you made run counter to everything I have seen testing skills.

@fernandezbaptiste
Copy link
Author

fernandezbaptiste commented Feb 13, 2026

Absolutely - 61% to 100% means the skill now conforms better to structural best practices.

The review eval is generated with Opus 4.6 and judges the skill against anthropic’s guidelines for skill design (e.g frontmatter specificity, trigger clarity, output sections etc).

We also run scenario-based task evals to measure real agent performance across tasks, comparing results with and without your skill to quantify the true perf delta. To do that, you need to claim your skill on tessl and run the eval.

If you're curious to poke review evals for say "executing-plans" skill, paste this into Claude - takes ~1 min ro run:

 claude
1. Install the tessl CLI: npm i -g @tessl/cli, then log in                                                                         
2. tessl i github:obra/superpowers --skill executing-plans and run the review eval on it                               
3. Improve skill based on eval feedback, re-run review and explain new score vs. original 

@fernandezbaptiste
Copy link
Author

Hi - just following up on the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants