Security research

When a coding skill tells the AI to leave vulnerabilities alone

24 April 2026 · 5 min read

An adoption-risk analysis of the andrej-karpathy-skills Claude Code plugin, focused on the behavioral guidelines it injects.

Executive summary

andrej-karpathy-skills is one of the most-installed Claude Code skills. It is a set of coding guidelines, distilled from Andrej Karpathy’s advice, that get injected as system-level instructions into every session. Structurally it is about as safe as a plugin gets: no hooks, no MCP server, no subprocesses, no bundled binaries. It is pure text. And the text is the problem.

Reasonable rules, bad in aggregate

We ran Oplane against the skill and the supply chain checked out. What it flagged was the usage context: three of the injected guidelines are each sensible on their own and together build a quiet wall around insecure code. Quoted verbatim from the skill:

“Don’t improve adjacent code, comments, or formatting”
“Match existing style, even if you’d do it differently”
“No features beyond what was asked”

Now put them together against a real codebase. The agent is editing a function and notices SQL injection in the function right next to it. The first rule says do not touch adjacent code. If it does write something nearby, the second rule says match the existing (insecure) style. And the third rule lets it reclassify “add input validation” as a feature nobody asked for. Each rule is defensible. The sum tells a capable security reviewer to look away.

What Oplane flagged

Security vulnerabilities are always in scope for remediation, even when adjacent to the task
Critical
Never match existing style when it is insecure: use parameterized queries, strong crypto, and a CSPRNG regardless
High
Carve security hardening out of "no features beyond what was asked"
Critical
Flag security-relevant dead code (unused endpoints, legacy auth) for review
Medium

The one-line fix

None of this means forking the skill or arguing with its philosophy, which is good. It needs one carve-out, added to the skill or to your own CLAUDE.md: “Security vulnerabilities are always in scope, regardless of whether they are adjacent to the current task.” That single line restores the reviewer instinct the guidelines quietly remove, and leaves everything else about the skill intact.

The broader point is the uncomfortable one. Every skill you install is a set of instructions you didn’t write, running with the same authority as the prompts you do. We have learned to review code dependencies. Instruction dependencies (skills, system prompts, agent guidelines) shape what the model builds just as directly, and almost nobody reviews them. This one is benign and well intentioned, which is exactly why it is a good example: the gap was not malice, it was a side effect nobody modeled.

Review the instructions you didn't write

Oplane threat-models the skills, prompts, and agent guidelines shaping your code, not just the code itself.

Try for free →Request a demo →