Why does my AI coding assistant forget instructions?

Language models weight the beginning and end of their input most heavily. As conversations grow, early instructions drift into the middle of the context window, where research shows they get deprioritized. Degradation can begin when the context is only 20 to 40 percent full.

What is context drift in AI?

Context drift occurs when an AI stops reliably following instructions loaded at the start of a conversation. The instructions remain in context but get overlooked as newer messages push them into positions the model attends to less. It is rooted in the 'Lost in the Middle' phenomenon documented by Stanford researchers in 2023.

What is Claude Enforcer?

Claude Enforcer is an open-source, MIT-licensed tool for Claude Code that builds enforcement layers around context drift. It uses on-demand skills, shell-script hooks that block actions regardless of model memory, and isolated subprocesses that evaluate with fresh context. It installs with one command.

How do hooks prevent AI instruction drift?

Hooks are shell scripts that run before the AI acts, outside the model's context window. They intercept actions and check them against rules programmatically. Because hooks operate externally, they enforce rules regardless of what the model remembers or has forgotten during a long conversation.

Your AI Has Amnesia

I spent a Saturday morning writing rules for my AI coding assistant. Careful ones. Don't touch the production database. Always run tests before committing. Use this specific account ID, not that one. I tested them. They worked. By Monday afternoon, it had ignored half of them.

I didn't do anything wrong. The architecture did.

Researchers at Stanford published a paper in 2023 called "Lost in the Middle" that explains why.¹ They found that language models weight the beginning and end of their input most heavily. Information in the middle gets overlooked. Not deleted. Not corrupted. Just quietly deprioritized, the way a long meeting makes you forget what was said at the 30-minute mark even though you were paying attention the whole time.

Some layers hold. Others fade.

The fascinating part? Later research suggests the degradation starts earlier than you'd think. Not when the context window is full, but when it's only 20 to 40 percent occupied.² Your carefully written rules are sitting right where the model stops looking first.

Developers who use Claude Code run into this constantly. One developer tracked his skill activation rates over a month of intensive use. His finding was sobering. About 50 percent. A coin toss.³ The instructions were loaded. The model just drifted past them.

The Briefing Room Problem

Claude Code reads a file called CLAUDE.md at the start of every conversation. Think of it as a briefing room. Your project's architecture, your coding conventions, your "never do this" rules. The problem is that briefing rooms fade. Every message you send, every file you read, every tool call you make pushes those initial instructions further into the middle of the context. The very place the research says gets ignored.

So you write better instructions. More specific. More emphatic. But the problem isn't clarity. It's physics. Long context degrades attention the way a long hallway dims light. You can make the bulb brighter, but the hallway is still long.

What Holds

This got me thinking about what does persist. Not in the conversation, but outside it. I started building what became Claude Enforcer, an open-source tool that layers enforcement around the drift problem.⁴

The first layer is on-demand skills. Instead of a 500-line briefing that fades, you keep a lean set of essentials and invoke specialized instructions when you need them. Type /deploy when deploying. The context stays relevant because it arrives fresh.

The second layer lives entirely outside the model's context. Shell scripts that run before the AI acts. A hook doesn't care what the model remembers. If Claude tries to edit your .env file, the hook blocks it. Every time. Regardless of what happened in the conversation above.

The third layer is the one that surprised me. You can spawn a subprocess that starts with clean context and reads your rules directly. It's like pulling in a fresh colleague instead of asking the one who's been in the meeting all day.

What persists lives outside the conversation.

The whole thing installs with one command and runs a first audit in about thirty seconds. It tells you where your instructions are drifting and what to do about it.

The rules you wrote were good. They just need a place to live where amnesia can't reach them.

References

Liu, N.F., et al. (2023). "Lost in the Middle: How Language Models Use Long Contexts." Transactions of the Association for Computational Linguistics. arXiv ↩
Veseli, B., et al. (2025). "Context Window Utilization and Performance Degradation in Large Language Models." Referenced in practitioner analyses of context quality thresholds. ↩
Spence, S. (2025). "Claude Code Skills Activation Rates." scottspence.com ↩
Meetze, F. (2026). Claude Enforcer. GitHub ↩

Your AI Has Amnesia

The Briefing Room Problem

What Holds

References

More to Explore

The Endorsement

Two Brains: Why Dynamic Model Routing Beats Picking One AI

Will AI Stall Itself?

Browse the Archive

The Briefing Room Problem

What Holds

References

Footnotes

More to Explore

The Endorsement

Two Brains: Why Dynamic Model Routing Beats Picking One AI

Will AI Stall Itself?

Browse the Archive