Skip to content

Context Management

Modern models have much larger context windows than they used to (1M tokens yay! 🎉). That helps, but it does not remove the problem.

The real issue is not just whether a coding session can fit the context. It is whether the context is clean, relevant, and easy to reason about. In practice, agents do best when they get exactly what they need for the task at hand - and as little else as possible.

When a session gets too large, too noisy, or too mixed, quality drops. You may see agents:

  • Forget important requirements mid-implementation
  • Drift from the plan
  • Copy inconsistent patterns from earlier discussion
  • Anchor on old assumptions that are no longer true
  • Miss edge cases that were already discussed

This can happen even with very large context windows. Bigger windows reduce the hard limit, but they do not fix low-quality context.

Context bloat is not only about token count. It is often caused by stale, ambiguous, or irrelevant information lingering in the conversation for too long.

Common examples:

  • A CLAUDE.md file becomes a dumping ground for rules that do not matter for the current task
  • A long debugging session accumulates several failed hypotheses that keep biasing the model
  • Large logs, stack traces, generated files, and broad file reads get pulled in “just in case”
  • Research sessions include too many raw docs, web search results, or pasted references
  • Planning, implementation, debugging, and review all happen in one long thread
  • Requirements change mid-session, but the old requirements are still sitting in context
  • Multiple designs are explored without clearly choosing one
  • You discover an unrelated bug while building a feature and try to solve both in the same session

The result is that the model can no longer tell which constraints are current, central, and authoritative.

Context quality matters more than context size.

The goal is not to fill the window. The goal is to give the agent the minimum high-signal context needed to solve the current problem well.

ACT is designed around a simple idea: turn one long, messy conversation into a sequence of short, focused sessions connected by files.

Instead of keeping planning, research, implementation, and review inside one ever-growing chat, ACT produces artifacts - specs, plans, and captured insights - that become the input to the next step.

Terminal window
/new
/act:workflow:spec "add user authentication"
Asks questions, creates detailed specification with user flows and edge cases
/new
/act:workflow:refine-spec ai_specs/auth-spec.md
Roasts the spec for gaps, wrong assumptions, and codebase misalignment
/new
/act:workflow:plan ai_specs/auth-spec.md
Creates phased implementation plan
/new
/act:workflow:work ai_specs/auth-plan.md
Executes plan phase by phase with commits and PR

That workflow gives you the best of both worlds:

  • Each phase starts with a clear context
  • The important knowledge is preserved in files
  • The noisy conversation that produced those files can be discarded
  • The agent stays focused on one mode of work at a time

This is exactly why some tools now offer explicit context-reset features. Claude, for example, has a way to clear context and continue working once a plan is ready. That works well because the valuable output is the plan itself, not the full exploratory path that led to it.

The spec file is the source of truth for requirements. During /act:workflow:work, the agent reads the spec alongside the plan, so requirements do not depend on chat history.

ACT breaks implementation into phases. Each phase has a focused goal, a limited set of files, and a verification step before moving on.

This avoids the slow context drift that happens when a feature grows inside one endless session.

3. Loading the right knowledge automatically

Section titled “3. Loading the right knowledge automatically”

Users usually do not need to manually load extra guidance before implementation. During /act:workflow:work, ACT automatically uses the flutter-development skill to bring in the relevant Flutter patterns and guidelines.

That matters for context management because the session stays focused. The agent gets the guidance it needs for the task without forcing you to paste in lots of docs, rules, or reference material by hand.

If deeper research is needed, the best approach is usually to do that in a separate session and write the distilled findings to a file that can be referenced later as needed. Today that is still a manual workflow, but ACT will support this pattern better over time.

A spec that tries to cover too much will produce a plan that is too large and too diffuse.

  • Instead of “add user management”, split it into smaller specs like registration, profile editing, and permissions
  • Aim for plans that fit into 3-5 phases

If you need lots of web research or docs review, do that in a dedicated session. Capture the distilled findings in a file, then start a fresh implementation session from that summary.

If you uncover a bug halfway through a feature, do not automatically fold it into the same thread. Often the better move is to handle it separately, or even in another branch or worktree, so the agent is not trying to solve two problems at once.

Once the spec is done, start fresh for refinement. Once the plan is done, start fresh for implementation. Keep the artifact, not the whole conversation.