Skip to Content
image description
image description

Insights Blog

Context Engineering: The Layer That Makes Spec-Driven Development Actually Work

Copy LinkPrintEmailFacebookX
context engineering ai

Context engineering AI is the discipline that determines whether your spec-driven development workflow produces reliable code or expensive rework. If you read our last post on spec-driven development, you saw the methodology gaining serious momentum: write a structured spec, hand it to an agent, get better results. But there’s a gap in that story that most SDD content glosses over.

Specs describe what to build. They don’t describe why your system works the way it does.

And that gap is where most AI coding agent failures actually live.


Where Specs Hit Their Ceiling

Picture this: your team writes a clean spec for a new payment processing endpoint. The spec covers the functional requirements, the edge cases, the API contract. It’s well-structured, testable, specific. Everything a good spec should be.

Your coding agent reads it and produces code that passes every requirement in the spec. Technically correct. Functionally complete.

And architecturally wrong.

The agent built a custom auth layer because it didn’t know your team uses shared middleware for authentication. It created a new database connection pool because it wasn’t aware of the existing connection manager. It implemented synchronous webhook processing because nothing in the spec mentioned that your system is event-driven by convention.

The spec told the agent what to build. It didn’t tell the agent how your system works, what patterns to follow, or what already exists.

This is not an edge case. This is the norm for complex tasks. A developer who’s been on your team for six months knows all of this implicitly. An AI agent knows none of it unless you explicitly provide that context.

The question isn’t whether your specs are good enough. The question is whether the context surrounding those specs is complete enough to produce code that actually fits your system.


What Context Engineering Actually Is

Context engineering AI is the strategic discipline of designing, managing, and delivering the right information to AI systems so they produce reliable, accurate output. It’s not a new prompting technique. It’s the infrastructure layer that sits beneath every spec, every prompt, and every agent interaction.

Andrej Karpathy framed it clearly: context engineering is “the delicate art and science of filling the context window with just the right information for the next step.” Shopify CEO Tobi Lutke made a similar push, arguing that “context engineering” better describes the skill of giving the model everything it needs to solve a task than “prompt engineering” ever did.

By early 2026, the shift is well established. Gartner has published formal definitions. GitHub repos dedicated to the discipline have thousands of stars. Faros, LangChain, and others have published detailed developer guides. It’s no longer a concept being debated. It’s a discipline being practiced.

Here’s the core distinction that matters for development teams:

Prompt engineering is about phrasing. You optimize a single input to improve a single output. It’s tactical, session-scoped, and focused on the question you’re asking right now.

Context engineering is about environment. You design the entire information ecosystem the model operates inside: architecture docs, coding conventions, service maps, dependency graphs, team standards, historical decisions. It’s strategic, persistent, and focused on everything the agent needs to know before you even ask a question.

Spec-driven development is a methodology that sits between them. The spec structures your intent. Context engineering ensures the agent has the knowledge to execute that intent correctly within your specific system.

The relationship is hierarchical. Context engineering is the discipline. Spec-driven development is one methodology that applies it. Prompt engineering is a tactic used within both. Conflating them causes real problems in how teams invest their time.


The Four Layers of Context

Not all context is equal. Effective context engineering AI requires thinking in layers, because different types of information serve different purposes in an agent’s reasoning.

Layer 1: Task context. This is what the spec provides. What needs to be built, why, and how success will be measured. Most teams handle this layer reasonably well, especially once they adopt SDD. It answers the question: “What are we doing?”

Layer 2: Codebase context. What architecture exists, how services connect, what patterns are established, where shared utilities live. This is the layer most teams neglect entirely. It answers the question: “What already exists that the agent needs to know about?”

An agent without codebase context will reinvent solutions to problems your team solved two years ago. It will create new utility functions that duplicate existing ones. It will choose architectural patterns that conflict with your established approach. Not because it’s incapable, but because it genuinely doesn’t know what’s already there.

Layer 3: Team context. How your team does things. Coding conventions, naming standards, review expectations, testing patterns, commit message formats. This is the “tribal knowledge” that experienced developers carry and new developers absorb over months of immersion.

AI agents don’t have months. They have a context window. If your team conventions aren’t documented and delivered to the agent, they don’t exist for that agent.

Layer 4: Temporal context. What’s changing. Migrations in progress, deprecated services, planned architectural shifts, recently-introduced patterns that haven’t been fully adopted yet. This is the most overlooked layer and arguably the most dangerous to miss.

An agent that doesn’t know you’re migrating from REST to GraphQL will happily build new REST endpoints. An agent that doesn’t know a particular library is deprecated will use it confidently. Temporal context prevents agents from building things that are technically correct today but wrong by next sprint.

All four layers are required for complex work. Most teams only provide Layer 1. That’s why agents produce code that works in isolation but fails in context.


Why This Can’t Be Solved With a Bigger Context Window

A common reaction: “Context windows are getting bigger. Won’t this problem solve itself?”

No. And here’s why.

Models now advertise 1 million, even 2 million token context windows. That sounds like enough to dump your entire codebase into a prompt and let the agent figure it out.

In practice, it doesn’t work. Research has documented the “lost-in-the-middle” problem, where models struggle to use information buried deep in a long context. Relevance degrades with volume. Costs scale linearly with context size. And raw codebase dumps don’t tell the agent what matters. They just provide more noise for the signal to get lost in.

Context engineering is the opposite of “dump everything in.” It’s curation. It’s deciding what information is relevant to this specific task, in this specific part of the codebase, for this specific team’s conventions. It’s the difference between handing someone a filing cabinet and handing them the three documents they need.

A well-engineered context payload for a complex task might include: the task spec, the architecture patterns doc for the affected service, the team’s coding conventions, a dependency map showing how the modified service connects to others, and a note about any in-progress migrations that affect the area. That’s five focused documents, not a codebase dump.

The discipline is in knowing what to include and what to leave out.


Context Engineering in Practice

So what does this look like for a development team adopting SDD?

Architecture decision records. Documents that explain not just what architectural choices were made, but why. When an agent knows that your team chose event-driven architecture because of specific scale requirements, it won’t default to synchronous patterns. The “why” behind decisions is the highest-leverage context you can provide.

Coding convention files. Machine-readable (or at least agent-readable) documents that describe how your team writes code. Naming patterns, file structure conventions, testing expectations, error handling standards. These should live alongside your code, referenced by agents before they generate anything.

Service-level context docs. For each major service or module, a document that describes: what it does, how it connects to other services, what shared utilities it uses, and what patterns it follows. Think of it as the onboarding doc you wish you’d had when you joined the team, written for an agent instead of a person.

Constraint maps. Explicit documentation of what agents should not do. Don’t create new database connections (use the pool). Don’t build custom auth (use the middleware). Don’t introduce new dependencies without checking the approved list. Negative constraints are just as valuable as positive guidance.

Temporal context markers. Simple, maintained docs that flag what’s in flux. “We’re migrating auth from service A to service B. All new code should use service B. Existing references to service A will be migrated in Q3.” This prevents agents from building on foundations that are being removed.

None of this is exotic. Most experienced teams have fragments of this knowledge scattered across wikis, Slack threads, and the heads of senior developers. Context engineering is the practice of making it explicit, structured, and deliverable to AI agents.


The Relationship Between Context Engineering and SDD

Here’s the framing that clarifies everything: spec-driven development is only as good as the context engineering behind it.

A spec without context engineering is a set of instructions given to someone who doesn’t know your system. They might build something that meets every stated requirement and still breaks everything around it.

A spec with context engineering is a set of instructions given to someone who has been thoroughly briefed on your architecture, your conventions, your constraints, and your direction. They build something that fits.

SDD provides the what. Context engineering provides the why and the how. Together, they give an agent everything it needs to produce code that isn’t just correct in isolation but correct within your system.

This is also where the conversation shifts from individual developer productivity to team capability. A solo developer can do passable context engineering by keeping architecture docs and convention files updated. But when five developers are working with five different agents across different parts of a codebase, the context problem multiplies.

Whose architecture docs are authoritative? Whose conventions win when there’s a conflict? How does developer A know that developer B’s agent was given context about a migration that affects both of their work?

Context engineering at team scale requires shared infrastructure, not just individual discipline. It requires persistent context that lives beyond a single session, visibility into what context agents are receiving across the team, and governance over which context documents are authoritative.

That’s the problem we’ll dig into in the next post in this series. Because the real unlock for context engineering AI isn’t doing it well as an individual. It’s doing it collaboratively as a team.


Getting Started With Context Engineering

If your team is practicing SDD (or starting to), here’s how to layer in context engineering without boiling the ocean.

Start with architecture decision records. Pick the five most important architectural decisions in your system. Document each one in a single page: what was decided, why, and what the implications are for new code. This alone dramatically improves agent output.

Write a conventions file. One document that covers how your team writes code. Naming patterns, file structure, testing expectations, error handling. Keep it under two pages. Update it when conventions change.

Create service context docs for your most active areas. You don’t need to document every service on day one. Start with the two or three areas where agents are working most frequently. Build context docs for those first and expand over time.

Add temporal context markers. If there’s a migration in progress, a deprecated service, or a pattern that’s being phased out, document it. A single sentence can prevent an agent from building on the wrong foundation.

The investment is small. A senior developer can build the initial context library in a day or two. The payoff is felt immediately in agent output quality and in the reduction of “technically correct but architecturally wrong” code that eats up review cycles.

Spec-driven development gives you a methodology. Context engineering AI gives you the foundation that makes it trustworthy. Together, they’re the beginning of a real system. But to scale it beyond individual practice, you need one more piece: a way to do this collaboratively.

That’s what we’ll cover next.


This is Part 2 of a series on spec-driven development and the planning infrastructure that makes AI coding agents work. Next up: why solo specs fail at team scale, and what collaborative spec-driven development looks like in practice.

image description
Back to top