Context Design
Orchestrating the Cognitive Supply Chain

The industry has been obsessed with "Prompt Engineering" for a while now—the art of phrasing a question to get a better answer. It is a good starting point when you are beginning to explore AI. However, prompting is like a spark: it can initiate a fire, but it will not sustain it.
Recently, the industry shifted toward massive token windows that can hold your entire business documentation in context. This has emerged the need to build systems that can handle this volume of information and, consequently, find what is actually relevant. It is the classic "needle in a haystack" problem, and it's getting worse. AI can get lost in this sea of information and start hallucinating—making up facts or citing information that doesn't exist.
If we want AI to move from a novelty to a robust infrastructure, we have to stop talking about prompts and large token windows and start talking about Context Design. It is the classic quality-over-quantity problem, just at a cognitive level.
From Content to Context
For decades, digital product design has been about Content Delivery. We build containers, we fill them with data, and we present them to the user. The user is the one responsible for synthesis. Think about Google Search: you get a list of results and you have to synthesize them yourself. Even with modern AI features that summarize results, they often fail to capture the full picture.
In the era of Agentic AI, the paradigm shifts. The AI is now the primary consumer of data. But AI, like a human, suffers from cognitive overload. If you flood the context window with noise, hallucinations increase and reasoning quality drops. You no longer look for entries, but for answers. You no longer look for information, but for insights. You no longer expect a "Search" engine, but a "Reasoning" engine. Think of how many people use AI as a glorified Google Search (Perplexity, I am looking at you).
Context Design is the intentional architecture of the information ecosystem that an AI operates within. It is the shift from providing all the data to providing the perfect signal at the perfect moment.
The Knowledge Refinery
For the past decade, Big Data made us greedy, encouraging us to keep every bit of information even if we don't need it. We expected AI to handle all that mess for us. But we didn't anticipate that AI would get lost in this sea of information and start hallucinating. The harsh truth is that as data grows larger, the precision of math (and logic) often diminishes. The solution is not to create bigger models, but to create better systems to handle the data to increase the quality of the context window.
We shouldn't just "fetch" data, we should refine it.
I initially faced this insight when working with SLMs (Small Language Models) and their limited context windows. I had to find a way to reduce the amount of tokens without reducing the quality of the input - condensing the most relevant information so the SLM had enough context to provide a high-quality answer.
See the difference? Prompt Engineering is a starting point - it improves the quality of the input, usually by adding framing, examples, or constraints. Experienced practitioners already compress and filter context intuitively. Context Design formalizes that instinct into a repeatable discipline. The concern shifts from the single prompt to the full information pipeline that feeds it. It's about maximizing the Signal-to-Noise Ratio (SNR) of the context window.
You can already find several tools that try to solve this problem by attempting to get the most out of the context window. But I realized a fundamental gap: while most tools are amazing, they are still single-tasked. The AI agent is still in charge of orchestrating the data between tools, which continues to fill the context window with noise. I wanted to create a system that could orchestrate the data flow between tools in a way that the AI doesn't have to worry about it.
I conceptualize this as a Knowledge Supply Chain. Raw data is extracted, transported, and then processed through a "Refinery" before it ever touches the AI's reasoning engine. This approach is inspired by the 50-year-old Unix piping design, which allows developers to chain simple tools to create powerful systems.
- Ingestion: Converting heterogeneous data (PDFs, Logs, Code) into a unified, searchable format.
- The Sieve: Heuristic filtering to remove structural noise (timestamps, headers, metadata).
- The Sift: Semantic pruning to eliminate natural language redundancy while preserving core reasoning paths.
- Injection: Delivering the high-SNR (Signal-to-Noise Ratio) payload into the context window.
The Expectation Effect for Machines
In my previous writings, I explored The Expectation Effect - the idea that humans do not experience products objectively. We experience them through the lens of our biological "Prediction Engine", constantly comparing reality against our cached priors. When a user is frustrated, it's often because of a gap between their mental model and the product's behavior.
With Agentic AI, we encounter a striking parallel that is rooted in actual cognitive science. In neuroscience, the theory of Predictive Coding (championed by researchers like Anil Seth) posits that human perception is essentially a "controlled hallucination". Our brains constantly predict sensory input based on past experiences ("priors"), and we only update our models when reality throws a "prediction error". When sensory input is weak or noisy, our brains fall back entirely on those priors, causing us to literally see or believe things that aren't there.
Large Language Models share a striking functional parallel with this model. The resemblance isn't in the blueprint - autoregressive next-token prediction emerged from statistical efficiency, not neuroscience. But the behavioral fingerprint is the same. At their core, LLMs are Next-Token Prediction engines. Like the human brain, they are optimized for structural coherence rather than objective truth. When an AI "hallucinates", it isn't a random glitch. The causes are layered - a weak signal in the context window, reinforcement fine-tuning that rewards confidence over accuracy, or a gap between the training data and reality. But the failure mode is always the same: the model falls back on its learned priors rather than grounded evidence. It confidently predicts what should come next, even when it's factually wrong.
Context Design is the Expectation Effect applied to machines.
By intentionally shaping the context window, you are actively defining the AI's "Map of Reality". You aren't just giving it data, you are setting its expectations for what constitutes a valid, high-fidelity answer. If you flood the context with noisy, unrefined data, you "prime" the AI for failure, triggering the very hallucinations we try to avoid. Conversely, when you orchestrate a high-SNR (Signal-to-Noise Ratio) environment, you create a self-fulfilling prophecy of quality. Managing the context is, ultimately, managing the expected outcome.
Systems, not Patches
When we build with a "System over Patch" mindset, we don't fix hallucinations by adding more prompts. We fix them by hardening the context pipeline. We ensure that the AI isn't just "guessing" based on a generic prompt, but "reasoning" based on a high-fidelity environment.
If you want to get more into the technical details of how to build these systems, I recommend you check out my Context Design page. Or go directly to the source code: Context-Pipe (opens in a new tab) or Semantic-Sift (opens in a new tab).
Context Design is how we build AI that is predictable, reliable, and most importantly, useful.
Thank you for sticking with me and I hope you enjoyed it.