What a model can't know about your user
The intelligence people praise in an AI feature is usually the context engineering you did before the model got involved.
Most teams describe their AI feature as "the model figures out what the user needs." It doesn't. The model receives whatever you decided to hand it, in whatever shape you chose, and produces a plausible continuation. The part everyone calls intelligence — the answer that feels like it understood the person — is almost always the assembly work you did in the half-second before the call: which facts you pulled, which you left out, how you framed the question. The model is the last step, not the thinking.
A model knows the distribution of language. It does not know that this user is on their third failed payment, that they churned once and came back, that it's 2am their time and they've reopened the same screen four times. None of that lives in the weights. It lives in your database, your event stream, your product's memory of the person. Whether it reaches the prompt is an engineering decision — and it's the decision that determines the quality of the output far more than the choice of model does.
The retrieval is the product
Swap a frontier model for a smaller one and a well-built feature degrades gracefully. Strip the context and even the best model produces confident, generic mush. That asymmetry tells you where the value actually sits.
Consider a support assistant. The unimpressive version answers from general knowledge: here's how refunds usually work. The version people call magic knows this is an enterprise account, that the charge in question is disputed, that the customer's contract has a custom SLA, and that a human already replied yesterday. The model didn't get smarter between those two versions. The context did.
- →What the user is doing right now — the screen, the selection, the half-finished action.
- →What they've done before — the history that makes "again" mean something.
- →What's true about their account — entitlements, limits, state that changes the right answer.
- →What you must never say — the rows, fields and records this person is not allowed to see.
Each of those is a query you write, a join you get right, a permission you enforce. The model contributes none of it. It just renders the consequences of choices you already made.
The model is a function. The context is the argument. You spend almost all your judgment choosing the argument.
Garbage context, confident output
The failure mode is specific and it's worth naming, because it doesn't look like a failure. Feed a model thin or wrong context and it won't stall or ask for help. It will fill the gap with the most probable-sounding answer and deliver it in the same even, assured tone it uses when it's right. The fluency is constant. The grounding is not.
So the work shifts upstream, to where it always belonged. Did you fetch the right records? Is the user's current state actually represented, or a stale snapshot from login? When two sources disagree — the CRM says active, billing says delinquent — which one wins, and does the prompt know there was a conflict at all? Did you pass a number as a number, or stringify it into something the model will quietly round? These are ordinary backend questions. They are also, now, the questions that decide whether your AI feels intelligent.
This is why "let's just give the model more context" is rarely the answer. More raw context buries the signal and invites the model to latch onto whatever's nearest the question. The craft is selection: the three facts that matter for this turn, not the forty that might. Curation, ranking, recency, relevance — the same discipline as building a good search result, pointed at a different consumer.
Where the design and the engineering meet
Deciding what the model gets to know is a product decision wearing an engineering costume. The boundary you draw around the context is the boundary of the feature's behavior — and of its safety.
Permissions are the sharpest version of this. A model has no concept of authorization; it will happily reason over any row you put in front of it. If a teammate's salary, another tenant's data, or a deleted record reaches the prompt, the model will use it, fluently, and you've built a leak with a friendly voice. The access-control layer that decides what context is even eligible isn't a constraint on the AI feature. It is the AI feature, as much as the prompt is.
The same logic applies to a confidence gate. When the evidence is thin, the honest move is to route to a human or say less — not to let the model improvise.
The interesting work happens before generate().
Look at what surrounds the generation call. Scope, retrieval, an empty-context guard, a confidence route. The model occupies one line. The product lives in the other four.
So when a feature feels like it read the user's mind, resist crediting the model. Ask what was in the prompt, who decided it belonged there, and which query made it true. That's where the intelligence was authored — by someone who understood both the user and the system well enough to hand the model exactly the right small thing to say.
