Yesterday’s Gospel: Why Retrieval Should Age Out
Key Takeaways
- Pure semantic relevance does not encode time. A retrieval system without decay will surface old “we always” content alongside last week’s revision, equally weighted.
- Half-life decay is a multiplier on the relevance score based on the chunk’s last-modified timestamp.
- A 90-day half-life means content from 90 days ago has half the weight of new content. Configurable per-category.
- Decisions, lessons, and project content benefit from short half-lives (180 days or less). Reference material benefits from long half-lives (years).
- One weight, one config, large quality lift on retrieval. The brain stops misleading the agent with stale gospel.
Last quarter's "we always do it this way" was last quarter's truth. The quarter before, somebody else owned that decision and wrote it down differently. Six months from now, the team will revise it again. A retrieval system that does not know any of this will happily surface the year-old "always" alongside last week's revision, equally weighted, and let the agent decide which to use.
The agent will choose wrong, sometimes. Most of those wrong choices will be invisible until they ship as a PR that conflicts with a decision the team made three Tuesdays ago.
The fix is half-life decay: a numeric weight that makes recent content surface ahead of older content of similar surface relevance. Configurable. Per-category. Cheap.
What pure semantic retrieval misses
Vector search is good at finding semantically-similar chunks. It is not good at knowing which of those chunks the team currently believes. If the codebase has a six-month-old design doc that says "we use Approach A" and a session log from last week that says "we replaced Approach A with Approach B for these reasons", a query about the approach will surface both. They are both relevant. Without a temporal weighting, both are returned at full weight, and the agent has to figure out which one is current.
The agent will sometimes get this right (especially if both chunks contain dates). It will sometimes get it wrong, in ways that are hard to spot in PR review. The wrong answer is the older content.
The fix is not to delete the old content. The old content is useful: it tells the team what was tried, why it was abandoned, what the alternatives were when the decision got made. The fix is to weight the old content lower at retrieval time so it does not crowd out the new.
Half-life decay, in one paragraph
Every chunk in the brain has a last-modified timestamp. At retrieval time, the relevance score gets multiplied by a decay factor:
decay = 0.5 ^ (age_days / half_life_days)
A chunk modified today has decay = 1.0. A chunk modified 90 days ago, with a 90-day half-life, has decay = 0.5. A year-old chunk with the same half-life has decay = ~0.06. Add the decay to the existing semantic relevance score and re-rank.
That is the entire mechanism. One multiplication, one config knob, no new infrastructure. The relevance graph reshapes so that recent content surfaces ahead of older content of similar topic relevance.
Per-category half-lives
A flat half-life across all content is wrong. Reference material (RFCs, formal specs, regulatory documents) does not become obsolete at the same rate as session logs. Different content types want different half-lives.
We configure ours like this:
| Category | Half-life | Reasoning |
|---|---|---|
| Decisions (ADRs, design choices) | 180 days | Architectural decisions usually stay current for months, get revisited rarely. |
| Lessons (gotchas, post-mortems) | 365 days | Once captured, a real bug pattern stays relevant for a year or longer. |
| Facts (per-project state) | 90 days | Project state changes fast; old endpoints get retired, new ones land. |
| Sessions (raw conversation) | 60 days | Most conversation is about today's work. Old chats have low ongoing value. |
| Entities (people, products, vendors) | 730 days | Entity relationships move slowly. The CTO is the same CTO. |
These numbers are a starting point. The right values for any team depend on how fast the team's truth changes. The point is the shape: short half-lives for fast-moving content, long half-lives for slow-moving content.
What this changes at retrieval time
Without decay, a query about the team's auth pattern returns:
- Old design doc from when the system launched (1.0 score).
- Recent session log explaining the auth refactor (1.0 score).
- Mid-life design review from six months ago (1.0 score).
The agent sees three voices, equally weighted, and has to decide. It will sometimes pick the loudest signal in the training-data sense, which is often the oldest doc.
With decay (90-day half-life on facts), the same query returns:
- Recent session log (1.0 score).
- Mid-life design review (~0.32 score).
- Old design doc (~0.06 score).
The agent gets the right answer first. The other two are still in retrieval (the old context is preserved if it is needed) but they are not crowding out the current truth.
A useful side effect: garbage management
Decay also degrades the visibility of stale content automatically. Old session logs that nobody has touched in months stop surfacing in retrieval, without anyone having to delete them. The brain keeps the history (deletion is a separate decision) but does not let the history pollute the current conversation.
Most knowledge bases need active garbage collection. With decay, a lot of garbage management happens for free, as a function of the chunk's last-modified timestamp.
When decay is wrong
Three cases where flat (no decay) or near-flat retrieval is the right call:
- Reference docs: regulatory text, formal specs, vendor contracts. These need to be retrievable at full strength regardless of age.
- Historical research: when the agent is explicitly asking "what did we used to do", the agent wants the older content back.
- Single-snapshot domains: some topics genuinely do not change (the date a company was founded, the specific text of a contract).
Decay should be configurable per-query as well as per-category. Most queries get the default decay; some queries explicitly ask for the historical view and get the flat retrieval.
Take the Next Step
If your team's RAG layer keeps surfacing stale content alongside current decisions, the fix is half-life decay. One weight, one config. We help teams instrument it cleanly with per-category half-lives that match how fast the team's truth actually changes. Get in touch if you want a fresh pair of eyes on yours.