Our work · Cross-Session Memory Engine

Most AI products fake personalization. We built the version that actually remembers.

A memory engine that learns about each user across sessions — and treats a user-stated commitment differently from a single-pass observation. Auditable, decay-aware, and visible to the user, not just hidden in a database somewhere upstream of the prompt.

Technology showcase Agntic Consulting Memory architecture
AM Alex Marchettiuser · profile_id 7c2a · last seen 4m ago
Entries23
Confirmed9
Inferred14
Confirmed Always cite sources inline with bracket numbers.category · response_style · stated 14 days ago 7
Confirmed Use JavaScript instead of TypeScript for all code samples.category · tooling · stated 5 weeks ago 3
Inferred Prefers concise answers — under ~120 words when possible.category · response_style · observed across 6 sessions 6
Inferred Working on a procurement-automation project for a mid-market client.category · context · reinforced 3 times 3
Inferred · new Reads documentation in the morning, codes in the afternoon.category · pattern · 1 observation — held back from prompt 1
Association graph2 clusters · 1 isolate

The profile is a live view, not a transcript. Confirmed entries are stated, immortal. Inferred entries earn their place by repetition — clustered facts reinforce each other; isolates fade.

Most AI products fake personalization by re-reading the last few turns of chat history. The agent sounds personal for the length of a session and forgets you the moment a new one starts. We build the version that actually accumulates context over months — that distinguishes what the user told it from what it merely guessed, lets the user see and edit what's stored, and refuses to pretend it knows things it doesn't.

01 · Two classes of memoryWhat the user said vs. what the agent guessed

Every entry in the profile carries a source. Confirmed entries come from explicit user statements — "from now on, always cite sources" — and are exempt from decay, period. Inferred entries come from session analysis and require at least two reinforcements before the agent treats them as established enough to surface in its prompt. A single-pass observation can be written, but it sits below the surface until reality confirms it again. The two classes never blur: an inferred update can never silently overwrite a confirmed entry — the store returns a conflict and waits for human approval.

ConfirmedUser-stated

Stated directly. Never decays.

Comes from explicit directives — "from now on", "always", "never". Treated as immortal until the user revises or revokes it.
DecayExempt
Surfaced atFirst write
OverwritesInferred conflicts
Label[confirmed]
"From now on, always cite sources inline."→ written to profile, exempt from every decay sweep.
InferredPattern-derived

Observed. Earns its place.

Derived from session analysis. Single observations are held below the surface. ≥ 2 reinforcements before the agent treats the fact as known.
DecayActive
Surfaced atreinforcement ≥ 2
OverwritesNever confirmed
Label[reinforced]
"User seems to prefer terse answers."→ written with reinforcement=1, withheld from prompt until reality agrees.

The store enforces the boundary. Inferred entries can be reinforced, decayed, or archived. Confirmed entries sit above all of that — and an inferred write that conflicts with one returns an error, not a silent overwrite.

02 · Commitments vs. ventingWhat turns into a memory — and what doesn't

The difference between "I'm done with TypeScript" and "From now on, use JavaScript instead of TypeScript for all my code" is the difference between venting and a directive. The first is frustration; the second is a commitment. Most products treat them the same and end up with prompts full of half-meant complaints. A fast-tier LLM watches every turn for the directive shape — explicit cues like from now on, always, never — and rejects hypotheticals, frustration, and idle musing. Only committed directives become confirmed memories.

Directive "From now on, use JavaScript instead of TypeScript for all my code."Explicit, forward-looking, committed. Triggers a confirmed write. Confirmed
Rule Write a confirmed entry only when the message has directive shape — \bfrom now on\b, always, never, remember that — and the fast-tier LLM classifies it as a clear, forward-looking commitment.

The filter is what makes the memory useful. Without it the profile fills with hypotheticals, frustration, and out-of-context one-liners that pollute every future prompt.

03 · The association graphFacts that cluster reinforce each other

Profile entries don't sit in a flat list — they live in an association graph that links facts in the same category, sharing tags or vocabulary. Tightly-clustered facts reinforce each other: a recently-retrieved entry lifts the relevance score of everything it connects to. Isolated facts — the one-liner you said six months ago that never came up again — drift below the relevance threshold and get archived. The decay score is a weighted composite of confidence, recency, graph connectivity, and reinforcement count; entries below 0.2 are archived, below 0.05 are deleted. Confirmed entries are explicitly excluded from the sweep, every time.

Association graph · user profile ClusteredIsolate
RESPONSE STYLE TOOLING decaying decaying
Tight clusterscore ↑ 0.78
  • Always cite sources inline
  • Prefers concise answers
  • Wants reasoning shown
  • No marketing-style hype
Isolatedscore ↓ 0.12
  • Mentioned a hiking trip once
  • Asked about Reykjavík weather

Cluster connectivity is one input of four. The decay score weights confidence, recency, graph connections, and reinforcement count. Below 0.2 the entry archives; below 0.05 it deletes. Confirmed entries are skipped entirely.

04 · Cold-start honestyThe product refuses to pretend

The most damaging thing a personalized agent can do is fake familiarity on the first session. When the profile has fewer than five entries, the retriever returns an empty result — explicitly, not a half-confident guess. The system prompt receives a fallback string that tells the agent to ask clarifying questions instead of acting as if it remembers things it does not. It's the same instinct as the confidence tag on retrieval: the architecture knows when it doesn't know, and says so.

Profile entries3 / 5 · cold start
1
2
3
4
5
6
7
empty retrieval profile injected
Below 5
Retriever returns the cold-start fallback."No user profile available yet. Ask clarifying questions as needed."
5 +
Profile is injected into the system prompt.Confirmed entries first; inferred entries with reinforcement ≥ 2 follow as [reinforced].
Steady
Retrieval becomes a reinforcement event.Surfaced entries get their last_reinforced bumped — frequent retrievals survive decay.

An empty retrieval is a feature. Five-entry threshold, hard-coded fallback string, no roleplay. The agent doesn't roleplay knowing you when it doesn't.

05 · User-facing controlsAuditable, not magic

The user sees exactly what the agent remembers. Inferred entries can be confirmed (promote to immortal), dismissed (delete the suggestion), or edited. Any entry can be force-forgotten. The whole profile can be re-derived from raw interaction logs — a fresh pass that rebuilds the inferred portion from history without bias from the existing entries. The whole thing can be exported as JSON. There are ten profile endpoints exposed for this; nothing about the memory layer is hidden from the person it's modeling.

Your profile23 entries
Always cite sources inline.Confirmed · stated 14 days ago
Edit Forget
Prefers concise answers — under ~120 words.Inferred · reinforced 6 times
Confirm Dismiss
Reads documentation in the morning, codes in the afternoon.Inferred · 1 observation · held below surface
Confirm Dismiss
Working on a procurement-automation project.Inferred · reinforced 3 times
Confirm Dismiss
Profile API/profile/*
GET/profileAll entries
GET/profile/suggestionsInferred awaiting review
GET/profile/graphAssociation graph
GET/profile/statsCounts & cadence
GET/profile/exportFull JSON dump
POST/profile/confirm/:idPromote inferred
POST/profile/dismiss/:idReject suggestion
POST/profile/forget/:idForce-forget entry
POST/profile/re-deriveRebuild inferred from logs
DELETE/profileWipe all user memory

Memory is a surface, not a black box. The endpoints exist because enterprise buyers ask for them — review, audit, export, force-forget — and most consulting shops do not deliver them.

06 · Why we built it this wayThe memory layer enterprise buyers ask for

Most teams ship "memory" by stuffing the last N turns of chat history into the next prompt and calling it personalization. That works for a session. It does not work for a relationship. Once a customer is six weeks in and expects the agent to know them, the cracks show: the model forgets stated preferences, invents ones the user never said, and has no way to tell the user what's stored. We built this engine because every serious deployment we've shipped needed it — separation of confirmed from inferred, a filter that rejects venting, a graph that decays isolated facts, a cold-start that admits ignorance, and a panel where the user can see and edit everything.

None of these pieces are exotic on their own. The work is in stacking them so the agent stops being a goldfish in a polite tone of voice and starts being something a customer can actually live with for a year.

Need an agent that remembers correctly?

We build memory layers for clients where the failure mode "the model invented something the user never said" is unacceptable — regulated industries, internal-tools teams, anywhere personalization has to be auditable.

Book a discovery call