Oxagen Docs

How memory works inside an Oxagen workspace — the four layers (semantic, episodic, telemetry, working set), the master toggle, what each layer retains, and how the chat path reads them on every turn.

Overview

A workspace is not just a graph — it is the unit of memory for every agent that touches it. Memory in Oxagen is workspace-scoped, layered, and opt-in. One toggle (workspace.config.memory_enabled) controls whether the workspace records and recalls anything beyond the active conversation window.

There are four layers, each with a different lifetime, store, and read path. They are independent — disabling memory disables capture into the durable layers but leaves the active conversation untouched, and the ontology graph itself is always available because it is the workspace.

Layer	Lifetime	Store	What it holds
Semantic	Permanent (until soft-deleted)	Neo4j + Postgres mirror	The workspace ontology — typed nodes and edges. The thing the agent reasons over.
Episodic	Permanent (summary only)	Postgres `app.memory_episodes`	Rolled-up summaries of older conversational turns. Source raw turns are hard-deleted on consolidation.
Telemetry	Permanent	Postgres `agent_memory.` + Neo4j `_mem:`	Per-action audit, grouped sequences, promoted patterns, benchmark trends.
Working set	TTL (default 1h, refreshed on read)	Redis	Volatile, per-conversation scratchpad — planner decisions, recent retrievals, intermediate tool outputs.

The CLAUDE-rule that governs all four: no raw user data. Episodic consolidation deletes source messages once a summary exists. Telemetry records action types and outcomes, not payload bodies. The working set is volatile by construction. The ontology stores typed entities and embeddings, never raw documents.

Enabling memory for a workspace

The master toggle lives at workspace.config.memory_enabled (a JSONB field on org.workspaces.config). When false, the workspace still has its ontology graph — semantic memory is structural, not optional — but durable capture into the episodic and telemetry layers is paused.

Via the app

Open the workspace.
Settings → Memory → Enable memory.

The setting round-trips through PUT /v1/workspaces/{workspace_id}/settings under the memory.enabled key. The server bridges that into workspace.config["memory_enabled"] so every read path (is_memory_enabled(workspace)) sees the same value.

Via MCP

workspace.enable_memory
workspace.disable_memory

The bearer token resolves the workspace; no arguments needed.

Via REST

POST /v1/workspaces/{workspace_id}/memory/enable
POST /v1/workspaces/{workspace_id}/memory/disable

These endpoints are internal (not in /openapi.json) and gate on workspace membership.

Layer 1 — Semantic memory (the ontology)

The workspace knowledge graph is the semantic memory layer. Every node and edge is workspace-scoped, RLS-enforced at the database, and written through the same provider so vector search and graph traversal agree on what exists.

The chat path embeds this layer into every turn through the ontology tools — hybrid search picks the most relevant nodes, the prompt-builder injects them into context, and the response links back to the node IDs that supported it.

Semantic memory is always live for a workspace. The memory_enabled toggle does not turn it off. To hide a node, soft-delete it; to remove a slice of the graph, use the connection-delete flow.

See Ontology for the full data model, hybrid-search behavior, and the credit-metered prompt-run audit.

Layer 2 — Episodic memory

Episodic memory holds summaries of older conversation turns so the chat path can recall context that has scrolled out of the active window without reading raw transcripts.

Data model

One row per consolidated chunk in app.memory_episodes:

Column	Purpose
`conversation_id`	The conversation the episode belongs to. Cascade-deletes with the parent conversation.
`summary`	LLM-generated third-person paragraph covering the consolidated turns.
`source_message_ids`	UUIDs of the source `app.messages` rows. The rows themselves are deleted on insert — this column is a forensics breadcrumb, not a live foreign key.
`turn_count`	How many source messages folded into this episode.
`workspace_id`	RLS scope. Inherited from `WorkspaceScopedBase`.

The summary prompt explicitly prefers types and roles over names — "the user" rather than "Mac", "the assistant", named entities only when the conversation is meaningfully about them.

Read path — live

On every chat turn, _get_episodic_context reads the five most recent episodes for the active conversation and prepends them to the system prompt as:

Earlier conversation summaries (most recent last):
- (12 turns) <summary>
- (8 turns) <summary>
- ...

Read failures (missing table, transient DB error) are swallowed and logged at WARNING — episodic recall is best-effort and never breaks a chat turn. If the workspace has no episodes yet, the block is empty and nothing is injected.

Write path — consolidation

Consolidation rolls up older raw messages into one episode and hard-deletes the sources. summarize_old_turns(session, conversation_id, retain_last=20, max_age_days=30) does the work:

Picks messages older than max_age_days that fall outside the retain_last most-recent rows.
Asks the orchestration router (kind="qa") to write a single-paragraph summary using the workspace's configured model.
Inserts a MemoryEpisode row.
Hard-deletes the source app.messages rows.

The function is idempotent: consolidated rows are gone, so a re-run finds no new candidates among them. Tails shorter than two messages are skipped.

How episodes are produced. Callers invoke the consolidator function directly — the worker does not run it on a schedule. Recall reads episodes when they exist and returns the empty string when they do not, so a workspace with no consolidator calls still gets a well-formed recall response.

Lifecycle

Episodes are soft-deletable (the is_deleted column inherited from SQLTableBase) and cascade-delete with the parent conversation. Deleting a conversation removes both the remaining raw messages and every episode summarising it.

Layer 3 — Agent telemetry

Telemetry captures what agents did in the workspace — every tool call, every mutation, every dead end — and promotes recurring outcomes into patterns that are injected back into the agent system prompt before the next action.

This is the layer documented in detail under Agent Memory. The short version:

agent_memory.agent_actions — one row per tool call, with action type, target node type, outcome, and error code (no payloads).
agent_memory.agent_sequences — actions grouped by session.
agent_memory.agent_patterns — promoted when ≥5 sequences share the same (action_type, node_type, error_code) at ≥60% confidence.
agent_memory.agent_benchmarks — nightly metric values per workspace.

Each Postgres row has a Neo4j twin under the _mem:* label space, so telemetry travels alongside the ontology in the same graph and an operator can hop from any domain node to its full agent history in one edge.

Telemetry is the layer where the memory_enabled toggle has the most behavioural impact — it is the only layer that also changes how the agent runs (via pre-execution context injection). Disabling memory stops capture and stops injection; existing rows remain queryable for audit.

The legacy module is named oxagen.domains.agent_memory; the oxagen.domains.memory.__init__ docstring marks it for a rename to agent_telemetry in a follow-up. Treat agent telemetry as the canonical name for this layer; the existing /docs/agent-memory page covers the full surface.

Layer 4 — Working set

The working set is the volatile, per-conversation scratchpad. It holds whatever the planner needs to remember inside a single conversation but does not need to persist beyond it: the last planner decision, the most recent retrieval result, intermediate tool outputs.

Layout

oxa:ws:<workspace_id>:<conversation_id>:<slot>

Values are JSON-encoded. Slot names are caller-defined; the API takes any string. Suggested conventions: planner_decision, recent_retrieval, intermediate_results.

Behaviour

TTL. Default one hour. Every read refreshes the TTL — every read is a heartbeat.
Eviction. A slot quietly disappears once its TTL expires. Readers get None and must reconstruct.
Clear. clear_working(workspace_id, conversation_id) wipes every slot for one conversation in a single SCAN+DELETE pass; called when the conversation ends.

Maturity note

The working-set module ships in oxagen.domains.memory.working_set and has Redis-coupled tests. It is a primitive callers opt into directly — the chat path reads from the semantic and episodic layers and does not consult the working set on its own.

How a chat turn uses memory

Each turn assembles its prompt from multiple layers. The flow:

User message arrives
    │
    ▼
Check workspace.config.memory_enabled
    │
    ▼  (always)
Hybrid-search the ontology — semantic memory hits
    │
    ▼  (always, best-effort)
recall_recent_episodes() — episodic summaries
    │
    ▼  (only if memory_enabled)
Inject promoted telemetry patterns referencing the target nodes
    │
    ▼
Compose system prompt:
  [episodic summaries] + [ontology context] + [pattern context]
    │
    ▼
Run model, stream response
    │
    ▼  (only if memory_enabled)
Capture action(s) into agent_memory.agent_actions
    │
    ▼  (eventual, when consolidation runs)
Old turns rolled into a new MemoryEpisode; source messages deleted

What each layer contributes:

Semantic answers what does the workspace know about this?
Episodic answers what has this conversation already covered?
Telemetry answers what tends to go wrong when an agent does this?
Working set answers what was I just thinking? (within one conversation, for one planner)

The turn never blocks on a memory layer. Failures in episodic recall, pattern lookup, or the working set are logged and skipped — the chat path always has the active conversation messages plus the ontology to fall back on.

Privacy and data retention

Workspace memory follows the project-wide rule: no raw user data persists past the active window.

Layer	What persists	What does not
Semantic	Typed entities, edges, embeddings, location references	Raw documents, raw email bodies, file contents
Episodic	LLM summary, message UUIDs, turn count	Source message bodies (hard-deleted on consolidation)
Telemetry	Action type, target node type, outcome, error code, latency	Tool arguments, response bodies, user-supplied strings
Working set	JSON values the caller chooses to write	Anything past the TTL — values evaporate on expiry

Cross-workspace memory is explicitly out of scope. Patterns, episodes, and working-set slots never leave the workspace they were written in. RLS enforces this at the database level for the Postgres layers; key namespacing enforces it for Redis.

REST and MCP surface

Toggle

POST   /v1/workspaces/{workspace_id}/memory/enable
POST   /v1/workspaces/{workspace_id}/memory/disable
PUT    /v1/workspaces/{workspace_id}/settings   # body: {"memory": {"enabled": true}}

workspace.enable_memory
workspace.disable_memory

Telemetry (agent memory)

See Agent Memory — MCP tools reference for the full surface. Highlights:

memory.search          # query patterns and sequences
memory.context         # semantic search over sequence intent_summary
memory.annotate        # annotate or suppress a pattern
memory.metric_create   # define a custom metric
memory.benchmark_get   # benchmark trend for a metric

Episodic and working set

The episodic and working-set layers do not currently expose dedicated REST or MCP tools — they are internal primitives the chat path uses. Conversation deletion (DELETE /v1/conversations/{id}) cascades through app.memory_episodes. Episode rows are visible via the admin tooling for incident response.

Disabling memory

Per-workspace:

workspace.disable_memory

Disabling stops new telemetry capture and stops pre-execution pattern injection. Existing telemetry rows, episodes, and ontology data are preserved and remain queryable. Re-enabling resumes capture from the point of re-enable; nothing back-fills.

To delete memory:

Conversation-scoped — delete the conversation. Episodes cascade.
Workspace-scoped — soft-delete or hard-delete the workspace. Every layer is workspace-scoped, so the data goes with it.
Pattern-scoped — memory.annotate with suppressed: true hides a telemetry pattern from injection without deleting the row.

Agent Memory — the telemetry layer in detail.
Ontology — the semantic layer, the data model, and the credit-metered prompt-run audit.
Security — RLS, workspace isolation, and the cross-workspace boundary.

Workspace Memory