Ontology (Knowledge Graph)
How Oxagen's self-evolving knowledge graph works — nodes, edges, types, hybrid search, importance scoring, merge queue, prompt-run audit, and the credit model that meters every call.
Overview
The Oxagen ontology is a living knowledge graph that maps every meaningful thing in your digital life — people, transactions, documents, meetings, photos — and the relationships between them. It evolves automatically as new data arrives.
Oxagen talks about the graph with three primitives:
- Node — a single thing (person, project, meeting note, receipt).
- Edge — a typed, directional relationship between two nodes.
- Type — the catalog entry for a node/edge type, auto-maintained from the live graph.
Every node and edge is workspace-scoped. Row-Level Security enforces isolation at the database layer, so a query can never see another workspace's rows even if the API forgets to filter.
Nodes
Every piece of data becomes a node — one entry in the graph.
Node: "Family Budget 2025"
├── type: family_budget
├── name: "Family Budget 2025"
├── properties: { owner: "Mac", year: 2025, ... }
├── importance_score: 0.78
├── reference_count: 12
├── connection_id: null # null for prompt-box entries
├── workspace_id: <uuid>
└── embedding: <512-dim vector> # computed async by the workerNode types are free-form strings — not a fixed enum. The refiner picks the
most specific type it can: family_budget instead of spreadsheet,
vet_appointment instead of meeting. New types register themselves in the
ontology.types catalog on first sight.
Edges
Nodes are connected by edges — typed, directional relationships.
| Edge type | Example |
|---|---|
predecessor_of | Budget 2025 → Budget 2026 |
receipt_for | Receipt photo → Bank transaction |
from_account | Transaction → Bank account |
paid_to | Transaction → Merchant |
attended_by | Meeting → Person |
Edges are discovered automatically by the refiner, and the reasoning + source evidence is stored on the row for transparency.
The prompt box — update / query / both
POST /v1/ontology/prompt is the day-1 demo entry point. You send a single
text blob. The system classifies intent into one of three buckets and runs the
matching pipeline:
- update — the Markdown/Text refiner extracts nodes + edges and persists them. The response echoes what landed.
- query — the Question Answerer runs hybrid retrieval against the workspace ontology and returns a narrative with node citations.
- both — runs update first, then a query over the freshly-updated graph.
Every call also returns an X-Correlation-Id header that groups logs, analytics
events, and audit rows for the same request.
Hybrid search
POST /v1/ontology/search returns ranked nodes using one of three retrieval
strategies:
| mode | What it does |
|---|---|
vector | Embed the query + run pgvector HNSW cosine similarity only. |
structural | Pure SQL: filter by type, rank by recency + importance_score. |
hybrid | Run both, merge by node id, combine scores with configured weights. |
The combined score for hybrid mode is:
score = vector_weight * vector_score + structural_weight * structural_scoreWhen search.importance_ranking is true (the default), the combined score is
multiplied by each node's importance_score so high-signal nodes surface above
raw text matches.
Weights live in WorkspaceOntologyConfig.search:
| Field | Default | Notes |
|---|---|---|
retrieval_mode | hybrid | Default mode when the caller omits one. |
vector_weight | 0.6 | Applied to the embedding similarity. |
structural_weight | 0.4 | Applied to recency + importance. |
min_similarity_threshold | 0.7 | Lower bound on vector similarity. |
default_top_k | 20 | Fallback when the caller omits top_k. |
importance_ranking | true | Multiplies the combined score. |
Structural candidates are ranked by a recency score (7-day half-life) blended
50/50 with importance_score. Vector candidates are converted from rank order
to a [0, 1] score so the two sides are comparable.
Example:
curl -X POST https://api.oxagen.ai/v1/ontology/search \
-H "Authorization: Bearer <jwt>" \
-H "Content-Type: application/json" \
-d '{
"q": "seed round conversations last month",
"top_k": 10,
"mode": "hybrid",
"filters": { "type": ["person", "meeting"] }
}'Response:
{
"results": [
{
"node": { "id": "...", "type": "person", "name": "Sarah Chen", ... },
"score": 0.74,
"components": {
"vector_score": 0.92,
"structural_score": 0.41,
"importance_multiplier": 0.88
}
}
],
"mode": "hybrid",
"query_plan": {
"mode": "hybrid",
"top_k": 10,
"filters": { "type": ["person", "meeting"] },
"vector_weight": 0.6,
"structural_weight": 0.4,
"importance_ranking": true,
"result_count": 1
},
"credit_receipt": {
"ledger_id": "...",
"kind": "ontology.search.hybrid",
"amount": 1
}
}The components map is the ranking's receipt — each per-source score is
preserved so the UI (or a debugger) can answer "why did this node rank?"
Importance model
Every node has an importance_score in [0.0, 1.0] that drives search
ranking, briefing priority, and digest surfacing. The score combines a
baseline, a reference term with optional recency shaping, and a global decay
applied nightly:
importance_score = (base_salience
+ reference_count
* boost_per_reference
* exp(-recency_lambda * age_days))
* exp(-base_decay_lambda * age_days)When recency_influences_importance is false, the
exp(-recency_lambda * age_days) factor drops out of the reference term — new
and old references contribute equally. age_days is measured from the node's
last_touch_at (for the reference term) and from created_at (for the global
decay).
base_salience— type-derived priority (afamily_budgetoutranks a loose note by default). Assigned by the refiner when the node is created.reference_count— how often other nodes, answers, and workers cite this node. Incremented each timereference_nodefires. Multiplied byboost_per_reference(an additive bump per reference, not a weight in a weighted sum).- Recency shaping — when enabled, a reference's contribution decays with
recency_lambdaover the days since the node was last touched. - Global decay — every night the whole score is multiplied by
exp(-base_decay_lambda * age_days)so cold nodes fade even if nothing else changes.
The knobs come from WorkspaceOntologyConfig.importance:
| Field | Default | Meaning |
|---|---|---|
recency_influences_importance | true | Include the recency term. |
recency_lambda | 0.05 | Decay rate on the reference term per day. |
boost_per_reference | 0.1 | Additive bump per reference_node call. |
base_decay_lambda | 0.01 | Daily exponential decay applied nightly. |
max_importance | 1.0 | Hard ceiling. |
min_importance | 0.0 | Floor below which decay stops. |
More references and recent touches = higher score. The nightly importance decay task runs in Celery Beat so stale nodes naturally fade.
Person resolution and the merge queue
New person nodes can collide with existing ones (same email, same
normalized name). Rather than silently merging, the PersonResolver worker
records a merge candidate row and lets either the auto-merge pipeline or a
human decide:
- On a
node.createdevent for a person-type node, the resolver searches the workspace for an existing person matching by email (preferred) or by normalized name. - It scores the match:
- email + normalized name →
1.00 - email only →
0.95 - normalized name only →
0.75
- email + normalized name →
- It upserts a
NodeMergeCandidaterow withstatus="pending"and the list of match reasons (email_match,normalized_name_match). - If the workspace has
resolver.auto_merge_enabled = trueand the score meetsresolver.merge_confidence_threshold(default0.9), the resolver performs the merge immediately, flips the candidate tostatus="applied", and writes aNodeMergeAction(action="merged", performed_by=null)audit row (null = system). - Otherwise the candidate stays pending, visible via
GET /v1/ontology/merges.
Resolver knobs live in WorkspaceOntologyConfig.resolver:
| Field | Default | Notes |
|---|---|---|
person_resolver_enabled | true | Toggles the worker entirely. |
organization_resolver_enabled | false | Not shipped yet — reserved. |
merge_confidence_threshold | 0.9 | Min score for auto-merge. |
auto_merge_enabled | false | Human approval required by default. |
Approve or reject a merge
Reviewers act on a pending candidate via two endpoints:
# List pending candidates in the caller's workspace, newest first.
curl https://api.oxagen.ai/v1/ontology/merges \
-H "Authorization: Bearer <jwt>"
# Approve — runs the merge, writes NodeMergeAction(performed_by=<user>).
curl -X POST https://api.oxagen.ai/v1/ontology/merges/<id>/approve \
-H "Authorization: Bearer <jwt>" \
-H "Content-Type: application/json" \
-d '{ "notes": "Same person, confirmed by user." }'
# Reject — no merge, writes NodeMergeAction(action="rejected", ...).
curl -X POST https://api.oxagen.ai/v1/ontology/merges/<id>/reject \
-H "Authorization: Bearer <jwt>" \
-H "Content-Type: application/json" \
-d '{ "notes": "Different people who share a name." }'Approval is synchronous: the merge runs in the same transaction that flips the
candidate to applied. Rejected candidates stay in history (soft-deleted
semantics) so repeated resolver runs don't resurrect them.
Prompt-run audit
Every call to POST /v1/ontology/prompt writes exactly one
ontology.ontology_prompt_runs row — on success and on failure. The
audit row is lightweight and workspace-scoped, so you can answer questions like
"how many prompts failed with InsufficientCreditsError in the last hour?"
without joining application logs.
What's stored:
| Column | Notes |
|---|---|
prompt_kind | Refiner family — e.g. markdown_text. Free-form string. |
source_hash | SHA-256 hex digest of the raw prompt. 64 chars. |
intent | update, query, or both. |
status | ok or error. |
node_count / edge_count | Refiner output counts on success, 0 otherwise. |
credits_used | Net credits charged. Zero after refund on failure. |
linked_credit_ledger_id | FK to the initial billing.credit_ledger reservation. |
summary | ≤500-char human-readable outcome. Never the raw prompt. |
error_code | exc.__class__.__name__ when status="error". |
duration_ms | End-to-end latency. |
correlation_id | UUID echoed back in X-Correlation-Id. |
Why no raw prompt
The audit row never contains your prompt text. We store:
- a SHA-256 hash — enough to detect replays and dedupe ingestion, not enough to reconstruct content; and
- an optional ≤500-character summary generated from counts and the answerer's narrative fragment.
This is a trust and privacy choice. Oxagen's mission is a personal knowledge graph you actually trust — storing raw prompts to "help with debugging" would leak the most sensitive surface of the product. Hash + summary lets engineers diagnose incidents without ever seeing the content.
Types catalog
The ontology.types table is an auto-maintained catalog of every (name, kind) tuple that currently appears on live nodes and edges in a workspace.
Each row carries:
name/kind— the type identifier and whether it describes nodes or edges.display_name/description— optional UI labels.row_count— number of live rows of this type.last_seen_at—MAX(created_at)across live rows.common_property_keys— top-N (default 10) most frequent JSONB property keys on live rows of this type, ordered by frequency.properties— optional JSONB for admin-authored schema.
Daily refresh
A Celery Beat job (fan_out_type_stats_refresh, 04:10 UTC daily) fans out
recompute_workspace_types(workspace_id) per workspace. Each per-workspace
task:
- Aggregates
row_countandlast_seen_atper(type, kind)from live nodes and edges. - Scans each type's
propertiesJSONB withjsonb_object_keysto compute the top-N most frequent keys (common_property_keys). - Upserts the active row via
ON CONFLICT DO UPDATE. - Soft-deletes stale type rows whose
(name, kind)no longer appears on any live node or edge, so the catalog never keeps names of types you've stopped using.
The task is internal, free (zero credits), and emits a best-effort
ontology_events row with event_type="type_stats_refresh" for
observability.
Credits — the metering model
Every ontology operation that spends LLM or embedding budget is credit-gated
through a single primitive, check_and_reserve_credits. The AlloyDB table
billing.credit_ledger is the source of truth; a denormalized mirror streams to
ClickHouse oxagen_events.credit_events for analytics.
Pricing table
| Kind | Cost | When it fires |
|---|---|---|
ontology.prompt.update | 5 | Refiner LLM call from /v1/ontology/prompt. |
ontology.prompt.query | 3 | Answerer LLM call from /v1/ontology/prompt. |
ontology.prompt.both | 7 | Update + query in one turn. |
ontology.search.hybrid | 1 | /v1/ontology/search (any mode). |
ontology.search.vector | 1 | Reserved for vector-only search callers. |
worker.embed_node | 1 | Async embedding of a single node. |
worker.recompute_type | 0 | Type-stats refresh. Free. |
worker.person_resolve | 0 | Deterministic resolver pass. Free. |
Costs are hard-coded in
packages/oxagen/oxagen/domains/billing/credits.py. Pricing changes are
deliberate code commits — a DB row can never silently double a customer's bill.
How the gate works
- The route classifies the request (e.g. the prompt's intent →
.update,.query, or.both). - It calls
check_and_reserve_credits(session, tenant_id=..., kind=...). - If balance < cost, the call raises
InsufficientCreditsErrorbefore any LLM or DB mutation work happens. FastAPI maps it to HTTP 402 with a structured body. - On success, a negative ledger row is written and the work runs.
- On pipeline failure, the route writes a matching positive refund row
keyed to the same kind, linked back via
linked_event_id.
402 response shape
{
"detail": {
"error": "insufficient_credits",
"required": 5,
"available": 2,
"kind": "ontology.prompt.update"
}
}Clients should surface this to the end user — the payload tells you exactly what was attempted and what's in the tank. Current balance, ledger history, and plan management live in the dashboard at app.oxagen.ai.