Question Answering
Query your workspace ontology in natural language. The question-answerer agent runs hybrid retrieval over your typed knowledge graph and returns a grounded, citation-backed narrative.
Question Answering
The question-answerer agent answers natural-language questions against your workspace's ontology. It does not hallucinate — every claim in the response is grounded in a node your workspace contains, and each node is cited by UUID so you can trace the answer back to its source.
How it works
When you ask a question, the agent runs a four-step pipeline:
- Embed — your question is embedded using the workspace's configured provider (default: OpenAI
text-embedding-3-small). - Retrieve — candidate nodes are pulled via hybrid search: a vector pass finds semantically similar nodes; a text-match pass ensures exact-name hits are not ranked out. The two result sets are merged and deduplicated.
- Compose — the candidates are passed to the default-tier LLM with a grounding-only system prompt. The model writes a short narrative citing specific nodes by UUID. It is instructed to say "I don't have anything about that" rather than fabricate.
- Parse — cited UUIDs are extracted and matched back to node records. Each match becomes a
NodeCitationwith the node's type, name, and a short excerpt.
The agent also fires node.referenced events for every cited node, which feeds the importance algorithm and causes frequently-cited nodes to surface higher in future retrievals.
Web app
Open any workspace and type a question into the chat input. The agent detects query intent automatically — you do not need to prefix your message.
Trace panel
While the agent runs you will see a live trace panel above the response:
✓ Embedding prompt 42ms
✓ Retrieving candidates 88ms
⟳ Calling LLM...Once the answer arrives, the trace collapses to a summary line:
Traced in 1.4s — 18 candidates · 3 citations ↓ [Download JSON]Click ↓ to expand the full trace. Toggle Show verbose to see candidate node names and scores, token counts, and the raw LLM response.
Downloading the trace
Click Download JSON in the expanded trace panel. The download is assembled from the data already in your browser — no second network request is made. The file is named trace-{correlation_id}.json and contains the complete pipeline trace including all candidate nodes, timings, and the raw LLM output.
Citations
Citations appear below the narrative as a list of nodes. Each citation shows the node type, name, and the excerpt the agent used to justify it. Click any citation to open the node detail view in the ontology explorer.
API
Use POST /v1/ontology/prompt with force_intent: "query" to run a pure question against your workspace:
curl -X POST https://api.oxagen.ai/v1/ontology/prompt \
-H "Authorization: Bearer $OXAGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "What infrastructure components depend on the auth service?",
"force_intent": "query"
}'Response:
{
"intent": "query",
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"answer": {
"narrative": "The API gateway [node:abc123] routes all traffic through the auth service [node:def456]. The worker service [node:ghi789] also depends on auth for job-level permission checks.",
"citations": [
{
"node_id": "abc123",
"type": "Service",
"name": "API Gateway",
"excerpt": "API Gateway"
},
{
"node_id": "def456",
"type": "Service",
"name": "Auth Service",
"excerpt": "Auth Service"
},
{
"node_id": "ghi789",
"type": "Service",
"name": "Worker Service",
"excerpt": "Worker Service"
}
],
"candidate_count": 14,
"retrieval_mode": "hybrid"
},
"nodes_created": [],
"edges_created": [],
"dropped_edges": 0
}Omit force_intent and the agent classifies intent automatically — a question phrased as a question routes to the query pipeline; a block of notes routes to the update pipeline.
For real-time chain-of-thought streaming use POST /v1/ontology/prompt/stream, which emits Server-Sent Events as each pipeline step completes.
MCP
The ontology.ask tool runs the question-answerer from any MCP-connected agent:
ontology.ask
question: "Which nodes are connected to the payment service?"The tool returns the narrative and citations in structured form. Citations include node_id, type, name, and excerpt — your agent can follow up with ontology.get_node to inspect any cited node in detail.
Retrieval modes
The workspace search.retrieval_mode config controls how candidates are gathered. Three modes are available:
| Mode | Behaviour |
|---|---|
hybrid (default) | Vector similarity pass merged with BM25 text-match pass. Best recall for most workspaces. |
vector | Embedding similarity only. Useful when your workspace has dense, uniform node names. |
structural | Graph-traversal pass anchored on the highest-importance nodes. Useful for deep dependency questions. |
Change the mode via the workspace settings page or via POST /v1/workspaces/{id}/config:
{ "search": { "retrieval_mode": "structural" } }The retrieval_mode value is echoed in every answer response so you can confirm which path ran.
Interpreting citations
A citation means the agent found a node in your ontology that it judged relevant to the question. The absence of a citation for a claim should not happen — the agent is prompted to only claim what the candidates support.
If the narrative contains [node:UUID] references that do not match any returned citation, the UUID was in the raw LLM response but not matched to a candidate record. This is rare and indicates a UUID hallucination — treat those claims as unverified.
Tips
What makes a good question
- Specific entity names work best: "What does the payment service depend on?" outperforms "Tell me about dependencies."
- Relationship questions are well-suited to hybrid retrieval: "Who owns the analytics pipeline?" or "Which services write to the events table?"
- Multi-hop questions work if your graph has edge coverage: "What does the auth service's upstream infrastructure look like?"
When the answer says "I don't have anything about that"
The workspace ontology does not contain relevant nodes for your question. Options:
- Ingest more data — paste notes, connect a data source, or run a GitHub ingestion to populate the relevant area of the graph.
- Rephrase using node names that appear in the ontology explorer.
- Check
candidate_countin the API response — a zero means retrieval returned nothing, not that the LLM gave up.
Controlling candidate depth
Pass top_k in the API request to override the workspace default (20):
{ "prompt": "...", "force_intent": "query", "top_k": 40 }Higher values improve recall on sparse workspaces at the cost of a larger LLM context window and slightly higher latency.