Oxagen Docs

Question Answering

Query your workspace ontology in natural language. The question-answerer agent runs hybrid retrieval over your typed knowledge graph and returns a grounded, citation-backed narrative.

Question Answering

The question-answerer agent answers natural-language questions against your workspace's ontology. It does not hallucinate — every claim in the response is grounded in a node your workspace contains, and each node is cited by UUID so you can trace the answer back to its source.


How it works

When you ask a question, the agent runs a four-step pipeline:

  1. Embed — your question is embedded using the workspace's configured provider (default: OpenAI text-embedding-3-small).
  2. Retrieve — candidate nodes are pulled via hybrid search: a vector pass finds semantically similar nodes; a text-match pass ensures exact-name hits are not ranked out. The two result sets are merged and deduplicated.
  3. Compose — the candidates are passed to the default-tier LLM with a grounding-only system prompt. The model writes a short narrative citing specific nodes by UUID. It is instructed to say "I don't have anything about that" rather than fabricate.
  4. Parse — cited UUIDs are extracted and matched back to node records. Each match becomes a NodeCitation with the node's type, name, and a short excerpt.

The agent also fires node.referenced events for every cited node, which feeds the importance algorithm and causes frequently-cited nodes to surface higher in future retrievals.


Web app

Open any workspace and type a question into the chat input. The agent detects query intent automatically — you do not need to prefix your message.

Trace panel

While the agent runs you will see a live trace panel above the response:

✓ Embedding prompt          42ms
✓ Retrieving candidates     88ms
⟳ Calling LLM...

Once the answer arrives, the trace collapses to a summary line:

Traced in 1.4s — 18 candidates · 3 citations  ↓ [Download JSON]

Click to expand the full trace. Toggle Show verbose to see candidate node names and scores, token counts, and the raw LLM response.

Downloading the trace

Click Download JSON in the expanded trace panel. The download is assembled from the data already in your browser — no second network request is made. The file is named trace-{correlation_id}.json and contains the complete pipeline trace including all candidate nodes, timings, and the raw LLM output.

Citations

Citations appear below the narrative as a list of nodes. Each citation shows the node type, name, and the excerpt the agent used to justify it. Click any citation to open the node detail view in the ontology explorer.


API

Use POST /v1/ontology/prompt with force_intent: "query" to run a pure question against your workspace:

curl -X POST https://api.oxagen.ai/v1/ontology/prompt \
  -H "Authorization: Bearer $OXAGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What infrastructure components depend on the auth service?",
    "force_intent": "query"
  }'

Response:

{
  "intent": "query",
  "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "answer": {
    "narrative": "The API gateway [node:abc123] routes all traffic through the auth service [node:def456]. The worker service [node:ghi789] also depends on auth for job-level permission checks.",
    "citations": [
      {
        "node_id": "abc123",
        "type": "Service",
        "name": "API Gateway",
        "excerpt": "API Gateway"
      },
      {
        "node_id": "def456",
        "type": "Service",
        "name": "Auth Service",
        "excerpt": "Auth Service"
      },
      {
        "node_id": "ghi789",
        "type": "Service",
        "name": "Worker Service",
        "excerpt": "Worker Service"
      }
    ],
    "candidate_count": 14,
    "retrieval_mode": "hybrid"
  },
  "nodes_created": [],
  "edges_created": [],
  "dropped_edges": 0
}

Omit force_intent and the agent classifies intent automatically — a question phrased as a question routes to the query pipeline; a block of notes routes to the update pipeline.

For real-time chain-of-thought streaming use POST /v1/ontology/prompt/stream, which emits Server-Sent Events as each pipeline step completes.


MCP

The ontology.ask tool runs the question-answerer from any MCP-connected agent:

ontology.ask
  question: "Which nodes are connected to the payment service?"

The tool returns the narrative and citations in structured form. Citations include node_id, type, name, and excerpt — your agent can follow up with ontology.get_node to inspect any cited node in detail.


Retrieval modes

The workspace search.retrieval_mode config controls how candidates are gathered. Three modes are available:

ModeBehaviour
hybrid (default)Vector similarity pass merged with BM25 text-match pass. Best recall for most workspaces.
vectorEmbedding similarity only. Useful when your workspace has dense, uniform node names.
structuralGraph-traversal pass anchored on the highest-importance nodes. Useful for deep dependency questions.

Change the mode via the workspace settings page or via POST /v1/workspaces/{id}/config:

{ "search": { "retrieval_mode": "structural" } }

The retrieval_mode value is echoed in every answer response so you can confirm which path ran.


Interpreting citations

A citation means the agent found a node in your ontology that it judged relevant to the question. The absence of a citation for a claim should not happen — the agent is prompted to only claim what the candidates support.

If the narrative contains [node:UUID] references that do not match any returned citation, the UUID was in the raw LLM response but not matched to a candidate record. This is rare and indicates a UUID hallucination — treat those claims as unverified.


Tips

What makes a good question

  • Specific entity names work best: "What does the payment service depend on?" outperforms "Tell me about dependencies."
  • Relationship questions are well-suited to hybrid retrieval: "Who owns the analytics pipeline?" or "Which services write to the events table?"
  • Multi-hop questions work if your graph has edge coverage: "What does the auth service's upstream infrastructure look like?"

When the answer says "I don't have anything about that"

The workspace ontology does not contain relevant nodes for your question. Options:

  1. Ingest more data — paste notes, connect a data source, or run a GitHub ingestion to populate the relevant area of the graph.
  2. Rephrase using node names that appear in the ontology explorer.
  3. Check candidate_count in the API response — a zero means retrieval returned nothing, not that the LLM gave up.

Controlling candidate depth

Pass top_k in the API request to override the workspace default (20):

{ "prompt": "...", "force_intent": "query", "top_k": 40 }

Higher values improve recall on sparse workspaces at the cost of a larger LLM context window and slightly higher latency.

On this page