Artifact storage
Where Oxagen stores everything the agent produces — the unified app.documents table, the source / provider / generated_by_run_id discriminators, and the documents browser that answers "what did the agent ship last week?".
Every artifact the agent generates — Google Doc, Google Sheet,
Google Slides deck, PDF — lands in the same app.documents table
that holds user-uploaded files and in-product markdown notes.
There is no agent-specific document store, no parallel
agent_artifacts table, no hidden bucket. The agent's output is a
first-class workspace document.
The unified app.documents row
Three columns disambiguate the agent surface:
| Column | Values | Meaning |
|---|---|---|
source | user_upload · agent_generated · external_import | Where the document came from. Constrained by a CHECK so SQL consumers can rely on the closed value set. |
generated_by_run_id | UUID | The agent run that produced this document. Populated when source='agent_generated'. |
provider | google · microsoft · local · connector-specific | The external system responsible for the binary. local means the bytes live in object storage; the rest mean a vendor copy. |
external_id | string | Stable id for the vendor object (e.g. Google Drive file id). |
external_url | string | Viewer URL — the link that opens the live doc in the vendor's UI. |
parent_document_id | UUID | Self-FK to the document this row was derived from. A docs.export_pdf PDF points at the source Doc; an external sync revision points at the previous revision. |
Beyond these, the row carries the same fields a user upload
carries — label, name, extension, mime_type,
storage_path, preview_pdf_path, page_count,
file_size_bytes, ingestion_status. Generation does not skip
ingestion: an agent-authored doc is indexed into the workspace
graph the same way an uploaded doc is, so an agent reading
"every doc that mentions Acme" sees its own output alongside
human uploads.
What gets stamped on each format
| Capability | kind | extension | mime_type | provider |
|---|---|---|---|---|
docs.create_from_spec | uploaded | .gdoc | application/vnd.google-apps.document | google |
sheets.create_from_spec | uploaded | .gsheet | application/vnd.google-apps.spreadsheet | google |
slides.create_from_spec | uploaded | .gslides | application/vnd.google-apps.presentation | google |
docs.export_pdf / sheets.export_pdf / slides.export_pdf | uploaded | .pdf | application/pdf | google (re-export from Drive) |
pdf.convert | uploaded | .pdf | application/pdf | local (Gotenberg) |
source is always agent_generated. generated_by_run_id
always points at the run that authored it.
Provenance chain
The parent_document_id self-FK is the chain agents traverse to
answer "what was this PDF rendered from?":
Drive Doc (source='agent_generated', generated_by_run_id=R1)
↑ parent_document_id
PDF export (source='agent_generated', generated_by_run_id=R1)
↑ parent_document_id
LibreOffice-converted PDF of a user upload (source='agent_generated', generated_by_run_id=R2)The chain is workspace-scoped — a follow-up parent_document_id
hop never leaves the workspace. Re-rendering an artifact reuses
the chain (same parent, new sibling row) so audit reconstructs
"who exported this, when, with which run".
The documents browser
The dashboard's documents browser reads app.documents directly.
The agent-output view applies one filter:
SELECT *
FROM app.documents
WHERE workspace_id = $1
AND source = 'agent_generated'
AND is_deleted = false
ORDER BY created_at DESC
LIMIT 50A composite index
(workspace_id, source, created_at DESC) supports the dominant
pagination pattern without an extra sort. The same index powers
the "agent output, this week" query on the workspace overview
card.
Tags
Workspace tags (app.document_tags joined via
app.document_tag_links) attach to agent-authored documents the
same way they attach to uploads. The generation tools auto-apply
three tags when present in the workspace tag dictionary:
| Tag | Value |
|---|---|
agent | <agent_slug> — the named agent that produced the doc, when set. |
kind | doc · sheet · slides · pdf |
run | <run_id> — back-reference to the agent run. |
Tags missing from the dictionary are silently skipped — the agent never invents a tag. Workspace admins manage the dictionary at Settings → Document tags.
Linked nodes
The ingestion pipeline writes a document node to the workspace
ontology for every row and joins it to extracted entities via
mentions edges. app.document_node_links keeps the
Postgres ↔ Neo4j bridge — an agent calling
ontology.list_nodes { type: 'document' } gets every
agent-authored doc alongside user uploads, queryable by the
same mentions graph traversals.
Soft delete and retention
Documents follow the standard soft-delete contract from
WorkspaceScopedBase — is_deleted = true plus deleted_at and
deleted_by_id. A soft-deleted agent artifact stops appearing in
the documents browser and stops contributing to ingestion, but
the row remains queryable from the audit chain. Hard delete is a
workspace-admin-only operation; the audit.event row recording
the delete is not deletable (the audit schema revokes
DELETE on the oxagen role).
Storage targets
The binary location depends on the provider:
provider = 'google'— the canonical bytes live in Google Drive.storage_pathis null;external_urlis the viewer URL. The PDF export of a Google artifact is also stored locally (via the export endpoint's byte stream).provider = 'local'— the bytes live in object storage atstorage_path(gs://oxagen-documents/...). Used bypdf.convertand by user uploads.provider = 'microsoft'— same as Google but the file lives in the Microsoft Graph drive item. Used by external imports from the Microsoft 365 connector.
The two paths share the same app.documents surface so an agent
reading the documents browser never has to branch on provider.
Audit
Each generation, conversion, share, and delete writes an
audit.event row chained to the workspace's audit stream. See
Events, triggers, and audits.
Artifacts overview · Document generation · Brand kits · Events, triggers, and audits
Brand kits
How Oxagen workspaces define palette, typography, spacing, logos, and voice — and how every agent-authored document, spreadsheet, slide deck, and PDF picks up the active kit automatically.
MCP Server
The Oxagen MCP server exposes your workspace knowledge graph as typed, workspace-scoped tools to any MCP-compatible coding agent.