Oxagen Docs

PDF generation

How Oxagen agents render documents to PDF — vendor-specific exports for generated Google Workspace artifacts, plus a generic LibreOffice-backed converter for everything else in the workspace document store.

The PDF surface is the agent's serialisation channel. Anything in app.documents that has a native office format (.docx, .pptx, .xlsx, .odt, .gdoc / .gsheet / .gslides) can be rendered to PDF and re-stored as a sibling document. The agent gets one deterministic byte stream regardless of source vendor.

There are two paths into PDF:

  1. Vendor-native exports for artifacts the agent itself just produced in Google Workspace — docs.export_pdf, sheets.export_pdf, slides.export_pdf. The Google API produces the PDF directly; no LibreOffice in the loop.
  2. Generic LibreOffice conversion for everything else — user-uploaded .docx / .pptx / .xlsx / .odt and externally-imported documents. pdf.convert routes through a Gotenberg service that drives a headless LibreOffice, the same converter that powers the universal document preview drawer in the web app.

Capability surface

CapabilityCreditsWhat it does
docs.export_pdf3Render a Google Doc the agent generated to PDF using the Google Drive export endpoint. Returns PDF bytes; stored as a sibling row with parent_document_id pointing at the source doc.
sheets.export_pdf3Same shape for a Google Sheet.
slides.export_pdf3Same shape for a Google Slides deck.
pdf.convert2Convert any office-format document already in app.documents to PDF via the LibreOffice-backed Gotenberg converter.

pdf.convert is the generic path — it accepts any document the LibreOffice filter understands (.docx, .pptx, .xlsx, .odt, .ods, .odp, .rtf, .txt, plus .gdoc / .gsheet / .gslides via an upstream Google export). The vendor-specific *.export_pdf capabilities are preferred for vendor-native artifacts because they preserve fidelity better than a LibreOffice round-trip.

Gotenberg routing

pdf.convert posts the source bytes to a Gotenberg deployment configured via the GOTENBERG_URL secret. The converter is a thin HTTP wrapper around Gotenberg's /forms/libreoffice/convert route. Concretely:

  • Per-call timeout defaults to 120 seconds (override with GOTENBERG_TIMEOUT on the API service).
  • The provider uses pypdf to read the page count from the returned PDF; the count is stamped onto the resulting app.documents.page_count.
  • A missing GOTENBERG_URL returns a 503 pdf_converter_unavailable rather than a vendor-flavoured 500 — same sentinel the document-preview path catches to render a graceful "preview unavailable" badge.

The conversion pipeline never persists raw source bytes outside the workspace document store. Gotenberg holds the source for the duration of one conversion request and then drops it.

Calling it

curl -X POST https://api.oxagen.ai/v1/cap/pdf.convert \
  -H "Authorization: Bearer $OXAGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{
        "document_id": "8e6c…",
        "store_as_sibling": true
      }'

store_as_sibling: true (the default) writes the resulting PDF back to app.documents with parent_document_id pointing at the source. Set it to false to get the bytes back without storing.

Storage and provenance

Each generated PDF lands in app.documents:

ColumnValue
kinduploaded
extension.pdf
mime_typeapplication/pdf
sourceagent_generated
providerlocal for pdf.convert, google for *.export_pdf
parent_document_idThe source document id
page_countPopulated from pypdf
generated_by_run_idThe agent run that produced the PDF

The same row is what the documents browser, the preview drawer, and the ingestion pipeline all read — no PDF-specific code path. See Artifact storage.

Permissions

*.export_pdf capabilities require the workspace's Google Workspace connection (same as the underlying generator). pdf.convert only requires the workspace itself — Gotenberg runs server-side, not against a third-party vendor.

Audit

Each conversion writes an audit.event row with action='document.pdf_exported' (or 'document.pdf_converted' for the generic path), chained to the workspace's audit stream. See Events, triggers, and audits.


Artifacts overview · Document generation · Spreadsheets · Slides

On this page