Zoom Meetings
Ingest Zoom meetings, attendees, and recorded transcripts into the workspace knowledge graph. The connector infers action items, deliverables, decisions, problems, and opportunities as typed nodes and wires them to the people they belong to.
Zoom Meetings
Meetings + transcripts → action items, decisions, deliverables.
The Zoom connector turns post-meeting cleanup into structured graph data. Every past meeting the authenticated host attended becomes a meeting node, every participant becomes a person node connected to that meeting, and — when cloud recording is enabled and a VTT transcript is available — an LLM extraction pass mines the transcript for action items, deliverables, decisions, problems, opportunities, and open-ended concepts. Each inferred item lands as its own typed node, edged back to the source meeting and (when the owner is on the participant list) to the person responsible.
The end result is a graph your agents can traverse in one hop: "What did Alice commit to in last week's meetings?", "Every customer pain we heard in Q2", "All deliverables that mention the Acme contract" — each is a single MATCH against the ontology.
What gets ingested
| Source | Node type | Edge type | Direction |
|---|---|---|---|
| Past meeting | meeting | — | — |
| Host (from meeting payload) | person | hosted / hosted_by | Person → Meeting + reverse |
| Participant (from past-meeting participants API) | person | had_participant / attended | Meeting → Person + reverse |
| Inferred action item | action_item | produced_action_item | Meeting → Action item |
| Inferred deliverable | deliverable | produced_deliverable | Meeting → Deliverable |
| Inferred decision | decision | produced_decision | Meeting → Decision |
| Inferred problem | problem | surfaced_problem | Meeting → Problem |
| Inferred opportunity | opportunity | surfaced_opportunity | Meeting → Opportunity |
| Inferred concept | concept | discussed_concept | Meeting → Concept |
| Action item / deliverable owner (when match exists) | person | assigned_to | Action item → Person |
Every meeting node carries the raw Zoom UUID, topic, start and end times, duration, host email, join URL, and (when ingested) the cleaned transcript text plus a 2–4 sentence meeting summary. Every inferred-insight node carries the model's title, a 1–2 sentence grounded paraphrase, the claimed owner email, and a 0.0–1.0 confidence score.
How a single meeting flows through the graph
Consider a real sales call:
Acme weekly sync — Tuesday 2pm. Host:
mac@oxagen.ai. Attendees:alice@oxagen.ai,bob@acme.com,carol@acme.com. Cloud recording on.Transcript excerpt: "Bob: We're blocked on getting the SSO config approved by our security team — that's our biggest pain right now. Mac: I'll send over our SOC 2 report by Friday so you can fast-track approval. Alice: I'll draft the technical onboarding deck for the security review. Carol: We can probably commit to 100 seats if onboarding goes smoothly. Mac: We'll go with the Pro plan then. Bob: Sounds good."
After this meeting syncs, the workspace graph contains:
(meeting "zoom:abc==", topic="Acme weekly sync", start="...")
├─[:hosted_by]──> (person "mac@oxagen.ai")
├─[:had_participant]──> (person "alice@oxagen.ai")
├─[:had_participant]──> (person "bob@acme.com")
├─[:had_participant]──> (person "carol@acme.com")
│
├─[:produced_action_item]──> (action_item "Send SOC 2 report to Acme by Friday")
│ └─[:assigned_to]──> (person "mac@oxagen.ai")
├─[:produced_action_item]──> (action_item "Draft technical onboarding deck")
│ └─[:assigned_to]──> (person "alice@oxagen.ai")
├─[:produced_deliverable]──> (deliverable "Technical onboarding deck")
├─[:produced_decision]──> (decision "Go with Pro plan for Acme")
├─[:surfaced_problem]──> (problem "SSO approval blocked at Acme security")
├─[:surfaced_opportunity]──> (opportunity "Acme commits to ~100 seats")
├─[:discussed_concept]──> (concept "SOC 2 compliance")
├─[:discussed_concept]──> (concept "SSO onboarding")
└─[:discussed_concept]──> (concept "Pro plan pricing")Every one of those nodes is queryable, embeddable, and edgeable. The alice@oxagen.ai node is the same node that Gmail, Calendar, Meet, and Contacts write to — so cross-source traversals "just work."
Real use cases
Sales: weekly customer-pain digest
"What problems did customers raise in the last 7 days, grouped by account?"
The agent runs:
MATCH (m:meeting)-[:surfaced_problem]->(p:problem)
WHERE m.start_time >= datetime() - duration({days: 7})
MATCH (m)-[:had_participant]->(person:person)
WHERE person.email ENDS WITH '@acme.com'
OR person.email ENDS WITH '@globex.com'
RETURN person.email AS account_contact,
p.title AS problem,
p.description AS context,
m.start_time AS heard_at
ORDER BY m.start_time DESCThe same problems node is then a natural edge target for competing_with, solved_by_feature, or builds edges written from other surfaces — for example, Linear issues that name a feature, or GitHub PR descriptions that reference the same problem text.
Operations: who owes what, by Friday
"Show me every open action item assigned to me from last week's meetings."
MATCH (m:meeting)-[:produced_action_item]->(a:action_item)
-[:assigned_to]->(p:person {email: 'mac@oxagen.ai'})
WHERE m.start_time >= datetime() - duration({days: 7})
RETURN a.title, a.description, m.topic, m.start_time
ORDER BY m.start_time DESCBecause assigned_to only fires when the LLM-extracted owner email matches an existing person node, this query never returns hallucinated owners. The connector deliberately does not mint a Person from a model-extracted email — it looks the person up against the participant list (and the rest of your workspace's people corpus) and skips attribution when there's no match. The owner_email property on the action item preserves the model's raw claim for downstream review.
Product: opportunities by theme
"Surface every expansion / partnership / new-use-case opportunity mentioned across all meetings this quarter, clustered by topic."
The opportunity nodes carry both their own embedding and a discussed_concept neighborhood from the same meeting, so the agent can group them with k-NN against the concept embeddings and write back a clustered_with edge as a derived view.
Engineering: tech-debt and risk surface
"Every problem that came up in our last 30 days of engineering syncs, and who raised it."
The connector is bias-neutral on topic — it surfaces customer pain, market threats, blockers, broken systems, tech debt, and architectural risks under the same problem type. The system prompt nudges the model toward business / product / sales / engineering use cases but explicitly leaves room for research, classroom, medical, legal, and personal contexts via the open-ended concept category.
Customer success: the "what happened in this account" timeline
MATCH (org:concept {title: 'Acme'})<-[:discussed_concept]-(m:meeting)
OPTIONAL MATCH (m)-[:produced_action_item]->(a:action_item)
OPTIONAL MATCH (m)-[:surfaced_problem]->(p:problem)
OPTIONAL MATCH (m)-[:produced_decision]->(d:decision)
RETURN m.start_time, m.topic, m.meeting_summary,
collect(DISTINCT a.title) AS actions,
collect(DISTINCT p.title) AS problems,
collect(DISTINCT d.title) AS decisions
ORDER BY m.start_time DESCOne query, one timeline — Acme-relevant meetings only, with every action, decision, and pain point that mentioned them, including the meetings where Acme was discussed but no one from Acme attended.
How invites and attendance are wired
Zoom's API model and the connector's response to it:
| Zoom field | Where it comes from | Where it lands in the graph |
|---|---|---|
meeting.uuid | /users/me/meetings?type=past (per-occurrence id — distinct from meeting_id which is shared across a recurring series) | meeting.name = "zoom:{uuid}", properties.zoom_uuid |
meeting.host_email | Same payload | (person)-[:hosted]->(meeting) and reverse |
meeting.start_time / end_time / duration | Same payload | meeting.properties.start_time, end_time, duration_minutes |
meeting.topic | Same payload | meeting.properties.topic plus a display_name decorated with the start timestamp so recurring meetings with identical topics render as distinguishable cards |
| Participants list | /past_meetings/{uuid}/participants (paginated) | (meeting)-[:had_participant]->(person) + reverse (person)-[:attended]->(meeting), with display_name and join_time stamped on the edge |
People-resolution uses the same helper Gmail / Calendar / Meet use (upsert_person_by_email), so a Zoom attendee whose email also appears in your inbox or calendar resolves to the same person node — no duplicates, no reconciliation step. Phone-only / anonymous callers (no email) fall back to display-name keying so the relationship still lands; they just won't dedupe across surfaces.
How transcripts and insights are inferred
When the host had cloud recording enabled and Zoom finished processing the VTT transcript, the connector:
- Fetches the transcript —
GET /meetings/{uuid}/recordings, filtersrecording_filesforfile_type == TRANSCRIPT, downloads the.VTTfile with the bearer token in theAuthorizationheader (never in the URL — see security notes below). - Parses VTT — strips the
WEBVTTheader, cue numbers, and timestamp lines; merges consecutive cues from the same speaker so half-sentences don't shred the downstream LLM context. The result is a cleanSpeaker: linetranscript text. - Persists the transcript — stored on
meeting.properties.transcript(capped at themax_transcript_charssetting, default 120,000 chars) plus atranscript_char_countcount. The transcript URL expires after Zoom deletes the recording, so persisting the parsed text locally is what keeps your graph queryable beyond Zoom's retention window. - Runs an LLM extraction pass — FAST-tier model (Haiku-class by default — Sonnet/Opus is overkill for structured paraphrase + classification). The system prompt instructs the model to be precise, terse, and grounded — never invent participants, projects, or commitments. The model returns one JSON object with six insight arrays (
action_items,deliverables,decisions,problems,opportunities,concepts) plus ameeting_summarystring. - Wires the insights — each insight becomes a typed node, find-or-create'd by a deterministic content slug (BLAKE2b hash of
kind:title.lower()) so re-running the sync on the same transcript collapses to the same nodes instead of duplicating. The correspondingproduced_*/surfaced_*/discussed_*edge from the meeting fires last. For action items and deliverables, anassigned_toedge is emitted only when the model-extracted owner email matches an existingpersonnode — hallucinated emails are logged aszoom.owner_unattributedand the edge is skipped.
The system prompt is deliberately bias-leaning toward business use cases — sales calls produce problem / opportunity nodes more often than research calls do — but the open-ended concept category catches everything else. Tested transcripts from research interviews, classroom debates, medical consultations, and architecture reviews all land high-signal concept nodes; they just don't necessarily generate action items.
Settings
| Key | Type | Default | Description |
|---|---|---|---|
backfill_days | number | 90 | How far back to scan on first sync. Subsequent syncs are incremental from last_synced_at. |
ingest_transcripts | boolean | true | Master switch for the transcript pipeline. When false, meetings + participants still ingest; transcripts are skipped entirely. |
infer_insights | boolean | true | When false, the transcript is still stored on the meeting node as a property but the LLM extraction pass is skipped (saves credits — useful for archival-only setups). |
min_transcript_chars | number | 200 | Transcripts shorter than this skip the LLM pass. Standups and 1:1s often fall below this — they still produce meeting + participant nodes, just no insight nodes. |
max_transcript_chars | number | 120,000 | Transcripts longer than this are truncated before extraction to keep prompt cost bounded. The cap is in characters, not tokens, because that's what's predictable from the source. |
Edit these per-connection in Settings → Connections → Zoom → Settings.
OAuth scopes
The Zoom Marketplace app requests read-only scopes only:
| Scope | What it's for |
|---|---|
meeting:read:list_past_instances | List the authenticated host's past meeting occurrences |
meeting:read:past_meeting | Pull individual meeting metadata |
meeting:read:list_past_participants | Pull the participants list per occurrence |
cloud_recording:read:list_user_recordings | Find the user's cloud recordings |
cloud_recording:read:recording | Download the transcript file |
user:read:user | Resolve the connecting user's email for nickname purposes |
No write scopes. No scheduling scopes. The connector cannot create, modify, or delete Zoom meetings.
Sync semantics
- Incremental cutoff —
last_synced_atis the lower bound on subsequent runs; the first sync uses thebackfill_dayswindow from the manifest. - Mid-pagination failure handling — if a list call fails part-way through (rate limit, transient 5xx, token revocation), the connection's
last_synced_atis left unchanged so the next run retries from the same cutoff. No silent gaps. - Token rotation propagation — Zoom rotates the refresh token on every use. A 401 mid-sync triggers a refresh-and-retry; the rotated credentials propagate forward through participant paging into the transcript download so a single rotation doesn't drop the transcript.
- Idempotency — meeting dedup uses
zoom:{uuid}as the canonical name (UUID is per-occurrence, so recurring meetings get distinct nodes). Inferred-insight dedup uses a content slug, so re-running the sync on the same transcript converges instead of duplicating.
Security notes
- The bearer token is passed in the
Authorizationheader on every Zoom API call and on the transcript download — never embedded in the URL query string. This avoids leaking the token into Nginx / Cloudflare / proxy access logs andRefererheaders. - Credentials are encrypted at rest with AES-GCM, scoped to the workspace's encryption key. Disconnecting the connector revokes the grant at Zoom and tears down the local row; opting into Purge nodes also hard-deletes every node and edge ingested from that connection.
- The connector never mints a
personnode from an LLM-extracted email. Owner attribution for action items and deliverables is a lookup against the existing People in the workspace; unmatched emails are preserved as a property on the insight node but do not create new entities.
Google Meet
Ingest Google Meet conference records and participants into the workspace knowledge graph. Each conference resolves to the same `meeting` node Calendar wrote, so invite + attendance live side-by-side.
Google Contacts
Import your Google Contacts as canonical `person` nodes in the workspace knowledge graph. Establishes the people corpus your other connectors dedupe against.