Gmail
Ingest email subjects, bodies, and attachments into the workspace knowledge graph. People, companies, and topics mentioned across your inbox become typed nodes your AI agents can traverse.
Gmail
Inbox → typed people, threads, attachments, and topics.
The Gmail connector turns your inbox into structured graph data. Every message inside the configured backfill window becomes an email node; senders and recipients resolve to person nodes (deduplicated by lowercased email address across every Google connector); attachments above the minimum size threshold are parsed (PDF, DOCX, plain text) and feed the same LLM extractor that processes the body so the entities mentioned land as typed nodes.
What gets ingested
| Source | Node type | Edge type | Direction |
|---|---|---|---|
| Email message | email | — | — |
| Sender | person | from / reverse sent | Email → Person |
| Recipient (To / Cc) | person | to / reverse received | Email → Person |
| Attachment | attachment | has_attachment | Email → Attachment |
| Entity extracted from body or attachment text | varies (person / organization / concept / date / task) | mentions | Email → Entity |
Subjects, bodies, and parsed attachment text feed the LLM extractor — people, companies, dates, projects, and topics referenced across your inbox become first-class nodes your agents can query.
Real use cases
- Account intelligence —
MATCH (org:organization {name: 'Acme'})<-[:mentions]-(e:email)returns every thread that talked about an account, including ones you weren't directly on. - Inbox-aware action items — combined with the Zoom or Calendar connectors, a
personnode bridges email + meetings, so "Alice agreed to send the report in email and confirmed in our meeting" is one entity, two pieces of evidence. - Find that thread — semantic search over the body embeddings retrieves emails by meaning, not just by subject keyword.
Settings
| Key | Type | Default | Description |
|---|---|---|---|
backfill_days | number | 90 | How many days of inbox history to scan on the first sync. |
extract_body | boolean | true | When false, only the email container is stored — no entity extraction from the body. |
extract_attachments | boolean | true | When false, attachments are stored as metadata-only nodes. |
min_attachment_bytes | number | 51,200 (50 KB) | Attachments smaller than this skip extraction — avoids parsing tracking-pixel-sized images. |
max_attachment_bytes | number | 20,000,000 (20 MB) | Attachments larger than this are stored as metadata-only nodes — avoids runaway LLM costs on huge decks. |
max_message_bytes | number | 1,000,000 (1 MB) | Bodies larger than this are truncated before extraction. |
exclude_labels | string[] | ["SPAM", "TRASH"] | Gmail labels to skip. Common additions: Promotions, Updates. |
OAuth scopes
gmail.readonly— read messages and metadata; no send, no modify, no deleteuserinfo.email— resolve the connecting user's email for nickname purposes
Tokens are encrypted at rest and refresh automatically. Revoking the grant at Google revokes our copy.
Connectors
First-party data-source connectors that ingest your real-world data — meetings, email, calendars, files, code, transactions, warehouses — into a typed, queryable knowledge graph your AI agents plug into.
Google Calendar
Ingest calendar events, attendees, and locations into the workspace knowledge graph. People you meet about become first-class entities your AI agents can query across surfaces.