Meeting summaries that actually land in Salesforce, Slack, and your CRM — with citations.
We build automated summary and action-item pipelines on Claude, GPT-4, and Whisper that respect privacy boundaries and route output to the systems your team already uses. No standalone 'AI notes' tab nobody opens.
Most 'AI meeting notes' tools produce summaries nobody reads and action items nobody trusts.
The pattern repeats. A team buys a notetaker, it dumps 800-word recaps into a folder, action items lack speaker attribution, and within six weeks reps stop opening them. Worse, the tool stores recordings on a third-party server with retention policies your security team never reviewed. Sales leaders still hand-type next steps into Salesforce. Researchers still re-watch interview recordings to find the quote they remember. The AI part works fine — the integration, attribution, and trust layer is what was never built.
- ▸ Generic summaries with no speaker attribution, so reps can't tell who committed to what or when in the call.
- ▸ Action items dumped into a separate dashboard instead of pushed to Salesforce Tasks or Slack threads where work happens.
- ▸ Recordings stored on vendor infrastructure with retention defaults that conflict with HIPAA, CUI, or basic enterprise policy.
- ▸ No audit trail when the LLM hallucinates a commitment or misattributes a decision — and it will, occasionally, do both.
Build the pipeline like an engineer, not a demo.
- STEP-01
Capture and diarize cleanly
Pull recordings from Zoom, Gong, or Teams via webhook on call end. Run diarization (pyannote or AssemblyAI) before transcription so speaker turns survive into the LLM context. Store raw audio in your S3 bucket, not the vendor's.
- STEP-02
Chunk with structure, not tokens
Split transcripts by speaker turn and topic shift, not arbitrary token windows. A 60-minute sales call becomes 8-12 semantic chunks with timestamps and speaker IDs preserved. This is what makes attribution actually work in the final summary.
- STEP-03
Two-pass extraction with citations
First pass: per-chunk extraction of decisions, action items, objections, next steps. Second pass: dedupe and consolidate across chunks. Every action item carries a speaker attribution and a timestamp citation back to the transcript so reps can verify before pushing to Salesforce.
- STEP-04
Route to the systems people use
Action items land in Salesforce as Tasks linked to the Opportunity. Summary posts to the deal's Slack channel. Customer interview notes sync to Notion or Dovetail with tags. Nobody opens a separate 'AI tool' — the output shows up where work already happens.
- STEP-05
Privacy controls that survive audit
PII redaction before LLM call, configurable retention windows, per-room consent flags, and an audit log of every prompt and completion. For regulated calls (HIPAA, ITAR, CUI) we route to Bedrock or Azure OpenAI in your tenant — recordings never leave your VPC.
from pydantic import BaseModel, Field
from typing import Literal
class ActionItem(BaseModel):
owner: str = Field(description="Speaker name from diarization")
task: str = Field(description="Specific, verifiable action")
due: str | None = Field(description="ISO date if mentioned, else null")
citation_ts: float = Field(description="Seconds into recording")
confidence: Literal["high", "medium", "low"]
class MeetingSummary(BaseModel):
tldr: str = Field(max_length=600)
decisions: list[str]
action_items: list[ActionItem]
open_questions: list[str]
risks: list[str]
# Force structured output — no free-form 'here's a summary' prose
response = client.messages.create(
model="claude-sonnet-4",
system=SYSTEM_PROMPT, # includes attribution rules
messages=[{"role": "user", "content": diarized_transcript}],
tools=[{"name": "emit_summary", "input_schema": MeetingSummary.model_json_schema()}],
tool_choice={"type": "tool", "name": "emit_summary"},
)
summary = MeetingSummary.model_validate(response.content[0].input)
# Every action item is now traceable back to a timestamp before it touches Salesforce
for item in summary.action_items:
if item.confidence == "low":
queue_for_human_review(item)
else:
push_to_salesforce_task(item, opportunity_id=ctx.opp_id) Structured output with per-item citations is the difference between a summary you trust and one your reps quietly stop reading.
Field FAQ.
→ How accurate are the summaries, really?
On clean audio with good diarization, action item extraction lands in the 85-95% precision range with structured output and a citation-required prompt. The failure mode isn't usually hallucination — it's missing items when speakers talk over each other or use vague language like 'we should probably look at that.' We tune the prompt to flag low-confidence items for human review rather than silently dropping or inventing them.
→ Can speaker attribution actually work, or is it always 'Speaker 1 said'?
It works if you do the diarization step properly before sending to the LLM. We map diarized speaker labels to real names using calendar invite attendees plus a short voice enrollment for recurring participants. For one-off calls with unknown speakers, we keep the generic labels but tag them with role inferred from context (host, prospect, technical). Skipping diarization and asking the LLM to figure out who said what from the transcript alone does not work reliably.
→ Where does the summary actually end up?
Wherever your team already works. Typical destinations: Salesforce Tasks and Opportunity activity timeline for sales calls, Slack channel post for the deal or project, Notion or Confluence page for internal meetings, Dovetail or a research repo for customer interviews. We also push to Linear or Jira when action items are engineering work. The integration layer is usually a few hundred lines of TypeScript per destination — nothing exotic.
→ What about privacy and sensitive recordings?
Three controls matter. First, where the model runs — for regulated workloads we use Bedrock, Azure OpenAI, or self-hosted models inside your VPC so audio and transcripts never cross a tenant boundary. Second, PII redaction before the LLM call using Presidio or a similar tool. Third, retention — raw recordings purge on a schedule you set, and the audit log captures every prompt, completion, and downstream write. This is also what makes it defensible for federal and healthcare clients.
→ Does this work for federal or CUI environments?
Yes. As an SDVOSB we deploy this pattern in GovCloud and Azure Government using FedRAMP-authorized model endpoints (Bedrock in GovCloud, Azure OpenAI Gov). Transcripts and recordings stay inside the boundary, the audit log meets standard logging requirements, and we can wire the pipeline into approved collaboration tools. We have shipped variants of this for contracting officer interviews and internal program reviews where recording handling is non-negotiable.
→ How long does a typical rollout take?
For a single source (say, Zoom) and two destinations (Salesforce + Slack) with standard privacy controls, three to five weeks end to end including a two-week pilot with one team. Adding sources or destinations is usually a week each. The long pole is almost never the AI part — it's getting OAuth scopes approved in your Salesforce org and agreeing on what fields action items map to.
→ What does it cost to run per meeting?
Model costs for a 60-minute meeting run roughly $0.15 to $0.60 depending on model choice (Claude Sonnet vs Haiku, GPT-4o vs mini) and whether you're doing single-pass or two-pass extraction. Transcription via AssemblyAI or Whisper adds another $0.30-0.60. At 1,000 meetings a month that's a few hundred dollars in API spend — trivial compared to rep time saved, but worth modeling before you commit to a model tier.
→ Why not just use the summary feature built into Zoom or Gong?
Use them if they're sufficient. They're fine for a generic recap. They fall short when you need (a) action items that actually land in Salesforce with the right field mapping, (b) custom extraction logic for your sales methodology or research framework, (c) summaries that respect your privacy and retention policies, or (d) outputs feeding downstream automation. The build-vs-buy line is usually drawn at integration depth and policy control.
→ Can it handle customer interviews differently from sales calls?
Yes, and it should. Sales calls extract MEDDIC or BANT signals, objections, and next-step commitments. Customer interviews extract pain points, jobs-to-be-done quotes with verbatim attribution, and feature requests tagged by theme. Same pipeline, different prompt and schema per meeting type, routed by calendar metadata or a tag the organizer sets. Mixing them produces mush — separate them from day one.
Continue recon.
AI Integration Services
How we wire Claude, GPT, and RAG into existing workflows without rip-and-replace.
REL-02Shipped Integrations
Real rollouts: timelines, model choices, and what we'd do differently next time.
REL-03Fixed-Scope Packages
Defined pilots for meeting summarization and other AI integration patterns.
REL-04Scope a Pilot
Bring a recording and a target system. We'll sketch the pipeline on a call.
Stop paying reps to take notes. Let's wire your meetings into the systems they should already be in.
Talk to a VooStack operator. We respond within one business day.