Which Services Include Speaker Identification? (2026)
Speaker identification pricing varies wildly across transcription services. Some include it free in base pricing, others charge $0.02+ per hour as an add-on, and a few bury the cost in unclear documentation. Before choosing a service, knowing whether speaker ID is included or extra can significantly affect your total cost—especially for meeting transcription, interviews, and podcast production where knowing "who said what" matters.
This guide compares speaker identification pricing across 7 major transcription services, showing exactly which include it free and which charge extra.
Quick Navigation
- Speaker Identification: Included vs Extra
- Services That Include Speaker ID Free
- Services That Charge Extra for Speaker ID
- Full Comparison Table
- Real Cost Impact: Examples
- When Speaker ID Matters Most
- How to Choose Based on Your Needs
- Related Resources
Speaker Identification: Included vs Extra
Speaker identification (technically called speaker diarization) detects different voices in your audio and labels them consistently throughout the transcript—"Speaker 1 said this, then Speaker 2 responded."
The pricing models fall into two categories:
Included in base price: You pay one rate and get speaker labels automatically. No surprise charges, no feature toggles.
Add-on feature: Base transcription is cheaper, but enabling speaker identification adds a per-minute surcharge. Your effective rate increases by 10-25%.
Services That Include Speaker ID Free
BrassTranscripts
Pricing: $2.50 flat (1-15 min) | $6.00 flat (16-120 min) Speaker ID: Included, no extra charge
Speaker identification runs automatically on every upload using Pyannote 3.1 diarization. No checkbox to enable, no add-on to purchase. A 60-minute meeting with 4 speakers costs the same $6.00 as a single-speaker podcast episode.
- Works with 2-6 speakers
- Labels included in all output formats (TXT, SRT, VTT, JSON)
- No API required—upload via web interface
Otter.ai
Pricing: $8.33-16.99/month (Pro) | $20-30/month (Business) Speaker ID: Included in all paid plans
Otter.ai includes speaker diarization in subscription tiers. The free plan also supports speaker ID but with 30-minute meeting limits.
- Real-time transcription with live speaker labels
- Meeting integrations (Zoom, Teams, Meet)
- Subscription-based with minute caps
Amazon Transcribe
Pricing: $0.024/min (Tier 1) down to $0.0078/min (Tier 4) Speaker ID: Included in base price
According to AWS Transcribe pricing, speaker identification is included in standard transcription at no additional charge.
- Requires AWS account and S3 setup
- API-only (no web upload interface)
- Best for high-volume enterprise use with existing AWS infrastructure
Services That Charge Extra for Speaker ID
AssemblyAI
Base pricing: $0.0025/min ($0.15/hour) Speaker ID add-on: +$0.00033/min (+$0.02/hour) Total with speaker ID: $0.00283/min ($0.17/hour)
According to AssemblyAI's pricing page, speaker diarization costs an additional $0.02 per hour on top of base transcription.
Cost impact: Adding speaker ID increases your bill by ~13%.
Example: 10 hours of meeting transcription
- Base only: $1.50
- With speaker ID: $1.70 (+$0.20)
Deepgram
Base pricing: $0.0077/min Nova-3 (PAYG) | $0.0065/min (Growth) Speaker ID add-on: ~$0.001-0.002/min (pricing not clearly published) Total with speaker ID: ~$0.0087-0.0097/min
Deepgram offers speaker diarization as a separate feature. Exact pricing isn't prominently displayed on their public pricing page.
Cost impact: Adding speaker ID increases your bill by ~13-26%.
Google Cloud Speech-to-Text
Base pricing: $0.016-0.024/min depending on model Speaker ID add-on: Extra charge (pricing not clearly documented)
Google Cloud offers speaker diarization but doesn't clearly publish the add-on pricing on their standard pricing page. Based on GCP patterns, expect additional per-minute charges.
Cost impact: Unknown without GCP account access.
Rev.ai
Base pricing: $0.02/min (English) | $0.033/min (Global) Speaker ID: Included in async transcription, extra for streaming
Rev.ai includes speaker diarization for pre-recorded audio but charges extra for real-time streaming with speaker labels.
Full Comparison Table
| Service | Base Rate | Speaker ID | Total w/ Speaker ID | Model |
|---|---|---|---|---|
| BrassTranscripts | $6.00 flat (16-120 min) | Included | $6.00 flat | Pay-per-file |
| Otter.ai | $8.33-16.99/mo | Included | Same | Subscription |
| Amazon Transcribe | $0.024/min | Included | $0.024/min | Pay-per-minute |
| AssemblyAI | $0.0025/min | +$0.00033/min | $0.00283/min | Pay-per-minute |
| Deepgram | $0.0077/min | +~$0.0015/min | ~$0.0092/min | Pay-per-minute |
| Google Cloud | $0.016-0.024/min | Extra (TBD) | Unknown | Pay-per-minute |
| Rev.ai | $0.02/min | Included (async) | $0.02/min | Pay-per-minute |
Prices verified January-February 2026. Check official pricing pages for current rates.
Real Cost Impact: Examples
Example 1: Weekly Team Meetings (4 hours/month)
A team transcribing four 1-hour meetings monthly:
| Service | Monthly Cost | Speaker ID |
|---|---|---|
| BrassTranscripts | $24.00 (4 × $6.00) | Included |
| Otter.ai Pro | $16.99 subscription | Included |
| AssemblyAI | $0.68 (base) + $0.08 (speaker ID) = $0.76 | +13% |
| Amazon Transcribe | $5.76 | Included |
Verdict: For low-volume meeting transcription, Otter.ai's subscription or Amazon Transcribe (if you have AWS) are cheapest. BrassTranscripts is simpler with no subscription or AWS setup required.
Example 2: Podcast Production (20 hours/month)
A podcast producer transcribing 20 hours of interviews monthly:
| Service | Monthly Cost | Speaker ID |
|---|---|---|
| BrassTranscripts | $120.00 (20 × $6.00) | Included |
| AssemblyAI | $3.00 (base) + $0.40 (speaker ID) = $3.40 | +13% |
| Amazon Transcribe | $28.80 | Included |
| Deepgram | $9.24 (base) + ~$1.80 (speaker ID) = ~$11.04 | +19% |
Verdict: At higher volumes, per-minute services (AssemblyAI, Amazon Transcribe) are significantly cheaper than flat-rate pricing. But they require API integration—no simple upload interface.
Example 3: Variable Usage (2-15 hours/month)
A consultant with unpredictable transcription needs:
| Scenario | BrassTranscripts | Otter.ai Pro | AssemblyAI + Speaker ID |
|---|---|---|---|
| Light month (2 hrs) | $12.00 | $16.99 | $0.34 |
| Heavy month (15 hrs) | $90.00 | $16.99 | $2.55 |
| Average (8 hrs) | $48.00 | $16.99 | $1.36 |
Verdict: Otter.ai's subscription wins for predictable heavy usage. AssemblyAI wins on pure cost but requires API coding. BrassTranscripts offers middle ground: no subscription, no API needed, speaker ID included.
When Speaker ID Matters Most
Speaker identification is essential for:
Business meetings: Track accountability—who committed to what, who raised concerns, who made decisions.
Research interviews: Distinguish interviewer questions from participant responses for qualitative analysis.
Podcast production: Separate host from guests for editing, pull quotes, and show notes.
Legal depositions: Document who made each statement for the record.
Focus groups: Analyze individual participant contributions across group discussions.
Sales calls: Review what the prospect said versus what your rep promised.
If your use case involves multiple speakers and you need to know who said what, factor speaker ID pricing into your service comparison—not just base transcription rates.
How to Choose Based on Your Needs
Choose services with speaker ID included if:
- You transcribe meetings, interviews, or multi-speaker content
- You want predictable pricing without add-on surprises
- You don't want to remember to enable a feature toggle
Best options: BrassTranscripts (no subscription, simple upload), Otter.ai (subscription with integrations), Amazon Transcribe (high volume with AWS)
Consider add-on pricing if:
- You're cost-optimizing at scale (100+ hours/month)
- You have engineering resources for API integration
- Some transcriptions don't need speaker labels
Best options: AssemblyAI (cheapest per-minute with speaker ID add-on), Deepgram (real-time streaming capabilities)
Skip speaker ID services if:
- You only transcribe single-speaker content (lectures, voice memos, dictation)
- Speaker attribution isn't relevant to your workflow
Best options: Any service—compare on base transcription price alone.
Try Speaker-Identified Transcription
Want to see how speaker identification works on your audio? Upload a file to BrassTranscripts and preview the first 30 words free—including speaker labels. No account required, no credit card needed for the preview.
For technical details on how speaker identification works, see our complete guide to AI transcription with speaker identification.
Related Resources
- What Is Speaker Diarization? - How the technology works
- AI Transcription with Speaker Identification: Complete Guide - In-depth technical guide
- How to Get Speaker Names in Transcripts - Assigning real names to speaker labels
- Multi-Speaker Transcription: Fix "Who Said What" Issues - Troubleshooting speaker attribution errors
- Amazon Transcribe Pricing 2026 - Detailed AWS cost breakdown
Pricing Disclaimer
Information valid as of February 2026. Pricing data verified from official pricing pages: AssemblyAI, Deepgram, Amazon Transcribe, Otter.ai, Google Cloud Speech-to-Text. Services may change pricing, features, or terms at any time. Always verify current rates directly before making purchasing decisions.