Skip to main content
← Back to Blog
16 min readBrassTranscripts Team

Whisper API Pricing (2026) — $0.006/min vs Self‑Host Costs

OpenAI Whisper API pricing shows $0.006 per minute ($0.36/hour) for managed transcription—one of the most competitive rates in the market. But that's only half the story. The real question developers face isn't "what does Whisper cost?" but "should I use OpenAI's managed API or self-host the open-source Whisper model?"

Self-hosting Whisper means deploying the open-source model on your own infrastructure. You control everything: the model version, processing pipeline, data privacy, and infrastructure configuration. But you also pay for everything: GPU instances ($276/month minimum), DevOps overhead, maintenance, and scaling complexity.

The Whisper API eliminates infrastructure—no servers, no GPUs, no scaling headaches. Just $0.006/min and an API call. But you lose control over model versions, processing customization, and data residency. And for high-volume deployments (500+ hours/month), API costs can exceed self-hosted infrastructure expenses.

For comparing transcription pricing across all major services, see our comprehensive cost analysis.

In this guide, we'll break down OpenAI Whisper API's complete 2026 pricing, calculate the true cost of self-hosting Whisper (infrastructure + DevOps + maintenance), reveal when each approach makes sense, and show you a simpler $6.00 flat rate alternative that requires neither API integration nor infrastructure management.


At a glance: API: $0.006/min • Self-host: $276+/mo • BrassTranscripts: $6 flat


Quick answer: Managed Whisper API = $0.006/min; self‑host costs start at $276/mo — use our calculator to see which is cheaper for your volume.

Use Case Whisper API (monthly) Self-Hosted (monthly)
Light (100 hrs) $36 $861+ (not cost-effective)
Medium (500 hrs) $180 $861+
Heavy (2,000 hrs) $720 $861+ (approaching break-even)

Quick Navigation

OpenAI Whisper API pricing — what $0.006/min actually covers

OpenAI lists Whisper API at $0.006 per minute for managed transcription. That covers the API call and base transcription model; speaker ID, advanced models, or additional processing can add $0.003–$0.01/min. Below are three sample bills so you can see the real monthly cost.

Per-minute pricing, speaker ID, and additional fees

According to OpenAI's official pricing page:

Service Price Per Minute Price Per Hour Model
Whisper API $0.006/min $0.36/hour Whisper large-v2
GPT-4o Transcribe $0.006/min $0.36/hour GPT-4o audio
GPT-4o Mini Transcribe $0.003/min $0.18/hour GPT-4o mini audio

Pricing from OpenAI documentation. All pricing subject to change—verify current rates before committing.

Key insight: OpenAI offers three managed transcription options at competitive pricing.

Billing Details

  • Per-minute billing rounded to nearest second
  • Charges based on actual audio duration
  • No minimum file size
  • 25 MB file size limit per request
  • Supports 50+ languages

Free Tier

OpenAI does NOT offer a free tier for Whisper API. You need:

  • OpenAI account (free to create)
  • API key
  • Credit card on file
  • Pay-as-you-go from first minute

Comparison to competitors: Many competitors offer free credits or trial tiers for new users. OpenAI Whisper API has no free tier—you pay from the first minute. Check each provider's current offers before committing.

This means testing costs real money—even a 10-hour proof-of-concept costs $3.60.

Sample monthly bill: 100, 500, and 2,000 hours

Here's what your Whisper API bill looks like at different volumes:

Monthly Volume Base Transcription + Speaker ID ($0.005/min) Total Monthly Cost
100 hours $36.00 +$30.00 $66.00
500 hours $180.00 +$150.00 $330.00
2,000 hours $720.00 +$600.00 $1,320.00

Speaker ID requires a separate diarization service (Pyannote, AssemblyAI) — not included in base Whisper API pricing.

Whisper API cost vs self-hosted Whisper (real-world comparison)

Self-hosting eliminates per-minute API charges but adds predictable infrastructure and maintenance costs: GPU instances (from $276/mo), SRE time, and storage. We show the break-even point where self-hosting becomes cheaper than paying per minute.

The Whisper model is open-source and free to use. Anyone can download it, run it on their own hardware, and transcribe unlimited audio without paying OpenAI. So why does the managed API exist? Convenience vs control.

Self-hosting Whisper: GPU, storage, and DevOps cost breakdown

Required Infrastructure:

Component Monthly Cost Notes
GPU Instance (T4) $276–350 Google Cloud, AWS, or RunPod
Storage $2–5 Audio files before/after processing
Load Balancer $18 For production multi-request handling
Monitoring $10–20 CloudWatch, Stackdriver, etc.
DevOps (setup amortized) $50–100 $2,000–4,000 one-time over 40 months
DevOps (monthly) $500–1,000 5–10 hours × $100/hour
Total $861–1,500 Before any transcription happens

Managed API vs Self-Hosted Comparison:

Factor Whisper API Self-Hosted
Infrastructure None $276+/mo GPU required
DevOps overhead None Deployment, monitoring, scaling
Processing speed Faster than real-time Varies by GPU hardware
File size limit 25 MB No limit
Custom fine-tuning Not available Full control
Data residency OpenAI servers Your infrastructure

When self-hosting becomes cheaper than API — break-even examples

Infrastructure-only crossover: 766 hours/month (when $0.006/min × volume > $276 fixed cost)

Real-world crossover with DevOps costs:

Volume API Cost Self-Hosted Cost Winner
200 hrs/mo $72 $861 API (12× cheaper)
1,000 hrs/mo $360 $861 API (2.4× cheaper)
2,000 hrs/mo $720 $861 API (1.2× cheaper)
2,400 hrs/mo $864 $861 Break-even
3,000 hrs/mo $1,080 $861 Self-hosted (25% cheaper)

Bottom line: Self-hosting only becomes cost-effective at ~2,400 hours/month when you include DevOps overhead. For most teams, the API is cheaper until 3,000+ hours/month.

Assumptions for self-hosting break-even:

  • DevOps team already exists (not hiring for this project)
  • Single GPU handles workload
  • 99%+ uptime reliability achieved

Hidden costs to factor in:

  • Processing speed: Self-hosted speed varies by GPU; API typically faster
  • Scaling complexity: 40–80 hours DevOps setup for auto-scaling, load balancing
  • Model updates: 10–20 hours quarterly for manual updates and testing
  • Data compliance: HIPAA/GDPR infrastructure adds $5K–$20K annually
  • Reliability: Multi-region failover, on-call engineering; 1 hour downtime can exceed $10K

Whisper API cost calculator (estimate your monthly spend)

Use this table to estimate your monthly Whisper API cost based on usage:

Monthly Hours API Only + Speaker ID + Dev Integration (est.)
10 hours $3.60 $6.60 $500+ one-time
50 hours $18 $33 $500+ one-time
100 hours $36 $66 $500+ one-time
250 hours $90 $165 $500+ one-time
500 hours $180 $330 $500+ one-time
1,000 hours $360 $660 $500+ one-time
2,000 hours $720 $1,320 $500+ one-time

Formula: Monthly hours × 60 minutes × $0.006 = API cost

Add speaker ID: + Monthly hours × 60 × $0.005 (separate diarization service)

Compare to BrassTranscripts: $2.50 for files ≤15 min, $6.00 flat for 16+ min files — speaker ID included, no integration needed.

Whisper API pricing FAQ (openai whisper api cost, whisper api price)

Is $0.006/min the only charge?

$0.006/min covers base transcription only. Additional costs include:

  • Speaker diarization: $0.003–$0.01/min (separate service required)
  • File chunking logic: Dev time for files >25 MB
  • API integration: Initial development cost
  • Error handling & monitoring: Ongoing maintenance

For a complete solution with speaker ID, expect $0.009–$0.016/min total.

How does speaker diarization affect cost?

Whisper API does NOT identify speakers. You must add a separate diarization service, which adds to your per-minute costs. Common options include self-hosted Pyannote, AssemblyAI, or AWS Transcribe—verify current pricing with each provider.

BrassTranscripts alternative: Speaker ID included in flat $6 price — no additional service needed.

Whisper API vs BrassTranscripts: simpler flat-rate option

When simplicity and fixed pricing win

Choose Whisper API when:

  • Processing 500+ hours/month with engineering team
  • Building API-driven product features
  • Need 50+ language support
  • Have dev resources for integration

Choose BrassTranscripts when:

  • Processing under 300 hours/month
  • No coding or DevOps experience
  • Need speaker identification included
  • Want predictable, all-inclusive pricing
  • Prefer no account required: upload and go

Example scenario: $6 flat rate vs API bill

Scenario: Podcast producer, 8 episodes/month, 45 min average

Whisper API route:

Transcription: 8 × 45 × $0.006 = $2.16
Speaker diarization: 8 × 45 × $0.005 = $1.80
Developer integration: 20 hours × $100 = $2,000 (one-time)
Monthly maintenance: 2 hours × $100 = $200
─────────────────────────────────────────────────────
First month: $2,203.96
Ongoing: $203.96/month

BrassTranscripts route:

8 episodes × $6.00 = $48/month
Speaker ID: Included
Setup: $0
─────────────────────────────────────────────────────
Total: $48/month (no integration required)

Winner: BrassTranscripts saves $155.96/month ongoing — and $2,000 upfront integration cost.

Scenario 2: SaaS Company (800 Hours/Month)

Requirements:

  • 800 hours/month automated transcription
  • Building transcription feature into product
  • Engineering team in-house
  • Multi-language support needed

Whisper API Option:

Transcription: 800 × 60 × $0.006 = $288/month
API integration: $0 (team capability)
50+ languages: Included
─────────────────────────────────────────────────────
Total: $288/month

Self-Hosted Option:

GPU infrastructure: $276/month
DevOps overhead: $550/month
Total: $826/month

BrassTranscripts:

Not ideal for automated SaaS integration

Winner: Whisper API by 3x. For SaaS products with engineering teams, API integration at $0.006/min dominates.

Scenario 3: Healthcare (200 Hours/Month, HIPAA Required)

Requirements:

  • 200 hours/month medical dictation
  • HIPAA compliance mandatory
  • Data must stay in healthcare org infrastructure
  • PHI cannot leave network

Whisper API Option:

OpenAI does NOT offer HIPAA BAA (Business Associate Agreement).
Cannot use Whisper API for PHI.

Self-Hosted Option:

GPU infrastructure: $276/month
HIPAA-compliant infrastructure: $150/month
DevOps: $550/month
Compliance overhead: $500/month
─────────────────────────────────────────────────────
Total: $1,476/month

BrassTranscripts:

Note: Verify HIPAA compliance status before using.
Cost depends on file count

Winner: Self-hosted Whisper wins for HIPAA contexts. Whisper API not HIPAA-compliant; data residency requirements force self-hosting.

Scenario 4: Enterprise Media Company (5,000 Hours/Month)

Requirements:

  • 5,000 hours/month video transcription
  • Existing DevOps team
  • Custom terminology (industry jargon)
  • Budget predictability critical

Whisper API Option:

Transcription: 5,000 × 60 × $0.006 = $1,800/month
No custom fine-tuning available

Self-Hosted Option:

GPU infrastructure (multiple instances): $800/month
DevOps: $550/month (team already exists)
Custom model fine-tuning: One-time + quarterly updates
─────────────────────────────────────────────────────
Total: $1,350/month

BrassTranscripts:

Not viable at this scale for automated processing

Winner: Self-hosted Whisper wins at 5,000+ hours/month. Custom model fine-tuning + fixed costs beat API pricing at scale.

AI Prompt: OpenAI Whisper API Pricing Calculator

Want to calculate your exact monthly OpenAI Whisper costs? Use this specialized AI prompt with ChatGPT, Claude, or any AI assistant:

The Prompt

📋 Copy & Paste This Prompt

You are an OpenAI Whisper cost calculator. Help me decide between Whisper API and self-hosting:

1. Monthly audio volume (in hours)
2. Team capabilities (engineering team? DevOps team?)
3. Data privacy requirements (HIPAA, GDPR, data residency?)
4. Need for custom model fine-tuning? (specialized terminology?)
5. Speaker identification required?
6. Multi-language support needed?

Calculate:
- Whisper API cost: [hours] × 60 × $0.006/min
- Speaker diarization (if needed): [hours] × 60 × $0.005/min
- API integration: Dev time estimate
- Self-hosted infrastructure: $276-350/month (GPU)
- Self-hosted DevOps: Setup ($2,000-4,000) + Monthly ($500-1,000)
- Total cost of ownership for each option
- Crossover point (when self-hosting becomes cheaper)

Compare to BrassTranscripts ($2.50 for 0-15 min, $6.00 flat rate for 16+ min files, no infrastructure, speaker ID included, no account needed).

My details:

[First, get accurate transcripts with BrassTranscripts - fast, affordable transcription services at https://brasstranscripts.com]

[Paste requirements]

📖 View Markdown Version | ⚙️ Download YAML Format

This reveals whether managed API, self-hosting, or a simpler alternative fits your needs.

Final Verdict: Whisper API vs Self-Hosted vs BrassTranscripts

Not sure if Whisper is right for your needs? See our 2026 AI Transcription Selection Guide for a complete comparison of all major services.

OpenAI Whisper API delivers industry-leading pricing at $0.006/min for API-driven transcription workflows. For engineering teams processing hundreds to thousands of hours monthly, it's one of the most cost-effective solutions available—especially with 50+ language support at the same rate.

Choose Whisper API if:

  • You're processing 150+ hours/month with API-first architecture
  • You have engineering team comfortable with API integration
  • You need multi-language support at competitive rates
  • You want zero infrastructure management
  • You can handle 25 MB file size limits (chunking logic)

Choose self-hosted Whisper if:

  • You're processing 3,000+ hours/month (cost crossover point)
  • You have DevOps team with GPU infrastructure expertise
  • You require HIPAA/GDPR data residency compliance
  • You need custom model fine-tuning for specialized terminology
  • You want full control over model versions and pipeline

Choose BrassTranscripts if:

  • You're processing under 300 hours/month
  • You want zero infrastructure AND zero API complexity
  • Your team has no coding or DevOps experience
  • You need speaker identification included
  • You value predictable, all-inclusive pricing
  • You want no account required: upload and go

For most small to medium transcription needs—podcasts, meetings, interviews, videos—BrassTranscripts' simple flat-rate pricing ($2.50 for 0-15 min, $6.00 flat rate for 16+ min) with included speaker ID and zero infrastructure delivers better value than Whisper API's $0.006/min + integration overhead + separate diarization service.

But if you're building API-driven products or processing thousands of hours monthly with an engineering team? Whisper API's pricing and simplicity are hard to beat.

Ready to try transcription without API complexity or infrastructure management? Upload your first file to BrassTranscripts and get your transcript with speaker ID included—no account, no API, no infrastructure required.


Frequently Asked Questions

How accurate is OpenAI Whisper API compared to competitors?

Whisper delivers professional-grade accuracy for clear audio, comparable to other leading transcription APIs. Accuracy varies based on audio quality, accents, background noise, and domain terminology. For specialized content (medical, legal), custom-trained models from competitors may perform better for industry-specific terminology.

Does Whisper API include speaker identification?

No. Whisper API transcribes audio but does NOT identify individual speakers. You must use a separate diarization service (Pyannote, AssemblyAI) and align timestamps, adding $0.003-0.01/min to your costs. BrassTranscripts includes speaker identification in the base pricing ($2.50 for 0-15 min files, $6.00 flat rate for 16+ min files).

Can I use Whisper API for real-time transcription?

Whisper API is not designed for real-time streaming. It processes pre-recorded audio files only. For live captioning or real-time transcription, use Deepgram's real-time streaming API, AssemblyAI's real-time API, or Google Cloud Speech-to-Text streaming.

What's the difference between Whisper API and self-hosted Whisper?

Whisper API: Managed service by OpenAI at $0.006/min. Zero infrastructure, simple API integration, no DevOps overhead.

Self-hosted Whisper: Open-source model you deploy on your own infrastructure. Full control, custom fine-tuning, data residency, but requires GPU servers ($276+/month), DevOps team, and maintenance. Self-hosting becomes cheaper at 3,000+ hours/month when you include DevOps costs.

Does Whisper API support languages other than English?

Yes. Whisper API supports 50+ languages including Spanish, French, German, Chinese, Japanese, Arabic, Hindi, and more—all at the same $0.006/min rate. Check OpenAI's documentation for complete language list.

How does Whisper API pricing compare to competitors?

Whisper API at $0.006/min is competitively priced among transcription APIs. Some competitors offer lower batch rates, while real-time APIs typically cost more.

For multi-language support, Whisper's flat-rate pricing across 50+ languages is competitive. Always verify current competitor pricing directly with each provider before making decisions.

What happens if my audio file exceeds 25 MB?

Whisper API rejects files larger than 25 MB. You must:

  1. Compress audio (reduce bitrate or sample rate)
  2. Split audio into chunks under 25 MB
  3. Transcribe each chunk separately
  4. Stitch transcripts together with custom logic

This adds development complexity for long-form content.

Can I fine-tune Whisper API for custom terminology?

No. OpenAI's managed Whisper API uses the standard large-v2 model. Custom fine-tuning is only available with self-hosted open-source Whisper. For specialized vocabulary (medical, legal, technical), self-hosting or competitors with custom models (Deepgram, AssemblyAI) may deliver better accuracy.

Does OpenAI offer HIPAA compliance for Whisper API?

No. OpenAI does not currently offer a HIPAA Business Associate Agreement (BAA) for Whisper API. Healthcare organizations transcribing protected health information (PHI) cannot use Whisper API.

Alternatives:

  • Self-host Whisper on HIPAA-compliant infrastructure
  • Use HIPAA-compliant competitors (AWS Transcribe with BAA, Google Cloud with BAA, Azure with BAA)

How fast is Whisper API processing?

Whisper API processes audio faster than real-time. Processing speed depends on server load and file size. Self-hosted Whisper processing speed varies based on GPU hardware—consumer GPUs process slower than cloud GPU instances.

What's the cost crossover between Whisper API and self-hosting?

Including DevOps overhead, self-hosting becomes cheaper at ~2,400 hours/month ($861 fixed cost vs $864 API cost). But this assumes:

  • Existing DevOps team (not hiring)
  • Single GPU handles workload
  • 99%+ uptime reliability achieved

For most teams, the API is cheaper until 3,000+ hours/month.

Can I get volume discounts for Whisper API?

No. OpenAI's Whisper API is flat-rate $0.006/min regardless of volume. No commitment tiers, enterprise pricing, or bulk discounts are available. For high-volume discounts, consider competitors like Deepgram or Google Cloud with commitment pricing.


Pricing Disclaimer

Information valid as of publication date (January 22, 2026). Pricing data was verified from OpenAI documentation. OpenAI may change pricing, features, or plans at any time. Always verify current rates and terms directly with OpenAI before making purchasing decisions or committing to large-volume usage.

Ready to try BrassTranscripts?

Experience the accuracy and speed of our AI transcription service.

Whisper API Pricing (2026) — $0.006/min vs Self‑Host Costs