What File Formats Can Be Transcribed? [Complete Audio & Video Format Guide]
BrassTranscripts supports 11 file formats total: 9 audio formats (MP3, M4A, WAV, AAC, FLAC, OGG, Opus, WebM, MPGA) and 2 video formats (MP4, MPEG). Maximum file size is 250MB. Audio/video duration must be between 5 minutes and 2 hours.
Quick Navigation
- Complete Supported Formats List
- Audio Formats Explained
- Video Formats Explained
- File Size and Duration Limits
- Format Conversion Tips
- Common Format Questions
Complete Supported Formats List
BrassTranscripts supports 11 total formats (9 audio + 2 video). Read about why we chose to support these 11 formats in our technical decisions.
Audio Formats (9 Total)
| Format | Extension | Best For | Quality |
|---|---|---|---|
| MP3 | .mp3 |
General use, podcasts | Good compression, widely compatible |
| M4A | .m4a |
Apple devices, iTunes | Better quality than MP3 at same size |
| WAV | .wav |
Professional recordings | Uncompressed, maximum quality |
| AAC | .aac |
Streaming, modern devices | Efficient compression, good quality |
| FLAC | .flac |
Archival, audiophile | Lossless compression |
| OGG | .ogg |
Open-source projects | Free format, good compression |
| Opus | .opus |
Voice recordings, VoIP | Optimized for speech |
| WebM | .webm |
Web recordings | Browser-native format |
| MPGA | .mpga |
MPEG audio streams | Legacy audio format |
Video Formats (2 Total)
| Format | Extension | Best For | Note |
|---|---|---|---|
| MP4 | .mp4 |
Universal video standard | Audio track extracted for transcription |
| MPEG | .mpeg |
Legacy video files | Audio track extracted for transcription |
Important: For video files, BrassTranscripts extracts the audio track and transcribes it. Video content is not analyzed—only the spoken audio.
Audio Formats Explained
MP3 (.mp3) - Most Common Format
Why it's popular: MP3 has been the standard audio format for decades. Nearly every device and software supports it.
Transcription quality: Excellent. Even with compression, MP3 retains sufficient audio quality for accurate transcription.
When to use:
- Podcast recordings
- Downloaded audio files
- General audio transcription needs
- Files from older recording equipment
Technical specs: Lossy compression, typically 128-320 kbps bitrate
M4A (.m4a) - Apple's Preferred Format
Why it's popular: Default format for Apple devices (iPhone, iPad, Mac). iTunes and Voice Memos app both create M4A files.
Transcription quality: Excellent. M4A typically provides better audio quality than MP3 at the same file size.
When to use:
- iPhone/iPad recordings (Voice Memos app)
- iTunes audio files
- Apple ecosystem recordings
- Higher quality audio at smaller sizes
Technical specs: AAC compression in MP4 container, typically 128-256 kbps
WAV (.wav) - Professional Uncompressed Audio
Why it's popular: Standard for professional audio recording. No compression means no quality loss.
Transcription quality: Maximum. Uncompressed audio preserves every detail.
When to use:
- Professional studio recordings
- High-quality interviews
- Archival recordings
- When file size is not a concern
Technical specs: Uncompressed PCM audio, typically 1,411 kbps (CD quality)
Warning: WAV files are large. A 1-hour recording can be 600MB+, exceeding BrassTranscripts' 250MB limit. Consider converting to FLAC or high-bitrate MP3 for long recordings.
AAC (.aac) - Modern Efficient Format
Why it's popular: Standard for streaming services and modern devices. Better compression efficiency than MP3.
Transcription quality: Excellent. AAC maintains high quality even at lower bitrates.
When to use:
- Streaming audio downloads
- Modern device recordings
- YouTube audio extraction
- Efficient storage with high quality
Technical specs: Advanced Audio Coding, typically 128-256 kbps
FLAC (.flac) - Lossless Compression
Why it's popular: Audiophile favorite. Compresses audio without losing quality (unlike MP3/AAC).
Transcription quality: Maximum. Identical to WAV quality but 40-60% smaller file size.
When to use:
- High-quality archival recordings
- Professional interviews requiring perfect fidelity
- Long recordings where WAV would exceed size limits
- Music recordings with subtle audio details
Technical specs: Lossless compression, typically 700-900 kbps
OGG (.ogg) - Open-Source Alternative
Why it's popular: Free, patent-free format. Common in open-source software and Linux systems.
Transcription quality: Good to excellent, depending on encoding settings.
When to use:
- Linux system recordings
- Open-source project audio
- Game audio files
- Web-based recordings using open formats
Technical specs: Ogg Vorbis compression, variable bitrate
Opus (.opus) - Speech-Optimized Format
Why it's popular: Designed specifically for voice and speech. Used in VoIP applications and voice chat.
Transcription quality: Excellent for speech. Optimized for clarity over music quality.
When to use:
- VoIP call recordings (Discord, Signal, WhatsApp)
- Voice chat recordings
- Webinar recordings
- Speech-only content
Technical specs: Opus codec, typically 16-64 kbps for speech
WebM (.webm) - Browser-Native Format
Why it's popular: Standard for web-based audio recording. Browsers can record directly to WebM.
Transcription quality: Good. Quality depends on recording settings.
When to use:
- Browser-based recordings
- Web application audio captures
- Screen recordings with audio
- Online meeting recordings
Technical specs: Opus or Vorbis audio in WebM container
MPGA (.mpga) - MPEG Audio Stream
Why it's popular: Legacy format for MPEG audio streams. Less common than MP3 but still in use.
Transcription quality: Good. Similar to MP3 quality.
When to use:
- Legacy audio files
- MPEG stream extractions
- Older recording systems
Technical specs: MPEG-1 or MPEG-2 audio layer
Video Formats Explained
MP4 (.mp4) - Universal Video Standard
Why it's supported: MP4 is the most common video format worldwide. Every modern device and platform uses it.
How transcription works: BrassTranscripts extracts the audio track from MP4 files and transcribes the speech. Video content is not analyzed.
When to use:
- Zoom/Teams/Google Meet recordings
- YouTube video downloads
- Phone video recordings
- Screen recordings with audio
- Interview recordings (video cameras)
- Webinar recordings
Common sources:
- Video conferencing platforms (Zoom, Teams, Meet)
- Smartphone cameras (iPhone, Android)
- Screen recording software (OBS, Camtasia)
- Video editing software exports
- Security camera footage with audio
Technical specs: H.264 or H.265 video + AAC audio, maximum 250MB
MPEG (.mpeg) - Legacy Video Format
Why it's supported: Older video format still used in professional video equipment and legacy systems.
How transcription works: Audio track extracted and transcribed (same as MP4).
When to use:
- Legacy video files
- Professional video equipment recordings
- Older security camera footage
- Broadcast video files
Technical specs: MPEG-2 video + audio, maximum 250MB
File Size and Duration Limits
Maximum File Size: 250MB
Why this limit: Balances upload speed, processing efficiency, and practical recording lengths. All formats process at similar speeds - learn more in our processing time guide.
What 250MB allows:
- MP3 (128 kbps): ~4 hours of audio
- MP3 (320 kbps): ~1.5 hours of audio
- M4A (128 kbps): ~4 hours of audio
- WAV (CD quality): ~25 minutes of audio
- FLAC: ~35-40 minutes of audio
- MP4 video (1080p): ~15-30 minutes depending on bitrate
If your file exceeds 250MB:
- Compress the file: Convert high-quality formats (WAV) to efficient formats (MP3, M4A)
- Reduce bitrate: Re-encode at lower bitrate while maintaining speech clarity
- Split the file: Break long recordings into multiple segments under 250MB each
- Remove video: If transcribing video, extract audio-only (much smaller)
Duration Limits: 5 Minutes to 2 Hours
Minimum duration: 5 minutes
Why: Ensures meaningful transcription content. Very short clips often lack context and are inefficient to process.
Maximum duration: 2 hours
Why: Practical limit for transcription quality and processing time. Most meetings, interviews, and podcasts fit within 2 hours.
If your recording exceeds 2 hours:
- Split into segments: Break into 2-hour (or shorter) chunks
- Transcribe separately: Upload each segment individually
- Combine transcripts: Merge the text files afterward
Example: 3-hour conference recording → Split into 90-minute segments → Upload twice → Combine transcripts
Format Conversion Tips
When to Convert Formats
Convert WAV to FLAC or MP3:
- WAV files exceeding 250MB
- Long professional recordings
- Archival storage with size constraints
Convert video to audio-only:
- Video files exceeding 250MB
- Only speech matters (no need to upload video)
- Faster upload and processing
Convert exotic formats to MP3/M4A:
- Formats not listed above (WMA, RA, etc.)
- Ensuring maximum compatibility
- Reducing troubleshooting
Free Conversion Tools
Desktop software:
- Audacity (Windows, Mac, Linux) - Free, open-source audio editor and converter
- VLC Media Player (Windows, Mac, Linux) - Convert audio and video formats
- FFmpeg (Command-line tool) - Professional conversion for technical users
Online converters (use with caution for sensitive content):
- CloudConvert - Wide format support
- FreeConvert - Audio and video conversion
- Online-Convert - Batch conversion
Privacy note: For confidential recordings (business meetings, legal interviews, medical content), use desktop software only. Online converters upload your files to third-party servers.
Conversion Best Practices
Maintain quality for transcription:
- Use 128 kbps minimum for MP3/M4A (lower reduces accuracy)
- Keep sample rate at 44.1 kHz or higher (speech clarity)
- Preserve mono or stereo (don't force mono from stereo unnecessarily)
Reduce file size efficiently:
- Remove video track: Audio-only files are 90% smaller than video
- Use efficient formats: FLAC instead of WAV, M4A instead of WAV
- Normalize audio levels: Prevents oversized files from excessive dynamics
- Trim silence: Remove long silent sections (intro/outro music, dead air)
Avoid these mistakes:
- ❌ Converting already-compressed formats multiple times (MP3 → AAC → OGG degrades quality)
- ❌ Using very low bitrates (<64 kbps) to shrink files (damages transcription accuracy)
- ❌ Converting lossless to lossy then back to lossless (doesn't restore quality)
Common Format Questions
Can I upload files from Zoom/Teams/Google Meet?
Yes, absolutely. All major video conferencing platforms export compatible formats:
- Zoom: MP4 (video) or M4A (audio-only) ✅
- Microsoft Teams: MP4 ✅
- Google Meet: MP4 ✅
- Webex: MP4 ✅
Pro tip: Download audio-only recordings when available (M4A from Zoom). They're much smaller than video files and upload faster.
What about iPhone/Android recordings?
iPhone (Voice Memos app): M4A ✅ Directly supported
Android:
- Most devices: MP3 or M4A ✅ Directly supported
- Samsung Voice Recorder: M4A ✅ Directly supported
- Google Recorder: M4A ✅ Directly supported
Can I transcribe YouTube videos?
Yes, but you need to download the audio first. BrassTranscripts does not download YouTube videos for you.
How to download YouTube audio:
- Use yt-dlp (command-line tool, most reliable)
- Use 4K Video Downloader (desktop app)
- Extract audio as MP3 or M4A
- Upload to BrassTranscripts
Legal note: Only download content you have rights to transcribe (your own videos, educational use with permission, public domain content).
What if my format isn't supported?
Solution: Convert to MP3 or M4A (universally compatible)
Unsupported formats you might encounter:
- WMA (Windows Media Audio) → Convert to MP3
- RA (RealAudio) → Convert to MP3
- AMR (Adaptive Multi-Rate, old phone recordings) → Convert to M4A
- AIFF (Apple Interchange File Format) → Convert to M4A or FLAC
Use conversion tools mentioned earlier (Audacity, VLC, FFmpeg).
Does audio quality affect format choice?
For transcription accuracy: Format matters less than recording quality.
What actually affects accuracy:
- ✅ Clear speech (no mumbling, good enunciation)
- ✅ Low background noise (quiet environment)
- ✅ Good microphone (reduces distortion)
- ✅ Adequate volume levels (not too quiet or clipping)
Format impacts:
- Compressed formats (MP3, AAC, OGG): Minimal impact if bitrate is reasonable (128+ kbps)
- Lossless formats (WAV, FLAC): Maximum fidelity but often overkill for speech
- Very low bitrates (<64 kbps): Can damage transcription accuracy
Recommendation: Use MP3 or M4A at 128-192 kbps for speech. Higher quality formats don't significantly improve transcription but create larger files.
Can I transcribe files from cloud storage?
BrassTranscripts requires direct file upload. You cannot provide cloud storage links (Google Drive, Dropbox, OneDrive).
Workflow:
- Download file from cloud storage to your device
- Verify format is supported (see list above)
- Upload to BrassTranscripts
Why direct upload: Ensures security, privacy, and reliable file access during processing.
What happens to video content in MP4/MPEG files?
Video content is ignored. BrassTranscripts extracts and transcribes only the audio track.
This means:
- Visual information is not captured (slides, screen shares, text overlays)
- Speaker identification is based on voice, not visual appearance
- Gestures, body language, and visual context are not transcribed
For videos with important visual content: Consider using screen reader software or manual notes to supplement the transcript with visual descriptions.
Are there any format-specific pricing differences?
No. All supported formats have the same pricing.
Pricing is based on audio duration, not file format or size:
- 1-15 minutes: $2.25 flat rate
- 16+ minutes: $0.15 per minute
Example:
- 20-minute MP3 file: $3.00
- 20-minute WAV file: $3.00
- 20-minute MP4 video: $3.00
(All priced identically despite different file sizes)
Summary: Choose the Right Format
Best all-around formats:
- MP3: Maximum compatibility, good quality, reasonable size
- M4A: Apple ecosystem, better quality than MP3 at same size
- MP4: Video recordings (audio extracted automatically)
For maximum quality:
- WAV: Uncompressed, but watch 250MB limit
- FLAC: Lossless compression, best balance of quality and size
For specific use cases:
- Podcasts: MP3 or M4A
- Interviews: MP3, M4A, or FLAC
- Meetings: MP4 video recordings or M4A audio-only
- Professional recordings: WAV or FLAC
- VoIP/Voice chat: Opus or MP3
Format checklist:
- ✅ Is your format in the supported list? (11 formats total)
- ✅ Is your file under 250MB?
- ✅ Is your recording between 5 minutes and 2 hours?
- ✅ If not, can you convert or split the file?
Ready to transcribe? Upload your audio or video file at BrassTranscripts.com and get your transcript in minutes.