Transcription File Formats: Audio & Video Guide
BrassTranscripts supports 11 file formats total: 9 audio formats (MP3, M4A, WAV, AAC, FLAC, OGG, Opus, WebM, MPGA) and 2 video formats (MP4, MPEG). Maximum file size is 250MB. Audio/video duration must be between 5 minutes and 2 hours.
Quick Navigation
- Complete Supported Formats List
- Audio Formats Explained
- Video Formats Explained
- File Size and Duration Limits
- Format Conversion Tips
- Common Format Questions
Complete Supported Formats List
BrassTranscripts supports 11 total formats (9 audio + 2 video). Read about why we chose to support these 11 formats in our technical decisions.
Audio Formats (9 Total)
| Format | Extension | Best For | Quality |
|---|---|---|---|
| MP3 | .mp3 |
General use, podcasts | Good compression, widely compatible |
| M4A | .m4a |
Apple devices, iTunes | Better quality than MP3 at same size |
| WAV | .wav |
Professional recordings | Uncompressed, maximum quality |
| AAC | .aac |
Streaming, modern devices | Efficient compression, good quality |
| FLAC | .flac |
Archival, audiophile | Lossless compression |
| OGG | .ogg |
Open-source projects | Free format, good compression |
| Opus | .opus |
Voice recordings, VoIP | Optimized for speech |
| WebM | .webm |
Web recordings | Browser-native format |
| MPGA | .mpga |
MPEG audio streams | Legacy audio format |
Video Formats (2 Total)
| Format | Extension | Best For | Note |
|---|---|---|---|
| MP4 | .mp4 |
Universal video standard | Audio track extracted for transcription |
| MPEG | .mpeg |
Legacy video files | Audio track extracted for transcription |
Important: For video files, BrassTranscripts extracts the audio track and transcribes it. Video content is not analyzed—only the spoken audio.
Audio Formats Explained
MP3 (.mp3) - Most Common Format
Why it's popular: MP3 has been the standard audio format for decades. Nearly every device and software supports it.
Transcription quality: Excellent. Even with compression, MP3 retains sufficient audio quality for accurate transcription.
When to use:
- Podcast recordings
- Downloaded audio files
- General audio transcription needs
- Files from older recording equipment
Technical specs: Lossy compression, typically 128-320 kbps bitrate
M4A (.m4a) - Apple's Preferred Format
Why it's popular: Default format for Apple devices (iPhone, iPad, Mac). iTunes and Voice Memos app both create M4A files.
Transcription quality: Excellent. M4A typically provides better audio quality than MP3 at the same file size.
When to use:
- iPhone/iPad recordings (Voice Memos app)
- iTunes audio files
- Apple ecosystem recordings
- Higher quality audio at smaller sizes
Technical specs: AAC compression in MP4 container, typically 128-256 kbps
WAV (.wav) - Professional Uncompressed Audio
Why it's popular: Standard for professional audio recording. No compression means no quality loss.
Transcription quality: Maximum. Uncompressed audio preserves every detail.
When to use:
- Professional studio recordings
- High-quality interviews
- Archival recordings
- When file size is not a concern
Technical specs: Uncompressed PCM audio, typically 1,411 kbps (CD quality)
Warning: WAV files are large. A 1-hour recording can be 600MB+, exceeding BrassTranscripts' 250MB limit. Consider converting to FLAC or high-bitrate MP3 for long recordings.
AAC (.aac) - Modern Efficient Format
Why it's popular: Standard for streaming services and modern devices. Better compression efficiency than MP3.
Transcription quality: Excellent. AAC maintains high quality even at lower bitrates.
When to use:
- Streaming audio downloads
- Modern device recordings
- YouTube audio extraction
- Efficient storage with high quality
Technical specs: Advanced Audio Coding, typically 128-256 kbps
FLAC (.flac) - Lossless Compression
Why it's popular: Audiophile favorite. Compresses audio without losing quality (unlike MP3/AAC).
Transcription quality: Maximum. Identical to WAV quality but 40-60% smaller file size.
When to use:
- High-quality archival recordings
- Professional interviews requiring perfect fidelity
- Long recordings where WAV would exceed size limits
- Music recordings with subtle audio details
Technical specs: Lossless compression, typically 700-900 kbps
OGG (.ogg) - Open-Source Alternative
Why it's popular: Free, patent-free format. Common in open-source software and Linux systems.
Transcription quality: Good to excellent, depending on encoding settings.
When to use:
- Linux system recordings
- Open-source project audio
- Game audio files
- Web-based recordings using open formats
Technical specs: Ogg Vorbis compression, variable bitrate
Opus (.opus) - Speech-Optimized Format
Why it's popular: Designed specifically for voice and speech. Used in VoIP applications and voice chat.
Transcription quality: Excellent for speech. Optimized for clarity over music quality.
When to use:
- VoIP call recordings (Discord, Signal, WhatsApp)
- Voice chat recordings
- Webinar recordings
- Speech-only content
Technical specs: Opus codec, typically 16-64 kbps for speech
WebM (.webm) - Browser-Native Format
Why it's popular: Standard for web-based audio recording. Browsers can record directly to WebM.
Transcription quality: Good. Quality depends on recording settings.
When to use:
- Browser-based recordings
- Web application audio captures
- Screen recordings with audio
- Online meeting recordings
Technical specs: Opus or Vorbis audio in WebM container
MPGA (.mpga) - MPEG Audio Stream
Why it's popular: Legacy format for MPEG audio streams. Less common than MP3 but still in use.
Transcription quality: Good. Similar to MP3 quality.
When to use:
- Legacy audio files
- MPEG stream extractions
- Older recording systems
Technical specs: MPEG-1 or MPEG-2 audio layer
Video Formats Explained
MP4 (.mp4) - Universal Video Standard
Why it's supported: MP4 is the most common video format worldwide. Every modern device and platform uses it.
How transcription works: BrassTranscripts extracts the audio track from MP4 files and transcribes the speech. Video content is not analyzed.
When to use:
- Zoom/Teams/Google Meet recordings
- YouTube video downloads
- Phone video recordings
- Screen recordings with audio
- Interview recordings (video cameras)
- Webinar recordings
Common sources:
- Video conferencing platforms (Zoom, Teams, Meet)
- Smartphone cameras (iPhone, Android)
- Screen recording software (OBS, Camtasia)
- Video editing software exports
- Security camera footage with audio
Technical specs: H.264 or H.265 video + AAC audio, maximum 250MB
MPEG (.mpeg) - Legacy Video Format
Why it's supported: Older video format still used in professional video equipment and legacy systems.
How transcription works: Audio track extracted and transcribed (same as MP4).
When to use:
- Legacy video files
- Professional video equipment recordings
- Older security camera footage
- Broadcast video files
Technical specs: MPEG-2 video + audio, maximum 250MB
File Size and Duration Limits
Maximum File Size: 250MB
Why this limit: Balances upload speed, processing efficiency, and practical recording lengths. All formats process at similar speeds - learn more in our processing time guide.
What 250MB allows:
- MP3 (128 kbps): ~4 hours of audio
- MP3 (320 kbps): ~1.5 hours of audio
- M4A (128 kbps): ~4 hours of audio
- WAV (CD quality): ~25 minutes of audio
- FLAC: ~35-40 minutes of audio
- MP4 video (1080p): ~15-30 minutes depending on bitrate
If your file exceeds 250MB:
- Compress the file: Convert high-quality formats (WAV) to efficient formats (MP3, M4A)
- Reduce bitrate: Re-encode at lower bitrate while maintaining speech clarity
- Split the file: Break long recordings into multiple segments under 250MB each
- Remove video: If transcribing video, extract audio-only (much smaller)
Duration Limits: 5 Minutes to 2 Hours
Minimum duration: 5 minutes
Why: Ensures meaningful transcription content. Very short clips often lack context and are inefficient to process.
Maximum duration: 2 hours
Why: Practical limit for transcription quality and processing time. Most meetings, interviews, and podcasts fit within 2 hours.
If your recording exceeds 2 hours:
- Split into segments: Break into 2-hour (or shorter) chunks
- Transcribe separately: Upload each segment individually
- Combine transcripts: Merge the text files afterward
Example: 3-hour conference recording → Split into 90-minute segments → Upload twice → Combine transcripts
Format Conversion Tips
When to Convert Formats
Convert WAV to FLAC or MP3:
- WAV files exceeding 250MB
- Long professional recordings
- Archival storage with size constraints
Convert video to audio-only:
- Video files exceeding 250MB
- Only speech matters (no need to upload video)
- Faster upload and processing
Convert exotic formats to MP3/M4A:
- Formats not listed above (WMA, RA, etc.)
- Ensuring maximum compatibility
- Reducing troubleshooting
Free Conversion Tools
Desktop software:
- Audacity (Windows, Mac, Linux) - Free, open-source audio editor and converter
- VLC Media Player (Windows, Mac, Linux) - Convert audio and video formats
- FFmpeg (Command-line tool) - Professional conversion for technical users
Online converters (use with caution for sensitive content):
- CloudConvert - Wide format support
- FreeConvert - Audio and video conversion
- Online-Convert - Batch conversion
Privacy note: For confidential recordings (business meetings, legal interviews, medical content), use desktop software only. Online converters upload your files to third-party servers.
Conversion Best Practices
Maintain quality for transcription:
- Use 128 kbps minimum for MP3/M4A (lower reduces accuracy)
- Keep sample rate at 44.1 kHz or higher (speech clarity)
- Preserve mono or stereo (don't force mono from stereo unnecessarily)
Reduce file size efficiently:
- Remove video track: Audio-only files are 90% smaller than video
- Use efficient formats: FLAC instead of WAV, M4A instead of WAV
- Normalize audio levels: Prevents oversized files from excessive dynamics
- Trim silence: Remove long silent sections (intro/outro music, dead air)
Avoid these mistakes:
- ❌ Converting already-compressed formats multiple times (MP3 → AAC → OGG degrades quality)
- ❌ Using very low bitrates (<64 kbps) to shrink files (damages transcription accuracy)
- ❌ Converting lossless to lossy then back to lossless (doesn't restore quality)
Common Format Questions
Can I upload files from Zoom/Teams/Google Meet?
Yes, absolutely. All major video conferencing platforms export compatible formats:
- Zoom: MP4 (video) or M4A (audio-only) ✅
- Microsoft Teams: MP4 ✅
- Google Meet: MP4 ✅
- Webex: MP4 ✅
Pro tip: Download audio-only recordings when available (M4A from Zoom). They're much smaller than video files and upload faster.
What about iPhone/Android recordings?
iPhone (Voice Memos app): M4A ✅ Directly supported
Android:
- Most devices: MP3 or M4A ✅ Directly supported
- Samsung Voice Recorder: M4A ✅ Directly supported
- Google Recorder: M4A ✅ Directly supported
Can I transcribe YouTube videos?
Yes, but you need to download the audio first. BrassTranscripts does not download YouTube videos for you.
How to download YouTube audio:
- Use yt-dlp (command-line tool, most reliable)
- Use 4K Video Downloader (desktop app)
- Extract audio as MP3 or M4A
- Upload to BrassTranscripts
Legal note: Only download content you have rights to transcribe (your own videos, educational use with permission, public domain content).
What if my format isn't supported?
Solution: Convert to MP3 or M4A (universally compatible)
Unsupported formats you might encounter:
- WMA (Windows Media Audio) → Convert to MP3
- RA (RealAudio) → Convert to MP3
- AMR (Adaptive Multi-Rate, old phone recordings) → Convert to M4A
- AIFF (Apple Interchange File Format) → Convert to M4A or FLAC
Use conversion tools mentioned earlier (Audacity, VLC, FFmpeg).
Does audio quality affect format choice?
For transcription accuracy: Format matters less than recording quality.
What actually affects accuracy:
- ✅ Clear speech (no mumbling, good enunciation)
- ✅ Low background noise (quiet environment)
- ✅ Good microphone (reduces distortion)
- ✅ Adequate volume levels (not too quiet or clipping)
Format impacts:
- Compressed formats (MP3, AAC, OGG): Minimal impact if bitrate is reasonable (128+ kbps)
- Lossless formats (WAV, FLAC): Maximum fidelity but often overkill for speech
- Very low bitrates (<64 kbps): Can damage transcription accuracy
Recommendation: Use MP3 or M4A at 128-192 kbps for speech. Higher quality formats don't significantly improve transcription but create larger files.
Can I transcribe files from cloud storage?
BrassTranscripts requires direct file upload. You cannot provide cloud storage links (Google Drive, Dropbox, OneDrive).
Workflow:
- Download file from cloud storage to your device
- Verify format is supported (see list above)
- Upload to BrassTranscripts
Why direct upload: Ensures security, privacy, and reliable file access during processing.
What happens to video content in MP4/MPEG files?
Video content is ignored. BrassTranscripts extracts and transcribes only the audio track.
This means:
- Visual information is not captured (slides, screen shares, text overlays)
- Speaker identification is based on voice, not visual appearance
- Gestures, body language, and visual context are not transcribed
For videos with important visual content: Consider using screen reader software or manual notes to supplement the transcript with visual descriptions.
Are there any format-specific pricing differences?
No. All supported formats have the same pricing.
Pricing is based on audio duration, not file format or size:
- 1-15 minutes: $2.50 flat rate
- 16-120 minutes: $6.00 flat rate
Example:
- 20-minute MP3 file: $6.00
- 20-minute WAV file: $6.00
- 20-minute MP4 video: $6.00
(All priced identically despite different file sizes)
Frequently Asked Questions
What audio and video file formats does BrassTranscripts support?
BrassTranscripts supports 11 file formats: 9 audio formats (MP3, M4A, WAV, AAC, FLAC, OGG, Opus, WebM, MPGA) and 2 video formats (MP4, MPEG). The maximum file size is 250MB and recordings must be between 5 minutes and 2 hours in duration.
Can I upload a Zoom, Teams, or Google Meet recording directly?
Yes. All major video conferencing platforms export compatible formats. Zoom saves recordings as MP4 (video) or M4A (audio-only), Microsoft Teams as MP4, Google Meet as MP4, and Webex as MP4. For faster uploads, download the audio-only version when the platform offers it, since audio files are significantly smaller than video files.
What should I do if my file exceeds the 250MB size limit?
Convert the file to a more efficient format using Audacity, VLC, or FFmpeg. Converting a WAV file to MP3 or M4A at 128 kbps typically reduces file size by 80–90 percent while preserving sufficient audio quality for transcription. Alternatively, split a long recording into segments of 90 minutes or less and upload each one separately.
Does the audio format affect transcription accuracy?
Format matters far less than recording quality. A clear MP3 recorded at 128 kbps in a quiet room will produce more accurate results than a lossless WAV recorded with a poor microphone and background noise. The main format-related risk is using very low bitrates below 64 kbps, which can damage speech clarity enough to reduce accuracy.
What if my audio format is not in the supported list?
Convert unsupported formats to MP3 or M4A before uploading. Common unsupported formats include WMA (Windows Media Audio), AMR (older phone recordings), RA (RealAudio), and AIFF (Apple Interchange File Format). Audacity and VLC are free desktop tools that handle these conversions without uploading files to a third-party server, which is important for confidential recordings.
Are there any format-specific pricing differences at BrassTranscripts?
No. All 11 supported formats are priced identically based on audio duration, not file size or format type. BrassTranscripts charges $2.50 for recordings 1–15 minutes and $6.00 for recordings 16–120 minutes, regardless of whether the file is a small Opus voice recording or a large MP4 video file.
Summary: Choose the Right Format
Best all-around formats:
- MP3: Maximum compatibility, good quality, reasonable size
- M4A: Apple ecosystem, better quality than MP3 at same size
- MP4: Video recordings (audio extracted automatically)
For maximum quality:
- WAV: Uncompressed, but watch 250MB limit
- FLAC: Lossless compression, best balance of quality and size
For specific use cases:
- Podcasts: MP3 or M4A
- Interviews: MP3, M4A, or FLAC
- Meetings: MP4 video recordings or M4A audio-only
- Professional recordings: WAV or FLAC
- VoIP/Voice chat: Opus or MP3
Format checklist:
- ✅ Is your format in the supported list? (11 formats total)
- ✅ Is your file under 250MB?
- ✅ Is your recording between 5 minutes and 2 hours?
- ✅ If not, can you convert or split the file?
Ready to transcribe? Upload your audio or video file at BrassTranscripts.com and get your transcript in minutes.