Transcription Research Index
BrassTranscripts maintains this curated index of primary sources — peer-reviewed papers, active open-source tools, and documented benchmarks — so builders and researchers can verify the evidence behind AI transcription claims. Every entry includes a capsule annotation explaining what BrassTranscripts draws from it and what builders should do with it.
About This Index
BrassTranscripts treats this index as the evidence layer underneath its product documentation. When a feature description says "speaker diarization performance depends on overlap," this is the research that backs that claim. The index covers five categories corresponding to the main quality dimensions of AI transcription: accuracy, speaker identification, language coverage, audio robustness, and benchmark methodology.
Entries pass three inclusion criteria: the source must be a primary document (paper, benchmark, or documented tool — not a summary); it must still be current (active maintenance for tools, peer-reviewed or preprint for papers); and it must have a concrete applied implication for builders or users of AI transcription services.
Transcription Accuracy
The AI speech recognition architecture, the inference engine, the Open ASR Leaderboard, and the Artificial Analysis benchmark — the core evidence base for AI transcription quality in real-world conditions.
Speaker Diarization
The neural diarization engine BrassTranscripts uses, overlap-aware diarization research, the ETH Zurich multi-model benchmark, and the in-the-wild datasets that reveal what speaker identification actually costs.
Multilingual Speech
FLEURS, Common Voice v20, Earnings-22, peer-reviewed accent research, and two underrepresented-population benchmarks (Indian English, African English) that expose real-world coverage gaps.
Audio Quality
Reverberation benchmarks, SNR thresholds, the CHiME-7 far-field challenge, and a practical WER measurement library — the evidence base for explaining what audio problems hurt transcripts.
ASR Benchmarks
The Open ASR Leaderboard long-form track, Artificial Analysis STT comparison, MLPerf Inference v5.1, and NIST SCTK — the methodological scaffolding behind every accuracy claim.
Curation Principles
Primary sources only
Papers, benchmarks, and documented tools — not summaries or second-hand descriptions. Each entry links to the authoritative source.
Applied annotations
Every entry includes a capsule explaining what builders should do with the finding — not a restatement of the abstract.
Quarterly refresh
Tool star counts, last-commit dates, and report publication dates are reviewed quarterly. Stale entries are updated or removed.
See the research applied
BrassTranscripts puts these findings into practice — professional AI transcription with speaker identification, 99+ languages, and real-world audio robustness.
Start Transcribing →