A comprehensive benchmark suite for evaluating and comparing different pitch detection algorithms across multiple datasets and metrics.
SwiftF0 achieves the highest average harmonic-mean accuracy (81.7%) across seven datasets while also delivering near real-time performance (≈42× faster than the TorchCREPE baseline on CPU). pYIN follows with the second-highest average accuracy (71.9%). TorchCREPE ranks third (71.0%) and remains the slowest algorithm, taking ≈5.5 s to process 5 s of audio on CPU. Praat delivers an excellent speed–accuracy balance: it processes 5 s of audio in just 7 ms on CPU (≈809× faster than TorchCREPE) while maintaining a strong overall accuracy of 66.3%. For a detailed breakdown of results, see Benchmark Results.
Algorithm | NSynth | PTDB | SpeechSynth | MIR‑1K | MDB‑STEM‑Synth | Vocadito | Bach10‑mf0‑synth | Average |
---|---|---|---|---|---|---|---|---|
BasicPitch | 11.9% | 12.8% | 55.9% | 25.7% | 8.1% | 13.1% | 19.4% | 21.0% |
pYIN | 17.8% | 72.3% | 55.8% | 89.4% | 83.6% | 89.8% | 94.4% | 71.9% |
Praat | 22.5% | 80.4% | 77.0% | 74.1% | 59.1% | 82.2% | 69.1% | 66.3% |
PENN | 2.0% | 82.5% | 77.0% | 80.4% | 61.4% | 57.2% | 45.4% | 58.0% |
RAPT | 13.2% | 70.7% | 67.3% | 76.5% | 70.3% | 78.0% | 78.8% | 65.0% |
SWIPE | 13.4% | 50.8% | 66.8% | 73.6% | 58.6% | 72.7% | 74.8% | 58.7% |
TorchCREPE | 73.4% | 66.0% | 82.4% | 71.4% | 49.6% | 64.2% | 90.3% | 71.0% |
YAAPT | 2.3% | 67.9% | 78.7% | 70.0% | 24.9% | 86.0% | 31.2% | 51.6% |
SwiftF0 | 33.6% | 87.0% | 88.7% | 93.3% | 82.6% | 92.1% | 94.6% | 81.7% |
pip install -r requirements.txt
Visualize algorithm comparisons:
python visualize_algorithms.py audio_file.wav
Run speed benchmark:
python speed_benchmark.py
Run pitch detection benchmark:
python pitch_benchmark.py --dataset DATASET_NAME --data-dir DATA_PATH
[Experimental] Run music transcription benchmark:
python note_benchmark.py --dataset DATASET_NAME --data-dir DATA_PATH
- Comprehensive evaluation across various datasets:
- PTDB
- NSynth
- MDB-stem-synth
- MIR-1K
- Vocadito
- Bach10-mf0-synth
- A novel synthetic speech dataset: SpeechSynth
- Performance benchmarking for CPU and GPU execution
- Testing under noisy conditions: CHiME-Home dataset
- Visualization tools for algorithm comparison
- Implementation of popular pitch detection algorithms:
- YAAPT (pYAAPT implementation)
- Praat (Parselmouth implementation)
- TorchCREPE (PyTorch implementation of CREPE) and CREPE (original implementation)
- Pitch-Estimating Neural Networks (PENN)
- SWIPE (SPTK implementation)
- RAPT (SPTK implementation)
- pYIN (librosa implementation)
- BasicPitch
- SwiftF0
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this benchmark in your research, please cite:
@software{pitch_detection_benchmark,
title = {Pitch Detection Benchmark},
author = {Lars Nieradzik},
year = {2025},
url = {https://github.com/lars76/pitch-detection-benchmark}
}