From 659003d7610398fc7b5ba0c51bb5aa87c62236d1 Mon Sep 17 00:00:00 2001 From: Max Morrison Date: Wed, 31 Jul 2024 15:17:58 -0500 Subject: [PATCH] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 7a0e694..5d3119f 100644 --- a/README.md +++ b/README.md @@ -89,6 +89,8 @@ pitch, periodicity = penn.from_audio( gpu=gpu) ``` +Note that pitch estimation is performed independently on each frame of audio. Then, a _decoding_ step occurs, which may or may not be computed independently on each frame. Most often, Viterbi decoding is used (as in, e.g., PYIN and CREPE). However, Viterbi decoding is slow. We made a fast Viterbi decoder called [torbi](https://github.com/maxrmorrison/torbi), which [we are working on adding to PyTorch](https://github.com/pytorch/pytorch/issues/121160). Until `torbi` is integrated into PyTorch (or otherwise made pip-installable), it is recommended to use the `dev` branch of `penn`, which uses `torbi` decoding by default, but is not pip-installable. Our paper [_Fine-Grained and Interpretable Neural Speech Editing_](https://www.maxrmorrison.com/sites/promonet/) introduces and demonstrates the efficacy of `torbi` for pitch decoding. + ### Application programming interface