Merge remote-tracking branch 'orig_rvc/main' into dev

# Conflicts: # .gitignore # docs/en/README.en.md # infer/lib/audio.py
SocAIty · Jul 9, 2024 · d37ac64 · d37ac64
2 parents 448a010 + 5524451
commit d37ac64
Show file tree

Hide file tree

Showing 86 changed files with 5,990 additions and 3,204 deletions.
diff --git a/.env b/.env
@@ -5,4 +5,5 @@ no_proxy = localhost, 127.0.0.1, ::1
 weight_root = assets/weights
 weight_uvr5_root = assets/uvr5_weights
 index_root = logs
+outside_index_root = assets/indices
 rmvpe_root = assets/rmvpe
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,14 +1,20 @@
 # Pull request checklist
 
 - [ ] The PR has a proper title. Use [Semantic Commit Messages](https://seesparkbox.com/foundry/semantic_commit_messages). (No more branch-name title please)
-- [ ] Make sure you are requesting the right branch: `dev`.
 - [ ] Make sure this is ready to be merged into the relevant branch. Please don't create a PR and let it hang for a few days.
-- [ ] Ensure all tests are passing.
-- [ ] Ensure linting is passing.
+- [ ] Ensure you can run the codes you submitted succesfully. These submissions will be prioritized for review:
+
+    Introduce improvements in program execution speed;
+
+    Introduce improvements in synthesis quality;
+
+    Fix existing bugs reported by user feedback (or you met);
+
+    Introduce more convenient user operations.
 
 # PR type
 
-- Bug fix / new feature / chore
+- Bug fix / new feature / synthesis quality improvement / program execution speed improvement
 
 # Description
 

diff --git a/.github/workflows/unitest.yml b/.github/workflows/unitest.yml
@@ -33,4 +33,4 @@ jobs:
         python infer/modules/train/preprocess.py logs/mute/0_gt_wavs 48000 8 logs/mi-test True 3.7
         touch logs/mi-test/extract_f0_feature.log
         python infer/modules/train/extract/extract_f0_print.py logs/mi-test $(nproc) pm
-        python infer/modules/train/extract_feature_print.py cpu 1 0 0 logs/mi-test v1
+        python infer/modules/train/extract_feature_print.py cpu 1 0 0 logs/mi-test v1 True
diff --git a/.gitignore b/.gitignore
@@ -7,7 +7,6 @@ __pycache__
 tools/aria2c/
 tools/flag.txt
 
-
 # Imported from huggingface.co/lj1995/VoiceConversionWebUI
 /pretrained
 /pretrained_v2
@@ -23,6 +22,10 @@ rmvpe.pt
 # To set a Python version for the project
 .tool-versions
 
+/runtime
+/assets/weights/*
+ffmpeg.*
+ffprobe.*
 
 ### IDE FILES
 

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,11 @@
+# 贡献规则
+1. 一般来说，作者`@RVC-Boss`将拒绝所有的算法更改，除非它是为了修复某个代码层面的错误或警告
+2. 您可以贡献本仓库的其他位置，如翻译和WebUI，但请尽量作最小更改
+3. 所有更改都需要由`@RVC-Boss`批准，因此您的PR可能会被搁置
+4. 由此带来的不便请您谅解
+
+# Contributing Rules
+1. Generally, the author `@RVC-Boss` will reject all algorithm changes unless what is to fix a code-level error or warning.
+2. You can contribute to other parts of this repo like translations and WebUI, but please minimize your changes as much as possible.
+3. All changes need to be approved by `@RVC-Boss`, so your PR may be put on hold.
+4. Please accept our apologies for any inconvenience caused.
diff --git a/README.md b/README.md
@@ -16,26 +16,33 @@
 
 [**更新日志**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/Changelog_CN.md) | [**常见问题解答**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98%E8%A7%A3%E7%AD%94) | [**AutoDL·5毛钱训练AI歌手**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/Autodl%E8%AE%AD%E7%BB%83RVC%C2%B7AI%E6%AD%8C%E6%89%8B%E6%95%99%E7%A8%8B) | [**对照实验记录**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/Autodl%E8%AE%AD%E7%BB%83RVC%C2%B7AI%E6%AD%8C%E6%89%8B%E6%95%99%E7%A8%8B](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/%E5%AF%B9%E7%85%A7%E5%AE%9E%E9%AA%8C%C2%B7%E5%AE%9E%E9%AA%8C%E8%AE%B0%E5%BD%95)) | [**在线演示**](https://modelscope.cn/studios/FlowerCry/RVCv2demo)
 
-</div>
-
-------
-
-[**English**](./docs/en/README.en.md) | [**中文简体**](./README.md) | [**日本語**](./docs/jp/README.ja.md) | [**한국어**](./docs/kr/README.ko.md) ([**韓國語**](./docs/kr/README.ko.han.md)) | [**Français**](./docs/fr/README.fr.md)| [**Türkçe**](./docs/tr/README.tr.md)
-
-点此查看我们的[演示视频](https://www.bilibili.com/video/BV1pm4y1z7Gm/) !
-
-训练推理界面：go-web.bat
+[**English**](./docs/en/README.en.md) | [**中文简体**](./README.md) | [**日本語**](./docs/jp/README.ja.md) | [**한국어**](./docs/kr/README.ko.md) ([**韓國語**](./docs/kr/README.ko.han.md)) | [**Français**](./docs/fr/README.fr.md) | [**Türkçe**](./docs/tr/README.tr.md) | [**Português**](./docs/pt/README.pt.md)
 
-![image](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/092e5c12-0d49-4168-a590-0b0ef6a4f630)
-
-实时变声界面：go-realtime-gui.bat
-
-![image](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/143246a9-8b42-4dd1-a197-430ede4d15d7)
+</div>
 
 > 底模使用接近50小时的开源高质量VCTK训练集训练，无版权方面的顾虑，请大家放心使用
 
 > 请期待RVCv3的底模，参数更大，数据更大，效果更好，基本持平的推理速度，需要训练数据量更少。
 
+<table>
+   <tr>
+		<td align="center">训练推理界面</td>
+		<td align="center">实时变声界面</td>
+	</tr>
+  <tr>
+		<td align="center"><img src="https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/092e5c12-0d49-4168-a590-0b0ef6a4f630"></td>
+    <td align="center"><img src="https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/730b4114-8805-44a1-ab1a-04668f3c30a6"></td>
+	</tr>
+	<tr>
+		<td align="center">go-web.bat</td>
+		<td align="center">go-realtime-gui.bat</td>
+	</tr>
+  <tr>
+    <td align="center">可以自由选择想要执行的操作。</td>
+		<td align="center">我们已经实现端到端170ms延迟。如使用ASIO输入输出设备，已能实现端到端90ms延迟，但非常依赖硬件驱动支持。</td>
+	</tr>
+</table>
+
 ## 简介
 本仓库具有以下特点
 + 使用top1检索替换输入源特征为训练集特征来杜绝音色泄漏
@@ -47,47 +54,55 @@
 + 使用最先进的[人声音高提取算法InterSpeech2023-RMVPE](#参考项目)根绝哑音问题。效果最好（显著地）但比crepe_full更快、资源占用更小
 + A卡I卡加速支持
 
+点此查看我们的[演示视频](https://www.bilibili.com/video/BV1pm4y1z7Gm/) !
+
 ## 环境配置
 以下指令需在 Python 版本大于3.8的环境中执行。  
 
-(Windows/Linux)  
-首先通过 pip 安装主要依赖:
+### Windows/Linux/MacOS等平台通用方法
+下列方法任选其一。
+#### 1. 通过 pip 安装依赖
+1. 安装Pytorch及其核心依赖，若已安装则跳过。参考自: https://pytorch.org/get-started/locally/
 ```bash
-# 安装Pytorch及其核心依赖，若已安装则跳过
-# 参考自: https://pytorch.org/get-started/locally/
 pip install torch torchvision torchaudio
-
-#如果是win系统+Nvidia Ampere架构(RTX30xx)，根据 #21 的经验，需要指定pytorch对应的cuda版本
-#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
+```
+2. 如果是 win 系统 + Nvidia Ampere 架构(RTX30xx)，根据 #21 的经验，需要指定 pytorch 对应的 cuda 版本
+```bash
+pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
+```
+3. 根据自己的显卡安装对应依赖
+- N卡
+```bash
+pip install -r requirements.txt
+```
+- A卡/I卡
+```bash
+pip install -r requirements-dml.txt
+```
+- A卡ROCM(Linux)
+```bash
+pip install -r requirements-amd.txt
+```
+- I卡IPEX(Linux)
+```bash
+pip install -r requirements-ipex.txt
 ```
 
-可以使用 poetry 来安装依赖：
+#### 2. 通过 poetry 来安装依赖
+安装 Poetry 依赖管理工具，若已安装则跳过。参考自: https://python-poetry.org/docs/#installation
 ```bash
-# 安装 Poetry 依赖管理工具, 若已安装则跳过
-# 参考自: https://python-poetry.org/docs/#installation
 curl -sSL https://install.python-poetry.org | python3 -
-
-# 通过poetry安装依赖
-poetry install
 ```
 
-你也可以通过 pip 来安装依赖：
+通过 Poetry 安装依赖时，python 建议使用 3.7-3.10 版本，其余版本在安装 llvmlite==0.39.0 时会出现冲突
 ```bash
-N卡：
-  pip install -r requirements.txt
-
-A卡/I卡：
-  pip install -r requirements-dml.txt
-
-A卡Rocm（Linux）：
-  pip install -r requirements-amd.txt
-
-I卡IPEX（Linux）：
-  pip install -r requirements-ipex.txt
+poetry init -n
+poetry env use "path to your python.exe"
+poetry run pip install -r requirments.txt
 ```
 
-------
-Mac 用户可以通过 `run.sh` 来安装依赖：
+### MacOS
+可以通过 `run.sh` 来安装依赖
 ```bash
 sh ./run.sh
 ```
@@ -97,48 +112,48 @@ RVC需要其他一些预模型来推理和训练。
 
 你可以从我们的[Hugging Face space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/)下载到这些模型。
 
-以下是一份清单，包括了所有RVC所需的预模型和其他文件的名称:
-```bash
-./assets/hubert/hubert_base.pt
-
-./assets/pretrained 
-
-./assets/uvr5_weights
+### 1. 下载 assets
+以下是一份清单，包括了所有RVC所需的预模型和其他文件的名称。你可以在`tools`文件夹找到下载它们的脚本。
 
-想测试v2版本模型的话，需要额外下载
+- ./assets/hubert/hubert_base.pt
 
-./assets/pretrained_v2
+- ./assets/pretrained 
 
-如果你正在使用Windows，则你可能需要这个文件，若ffmpeg和ffprobe已安装则跳过; ubuntu/debian 用户可以通过apt install ffmpeg来安装这2个库, Mac 用户则可以通过brew install ffmpeg来安装 (需要预先安装brew)
+- ./assets/uvr5_weights
 
-./ffmpeg
+想使用v2版本模型的话，需要额外下载
 
-https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe
+- ./assets/pretrained_v2
 
-./ffprobe
+### 2. 安装 ffmpeg
+若ffmpeg和ffprobe已安装则跳过。
 
-https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe
+#### Ubuntu/Debian 用户
+```bash
+sudo apt install ffmpeg
+```
+#### MacOS 用户
+```bash
+brew install ffmpeg
+```
+#### Windows 用户
+下载后放置在根目录。
+- 下载[ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe)
 
-如果你想使用最新的RMVPE人声音高提取算法，则你需要下载音高提取模型参数并放置于RVC根目录
+- 下载[ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe)
 
-https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt
+### 3. 下载 rmvpe 人声音高提取算法所需文件
 
-    A卡I卡用户需要的dml环境要请下载
+如果你想使用最新的RMVPE人声音高提取算法，则你需要下载音高提取模型参数并放置于RVC根目录。
 
-    https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx
+- 下载[rmvpe.pt](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt)
 
-```
-之后使用以下指令来启动WebUI:
-```bash
-python infer-web.py
-```
-如果你正在使用Windows 或 macOS，你可以直接下载并解压`RVC-beta.7z`，前者可以运行`go-web.bat`以启动WebUI，后者则运行命令`sh ./run.sh`以启动WebUI。
+#### 下载 rmvpe 的 dml 环境(可选, A卡/I卡用户)
 
-对于需要使用IPEX技术的I卡用户，请先在终端执行`source /opt/intel/oneapi/setvars.sh`（仅Linux）。
+- 下载[rmvpe.onnx](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx)
 
-仓库内还有一份`小白简易教程.doc`以供参考。
+### 4. AMD显卡Rocm(可选, 仅Linux)
 
-## AMD显卡Rocm相关（仅Linux）
 如果你想基于AMD的Rocm技术在Linux系统上运行RVC，请先在[这里](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html)安装所需的驱动。
 
 若你使用的是Arch Linux，可以使用pacman来安装所需驱动：
@@ -155,11 +170,32 @@ export HSA_OVERRIDE_GFX_VERSION=10.3.0
 sudo usermod -aG render $USERNAME
 sudo usermod -aG video $USERNAME
 ````
-之后运行WebUI：
+
+## 开始使用
+### 直接启动
+使用以下指令来启动 WebUI
 ```bash
 python infer-web.py
 ```
 
+若先前使用 Poetry 安装依赖，则可以通过以下方式启动WebUI
+```bash
+poetry run python infer-web.py
+```
+
+### 使用整合包
+下载并解压`RVC-beta.7z`
+#### Windows 用户
+双击`go-web.bat`
+#### MacOS 用户
+```bash
+sh ./run.sh
+```
+### 对于需要使用IPEX技术的I卡用户(仅Linux)
+```bash
+source /opt/intel/oneapi/setvars.sh
+```
+
 ## 参考项目
 + [ContentVec](https://github.com/auspicious3000/contentvec/)
 + [VITS](https://github.com/jaywalnut310/vits)

diff --git a/Retrieval_based_Voice_Conversion_WebUI.ipynb b/Retrieval_based_Voice_Conversion_WebUI.ipynb
@@ -290,7 +290,7 @@
     "\n",
     "!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",
     "\n",
-    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"
+    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"
    ]
   },
   {

diff --git a/Retrieval_based_Voice_Conversion_WebUI_v2.ipynb b/Retrieval_based_Voice_Conversion_WebUI_v2.ipynb
@@ -309,7 +309,7 @@
     "\n",
     "!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",
     "\n",
-    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"
+    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"
    ]
   },
   {
-Original file line number
+Diff line change
@@ Expand Up / @@ -290,7 +290,7 @@ @@
         "\n",
         "!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",
         "\n",
-        "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"
+        "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"
        ]
       },
       {
@@ Expand Down @@