VoxCPM-0.5B: A Small Yet Powerful One-Click Voice Cloning Package

VoxCPM: A tokenizer-free TTS for context-aware speech generation and realistic voice cloning.

How to Use

Download and extract the package.
Double-click 双击启动.bat. The first time you launch it, the SenseVoiceSmall model will be downloaded from modelscope.cn. This model is used to transcribe reference audio into text.

After successful startup, the operation interface will automatically open in your browser. If it doesn't, please manually navigate to http://127.0.0.1:7860 in your browser.

Loading interface 启动中

If the bottom of the window displays the image below, it indicates success.

If an Error: message appears as shown below, it means the startup failed. Please close the window and try again.

Upon success, the address http://127.0.0.1:7860 will automatically open in your browser.

Upload a 3-10 second reference audio to clone its voice tone. After uploading, the corresponding text content will be automatically recognized and generated; you can also modify it manually. Then, enter the text you want to synthesize into speech.

The package already includes the models, but it might still check for updates. If you encounter a network connection failure during use, with an error message containing a string similar to HTTPConnection, and you don't use a proxy/VPN, you can right-click and edit 双击启动.bat. Remove rem from the line rem set HF_ENDPOINT=https://hf-mirror.com, then save and double-click the file again to restart.
If you use a proxy/VPN and know your tool's proxy port, you don't need to perform the previous step. Instead, remove rem from the line rem set https_proxy=http://127.0.0.1:10808, change 10808 to your proxy port, save, and restart. This will ensure a more stable connection and reduce the likelihood of connection errors.