ByteDance VolcEngine Speech Synthesis
Speech synthesis, or text-to-speech (TTS), has many excellent open-source solutions like GPT-SoVITS, ChatTTS, and free options such as edge-tts. Of course, commercial-grade services also exist, such as ByteDance VolcEngine's speech synthesis. For free access, open-source solutions are naturally preferred, but for better quality, commercial services are more suitable. Especially with the development of large language models, prices are getting lower, making commercial APIs a good choice for voiceovers.
Since version 2.88, ByteDance VolcEngine's speech synthesis service has been added. It supports voiceovers in 8 languages, including Chinese, English, Japanese, Portuguese, Spanish, Thai, Vietnamese, and Indonesian. For Chinese, it also supports various dialects such as Northeastern Mandarin and Sichuanese. It offers 20,000 free requests, capable of synthesizing approximately 10 hours of audio.
Supported Chinese Voices
Only some Chinese voices are shown here. For voices in the other 7 languages, please check this link: https://www.volcengine.com/docs/6561/97465
Many Chinese voices are supported, including various dialects and popular movie narration voices like 'Xiaoshuai' and 'Xiaomei' from Douyin (TikTok).
Voice Name | voice_type |
---|---|
CanCan 2.0 | BV700_V2_streaming |
YangYang | BV705_streaming |
Sunny Youth | BV123_streaming |
Relaxed Youth | BV120_streaming |
Generic Son-in-law | BV119_streaming |
Ancient Style Young Woman | BV115_streaming |
Authoritative Uncle | BV107_streaming |
Simple and Honest Youth | BV100_streaming |
Gentle Lady | BV104_streaming |
Cheerful Youth | BV004_streaming |
Doting Young Lady | BV113_streaming |
Refined Youth | BV102_streaming |
Sweet Xiaoyuan | BV405_streaming |
Friendly Female Voice | BV007_streaming |
Intellectual Female Voice | BV009_streaming |
ChengCheng | BV419_streaming |
TongTong | BV415_streaming |
Friendly Male Voice | BV008_streaming |
Dubbed Movie Male Voice | BV408_streaming |
Lazy Lamb | BV426_streaming |
Fresh and Artistic Female Voice | BV428_streaming |
Motivational Female Voice | BV403_streaming |
Wise Elder | BV158_streaming |
Benevolent Grandma | BV157_streaming |
Rap Brother | BR001_streaming |
Energetic Narrator Male | BV410_streaming |
Film/TV Narrator Xiaoshuai | BV411_streaming |
Narrator Xiaoshuai - Multi-Emotion | BV437_streaming |
Film/TV Narrator Xiaomei | BV412_streaming |
Playboy Youth | BV159_streaming |
Livestreaming Queen | BV418_streaming |
Relaxed Youth | BV120_streaming |
Steady Narrator Male | BV142_streaming |
Charming Youth | BV143_streaming |
How to Enable
- First, register, log in, and complete real-name verification.
https://console.volcengine.com/
Open this link to register, log in, and complete real-name verification.
- After entering the console, open the "Speech Technology" page as shown in the image below.
Alternatively, click this link to directly access: https://console.volcengine.com/speech/app
Then, create an application as shown in the image below. You can fill in the name and description as you wish. The key is to select "Speech Synthesis Service," and then click "OK" to complete.
- Next, go to the speech synthesis page to activate the free trial.
Navigate to: https://console.volcengine.com/speech/service/8
At the top, select the application you just created, then click "Trial" to activate.
- Copy the 3 parameters, then you can fill them into your video translation software.
The first is cluster id
. As shown, copy the name under "cluster id".
The second is App id
. Scroll down on the same page, and you'll find it.
The third is Access Token
, located to the right of App id
. Copy it.
- Fill these into your video translation software. Open the Menu > TTS Settings > ByteDance VolcEngine Speech Synthesis window, fill in the details, and save after testing for issues.
How to Use in Video Translation Software
After filling in the parameters and confirming no issues, first select the target language in the software. Then, choose ByteDance VolcEngine Speech Synthesis from the dubbing channels. You can click to preview each voice timbre.
Select your preferred voice timbre to start the dubbing process.
Special Note
If you activate the official version, only "General Male" and "General Female" voices are available by default. Other voices need to be purchased and activated separately in the ByteDance VolcEngine backend.