Skip to content

ByteDance VolcEngine Speech Synthesis

Speech synthesis, or text-to-speech (TTS), has many excellent open-source solutions like GPT-SoVITS, ChatTTS, and free options such as edge-tts. Of course, commercial-grade services also exist, such as ByteDance VolcEngine's speech synthesis. For free access, open-source solutions are naturally preferred, but for better quality, commercial services are more suitable. Especially with the development of large language models, prices are getting lower, making commercial APIs a good choice for voiceovers.

Since version 2.88, ByteDance VolcEngine's speech synthesis service has been added. It supports voiceovers in 8 languages, including Chinese, English, Japanese, Portuguese, Spanish, Thai, Vietnamese, and Indonesian. For Chinese, it also supports various dialects such as Northeastern Mandarin and Sichuanese. It offers 20,000 free requests, capable of synthesizing approximately 10 hours of audio.

Supported Chinese Voices

Only some Chinese voices are shown here. For voices in the other 7 languages, please check this link: https://www.volcengine.com/docs/6561/97465

Many Chinese voices are supported, including various dialects and popular movie narration voices like 'Xiaoshuai' and 'Xiaomei' from Douyin (TikTok).

Voice Namevoice_type
CanCan 2.0BV700_V2_streaming
YangYangBV705_streaming
Sunny YouthBV123_streaming
Relaxed YouthBV120_streaming
Generic Son-in-lawBV119_streaming
Ancient Style Young WomanBV115_streaming
Authoritative UncleBV107_streaming
Simple and Honest YouthBV100_streaming
Gentle LadyBV104_streaming
Cheerful YouthBV004_streaming
Doting Young LadyBV113_streaming
Refined YouthBV102_streaming
Sweet XiaoyuanBV405_streaming
Friendly Female VoiceBV007_streaming
Intellectual Female VoiceBV009_streaming
ChengChengBV419_streaming
TongTongBV415_streaming
Friendly Male VoiceBV008_streaming
Dubbed Movie Male VoiceBV408_streaming
Lazy LambBV426_streaming
Fresh and Artistic Female VoiceBV428_streaming
Motivational Female VoiceBV403_streaming
Wise ElderBV158_streaming
Benevolent GrandmaBV157_streaming
Rap BrotherBR001_streaming
Energetic Narrator MaleBV410_streaming
Film/TV Narrator XiaoshuaiBV411_streaming
Narrator Xiaoshuai - Multi-EmotionBV437_streaming
Film/TV Narrator XiaomeiBV412_streaming
Playboy YouthBV159_streaming
Livestreaming QueenBV418_streaming
Relaxed YouthBV120_streaming
Steady Narrator MaleBV142_streaming
Charming YouthBV143_streaming

How to Enable

  1. First, register, log in, and complete real-name verification.

https://console.volcengine.com/

Open this link to register, log in, and complete real-name verification.

  1. After entering the console, open the "Speech Technology" page as shown in the image below.

image.png

Alternatively, click this link to directly access: https://console.volcengine.com/speech/app

Then, create an application as shown in the image below. You can fill in the name and description as you wish. The key is to select "Speech Synthesis Service," and then click "OK" to complete.

image.png

  1. Next, go to the speech synthesis page to activate the free trial.

Navigate to: https://console.volcengine.com/speech/service/8

At the top, select the application you just created, then click "Trial" to activate.

image.png

  1. Copy the 3 parameters, then you can fill them into your video translation software.

The first is cluster id. As shown, copy the name under "cluster id".

image.png

The second is App id. Scroll down on the same page, and you'll find it.

image.png

The third is Access Token, located to the right of App id. Copy it.

image.png

  1. Fill these into your video translation software. Open the Menu > TTS Settings > ByteDance VolcEngine Speech Synthesis window, fill in the details, and save after testing for issues.

image.png

How to Use in Video Translation Software

After filling in the parameters and confirming no issues, first select the target language in the software. Then, choose ByteDance VolcEngine Speech Synthesis from the dubbing channels. You can click to preview each voice timbre.

image.png

Select your preferred voice timbre to start the dubbing process.

Special Note

If you activate the official version, only "General Male" and "General Female" voices are available by default. Other voices need to be purchased and activated separately in the ByteDance VolcEngine backend.