Using Qwen-TTS for Voiceovers | pyVideoTrans-Open Source Video Translation Tool -pyvideotrans.com github.com/jianchang512/pyvideotrans

pyVideoTrans video translation software version v3.74-0720 and later integrates Alibaba's Qwen-TTS speech synthesis service!

Simply put, Qwen-TTS is an advanced text-to-speech technology that converts text into realistic and natural-sounding human voices. A key highlight is its ability to automatically adjust speech rhythm and emotion based on the text content.

Qwen-TTS Model

The qwen-tts model supports both Chinese and English, and also supports three dialects: Beijing dialect, Shanghainese (Wu dialect), and Sichuan dialect. Model names: qwen-tts qwen-tts-latest

Click here to view the specific voice characteristics and supported languages of qwen-tts

Currently, Qwen-TTS supports the following voices for you to choose from. All roles support both Chinese and English:

Beijing dialect: Dylan
Shanghainese: Jada
Sichuan dialect: Sunny
Others: Chelsie, Cherry, Ethan, Serena

Qwen3-TTS Model

The qwen3-tts model supports 10 languages and multiple Chinese dialects. Model name: qwen3-tts-flash
Click here to view the specific voice characteristics and supported languages of qwen3-tts

芊悦 Cherry

晨煦 Ethan

不吃鱼 Nofish

詹妮弗 Jennifer

甜茶 Ryan

卡捷琳娜 Katerina

墨讲师 Elias

上海-阿珍 Jada

北京-晓东 Dylan

四川-晴儿 Sunny

南京-老李 li

陕西-秦川 Marcus

闽南-阿杰 Roy

天津-李彼得 Peter

粤语-阿强 Rocky

粤语-阿清 Kiki

四川-程川 Eric

About Free Quota:
Alibaba provides a free quota of 1 million tokens for this service, which can synthesize approximately 20,000 seconds of audio, or about 333 minutes (approximately 5.5 hours).
This quota is sufficient for most individual users' regular use and feature testing.

How to Use the Qwen-TTS Feature?

No complex settings are required. With just a few simple steps, you can use the powerful Qwen-TTS in pyVideoTrans.

Step 1: Obtain and Configure Your API KEY

Alibaba provides a free quota for each user.

Please click this link to visit the Alibaba Cloud Bailian platform: https://bailian.console.aliyun.com/?tab=model#/api-key

Log in to your Alibaba Cloud account (register if you don't have one).
On the API-KEY management page, click "Create API-KEY". The system will automatically generate a string of characters starting with "sk-". This is your API KEY. Please copy this string of characters.
Return to the pyVideoTrans software, find TTS Settings in the top menu bar, click it, and select Qwen TTS in the drop-down menu.
In the pop-up Qwen TTS configuration window, paste the API KEY you just copied into the "API KEY" input box. You can click the "Test" button to listen to the effect. If you can hear the sound, it means the configuration is successful. Finally, click Save.

Step 2: Use Qwen-TTS in Video Translation

After the configuration is complete, you can enable Qwen-TTS when processing a single video.

In the main interface of pyVideoTrans, find the "Voiceover Channel" drop-down menu, click it, and select "Qwen TTS".
In the adjacent "Voiceover Role" menu, you can select your favorite voice, such as choosing "Cherry" to experience a standard female voice, or choosing "Sunny" for a fun Sichuan dialect voiceover.

Step 3: Use in Batch Voiceover and Multi-Role Voiceover

Qwen-TTS's powerful features also apply to batch processing tasks, greatly improving your work efficiency.

Batch Voiceover for Subtitles: If you have multiple SRT subtitle files that need voiceover, you can switch to the "Batch Voiceover for Subtitles" interface. Select "Qwen TTS" and your desired role in the "Voiceover Channel" below.
Multi-Role Voiceover for Subtitles: When processing conversations with multiple characters, this feature also applies. You can assign different Qwen-TTS voices to different characters in the "Multi-Role Voiceover for Subtitles" feature area.

Qwen-TTS Model ​

Qwen3-TTS Model ​

How to Use the Qwen-TTS Feature? ​

Step 1: Obtain and Configure Your API KEY ​

Step 2: Use Qwen-TTS in Video Translation ​

Step 3: Use in Batch Voiceover and Multi-Role Voiceover ​