Skip to content

Two Key Factors Determining Quality:

  1. The accuracy of the recognized text.

  2. The quality of the translation of that text.

The accuracy of the text directly determines the quality of the translation. Therefore, to improve translation quality, you need to address these two aspects.

I: Improving Text Recognition Accuracy:

  1. Use Large Model large-v3.

    From the base model, small model, medium model to the large-v3 model, the recognition accuracy improves, but the consumption of computer resources also increases. If your computer has a good performance NVIDIA graphics card and a video memory of 8GB or more, and you have configured the CUDA and cuDNN environment, you can try using the large-v3 model, which will significantly improve the accuracy of text subtitle recognition.

    View CUDA and cuDNN environment installation method

2. Separate the background sound in the video.

If there is a lot of background music or background noise in the video, it will definitely interfere with the text recognition effect. You can try to select "Keep Background Sound", which will separate the background sound before recognition, and only use the human voice inside to recognize, the effect will be much better.

Of course, you can also use other third-party separation tools or the "Separate Vocal Background" function on the left side of the software to separate the human voice and background sound in the video separately.

Then use the "Audio and Video to Text" function to separately recognize the human voice for subtitles to obtain text subtitles.

Then, under "Text Subtitle Translation", translate the subtitle into the target language.

Then, in the "Standard Function Mode", import the subtitle, add background music, and finally embed the dubbing and subtitles into the video. Although the steps are slightly cumbersome, it can significantly improve the translation effect.

3. Manually modify and adjust

After the subtitle recognition is completed, and after the translation is completed, the current complete text will be displayed in the subtitle area on the right side of the software. You can "click the pause" button to pause, and then manually modify and adjust. No matter how accurate machine recognition and translation are, they are never as good as manual proofreading.

II: Improve the quality of text subtitle translation

Among them, the best translation quality is ChatGPT/DeepL/Azure. These three require paid accounts, but do not support domestic users to pay, and ChatGPT/Azure also need to configure proxies, which has a high threshold.

If you meet this condition, have a paid account and can configure a proxy, you can use these 3 translation channels to improve the translation quality (there are many transit proxy services available for ChatGPT in China).

The next best effect is Google/Gemini/Microsoft, these three are free, Google and Gemini need to configure a proxy, Microsoft does not need a proxy.

However, it should be noted that Gemini has higher security restrictions. If the content of your video dialogue is graded, Gemini may refuse to translate it.

Again, you can choose Baidu Translate and Tencent Translate. You need to apply for free keys and appids on their websites respectively. Among them, Tencent has a higher free quota, and Baidu has a very low free quota.

In summary, if conditions are met, ChatGPT/DeepL is preferred, then Google, then Microsoft, and finally Tencent Translate Baidu Translate.

Of course, you can also use DeepLx to get DeepL for free, but it is unstable and easy to be banned by IP.

Similarly, after the translation is completed, a pause button will also appear. Click pause, and the translation results can be manually checked and modified in the subtitle area on the right.