Skip to content

Gemini AI is not only an excellent large language model for chatting, but also a great speech recognition and audio/video-to-text tool. It offers over 1500 free calls per day, which should be enough for everyday use.

How to Enable Gemini AI Service

First, you need to visit the Gemini AI online Studio page: https://aistudio.google.com/. Try to see if you can open it.

  1. Scientific Internet Access is a Prerequisite: This is perhaps the only hurdle to using Gemini AI. Sometimes, even if you are using scientific internet access, opening the above website may still show a "Country or region not supported" message.

At this point, you need to try switching VPN nodes until the page displays the interface shown below correctly:

image.png

  1. Get API Key: In the upper left corner of the page shown above, you will see a Get API Key button. Click it and create a new key.

    image.png

  2. Paste API Key: Paste the API Key you obtained into the pyVideoTrans software. Specifically, open the software's settings menu, find the "Gemini Pro Gemini Key" option, and paste the key in.

    image.png

Using in Video Translation and Dubbing Software

Please upgrade to v3.07 patch version first

  1. First, fill in your Key and the model you are using in the menu bar--Translation Settings--Gemini pro, and you can also modify the transcription prompt here.

image.png

  1. Don't forget the proxy/VPN, otherwise errors are inevitable.

image.png

  1. Select Gemini large model recognition in the speech recognition channel, upload audio and video, select the pronunciation language, and do not select Chinese re-segmentation. Gemini's own segmentation effect is good enough. Selecting it may result in worse results.

image.png

  1. Just wait for the recognition result. If you are not satisfied, you can adjust the prompt to modify it again.

image.png