Local Large Language Models and How to Use Domestic AIs Compatible with OpenAI ChatGPT API
In video translation and dubbing software, AI large language models can serve as efficient translation channels, significantly improving translation quality through contextual understanding.
Currently, most domestic AI interfaces are compatible with OpenAI technology, allowing users to operate directly within OpenAI ChatGPT or local large language models. Alternatively, you can deploy and use Ollama locally.
Moonshot AI Usage
- Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
- Enter
https://api.moonshot.cn/v1
in the API Endpoint Address. - Enter the
API Key
obtained from the Moonshot Open Platform in the SK field. You can obtain it from this website: https://platform.moonshot.cn/console/api-keys - Enter
moonshot-v1-8k,moonshot-v1-32k,moonshot-v1-128k
in the model text box area. - Then, select the desired model in the model selection dropdown and keep the settings after testing.
DeepSeek AI Usage
- Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
- Enter
https://api.deepseek.com/v1
in the API Endpoint Address. - Enter the
API Key
obtained from the DeepSeek Open Platform in the SK field. You can obtain it from this website: https://platform.deepseek.com/api_keys - Enter
deepseek-chat
in the model text box area. - Then, select
deepseek-chat
in the model selection dropdown and keep the settings after testing.
Zhipu AI BigModel Usage
- Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
- Enter
https://open.bigmodel.cn/api/paas/v4/
in the API Endpoint Address. - Enter the
API Key
obtained from the Zhipu AI Platform in the SK field. You can obtain it from this website: https://www.bigmodel.cn/usercenter/apikeys - Enter
glm-4-plus,glm-4-0520,glm-4 ,glm-4-air,glm-4-airx,glm-4-long , glm-4-flashx ,glm-4-flash
in the model text box area. - Then, select the desired model in the model selection dropdown. You can choose the free model
glm-4-flash
and keep the settings after testing.
Baichuan AI Usage
- Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
- Enter
https://api.baichuan-ai.com/v1
in the API Endpoint Address. - Enter the
API Key
obtained from the Baichuan AI Platform in the SK field. You can obtain it from this website: https://platform.baichuan-ai.com/console/apikey - Enter
Baichuan4,Baichuan3-Turbo,Baichuan3-Turbo-128k,Baichuan2-Turbo
in the model text box area. - Then, select the desired model in the model selection dropdown and keep the settings after testing.
Lingyi Wanwu (01.AI)
Official Website: https://lingyiwanwu.com
API Key Acquisition: https://platform.lingyiwanwu.com/apikeys
API URL: https://api.lingyiwanwu.com/v1
Available Model: yi-lightning
Alibaba Bailian
Alibaba Bailian is an AI model marketplace that provides all Alibaba-related models and other manufacturers' models, including Deepseek-r1.
Official Website: https://bailian.console.aliyun.com
API KEY (SK) Acquisition: https://bailian.console.aliyun.com/?apiKey=1#/api-key
API URL: https://dashscope.aliyuncs.com/compatible-mode/v1
Available Models: Numerous, see details at https://bailian.console.aliyun.com/#/model-market
Silicon Flow
Another AI marketplace similar to Alibaba Bailian, providing mainstream domestic models, including deepseek-r1.
Official Website: https://siliconflow.cn
API KEY (SK) Acquisition: https://cloud.siliconflow.cn/account/ak
API URL: https://api.siliconflow.cn/v1
Available Models: Numerous, see details at https://cloud.siliconflow.cn/models?types=chat
Note: Silicon Flow provides the Qwen/Qwen2.5-7B-Instruct
free model, which can be used directly without cost.
ByteDance Volcano Engine Ark
An AI marketplace similar to Alibaba Bailian, in addition to gathering Doubao series models, there are also some third-party models, including deepseek-r1.
Official Website: https://www.volcengine.com/product/ark
API KEY (SK) Acquisition: https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey
API URL: https://ark.cn-beijing.volces.com/api/v3
MODELS: Numerous, see details at https://console.volcengine.com/ark/region:ark+cn-beijing/model?vendor=Bytedance&view=LIST_VIEW
Note: ByteDance Volcano Engine Ark's compatibility with the OpenAI SDK is a bit peculiar. You cannot directly fill in the model name. You need to create an inference endpoint in the Volcano Engine Ark console in advance, select the model to use in the inference endpoint, and then fill in the inference endpoint ID in the place where the model is needed, i.e., in the software. If you find it troublesome, you can ignore it, as it has no other advantages besides the slightly lower price. See how to create an inference endpoint: https://www.volcengine.com/docs/82379/1099522
Important Notes:
Most AI translation channels may limit the number of requests per minute. If an error message indicating that the request frequency has been exceeded appears during use, you can click "Translation Channel↓" on the software's main interface and change the pause seconds to 10 in the pop-up window. This means waiting 10 seconds after each translation before initiating the next translation request, with a maximum of 6 times per minute to prevent exceeding the frequency limit.
If the selected model is not intelligent enough, especially if the locally deployed model is limited by hardware resources and is usually smaller, it may not be able to accurately return translated text in the required format according to the instruction requirements, and there may be too many blank lines in the translation results. At this time, you can try using a larger model, or open Menu--Tools/Options--Advanced Options--Send Complete Subtitle Content when using AI translation and uncheck the box.
Deploying Tongyi Qianwen Large Model Locally Using Ollama
If you have some hands-on skills, you can also deploy a large language model locally and use it for translation. The following uses Tongyi Qianwen as an example to introduce the deployment and usage methods.
1. Download the exe and Run Successfully
Open the website https://ollama.com/download
Click Download. After the download is complete, double-click to open the installation interface, and click Install
to complete the installation.
After completion, a black or blue window will automatically pop up. Enter the 3 words ollama run qwen
and press Enter. This will automatically download the Tongyi Qianwen model.
Wait for the model download to finish. No proxy is required, and the speed is quite fast.
The model will automatically run after the download is complete. When the progress reaches 100% and the "Success" character is displayed, it means that the model has been successfully run, and it also means that the Tongyi Qianwen large model installation and deployment are all completed, and you can use it happily. Isn't it super simple?
The default API endpoint address is http://localhost:11434
If the window is closed, how do I open it again? It's also very simple. Click the computer's start menu, find "Command Prompt" or "Windows PowerShell" (or directly press
Win+q
and type cmd to search), click to open it, and enterollama run qwen
to complete the process.
2. Use Directly in the Console Command Window
As shown in the figure, when this interface is displayed, you can actually enter text directly in the window to start using it.
3. Of course, this interface may not be very friendly, so let's get a friendly UI.
Open the website https://chatboxai.app/zh and click Download.
Double-click after downloading and wait for the interface window to open automatically.
Click "Start Settings". In the pop-up overlay, click the top Model, select "Ollama" in the AI Model Provider, fill in the API domain address http://localhost:11434
, select Qwen:latest
in the Model drop-down menu, and then save it.
The usage interface displayed after saving. Use your imagination and use it freely.
4. Fill in the API to the Video Translation and Dubbing Software
Open Menu--Settings--Compatible with OpenAI and Local Large Models, add a model
,qwen
in the middle text box, as shown below, and then select the model.Fill in
http://localhost:11434/v1
in the API URL, and fill in the SK arbitrarily, such as 1234.Test if it is successful, save it if it is successful, and go to use it.
5. What Other Models Can Be Used?
In addition to Tongyi Qianwen, there are many models that can be used. The usage method is just as simple, just 3 words ollama run model name
.
Open this address https://ollama.com/library to see all the model names. If you want to use a model, copy the name, and then execute ollama run model name
.
Remember how to open the command window? Click the start menu and find Command Prompt
or Windows PowerShell
.
For example, I want to install the openchat
model.
Open Command Prompt
, enter ollama run openchat
, press Enter, and then wait until Success is displayed.
Important Notes:
Most AI translation channels may limit the number of requests per minute. If an error message indicating that the request frequency has been exceeded appears during use, you can click "Translation Channel↓" on the software's main interface and change the pause seconds to 10 in the pop-up window. This means waiting 10 seconds after each translation before initiating the next translation request, with a maximum of 6 times per minute to prevent exceeding the frequency limit.
If the selected model is not intelligent enough, especially if the locally deployed model is limited by hardware resources and is usually smaller, it may not be able to accurately return translated text in the required format according to the instruction requirements, and there may be too many blank lines in the translation results. At this time, you can try using a larger model, or open Menu--Tools/Options--Advanced Options--Send Complete Subtitle Content when using AI translation and uncheck the box.