Local Large Language Models and How to Use Domestic AIs Compatible with OpenAI ChatGPT API

In video translation and dubbing software, AI large language models can serve as efficient translation channels, significantly improving translation quality through contextual understanding.

Currently, most domestic AI interfaces are compatible with OpenAI technology, allowing users to operate directly within OpenAI ChatGPT or local large language models. Alternatively, you can deploy and use Ollama locally.

Moonshot AI Usage

Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
Enter https://api.moonshot.cn/v1 in the API Endpoint Address.
Enter the API Key obtained from the Moonshot Open Platform in the SK field. You can obtain it from this website: https://platform.moonshot.cn/console/api-keys
Enter moonshot-v1-8k,moonshot-v1-32k,moonshot-v1-128k in the model text box area.
Then, select the desired model in the model selection dropdown and keep the settings after testing.

DeepSeek AI Usage

Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
Enter https://api.deepseek.com/v1 in the API Endpoint Address.
Enter the API Key obtained from the DeepSeek Open Platform in the SK field. You can obtain it from this website: https://platform.deepseek.com/api_keys
Enter deepseek-chat in the model text box area.
Then, select deepseek-chat in the model selection dropdown and keep the settings after testing.

Zhipu AI BigModel Usage

Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
Enter https://open.bigmodel.cn/api/paas/v4/ in the API Endpoint Address.
Enter the API Key obtained from the Zhipu AI Platform in the SK field. You can obtain it from this website: https://www.bigmodel.cn/usercenter/apikeys
Enter glm-4-plus,glm-4-0520,glm-4 ,glm-4-air,glm-4-airx,glm-4-long , glm-4-flashx ,glm-4-flash in the model text box area.
Then, select the desired model in the model selection dropdown. You can choose the free model glm-4-flash and keep the settings after testing.

Baichuan AI Usage

Go to Menu--Translation Settings--OpenAI ChatGPT API Settings interface.
Enter https://api.baichuan-ai.com/v1 in the API Endpoint Address.
Enter the API Key obtained from the Baichuan AI Platform in the SK field. You can obtain it from this website: https://platform.baichuan-ai.com/console/apikey
Enter Baichuan4,Baichuan3-Turbo,Baichuan3-Turbo-128k,Baichuan2-Turbo in the model text box area.
Then, select the desired model in the model selection dropdown and keep the settings after testing.

Lingyi Wanwu (01.AI)

Official Website: https://lingyiwanwu.com

API Key Acquisition: https://platform.lingyiwanwu.com/apikeys

API URL: https://api.lingyiwanwu.com/v1

Available Model: yi-lightning

Alibaba Bailian

Alibaba Bailian is an AI model marketplace that provides all Alibaba-related models and other manufacturers' models, including Deepseek-r1.

Official Website: https://bailian.console.aliyun.com

API KEY (SK) Acquisition: https://bailian.console.aliyun.com/?apiKey=1#/api-key

API URL: https://dashscope.aliyuncs.com/compatible-mode/v1

Available Models: Numerous, see details at https://bailian.console.aliyun.com/#/model-market

Silicon Flow

Another AI marketplace similar to Alibaba Bailian, providing mainstream domestic models, including deepseek-r1.

Official Website: https://siliconflow.cn

API KEY (SK) Acquisition: https://cloud.siliconflow.cn/account/ak

API URL: https://api.siliconflow.cn/v1

Available Models: Numerous, see details at https://cloud.siliconflow.cn/models?types=chat

Note: Silicon Flow provides the Qwen/Qwen2.5-7B-Instruct free model, which can be used directly without cost.

ByteDance Volcano Engine Ark

An AI marketplace similar to Alibaba Bailian, in addition to gathering Doubao series models, there are also some third-party models, including deepseek-r1.

Official Website: https://www.volcengine.com/product/ark

API KEY (SK) Acquisition: https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey

API URL: https://ark.cn-beijing.volces.com/api/v3

MODELS: Numerous, see details at https://console.volcengine.com/ark/region:ark+cn-beijing/model?vendor=Bytedance&view=LIST_VIEW

Note: ByteDance Volcano Engine Ark's compatibility with the OpenAI SDK is a bit peculiar. You cannot directly fill in the model name. You need to create an inference endpoint in the Volcano Engine Ark console in advance, select the model to use in the inference endpoint, and then fill in the inference endpoint ID in the place where the model is needed, i.e., in the software. If you find it troublesome, you can ignore it, as it has no other advantages besides the slightly lower price. See how to create an inference endpoint: https://www.volcengine.com/docs/82379/1099522

Important Notes:

Most AI translation channels may limit the number of requests per minute. If an error message indicating that the request frequency has been exceeded appears during use, you can click "Translation Channel↓" on the software's main interface and change the pause seconds to 10 in the pop-up window. This means waiting 10 seconds after each translation before initiating the next translation request, with a maximum of 6 times per minute to prevent exceeding the frequency limit.
If the selected model is not intelligent enough, especially if the locally deployed model is limited by hardware resources and is usually smaller, it may not be able to accurately return translated text in the required format according to the instruction requirements, and there may be too many blank lines in the translation results. At this time, you can try using a larger model, or open Menu--Tools/Options--Advanced Options--Send Complete Subtitle Content when using AI translation and uncheck the box.

Deploying Tongyi Qianwen Large Model Locally Using Ollama

If you have some hands-on skills, you can also deploy a large language model locally and use it for translation. The following uses Tongyi Qianwen as an example to introduce the deployment and usage methods.

1. Download the exe and Run Successfully

Open the website https://ollama.com/download

Click Download. After the download is complete, double-click to open the installation interface, and click Install to complete the installation.

After completion, a black or blue window will automatically pop up. Enter the 3 words ollama run qwen and press Enter. This will automatically download the Tongyi Qianwen model.

Wait for the model download to finish. No proxy is required, and the speed is quite fast.

The model will automatically run after the download is complete. When the progress reaches 100% and the "Success" character is displayed, it means that the model has been successfully run, and it also means that the Tongyi Qianwen large model installation and deployment are all completed, and you can use it happily. Isn't it super simple?

The default API endpoint address is http://localhost:11434

If the window is closed, how do I open it again? It's also very simple. Click the computer's start menu, find "Command Prompt" or "Windows PowerShell" (or directly press Win+q and type cmd to search), click to open it, and enter ollama run qwen to complete the process.

2. Use Directly in the Console Command Window

As shown in the figure, when this interface is displayed, you can actually enter text directly in the window to start using it.

3. Of course, this interface may not be very friendly, so let's get a friendly UI.

Open the website https://chatboxai.app/zh and click Download.

Double-click after downloading and wait for the interface window to open automatically.

Click "Start Settings". In the pop-up overlay, click the top Model, select "Ollama" in the AI Model Provider, fill in the API domain address http://localhost:11434, select Qwen:latest in the Model drop-down menu, and then save it.

The usage interface displayed after saving. Use your imagination and use it freely.

4. Fill in the API to the Video Translation and Dubbing Software

Open Menu--Settings--Compatible with OpenAI and Local Large Models, add a model ,qwen in the middle text box, as shown below, and then select the model.
Fill in http://localhost:11434/v1 in the API URL, and fill in the SK arbitrarily, such as 1234.
Test if it is successful, save it if it is successful, and go to use it.

5. What Other Models Can Be Used?

In addition to Tongyi Qianwen, there are many models that can be used. The usage method is just as simple, just 3 words ollama run model name.

Open this address https://ollama.com/library to see all the model names. If you want to use a model, copy the name, and then execute ollama run model name.

Remember how to open the command window? Click the start menu and find Command Prompt or Windows PowerShell.

For example, I want to install the openchat model.

Open Command Prompt, enter ollama run openchat, press Enter, and then wait until Success is displayed.

Important Notes:

Most AI translation channels may limit the number of requests per minute. If an error message indicating that the request frequency has been exceeded appears during use, you can click "Translation Channel↓" on the software's main interface and change the pause seconds to 10 in the pop-up window. This means waiting 10 seconds after each translation before initiating the next translation request, with a maximum of 6 times per minute to prevent exceeding the frequency limit.
If the selected model is not intelligent enough, especially if the locally deployed model is limited by hardware resources and is usually smaller, it may not be able to accurately return translated text in the required format according to the instruction requirements, and there may be too many blank lines in the translation results. At this time, you can try using a larger model, or open Menu--Tools/Options--Advanced Options--Send Complete Subtitle Content when using AI translation and uncheck the box.