Merge pull request #658 from harry0703/dev

bump version to 1.2.6
Merge branch 'add-siliconflow-tts' into dev
2025-05-10 14:14:42 +08:00 · 2025-05-10 14:13:37 +08:00 · 2025-05-10 14:13:18 +08:00 · 2025-05-10 14:12:11 +08:00 · 2025-05-10 14:11:26 +08:00 · 2025-05-10 14:10:42 +08:00
65 changed files with 5296 additions and 1404 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,81 @@
 name: 🐛 Bug
 description: 出现错误或未按预期工作
 title: "请在此处填写标题"
 labels:
  - bug
 body:
  - type: markdown
    attributes:
      value: |
        **在提交此问题之前，请确保您已阅读以下文档：[Getting Started (英文)](https://github.com/harry0703/MoneyPrinterTurbo/blob/main/README-en.md#system-requirements-) 或 [快速开始 (中文)](https://github.com/harry0703/MoneyPrinterTurbo/blob/main/README.md#%E5%BF%AB%E9%80%9F%E5%BC%80%E5%A7%8B-)。**
        **请填写以下信息：**
  - type: checkboxes
    attributes:
      label: 是否已存在类似问题？
      description: |
        请务必检查此问题是否已有用户反馈。
        在提交新问题前，使用 GitHub 的问题搜索框（包括已关闭的问题）或通过 Google、StackOverflow 等工具搜索，确认该问题是否重复。
        您可能已经可以找到解决问题的方法！
      options:
        - label: 我已搜索现有问题
          required: true
  - type: textarea
    attributes:
      label: 当前行为
      description: 描述您当前遇到的情况。
      placeholder: |
        MoneyPrinterTurbo 未按预期工作。当我执行某个操作时，视频未成功生成/程序报错了...
    validations:
      required: true
  - type: textarea
    attributes:
      label: 预期行为
      description: 描述您期望发生的情况。
      placeholder: |
        当我执行某个操作时，程序应当...
    validations:
      required: true
  - type: textarea
    attributes:
      label: 重现步骤
      description: 描述重现问题的步骤。描述的越详细，越有助于定位和修复问题。
    validations:
      required: true
  - type: textarea
    attributes:
      label: 堆栈追踪/日志
      description: |
        如果您有任何堆栈追踪或日志，请将它们粘贴在此处。（注意不要包含敏感信息）
    validations:
      required: true
  - type: input
    attributes:
      label: Python 版本
      description: 您遇到此问题时使用的 Python 版本。
      placeholder: v3.13.0, v3.10.0 等
    validations:
      required: true
  - type: input
    attributes:
      label: 操作系统
      description: 您使用 MoneyPrinterTurbo 遇到问题时的操作系统信息。
      placeholder: macOS 14.1, Windows 11 等
    validations:
      required: true
  - type: input
    attributes:
      label: MoneyPrinterTurbo 版本
      description: 您在哪个版本的 MoneyPrinterTurbo 中遇到了此问题？
      placeholder: v1.2.2 等
    validations:
      required: true
  - type: textarea
    attributes:
      label: 其他信息
      description: 您还有什么其他信息想补充吗？例如问题的截图或视频记录。
    validations:
      required: false
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1 @@
 blank_issues_enabled: false
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,38 @@
 name: ✨ 增加功能
 description: 为此项目提出一个新想法
 title: "请在此处填写标题"
 labels:
  - enhancement
 body:
  - type: checkboxes
    attributes:
      label: 是否已存在类似的功能请求？
      description: 请确保此功能请求是否重复。
      options:
        - label: 我已搜索现有的功能请求
          required: true
  - type: textarea
    attributes:
      label: 痛点
      description: 请解释您的功能请求。
      placeholder: 我希望可以实现这一点
    validations:
      required: true
  - type: textarea
    attributes:
      label: 建议的解决方案
      description: 请描述您能想到的解决方案。
      placeholder: 您可以添加这个功能 / 更改这个流程 / 使用某种方法
    validations:
      required: true
  - type: textarea
    attributes:
      label: 有用的资源
      description: 请提供一些有助于实现您建议的资源。
  - type: textarea
    attributes:
      label: 其他信息
      description: 您还有什么其他想补充的信息吗？例如问题的截图或视频记录。
    validations:
      required: false
--- a/.gitignore
+++ b/.gitignore
@@ -20,3 +20,9 @@ node_modules
 /sites/docs/.vuepress/.cache
 # VuePress 默认构建生成的静态文件目录
 /sites/docs/.vuepress/dist
 # 模型目录
 /models/
 ./models/*
 venv/
 .venv
--- a/.pdm-python
+++ b/.pdm-python
@@ -0,0 +1 @@
 ./MoneyPrinterTurbo/.venv/bin/python
--- a/4
+++ b/4
@@ -1,5 +1,5 @@
 # Use an official Python runtime as a parent image
-FROM python:3.10-slim-bullseye
+FROM python:3.11-slim-bullseye
 # Set the working directory in the container
 WORKDIR /MoneyPrinterTurbo
@@ -41,4 +41,4 @@ CMD ["streamlit", "run", "./webui/Main.py","--browser.serverAddress=127.0.0.1","
 ## For Linux or MacOS:
 # docker run -v $(pwd)/config.toml:/MoneyPrinterTurbo/config.toml -v $(pwd)/storage:/MoneyPrinterTurbo/storage -p 8501:8501 moneyprinterturbo
 ## For Windows:
-# docker run -v %cd%/config.toml:/MoneyPrinterTurbo/config.toml -v %cd%/storage:/MoneyPrinterTurbo/storage -p 8501:8501 moneyprinterturbo
+# docker run -v ${PWD}/config.toml:/MoneyPrinterTurbo/config.toml -v ${PWD}/storage:/MoneyPrinterTurbo/storage -p 8501:8501 moneyprinterturbo
--- a/README-en.md
+++ b/README-en.md
@@ -35,9 +35,18 @@ like to express our special thanks to
 **RecCloud (AI-Powered Multimedia Service Platform)** for providing a free `AI Video Generator` service based on this
 project. It allows for online use without deployment, which is very convenient.
-https://reccloud.com
+- Chinese version: https://reccloud.cn
 - English version: https://reccloud.com
-![](docs/reccloud.com.jpg)
+![](docs/reccloud.cn.jpg)
 ## Thanks for Sponsorship 🙏
 Thanks to Picwish https://picwish.cn for supporting and sponsoring this project, enabling continuous updates and maintenance.
 Picwish focuses on the **image processing field**, providing a rich set of **image processing tools** that extremely simplify complex operations, truly making image processing easier.
 ![picwish.jpg](docs/picwish.jpg)
 ## Features 🎯
@@ -51,28 +60,26 @@ https://reccloud.com
  satisfactory one
 - [x] Supports setting the **duration of video clips**, facilitating adjustments to material switching frequency
 - [x] Supports video copy in both **Chinese** and **English**
- [x] Supports **multiple voice** synthesis
+- [x] Supports **multiple voice** synthesis, with **real-time preview** of effects
 - [x] Supports **subtitle generation**, with adjustable `font`, `position`, `color`, `size`, and also
  supports `subtitle outlining`
 - [x] Supports **background music**, either random or specified music files, with adjustable `background music volume`
- [x] Video material sources are **high-definition** and **royalty-free**
+- [x] Video material sources are **high-definition** and **royalty-free**, and you can also use your own **local materials**
- [x] Supports integration with various models such as **OpenAI**, **moonshot**, **Azure**, **gpt4free**, **one-api**,
+- [x] Supports integration with various models such as **OpenAI**, **Moonshot**, **Azure**, **gpt4free**, **one-api**,
-  **qianwen**, **Google Gemini**, **Ollama** and more
+  **Qwen**, **Google Gemini**, **Ollama**, **DeepSeek**, **ERNIE** and more
    - For users in China, it is recommended to use **DeepSeek** or **Moonshot** as the large model provider (directly accessible in China, no VPN needed. Free credits upon registration, generally sufficient for use)
 ❓[How to Use the Free OpenAI GPT-3.5 Model?](https://github.com/harry0703/MoneyPrinterTurbo/blob/main/README-en.md#common-questions-)
 ### Future Plans 📅
- [ ] Introduce support for GPT-SoVITS dubbing
+- [ ] GPT-SoVITS dubbing support
- [ ] Enhance voice synthesis with large models for a more natural and emotionally resonant voice output
+- [ ] Optimize voice synthesis using large models for more natural and emotionally rich voice output
- [ ] Incorporate video transition effects to ensure a smoother viewing experience
+- [ ] Add video transition effects for a smoother viewing experience
- [ ] Improve the relevance of video content
+- [ ] Add more video material sources, improve the matching between video materials and script
- [ ] Add options for video length: short, medium, long
+- [ ] Add video length options: short, medium, long
- [ ] Package the application into a one-click launch bundle for Windows and macOS for ease of use
+- [ ] Support more voice synthesis providers, such as OpenAI TTS
- [ ] Enable the use of custom materials
+- [ ] Automate upload to YouTube platform
 - [ ] Offer voiceover and background music options with real-time preview
 - [ ] Support a wider range of voice synthesis providers, such as OpenAI TTS, Azure TTS
 - [ ] Automate the upload process to the YouTube platform
 ## Video Demos 📺
@@ -115,10 +122,27 @@ https://reccloud.com
 - Recommended minimum 4 CPU cores or more, 8G of memory or more, GPU is not required
 - Windows 10 or MacOS 11.0, and their later versions
 ## Quick Start 🚀
 Download the one-click startup package, extract and use directly (the path should not contain **Chinese characters**, **special characters**, or **spaces**)
 ### Windows
 - Baidu Netdisk (1.2.1 latest version): https://pan.baidu.com/s/1pSNjxTYiVENulTLm6zieMQ?pwd=g36q Extraction code: g36q
 After downloading, it is recommended to **double-click** `update.bat` first to update to the **latest code**, then double-click `start.bat` to launch
 After launching, the browser will open automatically (if it opens blank, it is recommended to use **Chrome** or **Edge**)
 ### Other Systems
 One-click startup packages have not been created yet. See the **Installation & Deployment** section below. It is recommended to use **docker** for deployment, which is more convenient.
 ## Installation & Deployment 📥
 ### Prerequisites
 - Try to avoid using **Chinese paths** to prevent unpredictable issues
- Ensure your **network** is stable, meaning you can access foreign websites normally
+- Ensure your **network** is stable, VPN needs to be in `global traffic` mode
 #### ① Clone the Project
@@ -132,11 +156,6 @@ git clone https://github.com/harry0703/MoneyPrinterTurbo.git
 - Follow the instructions in the `config.toml` file to configure `pexels_api_keys` and `llm_provider`, and according to
  the llm_provider's service provider, set up the corresponding API Key
 #### ③ Configure Large Language Models (LLM)
 - To use `GPT-4.0` or `GPT-3.5`, you need an `API Key` from `OpenAI`. If you don't have one, you can set `llm_provider`
  to `g4f` (a free-to-use GPT library https://github.com/xtekky/gpt4free)
 ### Docker Deployment 🐳
 #### ① Launch the Docker Container
@@ -152,6 +171,8 @@ cd MoneyPrinterTurbo
 docker-compose up
 ```
 > Note：The latest version of docker will automatically install docker compose in the form of a plug-in, and the start command is adjusted to `docker compose up `
 #### ② Access the Web Interface
 Open your browser and visit http://0.0.0.0:8501
@@ -162,27 +183,28 @@ Open your browser and visit http://0.0.0.0:8080/docs Or http://0.0.0.0:8080/redo
 ### Manual Deployment 📦
-#### ① Create a Python Virtual Environment
+> Video tutorials
 >
 > - Complete usage demonstration: https://v.douyin.com/iFhnwsKY/
 > - How to deploy on Windows: https://v.douyin.com/iFyjoW3M
-It is recommended to create a Python virtual environment
+#### ① Install Dependencies
-using [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
+
 It is recommended to use [pdm](https://pdm-project.org/en/latest/#installation)
 ```shell
 git clone https://github.com/harry0703/MoneyPrinterTurbo.git
 cd MoneyPrinterTurbo
-conda create -n MoneyPrinterTurbo python=3.10
+pdm sync
 conda activate MoneyPrinterTurbo
 pip install -r requirements.txt
 ```
 #### ② Install ImageMagick
 ###### Windows:
- Download https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-29-Q16-x64-static.exe
+- Download https://imagemagick.org/script/download.php Choose the Windows version, make sure to select the **static library** version, such as ImageMagick-7.1.1-32-Q16-x64-**static**.exe
 - Install the downloaded ImageMagick, **do not change the installation path**
- Modify the `config.toml` configuration file, set `imagemagick_path` to your actual installation path (if you didn't
+- Modify the `config.toml` configuration file, set `imagemagick_path` to your actual installation path
  change the path during installation, just uncomment it)
 ###### MacOS:
@@ -209,14 +231,12 @@ Note that you need to execute the following commands in the `root directory` of
 ###### Windows
 ```bat
 conda activate MoneyPrinterTurbo
 webui.bat
 ```
 ###### MacOS or Linux
 ```shell
 conda activate MoneyPrinterTurbo
 sh webui.sh
 ```
@@ -235,13 +255,15 @@ online for a quick experience.
 A list of all supported voices can be viewed here: [Voice List](./docs/voice-list.txt)
 2024-04-16 v1.1.2 Added 9 new Azure voice synthesis voices that require API KEY configuration. These voices sound more realistic.
 ## Subtitle Generation 📜
 Currently, there are 2 ways to generate subtitles:
- edge: Faster generation speed, better performance, no specific requirements for computer configuration, but the
+- **edge**: Faster generation speed, better performance, no specific requirements for computer configuration, but the
  quality may be unstable
- whisper: Slower generation speed, poorer performance, specific requirements for computer configuration, but more
+- **whisper**: Slower generation speed, poorer performance, specific requirements for computer configuration, but more
  reliable quality
 You can switch between them by modifying the `subtitle_provider` in the `config.toml` configuration file
@@ -250,18 +272,22 @@ It is recommended to use `edge` mode, and switch to `whisper` mode if the qualit
 satisfactory.
 > Note:
-> If left blank, it means no subtitles will be generated.
+>
 > 1. In whisper mode, you need to download a model file from HuggingFace, about 3GB in size, please ensure good internet connectivity
 > 2. If left blank, it means no subtitles will be generated.
-**Download whisper**
+> Since HuggingFace is not accessible in China, you can use the following methods to download the `whisper-large-v3` model file
 - Please ensure a good internet connectivity
 - `whisper` model can be downloaded from HuggingFace: https://huggingface.co/openai/whisper-large-v3/tree/main
-After downloading the model to local machine, copy the whole folder and put it into the following path: `.\MoneyPrinterTurbo\models`
+Download links:
-This is what the final path should look like: `.\MoneyPrinterTurbo\models\whisper-large-v3`
+- Baidu Netdisk: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
 - Quark Netdisk: https://pan.quark.cn/s/3ee3d991d64b
 After downloading the model, extract it and place the entire directory in `.\MoneyPrinterTurbo\models`,
 The final file path should look like this: `.\MoneyPrinterTurbo\models\whisper-large-v3`
 ```
-MoneyPrinterTurbo  
+MoneyPrinterTurbo
  ├─models
  │   └─whisper-large-v3
  │          config.json
@@ -302,6 +328,16 @@ Once successfully started, modify the `config.toml` configuration as follows:
 - Change `openai_base_url` to `http://localhost:3040/v1/`
 - Set `openai_model_name` to `gpt-3.5-turbo`
 > Note: This method may be unstable
 ### ❓AttributeError: 'str' object has no attribute 'choices'
 This issue is caused by the large language model not returning a correct response.
 It's likely a network issue. Use a **VPN**, or set `openai_base_url` to your proxy, which should solve the problem.
 At the same time, it is recommended to use **Moonshot** or **DeepSeek** as the large model provider, as these service providers have faster access and are more stable in China.
 ### ❓RuntimeError: No ffmpeg exe could be found
 Normally, ffmpeg will be automatically downloaded and detected.
@@ -326,14 +362,14 @@ ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
 [issue 56](https://github.com/harry0703/MoneyPrinterTurbo/issues/56)
 ```
-failed to generate audio, maybe the network is not available. 
+failed to generate audio, maybe the network is not available.
 if you are in China, please use a VPN.
 ```
 [issue 44](https://github.com/harry0703/MoneyPrinterTurbo/issues/44)
 ```
-failed to download videos, maybe the network is not available. 
+failed to download videos, maybe the network is not available.
 if you are in China, please use a VPN.
 ```
@@ -353,6 +389,43 @@ For Linux systems, you can manually install it, refer to https://cn.linux-consol
 Thanks to [@wangwenqiao666](https://github.com/wangwenqiao666) for their research and exploration
 ### ❓ImageMagick's security policy prevents operations related to temporary file @/tmp/tmpur5hyyto.txt
 You can find these policies in ImageMagick's configuration file policy.xml.
 This file is usually located in /etc/ImageMagick-`X`/ or a similar location in the ImageMagick installation directory.
 Modify the entry containing `pattern="@"`, change `rights="none"` to `rights="read|write"` to allow read and write operations on files.
 ### ❓OSError: [Errno 24] Too many open files
 This issue is caused by the system's limit on the number of open files. You can solve it by modifying the system's file open limit.
 Check the current limit:
 ```shell
 ulimit -n
 ```
 If it's too low, you can increase it, for example:
 ```shell
 ulimit -n 10240
 ```
 ### ❓Whisper model download failed, with the following error
 LocalEntryNotfoundEror: Cannot find an appropriate cached snapshotfolderfor the specified revision on the local disk and
 outgoing trafic has been disabled.
 To enablerepo look-ups and downloads online, pass 'local files only=False' as input.
 or
 An error occured while synchronizing the model Systran/faster-whisper-large-v3 from the Hugging Face Hub:
 An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the
 specified revision on the local disk. Please check your internet connection and try again.
 Trying to load the model directly from the local cache, if it exists.
 Solution: [Click to see how to manually download the model from netdisk](#subtitle-generation-)
 ## Feedback & Suggestions 📢
 - You can submit an [issue](https://github.com/harry0703/MoneyPrinterTurbo/issues) or
--- a/README.md
+++ b/README.md
@@ -59,7 +59,7 @@
 - [x] 支持 **背景音乐**，随机或者指定音乐文件，可设置`背景音乐音量`
 - [x] 视频素材来源 **高清**，而且 **无版权**，也可以使用自己的 **本地素材**
 - [x] 支持 **OpenAI**、**Moonshot**、**Azure**、**gpt4free**、**one-api**、**通义千问**、**Google Gemini**、**Ollama**、
-  **DeepSeek** 等多种模型接入
+  **DeepSeek**、 **文心一言** 等多种模型接入
    - 中国用户建议使用 **DeepSeek** 或 **Moonshot** 作为大模型提供商（国内可直接访问，不需要VPN。注册就送额度，基本够用）
 ### 后期计划 📅
@@ -72,10 +72,6 @@
 - [ ] 支持更多的语音合成服务商，比如 OpenAI TTS
 - [ ] 自动上传到YouTube平台
 ## 交流讨论 💬
 <img src="docs/wechat-group.jpg" width="250">
 ## 视频演示 📺
 ### 竖屏 9:16
@@ -121,20 +117,15 @@
 ## 快速开始 🚀
-下载一键启动包，解压直接使用（路径不要有 **中文** 和 **空格**）
+下载一键启动包，解压直接使用（路径不要有 **中文**、**特殊字符**、**空格**）
 ### Windows
-
+- 百度网盘（1.2.1 老版本）: https://pan.baidu.com/s/1pSNjxTYiVENulTLm6zieMQ?pwd=g36q 提取码: g36q
 - 百度网盘: https://pan.baidu.com/s/1jKF1mgsjfN8fBk6uTEHArQ?pwd=jrp7 提取码: jrp7
 下载后，建议先**双击执行** `update.bat` 更新到**最新代码**，然后双击 `start.bat` 启动
 启动后，会自动打开浏览器（如果打开是空白，建议换成 **Chrome** 或者 **Edge** 打开）
 ### 其他系统
 还没有制作一键启动包，看下面的 **安装部署** 部分，建议使用 **docker** 部署，更加方便。
 ## 安装部署 📥
 ### 前提条件
@@ -148,7 +139,7 @@
 git clone https://github.com/harry0703/MoneyPrinterTurbo.git
 ```
-#### ② 修改配置文件
+#### ② 修改配置文件（可选，建议启动后也可以在 WebUI 里面配置）
 - 将 `config.example.toml` 文件复制一份，命名为 `config.toml`
 - 按照 `config.toml` 文件中的说明，配置好 `pexels_api_keys` 和 `llm_provider`，并根据 llm_provider 对应的服务商，配置相关的
@@ -170,6 +161,8 @@ cd MoneyPrinterTurbo
 docker-compose up
 ```
 > 注意：最新版的docker安装时会自动以插件的形式安装docker compose，启动命令调整为docker compose up
 #### ② 访问Web界面
 打开浏览器，访问 http://0.0.0.0:8501
@@ -185,16 +178,14 @@ docker-compose up
 - 完整的使用演示：https://v.douyin.com/iFhnwsKY/
 - 如何在Windows上部署：https://v.douyin.com/iFyjoW3M
-#### ① 创建虚拟环境
+#### ① 依赖安装
-建议使用 [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html) 创建 python 虚拟环境
+建议使用 [pdm](https://pdm-project.org/en/latest/#installation)
 ```shell
 git clone https://github.com/harry0703/MoneyPrinterTurbo.git
 cd MoneyPrinterTurbo
-conda create -n MoneyPrinterTurbo python=3.10
+pdm sync
 conda activate MoneyPrinterTurbo
 pip install -r requirements.txt
 ```
 #### ② 安装好 ImageMagick
@@ -225,14 +216,12 @@ pip install -r requirements.txt
 ###### Windows
 ```bat
 conda activate MoneyPrinterTurbo
 webui.bat
 ```
 ###### MacOS or Linux
 ```shell
 conda activate MoneyPrinterTurbo
 sh webui.sh
 ```
@@ -300,33 +289,6 @@ MoneyPrinterTurbo
 ## 常见问题 🤔
 ### ❓如何使用免费的OpenAI GPT-3.5模型?
 [OpenAI宣布ChatGPT里面3.5已经免费了](https://openai.com/blog/start-using-chatgpt-instantly)，有开发者将其封装成了API，可以直接调用
 **确保你安装和启动了docker服务**，执行以下命令启动docker服务
 ```shell
 docker run -p 3040:3040 missuo/freegpt35
 ```
 启动成功后，修改 `config.toml` 中的配置
 - `llm_provider` 设置为 `openai`
 - `openai_api_key` 随便填写一个即可，比如 '123456'
 - `openai_base_url` 改为 `http://localhost:3040/v1/`
 - `openai_model_name` 改为 `gpt-3.5-turbo`
 > 注意：该方式稳定性较差
 ### ❓AttributeError: 'str' object has no attribute 'choices'`
 这个问题是由于大模型没有返回正确的回复导致的。
 大概率是网络原因， 使用 **VPN**，或者设置 `openai_base_url` 为你的代理 ，应该就可以解决了。
 同时建议使用 **Moonshot** 或 **DeepSeek** 作为大模型提供商，这两个服务商在国内访问速度更快，更加稳定。
 ### ❓RuntimeError: No ffmpeg exe could be found
 通常情况下，ffmpeg 会被自动下载，并且会被自动检测到。
--- a/app/asgi.py
+++ b/app/asgi.py
@@ -1,12 +1,13 @@
 """Application implementation - ASGI."""
 import os
 from fastapi import FastAPI, Request
 from fastapi.exceptions import RequestValidationError
 from fastapi.responses import JSONResponse
 from loguru import logger
 from fastapi.staticfiles import StaticFiles
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import JSONResponse
 from fastapi.staticfiles import StaticFiles
 from loguru import logger
 from app.config import config
 from app.models.exception import HttpException
@@ -24,7 +25,9 @@ def exception_handler(request: Request, e: HttpException):
 def validation_exception_handler(request: Request, e: RequestValidationError):
    return JSONResponse(
        status_code=400,
-        content=utils.get_response(status=400, data=e.errors(), message='field required'),
+        content=utils.get_response(
            status=400, data=e.errors(), message="field required"
        ),
    )
@@ -61,7 +64,9 @@ app.add_middleware(
 )
 task_dir = utils.task_dir()
-app.mount("/tasks", StaticFiles(directory=task_dir, html=True, follow_symlink=True), name="")
+app.mount(
    "/tasks", StaticFiles(directory=task_dir, html=True, follow_symlink=True), name=""
 )
 public_dir = utils.public_dir()
 app.mount("/", StaticFiles(directory=public_dir, html=True), name="")
--- a/app/config/init.py
+++ b/app/config/init.py
@@ -10,7 +10,9 @@ from app.utils import utils
 def __init_logger():
    # _log_file = utils.storage_dir("logs/server.log")
    _lvl = config.log_level
-    root_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
+    root_dir = os.path.dirname(
        os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
    )
    def format_record(record):
        # 获取日志记录中的文件全路径
@@ -21,10 +23,13 @@ def __init_logger():
        record["file"].path = f"./{relative_path}"
        # 返回修改后的格式字符串
        # 您可以根据需要调整这里的格式
-        _format = '<green>{time:%Y-%m-%d %H:%M:%S}</> | ' + \
+        _format = (
-                  '<level>{level}</> | ' + \
+            "<green>{time:%Y-%m-%d %H:%M:%S}</> | "
-                  '"{file.path}:{line}":<blue> {function}</> ' + \
+            + "<level>{level}</> | "
-                  '- <level>{message}</>' + "\n"
+            + '"{file.path}:{line}":<blue> {function}</> '
            + "- <level>{message}</>"
            + "\n"
        )
        return _format
    logger.remove()
--- a/app/config/config.py
+++ b/app/config/config.py
@@ -1,7 +1,8 @@
 import os
 import socket
 import toml
 import shutil
 import socket
 import toml
 from loguru import logger
 root_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
@@ -17,7 +18,7 @@ def load_config():
        example_file = f"{root_dir}/config.example.toml"
        if os.path.isfile(example_file):
            shutil.copyfile(example_file, config_file)
-            logger.info(f"copy config.example.toml to config.toml")
+            logger.info("copy config.example.toml to config.toml")
    logger.info(f"load config from file: {config_file}")
@@ -25,7 +26,7 @@ def load_config():
        _config_ = toml.load(config_file)
    except Exception as e:
        logger.warning(f"load config failed: {str(e)}, try to load as utf-8-sig")
-        with open(config_file, mode="r", encoding='utf-8-sig') as fp:
+        with open(config_file, mode="r", encoding="utf-8-sig") as fp:
            _cfg_content = fp.read()
            _config_ = toml.loads(_cfg_content)
    return _config_
@@ -35,6 +36,7 @@ def save_config():
    with open(config_file, "w", encoding="utf-8") as f:
        _cfg["app"] = app
        _cfg["azure"] = azure
        _cfg["siliconflow"] = siliconflow
        _cfg["ui"] = ui
        f.write(toml.dumps(_cfg))
@@ -44,7 +46,13 @@ app = _cfg.get("app", {})
 whisper = _cfg.get("whisper", {})
 proxy = _cfg.get("proxy", {})
 azure = _cfg.get("azure", {})
-ui = _cfg.get("ui", {})
+siliconflow = _cfg.get("siliconflow", {})
 ui = _cfg.get(
    "ui",
    {
        "hide_log": False,
    },
 )
 hostname = socket.gethostname()
@@ -52,9 +60,11 @@ log_level = _cfg.get("log_level", "DEBUG")
 listen_host = _cfg.get("listen_host", "0.0.0.0")
 listen_port = _cfg.get("listen_port", 8080)
 project_name = _cfg.get("project_name", "MoneyPrinterTurbo")
-project_description = _cfg.get("project_description",
+project_description = _cfg.get(
-                               "<a href='https://github.com/harry0703/MoneyPrinterTurbo'>https://github.com/harry0703/MoneyPrinterTurbo</a>")
+    "project_description",
-project_version = _cfg.get("project_version", "1.1.9")
+    "<a href='https://github.com/harry0703/MoneyPrinterTurbo'>https://github.com/harry0703/MoneyPrinterTurbo</a>",
 )
 project_version = _cfg.get("project_version", "1.2.6")
 reload_debug = False
 imagemagick_path = app.get("imagemagick_path", "")
--- a/app/controllers/base.py
+++ b/app/controllers/base.py
@@ -7,14 +7,14 @@ from app.models.exception import HttpException
 def get_task_id(request: Request):
-    task_id = request.headers.get('x-task-id')
+    task_id = request.headers.get("x-task-id")
    if not task_id:
        task_id = uuid4()
    return str(task_id)
 def get_api_key(request: Request):
-    api_key = request.headers.get('x-api-key')
+    api_key = request.headers.get("x-api-key")
    return api_key
@@ -23,5 +23,9 @@ def verify_token(request: Request):
    if token != config.app.get("api_key", ""):
        request_id = get_task_id(request)
        request_url = request.url
-        user_agent = request.headers.get('user-agent')
+        user_agent = request.headers.get("user-agent")
-        raise HttpException(task_id=request_id, status_code=401, message=f"invalid token: {request_url}, {user_agent}")
+        raise HttpException(
            task_id=request_id,
            status_code=401,
            message=f"invalid token: {request_url}, {user_agent}",
        )
--- a/app/controllers/manager/base_manager.py
+++ b/app/controllers/manager/base_manager.py
@@ -1,5 +1,5 @@
 import threading
-from typing import Callable, Any, Dict
+from typing import Any, Callable, Dict
 class TaskManager:
@@ -18,28 +18,35 @@ class TaskManager:
                print(f"add task: {func.__name__}, current_tasks: {self.current_tasks}")
                self.execute_task(func, *args, **kwargs)
            else:
-                print(f"enqueue task: {func.__name__}, current_tasks: {self.current_tasks}")
+                print(
                    f"enqueue task: {func.__name__}, current_tasks: {self.current_tasks}"
                )
                self.enqueue({"func": func, "args": args, "kwargs": kwargs})
    def execute_task(self, func: Callable, *args: Any, **kwargs: Any):
-        thread = threading.Thread(target=self.run_task, args=(func, *args), kwargs=kwargs)
+        thread = threading.Thread(
            target=self.run_task, args=(func, *args), kwargs=kwargs
        )
        thread.start()
    def run_task(self, func: Callable, *args: Any, **kwargs: Any):
        try:
            with self.lock:
                self.current_tasks += 1
-            func(*args, **kwargs)  # 在这里调用函数，传递*args和**kwargs
+            func(*args, **kwargs)  # call the function here, passing *args and **kwargs.
        finally:
            self.task_done()
    def check_queue(self):
        with self.lock:
-            if self.current_tasks < self.max_concurrent_tasks and not self.is_queue_empty():
+            if (
                self.current_tasks < self.max_concurrent_tasks
                and not self.is_queue_empty()
            ):
                task_info = self.dequeue()
-                func = task_info['func']
+                func = task_info["func"]
-                args = task_info.get('args', ())
+                args = task_info.get("args", ())
-                kwargs = task_info.get('kwargs', {})
+                kwargs = task_info.get("kwargs", {})
                self.execute_task(func, *args, **kwargs)
    def task_done(self):
--- a/app/controllers/manager/redis_manager.py
+++ b/app/controllers/manager/redis_manager.py
@@ -8,7 +8,7 @@ from app.models.schema import VideoParams
 from app.services import task as tm
 FUNC_MAP = {
-    'start': tm.start,
+    "start": tm.start,
    # 'start_test': tm.start_test
 }
@@ -24,11 +24,15 @@ class RedisTaskManager(TaskManager):
    def enqueue(self, task: Dict):
        task_with_serializable_params = task.copy()
-        if 'params' in task['kwargs'] and isinstance(task['kwargs']['params'], VideoParams):
+        if "params" in task["kwargs"] and isinstance(
-            task_with_serializable_params['kwargs']['params'] = task['kwargs']['params'].dict()
+            task["kwargs"]["params"], VideoParams
        ):
            task_with_serializable_params["kwargs"]["params"] = task["kwargs"][
                "params"
            ].dict()
        # 将函数对象转换为其名称
-        task_with_serializable_params['func'] = task['func'].__name__
+        task_with_serializable_params["func"] = task["func"].__name__
        self.redis_client.rpush(self.queue, json.dumps(task_with_serializable_params))
    def dequeue(self):
@@ -36,10 +40,14 @@ class RedisTaskManager(TaskManager):
        if task_json:
            task_info = json.loads(task_json)
            # 将函数名称转换回函数对象
-            task_info['func'] = FUNC_MAP[task_info['func']]
+            task_info["func"] = FUNC_MAP[task_info["func"]]
-            if 'params' in task_info['kwargs'] and isinstance(task_info['kwargs']['params'], dict):
+            if "params" in task_info["kwargs"] and isinstance(
-                task_info['kwargs']['params'] = VideoParams(**task_info['kwargs']['params'])
+                task_info["kwargs"]["params"], dict
            ):
                task_info["kwargs"]["params"] = VideoParams(
                    **task_info["kwargs"]["params"]
                )
            return task_info
        return None
--- a/app/controllers/ping.py
+++ b/app/controllers/ping.py
@@ -1,9 +1,13 @@
-from fastapi import APIRouter
+from fastapi import APIRouter, Request
 from fastapi import Request
 router = APIRouter()
-@router.get("/ping", tags=["Health Check"], description="检查服务可用性", response_description="pong")
+@router.get(
    "/ping",
    tags=["Health Check"],
    description="检查服务可用性",
    response_description="pong",
 )
 def ping(request: Request) -> str:
    return "pong"
--- a/app/controllers/v1/base.py
+++ b/app/controllers/v1/base.py
@@ -1,10 +1,10 @@
-from fastapi import APIRouter, Depends
+from fastapi import APIRouter
 def new_router(dependencies=None):
    router = APIRouter()
-    router.tags = ['V1']
+    router.tags = ["V1"]
-    router.prefix = '/api/v1'
+    router.prefix = "/api/v1"
    # 将认证依赖项应用于所有路由
    if dependencies:
        router.dependencies = dependencies
--- a/app/controllers/v1/llm.py
+++ b/app/controllers/v1/llm.py
@@ -1,31 +1,45 @@
 from fastapi import Request
 from app.controllers.v1.base import new_router
-from app.models.schema import VideoScriptResponse, VideoScriptRequest, VideoTermsResponse, VideoTermsRequest
+from app.models.schema import (
    VideoScriptRequest,
    VideoScriptResponse,
    VideoTermsRequest,
    VideoTermsResponse,
 )
 from app.services import llm
 from app.utils import utils
-# 认证依赖项
+# authentication dependency
 # router = new_router(dependencies=[Depends(base.verify_token)])
 router = new_router()
-@router.post("/scripts", response_model=VideoScriptResponse, summary="Create a script for the video")
+@router.post(
    "/scripts",
    response_model=VideoScriptResponse,
    summary="Create a script for the video",
 )
 def generate_video_script(request: Request, body: VideoScriptRequest):
-    video_script = llm.generate_script(video_subject=body.video_subject,
+    video_script = llm.generate_script(
-                                       language=body.video_language,
+        video_subject=body.video_subject,
-                                       paragraph_number=body.paragraph_number)
+        language=body.video_language,
-    response = {
+        paragraph_number=body.paragraph_number,
-        "video_script": video_script
+    )
-    }
+    response = {"video_script": video_script}
    return utils.get_response(200, response)
-@router.post("/terms", response_model=VideoTermsResponse, summary="Generate video terms based on the video script")
+@router.post(
    "/terms",
    response_model=VideoTermsResponse,
    summary="Generate video terms based on the video script",
 )
 def generate_video_terms(request: Request, body: VideoTermsRequest):
-    video_terms = llm.generate_terms(video_subject=body.video_subject,
+    video_terms = llm.generate_terms(
-                                     video_script=body.video_script,
+        video_subject=body.video_subject,
-                                     amount=body.amount)
+        video_script=body.video_script,
-    response = {
+        amount=body.amount,
-        "video_terms": video_terms
+    )
-    }
+    response = {"video_terms": video_terms}
    return utils.get_response(200, response)
--- a/app/controllers/v1/video.py
+++ b/app/controllers/v1/video.py
@@ -1,11 +1,12 @@
 import os
 import glob
 import os
 import pathlib
 import shutil
 from typing import Union
-from fastapi import Request, Depends, Path, BackgroundTasks, UploadFile
+from fastapi import BackgroundTasks, Depends, Path, Request, UploadFile
 from fastapi.responses import FileResponse, StreamingResponse
 from fastapi.params import File
 from fastapi.responses import FileResponse, StreamingResponse
 from loguru import logger
 from app.config import config
@@ -14,10 +15,19 @@ from app.controllers.manager.memory_manager import InMemoryTaskManager
 from app.controllers.manager.redis_manager import RedisTaskManager
 from app.controllers.v1.base import new_router
 from app.models.exception import HttpException
-from app.models.schema import TaskVideoRequest, TaskQueryResponse, TaskResponse, TaskQueryRequest, \
+from app.models.schema import (
-    BgmUploadResponse, BgmRetrieveResponse, TaskDeletionResponse
+    AudioRequest,
-from app.services import task as tm
+    BgmRetrieveResponse,
    BgmUploadResponse,
    SubtitleRequest,
    TaskDeletionResponse,
    TaskQueryRequest,
    TaskQueryResponse,
    TaskResponse,
    TaskVideoRequest,
 )
 from app.services import state as sm
 from app.services import task as tm
 from app.utils import utils
 # 认证依赖项
@@ -34,48 +44,81 @@ _max_concurrent_tasks = config.app.get("max_concurrent_tasks", 5)
 redis_url = f"redis://:{_redis_password}@{_redis_host}:{_redis_port}/{_redis_db}"
 # 根据配置选择合适的任务管理器
 if _enable_redis:
-    task_manager = RedisTaskManager(max_concurrent_tasks=_max_concurrent_tasks, redis_url=redis_url)
+    task_manager = RedisTaskManager(
        max_concurrent_tasks=_max_concurrent_tasks, redis_url=redis_url
    )
 else:
    task_manager = InMemoryTaskManager(max_concurrent_tasks=_max_concurrent_tasks)
 # @router.post("/videos-test", response_model=TaskResponse, summary="Generate a short video")
 # async def create_video_test(request: Request, body: TaskVideoRequest):
 #     task_id = utils.get_uuid()
 #     request_id = base.get_task_id(request)
 #     try:
 #         task = {
 #             "task_id": task_id,
 #             "request_id": request_id,
 #             "params": body.dict(),
 #         }
 #         task_manager.add_task(tm.start_test, task_id=task_id, params=body)
 #         return utils.get_response(200, task)
 #     except ValueError as e:
 #         raise HttpException(task_id=task_id, status_code=400, message=f"{request_id}: {str(e)}")
@router.post("/videos", response_model=TaskResponse, summary="Generate a short video")
-def create_video(background_tasks: BackgroundTasks, request: Request, body: TaskVideoRequest):
+def create_video(
    background_tasks: BackgroundTasks, request: Request, body: TaskVideoRequest
 ):
    return create_task(request, body, stop_at="video")
@router.post("/subtitle", response_model=TaskResponse, summary="Generate subtitle only")
 def create_subtitle(
    background_tasks: BackgroundTasks, request: Request, body: SubtitleRequest
 ):
    return create_task(request, body, stop_at="subtitle")
@router.post("/audio", response_model=TaskResponse, summary="Generate audio only")
 def create_audio(
    background_tasks: BackgroundTasks, request: Request, body: AudioRequest
 ):
    return create_task(request, body, stop_at="audio")
 def create_task(
    request: Request,
    body: Union[TaskVideoRequest, SubtitleRequest, AudioRequest],
    stop_at: str,
 ):
    task_id = utils.get_uuid()
    request_id = base.get_task_id(request)
    try:
        task = {
            "task_id": task_id,
            "request_id": request_id,
-            "params": body.dict(),
+            "params": body.model_dump(),
        }
        sm.state.update_task(task_id)
-        # background_tasks.add_task(tm.start, task_id=task_id, params=body)
+        task_manager.add_task(tm.start, task_id=task_id, params=body, stop_at=stop_at)
-        task_manager.add_task(tm.start, task_id=task_id, params=body)
+        logger.success(f"Task created: {utils.to_json(task)}")
        logger.success(f"video created: {utils.to_json(task)}")
        return utils.get_response(200, task)
    except ValueError as e:
-        raise HttpException(task_id=task_id, status_code=400, message=f"{request_id}: {str(e)}")
+        raise HttpException(
            task_id=task_id, status_code=400, message=f"{request_id}: {str(e)}"
        )
 from fastapi import Query
@router.get("/tasks", response_model=TaskQueryResponse, summary="Get all tasks")
 def get_all_tasks(request: Request, page: int = Query(1, ge=1), page_size: int = Query(10, ge=1)):
    request_id = base.get_task_id(request)
    tasks, total = sm.state.get_all_tasks(page, page_size)
    response = {
        "tasks": tasks,
        "total": total,
        "page": page,
        "page_size": page_size,
    }
    return utils.get_response(200, response)
-@router.get("/tasks/{task_id}", response_model=TaskQueryResponse, summary="Query task status")
+
-def get_task(request: Request, task_id: str = Path(..., description="Task ID"),
+@router.get(
-             query: TaskQueryRequest = Depends()):
+    "/tasks/{task_id}", response_model=TaskQueryResponse, summary="Query task status"
 )
 def get_task(
    request: Request,
    task_id: str = Path(..., description="Task ID"),
    query: TaskQueryRequest = Depends(),
 ):
    endpoint = config.app.get("endpoint", "")
    if not endpoint:
        endpoint = str(request.base_url)
@@ -108,10 +151,16 @@ def get_task(request: Request, task_id: str = Path(..., description="Task ID"),
            task["combined_videos"] = urls
        return utils.get_response(200, task)
-    raise HttpException(task_id=task_id, status_code=404, message=f"{request_id}: task not found")
+    raise HttpException(
        task_id=task_id, status_code=404, message=f"{request_id}: task not found"
    )
-@router.delete("/tasks/{task_id}", response_model=TaskDeletionResponse, summary="Delete a generated short video task")
+@router.delete(
    "/tasks/{task_id}",
    response_model=TaskDeletionResponse,
    summary="Delete a generated short video task",
 )
 def delete_video(request: Request, task_id: str = Path(..., description="Task ID")):
    request_id = base.get_task_id(request)
    task = sm.state.get_task(task_id)
@@ -125,32 +174,40 @@ def delete_video(request: Request, task_id: str = Path(..., description="Task ID
        logger.success(f"video deleted: {utils.to_json(task)}")
        return utils.get_response(200)
-    raise HttpException(task_id=task_id, status_code=404, message=f"{request_id}: task not found")
+    raise HttpException(
        task_id=task_id, status_code=404, message=f"{request_id}: task not found"
    )
-@router.get("/musics", response_model=BgmRetrieveResponse, summary="Retrieve local BGM files")
+@router.get(
    "/musics", response_model=BgmRetrieveResponse, summary="Retrieve local BGM files"
 )
 def get_bgm_list(request: Request):
    suffix = "*.mp3"
    song_dir = utils.song_dir()
    files = glob.glob(os.path.join(song_dir, suffix))
    bgm_list = []
    for file in files:
-        bgm_list.append({
+        bgm_list.append(
-            "name": os.path.basename(file),
+            {
-            "size": os.path.getsize(file),
+                "name": os.path.basename(file),
-            "file": file,
+                "size": os.path.getsize(file),
-        })
+                "file": file,
-    response = {
+            }
-        "files": bgm_list
+        )
-    }
+    response = {"files": bgm_list}
    return utils.get_response(200, response)
-@router.post("/musics", response_model=BgmUploadResponse, summary="Upload the BGM file to the songs directory")
+@router.post(
    "/musics",
    response_model=BgmUploadResponse,
    summary="Upload the BGM file to the songs directory",
 )
 def upload_bgm_file(request: Request, file: UploadFile = File(...)):
    request_id = base.get_task_id(request)
    # check file ext
-    if file.filename.endswith('mp3'):
+    if file.filename.endswith("mp3"):
        song_dir = utils.song_dir()
        save_path = os.path.join(song_dir, file.filename)
        # save file
@@ -158,26 +215,26 @@ def upload_bgm_file(request: Request, file: UploadFile = File(...)):
            # If the file already exists, it will be overwritten
            file.file.seek(0)
            buffer.write(file.file.read())
-        response = {
+        response = {"file": save_path}
            "file": save_path
        }
        return utils.get_response(200, response)
-    raise HttpException('', status_code=400, message=f"{request_id}: Only *.mp3 files can be uploaded")
+    raise HttpException(
        "", status_code=400, message=f"{request_id}: Only *.mp3 files can be uploaded"
    )
@router.get("/stream/{file_path:path}")
 async def stream_video(request: Request, file_path: str):
    tasks_dir = utils.task_dir()
    video_path = os.path.join(tasks_dir, file_path)
-    range_header = request.headers.get('Range')
+    range_header = request.headers.get("Range")
    video_size = os.path.getsize(video_path)
    start, end = 0, video_size - 1
    length = video_size
    if range_header:
-        range_ = range_header.split('bytes=')[1]
+        range_ = range_header.split("bytes=")[1]
-        start, end = [int(part) if part else None for part in range_.split('-')]
+        start, end = [int(part) if part else None for part in range_.split("-")]
        if start is None:
            start = video_size - end
            end = video_size - 1
@@ -186,7 +243,7 @@ async def stream_video(request: Request, file_path: str):
        length = end - start + 1
    def file_iterator(file_path, offset=0, bytes_to_read=None):
-        with open(file_path, 'rb') as f:
+        with open(file_path, "rb") as f:
            f.seek(offset, os.SEEK_SET)
            remaining = bytes_to_read or video_size
            while remaining > 0:
@@ -197,10 +254,12 @@ async def stream_video(request: Request, file_path: str):
                remaining -= len(data)
                yield data
-    response = StreamingResponse(file_iterator(video_path, start, length), media_type='video/mp4')
+    response = StreamingResponse(
-    response.headers['Content-Range'] = f'bytes {start}-{end}/{video_size}'
+        file_iterator(video_path, start, length), media_type="video/mp4"
-    response.headers['Accept-Ranges'] = 'bytes'
+    )
-    response.headers['Content-Length'] = str(length)
+    response.headers["Content-Range"] = f"bytes {start}-{end}/{video_size}"
    response.headers["Accept-Ranges"] = "bytes"
    response.headers["Content-Length"] = str(length)
    response.status_code = 206  # Partial Content
    return response
@@ -219,8 +278,10 @@ async def download_video(_: Request, file_path: str):
    file_path = pathlib.Path(video_path)
    filename = file_path.stem
    extension = file_path.suffix
-    headers = {
+    headers = {"Content-Disposition": f"attachment; filename={filename}{extension}"}
-        "Content-Disposition": f"attachment; filename={filename}{extension}"
+    return FileResponse(
-    }
+        path=video_path,
-    return FileResponse(path=video_path, headers=headers, filename=f"{filename}{extension}",
+        headers=headers,
-                        media_type=f'video/{extension[1:]}')
+        filename=f"{filename}{extension}",
        media_type=f"video/{extension[1:]}",
    )
--- a/app/models/const.py
+++ b/app/models/const.py
@@ -1,11 +1,25 @@
 PUNCTUATIONS = [
-    "?", ",", ".", "、", ";", ":", "!", "…",
+    "?",
-    "？", "，", "。", "、", "；", "：", "！", "...",
+    ",",
    ".",
    "、",
    ";",
    ":",
    "!",
    "…",
    "？",
    "，",
    "。",
    "、",
    "；",
    "：",
    "！",
    "...",
 ]
 TASK_STATE_FAILED = -1
 TASK_STATE_COMPLETE = 1
 TASK_STATE_PROCESSING = 4
-FILE_TYPE_VIDEOS = ['mp4', 'mov', 'mkv', 'webm']
+FILE_TYPE_VIDEOS = ["mp4", "mov", "mkv", "webm"]
-FILE_TYPE_IMAGES = ['jpg', 'jpeg', 'png', 'bmp']
+FILE_TYPE_IMAGES = ["jpg", "jpeg", "png", "bmp"]
--- a/app/models/exception.py
+++ b/app/models/exception.py
@@ -5,16 +5,18 @@ from loguru import logger
 class HttpException(Exception):
-    def __init__(self, task_id: str, status_code: int, message: str = '', data: Any = None):
+    def __init__(
        self, task_id: str, status_code: int, message: str = "", data: Any = None
    ):
        self.message = message
        self.status_code = status_code
        self.data = data
-        # 获取异常堆栈信息
+        # Retrieve the exception stack trace information.
        tb_str = traceback.format_exc().strip()
        if not tb_str or tb_str == "NoneType: None":
-            msg = f'HttpException: {status_code}, {task_id}, {message}'
+            msg = f"HttpException: {status_code}, {task_id}, {message}"
        else:
-            msg = f'HttpException: {status_code}, {task_id}, {message}\n{tb_str}'
+            msg = f"HttpException: {status_code}, {task_id}, {message}\n{tb_str}"
        if status_code == 400:
            logger.warning(msg)
--- a/app/models/schema.py
+++ b/app/models/schema.py
@@ -1,12 +1,16 @@
 import warnings
 from enum import Enum
-from typing import Any, Optional, List
+from typing import Any, List, Optional, Union
 import pydantic
 from pydantic import BaseModel
 import warnings
 # 忽略 Pydantic 的特定警告
-warnings.filterwarnings("ignore", category=UserWarning, message="Field name.*shadows an attribute in parent.*")
+warnings.filterwarnings(
    "ignore",
    category=UserWarning,
    message="Field name.*shadows an attribute in parent.*",
 )
 class VideoConcatMode(str, Enum):
@@ -14,6 +18,15 @@ class VideoConcatMode(str, Enum):
    sequential = "sequential"
 class VideoTransitionMode(str, Enum):
    none = None
    shuffle = "Shuffle"
    fade_in = "FadeIn"
    fade_out = "FadeOut"
    slide_in = "SlideIn"
    slide_out = "SlideOut"
 class VideoAspect(str, Enum):
    landscape = "16:9"
    portrait = "9:16"
@@ -40,45 +53,6 @@ class MaterialInfo:
    duration: int = 0
 # VoiceNames = [
 #     # zh-CN
 #     "female-zh-CN-XiaoxiaoNeural",
 #     "female-zh-CN-XiaoyiNeural",
 #     "female-zh-CN-liaoning-XiaobeiNeural",
 #     "female-zh-CN-shaanxi-XiaoniNeural",
 #
 #     "male-zh-CN-YunjianNeural",
 #     "male-zh-CN-YunxiNeural",
 #     "male-zh-CN-YunxiaNeural",
 #     "male-zh-CN-YunyangNeural",
 #
 #     # "female-zh-HK-HiuGaaiNeural",
 #     # "female-zh-HK-HiuMaanNeural",
 #     # "male-zh-HK-WanLungNeural",
 #     #
 #     # "female-zh-TW-HsiaoChenNeural",
 #     # "female-zh-TW-HsiaoYuNeural",
 #     # "male-zh-TW-YunJheNeural",
 #
 #     # en-US
 #
 #     "female-en-US-AnaNeural",
 #     "female-en-US-AriaNeural",
 #     "female-en-US-AvaNeural",
 #     "female-en-US-EmmaNeural",
 #     "female-en-US-JennyNeural",
 #     "female-en-US-MichelleNeural",
 #
 #     "male-en-US-AndrewNeural",
 #     "male-en-US-BrianNeural",
 #     "male-en-US-ChristopherNeural",
 #     "male-en-US-EricNeural",
 #     "male-en-US-GuyNeural",
 #     "male-en-US-RogerNeural",
 #     "male-en-US-SteffanNeural",
 # ]
 class VideoParams(BaseModel):
    """
    {
@@ -93,30 +67,36 @@ class VideoParams(BaseModel):
      "stroke_width": 1.5
    }
    """
    video_subject: str
-    video_script: str = ""  # 用于生成视频的脚本
+    video_script: str = ""  # Script used to generate the video
-    video_terms: Optional[str | list] = None  # 用于生成视频的关键词
+    video_terms: Optional[str | list] = None  # Keywords used to generate the video
    video_aspect: Optional[VideoAspect] = VideoAspect.portrait.value
    video_concat_mode: Optional[VideoConcatMode] = VideoConcatMode.random.value
    video_transition_mode: Optional[VideoTransitionMode] = None
    video_clip_duration: Optional[int] = 5
    video_count: Optional[int] = 1
    video_source: Optional[str] = "pexels"
-    video_materials: Optional[List[MaterialInfo]] = None  # 用于生成视频的素材
+    video_materials: Optional[List[MaterialInfo]] = (
        None  # Materials used to generate the video
    )
    video_language: Optional[str] = ""  # auto detect
    voice_name: Optional[str] = ""
    voice_volume: Optional[float] = 1.0
    voice_rate: Optional[float] = 1.0
    bgm_type: Optional[str] = "random"
    bgm_file: Optional[str] = ""
    bgm_volume: Optional[float] = 0.2
    subtitle_enabled: Optional[bool] = True
    subtitle_position: Optional[str] = "bottom"  # top, bottom, center
    custom_position: float = 70.0
    font_name: Optional[str] = "STHeitiMedium.ttc"
    text_fore_color: Optional[str] = "#FFFFFF"
-    text_background_color: Optional[str] = "transparent"
+    text_background_color: Union[bool, str] = True
    font_size: int = 60
    stroke_color: Optional[str] = "#000000"
@@ -125,6 +105,38 @@ class VideoParams(BaseModel):
    paragraph_number: Optional[int] = 1
 class SubtitleRequest(BaseModel):
    video_script: str
    video_language: Optional[str] = ""
    voice_name: Optional[str] = "zh-CN-XiaoxiaoNeural-Female"
    voice_volume: Optional[float] = 1.0
    voice_rate: Optional[float] = 1.2
    bgm_type: Optional[str] = "random"
    bgm_file: Optional[str] = ""
    bgm_volume: Optional[float] = 0.2
    subtitle_position: Optional[str] = "bottom"
    font_name: Optional[str] = "STHeitiMedium.ttc"
    text_fore_color: Optional[str] = "#FFFFFF"
    text_background_color: Union[bool, str] = True
    font_size: int = 60
    stroke_color: Optional[str] = "#000000"
    stroke_width: float = 1.5
    video_source: Optional[str] = "local"
    subtitle_enabled: Optional[str] = "true"
 class AudioRequest(BaseModel):
    video_script: str
    video_language: Optional[str] = ""
    voice_name: Optional[str] = "zh-CN-XiaoxiaoNeural-Female"
    voice_volume: Optional[float] = 1.0
    voice_rate: Optional[float] = 1.2
    bgm_type: Optional[str] = "random"
    bgm_file: Optional[str] = ""
    bgm_volume: Optional[float] = 0.2
    video_source: Optional[str] = "local"
 class VideoScriptParams:
    """
    {
@@ -133,6 +145,7 @@ class VideoScriptParams:
      "paragraph_number": 1
    }
    """
    video_subject: Optional[str] = "春天的花海"
    video_language: Optional[str] = ""
    paragraph_number: Optional[int] = 1
@@ -146,14 +159,17 @@ class VideoTermsParams:
      "amount": 5
    }
    """
    video_subject: Optional[str] = "春天的花海"
-    video_script: Optional[str] = "春天的花海，如诗如画般展现在眼前。万物复苏的季节里，大地披上了一袭绚丽多彩的盛装。金黄的迎春、粉嫩的樱花、洁白的梨花、艳丽的郁金香……"
+    video_script: Optional[str] = (
        "春天的花海，如诗如画般展现在眼前。万物复苏的季节里，大地披上了一袭绚丽多彩的盛装。金黄的迎春、粉嫩的樱花、洁白的梨花、艳丽的郁金香……"
    )
    amount: Optional[int] = 5
 class BaseResponse(BaseModel):
    status: int = 200
-    message: Optional[str] = 'success'
+    message: Optional[str] = "success"
    data: Any = None
@@ -188,9 +204,7 @@ class TaskResponse(BaseResponse):
            "example": {
                "status": 200,
                "message": "success",
-                "data": {
+                "data": {"task_id": "6c85c8cc-a77a-42b9-bc30-947815aa0558"},
                    "task_id": "6c85c8cc-a77a-42b9-bc30-947815aa0558"
                }
            },
        }
@@ -209,8 +223,8 @@ class TaskQueryResponse(BaseResponse):
                    ],
                    "combined_videos": [
                        "http://127.0.0.1:8080/tasks/6c85c8cc-a77a-42b9-bc30-947815aa0558/combined-1.mp4"
-                    ]
+                    ],
-                }
+                },
            },
        }
@@ -229,8 +243,8 @@ class TaskDeletionResponse(BaseResponse):
                    ],
                    "combined_videos": [
                        "http://127.0.0.1:8080/tasks/6c85c8cc-a77a-42b9-bc30-947815aa0558/combined-1.mp4"
-                    ]
+                    ],
-                }
+                },
            },
        }
@@ -243,7 +257,7 @@ class VideoScriptResponse(BaseResponse):
                "message": "success",
                "data": {
                    "video_script": "春天的花海，是大自然的一幅美丽画卷。在这个季节里，大地复苏，万物生长，花朵争相绽放，形成了一片五彩斑斓的花海..."
-                }
+                },
            },
        }
@@ -254,9 +268,7 @@ class VideoTermsResponse(BaseResponse):
            "example": {
                "status": 200,
                "message": "success",
-                "data": {
+                "data": {"video_terms": ["sky", "tree"]},
                    "video_terms": ["sky", "tree"]
                }
            },
        }
@@ -272,10 +284,10 @@ class BgmRetrieveResponse(BaseResponse):
                        {
                            "name": "output013.mp3",
                            "size": 1891269,
-                            "file": "/MoneyPrinterTurbo/resource/songs/output013.mp3"
+                            "file": "/MoneyPrinterTurbo/resource/songs/output013.mp3",
                        }
                    ]
-                }
+                },
            },
        }
@@ -286,8 +298,6 @@ class BgmUploadResponse(BaseResponse):
            "example": {
                "status": 200,
                "message": "success",
-                "data": {
+                "data": {"file": "/MoneyPrinterTurbo/resource/songs/example.mp3"},
                    "file": "/MoneyPrinterTurbo/resource/songs/example.mp3"
                }
            },
        }
--- a/app/router.py
+++ b/app/router.py
@@ -6,9 +6,10 @@ Resources:
    1. https://fastapi.tiangolo.com/tutorial/bigger-applications
 """
 from fastapi import APIRouter
-from app.controllers.v1 import video, llm
+from app.controllers.v1 import llm, video
 root_api_router = APIRouter()
 # v1
--- a/app/services/llm.py
+++ b/app/services/llm.py
@@ -1,10 +1,11 @@
 import json
 import logging
 import re
 import json
 from typing import List
 import g4f
 from loguru import logger
-from openai import OpenAI
+from openai import AzureOpenAI, OpenAI
 from openai import AzureOpenAI
 from openai.types.chat import ChatCompletion
 from app.config import config
@@ -13,189 +14,254 @@ _max_retries = 5
 def _generate_response(prompt: str) -> str:
-    content = ""
+    try:
-    llm_provider = config.app.get("llm_provider", "openai")
+        content = ""
-    logger.info(f"llm provider: {llm_provider}")
+        llm_provider = config.app.get("llm_provider", "openai")
-    if llm_provider == "g4f":
+        logger.info(f"llm provider: {llm_provider}")
-        model_name = config.app.get("g4f_model_name", "")
+        if llm_provider == "g4f":
-        if not model_name:
+            model_name = config.app.get("g4f_model_name", "")
-            model_name = "gpt-3.5-turbo-16k-0613"
+            if not model_name:
-        import g4f
+                model_name = "gpt-3.5-turbo-16k-0613"
-        content = g4f.ChatCompletion.create(
+            content = g4f.ChatCompletion.create(
            model=model_name,
            messages=[{"role": "user", "content": prompt}],
        )
    else:
        api_version = ""  # for azure
        if llm_provider == "moonshot":
            api_key = config.app.get("moonshot_api_key")
            model_name = config.app.get("moonshot_model_name")
            base_url = "https://api.moonshot.cn/v1"
        elif llm_provider == "ollama":
            # api_key = config.app.get("openai_api_key")
            api_key = "ollama"  # any string works but you are required to have one
            model_name = config.app.get("ollama_model_name")
            base_url = config.app.get("ollama_base_url", "")
            if not base_url:
                base_url = "http://localhost:11434/v1"
        elif llm_provider == "openai":
            api_key = config.app.get("openai_api_key")
            model_name = config.app.get("openai_model_name")
            base_url = config.app.get("openai_base_url", "")
            if not base_url:
                base_url = "https://api.openai.com/v1"
        elif llm_provider == "oneapi":
            api_key = config.app.get("oneapi_api_key")
            model_name = config.app.get("oneapi_model_name")
            base_url = config.app.get("oneapi_base_url", "")
        elif llm_provider == "azure":
            api_key = config.app.get("azure_api_key")
            model_name = config.app.get("azure_model_name")
            base_url = config.app.get("azure_base_url", "")
            api_version = config.app.get("azure_api_version", "2024-02-15-preview")
        elif llm_provider == "gemini":
            api_key = config.app.get("gemini_api_key")
            model_name = config.app.get("gemini_model_name")
            base_url = "***"
        elif llm_provider == "qwen":
            api_key = config.app.get("qwen_api_key")
            model_name = config.app.get("qwen_model_name")
            base_url = "***"
        elif llm_provider == "cloudflare":
            api_key = config.app.get("cloudflare_api_key")
            model_name = config.app.get("cloudflare_model_name")
            account_id = config.app.get("cloudflare_account_id")
            base_url = "***"
        elif llm_provider == "deepseek":
            api_key = config.app.get("deepseek_api_key")
            model_name = config.app.get("deepseek_model_name")
            base_url = config.app.get("deepseek_base_url")
            if not base_url:
                base_url = "https://api.deepseek.com"
        else:
            raise ValueError("llm_provider is not set, please set it in the config.toml file.")
        if not api_key:
            raise ValueError(f"{llm_provider}: api_key is not set, please set it in the config.toml file.")
        if not model_name:
            raise ValueError(f"{llm_provider}: model_name is not set, please set it in the config.toml file.")
        if not base_url:
            raise ValueError(f"{llm_provider}: base_url is not set, please set it in the config.toml file.")
        if llm_provider == "qwen":
            import dashscope
            from dashscope.api_entities.dashscope_response import GenerationResponse
            dashscope.api_key = api_key
            response = dashscope.Generation.call(
                model=model_name,
-                messages=[{"role": "user", "content": prompt}]
+                messages=[{"role": "user", "content": prompt}],
            )
        else:
            api_version = ""  # for azure
            if llm_provider == "moonshot":
                api_key = config.app.get("moonshot_api_key")
                model_name = config.app.get("moonshot_model_name")
                base_url = "https://api.moonshot.cn/v1"
            elif llm_provider == "ollama":
                # api_key = config.app.get("openai_api_key")
                api_key = "ollama"  # any string works but you are required to have one
                model_name = config.app.get("ollama_model_name")
                base_url = config.app.get("ollama_base_url", "")
                if not base_url:
                    base_url = "http://localhost:11434/v1"
            elif llm_provider == "openai":
                api_key = config.app.get("openai_api_key")
                model_name = config.app.get("openai_model_name")
                base_url = config.app.get("openai_base_url", "")
                if not base_url:
                    base_url = "https://api.openai.com/v1"
            elif llm_provider == "oneapi":
                api_key = config.app.get("oneapi_api_key")
                model_name = config.app.get("oneapi_model_name")
                base_url = config.app.get("oneapi_base_url", "")
            elif llm_provider == "azure":
                api_key = config.app.get("azure_api_key")
                model_name = config.app.get("azure_model_name")
                base_url = config.app.get("azure_base_url", "")
                api_version = config.app.get("azure_api_version", "2024-02-15-preview")
            elif llm_provider == "gemini":
                api_key = config.app.get("gemini_api_key")
                model_name = config.app.get("gemini_model_name")
                base_url = "***"
            elif llm_provider == "qwen":
                api_key = config.app.get("qwen_api_key")
                model_name = config.app.get("qwen_model_name")
                base_url = "***"
            elif llm_provider == "cloudflare":
                api_key = config.app.get("cloudflare_api_key")
                model_name = config.app.get("cloudflare_model_name")
                account_id = config.app.get("cloudflare_account_id")
                base_url = "***"
            elif llm_provider == "deepseek":
                api_key = config.app.get("deepseek_api_key")
                model_name = config.app.get("deepseek_model_name")
                base_url = config.app.get("deepseek_base_url")
                if not base_url:
                    base_url = "https://api.deepseek.com"
            elif llm_provider == "ernie":
                api_key = config.app.get("ernie_api_key")
                secret_key = config.app.get("ernie_secret_key")
                base_url = config.app.get("ernie_base_url")
                model_name = "***"
                if not secret_key:
                    raise ValueError(
                        f"{llm_provider}: secret_key is not set, please set it in the config.toml file."
                    )
            else:
                raise ValueError(
                    "llm_provider is not set, please set it in the config.toml file."
                )
            if not api_key:
                raise ValueError(
                    f"{llm_provider}: api_key is not set, please set it in the config.toml file."
                )
            if not model_name:
                raise ValueError(
                    f"{llm_provider}: model_name is not set, please set it in the config.toml file."
                )
            if not base_url:
                raise ValueError(
                    f"{llm_provider}: base_url is not set, please set it in the config.toml file."
                )
            if llm_provider == "qwen":
                import dashscope
                from dashscope.api_entities.dashscope_response import GenerationResponse
                dashscope.api_key = api_key
                response = dashscope.Generation.call(
                    model=model_name, messages=[{"role": "user", "content": prompt}]
                )
                if response:
                    if isinstance(response, GenerationResponse):
                        status_code = response.status_code
                        if status_code != 200:
                            raise Exception(
                                f'[{llm_provider}] returned an error response: "{response}"'
                            )
                        content = response["output"]["text"]
                        return content.replace("\n", "")
                    else:
                        raise Exception(
                            f'[{llm_provider}] returned an invalid response: "{response}"'
                        )
                else:
                    raise Exception(f"[{llm_provider}] returned an empty response")
            if llm_provider == "gemini":
                import google.generativeai as genai
                genai.configure(api_key=api_key, transport="rest")
                generation_config = {
                    "temperature": 0.5,
                    "top_p": 1,
                    "top_k": 1,
                    "max_output_tokens": 2048,
                }
                safety_settings = [
                    {
                        "category": "HARM_CATEGORY_HARASSMENT",
                        "threshold": "BLOCK_ONLY_HIGH",
                    },
                    {
                        "category": "HARM_CATEGORY_HATE_SPEECH",
                        "threshold": "BLOCK_ONLY_HIGH",
                    },
                    {
                        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
                        "threshold": "BLOCK_ONLY_HIGH",
                    },
                    {
                        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
                        "threshold": "BLOCK_ONLY_HIGH",
                    },
                ]
                model = genai.GenerativeModel(
                    model_name=model_name,
                    generation_config=generation_config,
                    safety_settings=safety_settings,
                )
                try:
                    response = model.generate_content(prompt)
                    candidates = response.candidates
                    generated_text = candidates[0].content.parts[0].text
                except (AttributeError, IndexError) as e:
                    print("Gemini Error:", e)
                return generated_text
            if llm_provider == "cloudflare":
                import requests
                response = requests.post(
                    f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_name}",
                    headers={"Authorization": f"Bearer {api_key}"},
                    json={
                        "messages": [
                            {
                                "role": "system",
                                "content": "You are a friendly assistant",
                            },
                            {"role": "user", "content": prompt},
                        ]
                    },
                )
                result = response.json()
                logger.info(result)
                return result["result"]["response"]
            if llm_provider == "ernie":
                import requests
                params = {
                    "grant_type": "client_credentials",
                    "client_id": api_key,
                    "client_secret": secret_key,
                }
                access_token = (
                    requests.post(
                        "https://aip.baidubce.com/oauth/2.0/token", params=params
                    )
                    .json()
                    .get("access_token")
                )
                url = f"{base_url}?access_token={access_token}"
                payload = json.dumps(
                    {
                        "messages": [{"role": "user", "content": prompt}],
                        "temperature": 0.5,
                        "top_p": 0.8,
                        "penalty_score": 1,
                        "disable_search": False,
                        "enable_citation": False,
                        "response_format": "text",
                    }
                )
                headers = {"Content-Type": "application/json"}
                response = requests.request(
                    "POST", url, headers=headers, data=payload
                ).json()
                return response.get("result")
            if llm_provider == "azure":
                client = AzureOpenAI(
                    api_key=api_key,
                    api_version=api_version,
                    azure_endpoint=base_url,
                )
            else:
                client = OpenAI(
                    api_key=api_key,
                    base_url=base_url,
                )
            response = client.chat.completions.create(
                model=model_name, messages=[{"role": "user", "content": prompt}]
            )
            if response:
-                if isinstance(response, GenerationResponse):
+                if isinstance(response, ChatCompletion):
-                    status_code = response.status_code
+                    content = response.choices[0].message.content
                    if status_code != 200:
                        raise Exception(
                            f"[{llm_provider}] returned an error response: \"{response}\"")
                    content = response["output"]["text"]
                    return content.replace("\n", "")
                else:
                    raise Exception(
-                        f"[{llm_provider}] returned an invalid response: \"{response}\"")
+                        f'[{llm_provider}] returned an invalid response: "{response}", please check your network '
                        f"connection and try again."
                    )
            else:
                raise Exception(
-                    f"[{llm_provider}] returned an empty response")
+                    f"[{llm_provider}] returned an empty response, please check your network connection and try again."
                )
-        if llm_provider == "gemini":
+        return content.replace("\n", "")
-            import google.generativeai as genai
+    except Exception as e:
-            genai.configure(api_key=api_key, transport='rest')
+        return f"Error: {str(e)}"
            generation_config = {
                "temperature": 0.5,
                "top_p": 1,
                "top_k": 1,
                "max_output_tokens": 2048,
            }
            safety_settings = [
                {
                    "category": "HARM_CATEGORY_HARASSMENT",
                    "threshold": "BLOCK_ONLY_HIGH"
                },
                {
                    "category": "HARM_CATEGORY_HATE_SPEECH",
                    "threshold": "BLOCK_ONLY_HIGH"
                },
                {
                    "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
                    "threshold": "BLOCK_ONLY_HIGH"
                },
                {
                    "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
                    "threshold": "BLOCK_ONLY_HIGH"
                },
            ]
            model = genai.GenerativeModel(model_name=model_name,
                                          generation_config=generation_config,
                                          safety_settings=safety_settings)
            try:
                response = model.generate_content(prompt)
                candidates = response.candidates
                generated_text = candidates[0].content.parts[0].text
            except (AttributeError, IndexError) as e:
                print("Gemini Error:", e)
            return generated_text
        if llm_provider == "cloudflare":
            import requests
            response = requests.post(
                f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_name}",
                headers={"Authorization": f"Bearer {api_key}"},
                json={
                    "messages": [
                        {"role": "system", "content": "You are a friendly assistant"},
                        {"role": "user", "content": prompt}
                    ]
                }
            )
            result = response.json()
            logger.info(result)
            return result["result"]["response"]
        if llm_provider == "azure":
            client = AzureOpenAI(
                api_key=api_key,
                api_version=api_version,
                azure_endpoint=base_url,
            )
        else:
            client = OpenAI(
                api_key=api_key,
                base_url=base_url,
            )
        response = client.chat.completions.create(
            model=model_name,
            messages=[{"role": "user", "content": prompt}]
        )
        if response:
            if isinstance(response, ChatCompletion):
                content = response.choices[0].message.content
            else:
                raise Exception(
                    f"[{llm_provider}] returned an invalid response: \"{response}\", please check your network "
                    f"connection and try again.")
        else:
            raise Exception(
                f"[{llm_provider}] returned an empty response, please check your network connection and try again.")
    return content.replace("\n", "")
-def generate_script(video_subject: str, language: str = "", paragraph_number: int = 1) -> str:
+def generate_script(
    video_subject: str, language: str = "", paragraph_number: int = 1
 ) -> str:
    prompt = f"""
 # Role: Video Script Generator
@@ -236,10 +302,10 @@ Generate a script for a video, depending on the subject of the video.
        paragraphs = response.split("\n\n")
        # Select the specified number of paragraphs
-        selected_paragraphs = paragraphs[:paragraph_number]
+        # selected_paragraphs = paragraphs[:paragraph_number]
        # Join the selected paragraphs into a single string
-        return "\n\n".join(selected_paragraphs)
+        return "\n\n".join(paragraphs)
    for i in range(_max_retries):
        try:
@@ -260,8 +326,10 @@ Generate a script for a video, depending on the subject of the video.
        if i < _max_retries:
            logger.warning(f"failed to generate video script, trying again... {i + 1}")
-
+    if "Error: " in final_script:
-    logger.success(f"completed: \n{final_script}")
+        logger.error(f"failed to generate video script: {final_script}")
    else:
        logger.success(f"completed: \n{final_script}")
    return final_script.strip()
@@ -295,21 +363,30 @@ Please note that you must use English for generating video search terms; Chinese
    logger.info(f"subject: {video_subject}")
    search_terms = []
    response = ""
    for i in range(_max_retries):
        try:
            response = _generate_response(prompt)
            if "Error: " in response:
                logger.error(f"failed to generate video script: {response}")
                return response
            search_terms = json.loads(response)
-            if not isinstance(search_terms, list) or not all(isinstance(term, str) for term in search_terms):
+            if not isinstance(search_terms, list) or not all(
                isinstance(term, str) for term in search_terms
            ):
                logger.error("response is not a list of strings.")
                continue
        except Exception as e:
-            match = re.search(r'\[.*]', response)
+            logger.warning(f"failed to generate video terms: {str(e)}")
-            if match:
+            if response:
-                try:
+                match = re.search(r"\[.*]", response)
-                    search_terms = json.loads(match.group())
+                if match:
-                except json.JSONDecodeError:
+                    try:
-                    pass
+                        search_terms = json.loads(match.group())
                    except Exception as e:
                        logger.warning(f"failed to generate video terms: {str(e)}")
                        pass
        if search_terms and len(search_terms) > 0:
            break
@@ -322,9 +399,13 @@ Please note that you must use English for generating video search terms; Chinese
 if __name__ == "__main__":
    video_subject = "生命的意义是什么"
-    script = generate_script(video_subject=video_subject, language="zh-CN", paragraph_number=1)
+    script = generate_script(
        video_subject=video_subject, language="zh-CN", paragraph_number=1
    )
    print("######################")
    print(script)
-    search_terms = generate_terms(video_subject=video_subject, video_script=script, amount=5)
+    search_terms = generate_terms(
        video_subject=video_subject, video_script=script, amount=5
    )
    print("######################")
    print(search_terms)
--- a/app/services/material.py
+++ b/app/services/material.py
@@ -1,14 +1,14 @@
 import os
 import random
 from typing import List
 from urllib.parse import urlencode
 import requests
 from typing import List
 from loguru import logger
 from moviepy.video.io.VideoFileClip import VideoFileClip
 from app.config import config
-from app.models.schema import VideoAspect, VideoConcatMode, MaterialInfo
+from app.models.schema import MaterialInfo, VideoAspect, VideoConcatMode
 from app.utils import utils
 requested_count = 0
@@ -19,7 +19,8 @@ def get_api_key(cfg_key: str):
    if not api_keys:
        raise ValueError(
            f"\n\n##### {cfg_key} is not set #####\n\nPlease set it in the config.toml file: {config.config_file}\n\n"
-            f"{utils.to_json(config.app)}")
+            f"{utils.to_json(config.app)}"
        )
    # if only one key is provided, return it
    if isinstance(api_keys, str):
@@ -30,28 +31,32 @@ def get_api_key(cfg_key: str):
    return api_keys[requested_count % len(api_keys)]
-def search_videos_pexels(search_term: str,
+def search_videos_pexels(
-                         minimum_duration: int,
+    search_term: str,
-                         video_aspect: VideoAspect = VideoAspect.portrait,
+    minimum_duration: int,
-                         ) -> List[MaterialInfo]:
+    video_aspect: VideoAspect = VideoAspect.portrait,
 ) -> List[MaterialInfo]:
    aspect = VideoAspect(video_aspect)
    video_orientation = aspect.name
    video_width, video_height = aspect.to_resolution()
    api_key = get_api_key("pexels_api_keys")
    headers = {
-        "Authorization": api_key
+        "Authorization": api_key,
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36",
    }
    # Build URL
-    params = {
+    params = {"query": search_term, "per_page": 20, "orientation": video_orientation}
        "query": search_term,
        "per_page": 20,
        "orientation": video_orientation
    }
    query_url = f"https://api.pexels.com/videos/search?{urlencode(params)}"
    logger.info(f"searching videos: {query_url}, with proxies: {config.proxy}")
    try:
-        r = requests.get(query_url, headers=headers, proxies=config.proxy, verify=False, timeout=(30, 60))
+        r = requests.get(
            query_url,
            headers=headers,
            proxies=config.proxy,
            verify=False,
            timeout=(30, 60),
        )
        response = r.json()
        video_items = []
        if "videos" not in response:
@@ -83,10 +88,11 @@ def search_videos_pexels(search_term: str,
    return []
-def search_videos_pixabay(search_term: str,
+def search_videos_pixabay(
-                          minimum_duration: int,
+    search_term: str,
-                          video_aspect: VideoAspect = VideoAspect.portrait,
+    minimum_duration: int,
-                          ) -> List[MaterialInfo]:
+    video_aspect: VideoAspect = VideoAspect.portrait,
 ) -> List[MaterialInfo]:
    aspect = VideoAspect(video_aspect)
    video_width, video_height = aspect.to_resolution()
@@ -97,13 +103,15 @@ def search_videos_pixabay(search_term: str,
        "q": search_term,
        "video_type": "all",  # Accepted values: "all", "film", "animation"
        "per_page": 50,
-        "key": api_key
+        "key": api_key,
    }
    query_url = f"https://pixabay.com/api/videos/?{urlencode(params)}"
    logger.info(f"searching videos: {query_url}, with proxies: {config.proxy}")
    try:
-        r = requests.get(query_url, proxies=config.proxy, verify=False, timeout=(30, 60))
+        r = requests.get(
            query_url, proxies=config.proxy, verify=False, timeout=(30, 60)
        )
        response = r.json()
        video_items = []
        if "hits" not in response:
@@ -121,7 +129,7 @@ def search_videos_pixabay(search_term: str,
            for video_type in video_files:
                video = video_files[video_type]
                w = int(video["width"])
-                h = int(video["height"])
+                # h = int(video["height"])
                if w >= video_width:
                    item = MaterialInfo()
                    item.provider = "pixabay"
@@ -153,9 +161,21 @@ def save_video(video_url: str, save_dir: str = "") -> str:
        logger.info(f"video already exists: {video_path}")
        return video_path
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36"
    }
    # if video does not exist, download it
    with open(video_path, "wb") as f:
-        f.write(requests.get(video_url, proxies=config.proxy, verify=False, timeout=(60, 240)).content)
+        f.write(
            requests.get(
                video_url,
                headers=headers,
                proxies=config.proxy,
                verify=False,
                timeout=(60, 240),
            ).content
        )
    if os.path.exists(video_path) and os.path.getsize(video_path) > 0:
        try:
@@ -168,20 +188,21 @@ def save_video(video_url: str, save_dir: str = "") -> str:
        except Exception as e:
            try:
                os.remove(video_path)
-            except Exception as e:
+            except Exception:
                pass
            logger.warning(f"invalid video file: {video_path} => {str(e)}")
    return ""
-def download_videos(task_id: str,
+def download_videos(
-                    search_terms: List[str],
+    task_id: str,
-                    source: str = "pexels",
+    search_terms: List[str],
-                    video_aspect: VideoAspect = VideoAspect.portrait,
+    source: str = "pexels",
-                    video_contact_mode: VideoConcatMode = VideoConcatMode.random,
+    video_aspect: VideoAspect = VideoAspect.portrait,
-                    audio_duration: float = 0.0,
+    video_contact_mode: VideoConcatMode = VideoConcatMode.random,
-                    max_clip_duration: int = 5,
+    audio_duration: float = 0.0,
-                    ) -> List[str]:
+    max_clip_duration: int = 5,
 ) -> List[str]:
    valid_video_items = []
    valid_video_urls = []
    found_duration = 0.0
@@ -190,9 +211,11 @@ def download_videos(task_id: str,
        search_videos = search_videos_pixabay
    for search_term in search_terms:
-        video_items = search_videos(search_term=search_term,
+        video_items = search_videos(
-                                    minimum_duration=max_clip_duration,
+            search_term=search_term,
-                                    video_aspect=video_aspect)
+            minimum_duration=max_clip_duration,
            video_aspect=video_aspect,
        )
        logger.info(f"found {len(video_items)} videos for '{search_term}'")
        for item in video_items:
@@ -202,7 +225,8 @@ def download_videos(task_id: str,
                found_duration += item.duration
    logger.info(
-        f"found total videos: {len(valid_video_items)}, required duration: {audio_duration} seconds, found duration: {found_duration} seconds")
+        f"found total videos: {len(valid_video_items)}, required duration: {audio_duration} seconds, found duration: {found_duration} seconds"
    )
    video_paths = []
    material_directory = config.app.get("material_directory", "").strip()
@@ -218,14 +242,18 @@ def download_videos(task_id: str,
    for item in valid_video_items:
        try:
            logger.info(f"downloading video: {item.url}")
-            saved_video_path = save_video(video_url=item.url, save_dir=material_directory)
+            saved_video_path = save_video(
                video_url=item.url, save_dir=material_directory
            )
            if saved_video_path:
                logger.info(f"video saved: {saved_video_path}")
                video_paths.append(saved_video_path)
                seconds = min(max_clip_duration, item.duration)
                total_duration += seconds
                if total_duration > audio_duration:
-                    logger.info(f"total duration of downloaded videos: {total_duration} seconds, skip downloading more")
+                    logger.info(
                        f"total duration of downloaded videos: {total_duration} seconds, skip downloading more"
                    )
                    break
        except Exception as e:
            logger.error(f"failed to download video: {utils.to_json(item)} => {str(e)}")
@@ -234,4 +262,6 @@ def download_videos(task_id: str,
 if __name__ == "__main__":
-    download_videos("test123", ["Money Exchange Medium"], audio_duration=100, source="pixabay")
+    download_videos(
        "test123", ["Money Exchange Medium"], audio_duration=100, source="pixabay"
    )
--- a/app/services/state.py
+++ b/app/services/state.py
@@ -1,12 +1,12 @@
 import ast
 from abc import ABC, abstractmethod
 from app.config import config
 from app.models import const
 # Base class for state management
 class BaseState(ABC):
    @abstractmethod
    def update_task(self, task_id: str, state: int, progress: int = 0, **kwargs):
        pass
@@ -15,19 +15,36 @@ class BaseState(ABC):
    def get_task(self, task_id: str):
        pass
    @abstractmethod
    def get_all_tasks(self, page: int, page_size: int):
        pass
 # Memory state management
 class MemoryState(BaseState):
    def __init__(self):
        self._tasks = {}
-    def update_task(self, task_id: str, state: int = const.TASK_STATE_PROCESSING, progress: int = 0, **kwargs):
+    def get_all_tasks(self, page: int, page_size: int):
        start = (page - 1) * page_size
        end = start + page_size
        tasks = list(self._tasks.values())
        total = len(tasks)
        return tasks[start:end], total
    def update_task(
        self,
        task_id: str,
        state: int = const.TASK_STATE_PROCESSING,
        progress: int = 0,
        **kwargs,
    ):
        progress = int(progress)
        if progress > 100:
            progress = 100
        self._tasks[task_id] = {
            "task_id": task_id,
            "state": state,
            "progress": progress,
            **kwargs,
@@ -43,17 +60,46 @@ class MemoryState(BaseState):
 # Redis state management
 class RedisState(BaseState):
-
+    def __init__(self, host="localhost", port=6379, db=0, password=None):
    def __init__(self, host='localhost', port=6379, db=0, password=None):
        import redis
        self._redis = redis.StrictRedis(host=host, port=port, db=db, password=password)
-    def update_task(self, task_id: str, state: int = const.TASK_STATE_PROCESSING, progress: int = 0, **kwargs):
+    def get_all_tasks(self, page: int, page_size: int):
        start = (page - 1) * page_size
        end = start + page_size
        tasks = []
        cursor = 0
        total = 0
        while True:
            cursor, keys = self._redis.scan(cursor, count=page_size)
            total += len(keys)
            if total > start:
                for key in keys[max(0, start - total):end - total]:
                    task_data = self._redis.hgetall(key)
                    task = {
                        k.decode("utf-8"): self._convert_to_original_type(v) for k, v in task_data.items()
                    }
                    tasks.append(task)
                    if len(tasks) >= page_size:
                        break
            if cursor == 0 or len(tasks) >= page_size:
                break
        return tasks, total
    def update_task(
        self,
        task_id: str,
        state: int = const.TASK_STATE_PROCESSING,
        progress: int = 0,
        **kwargs,
    ):
        progress = int(progress)
        if progress > 100:
            progress = 100
        fields = {
            "task_id": task_id,
            "state": state,
            "progress": progress,
            **kwargs,
@@ -67,7 +113,10 @@ class RedisState(BaseState):
        if not task_data:
            return None
-        task = {key.decode('utf-8'): self._convert_to_original_type(value) for key, value in task_data.items()}
+        task = {
            key.decode("utf-8"): self._convert_to_original_type(value)
            for key, value in task_data.items()
        }
        return task
    def delete_task(self, task_id: str):
@@ -79,7 +128,7 @@ class RedisState(BaseState):
        Convert the value from byte string to its original data type.
        You can extend this method to handle other data types as needed.
        """
-        value_str = value.decode('utf-8')
+        value_str = value.decode("utf-8")
        try:
            # try to convert byte string array to list
@@ -100,4 +149,10 @@ _redis_port = config.app.get("redis_port", 6379)
 _redis_db = config.app.get("redis_db", 0)
 _redis_password = config.app.get("redis_password", None)
-state = RedisState(host=_redis_host, port=_redis_port, db=_redis_db, password=_redis_password) if _enable_redis else MemoryState()
+state = (
    RedisState(
        host=_redis_host, port=_redis_port, db=_redis_db, password=_redis_password
    )
    if _enable_redis
    else MemoryState()
 )
--- a/app/services/subtitle.py
+++ b/app/services/subtitle.py
@@ -1,9 +1,9 @@
 import json
 import os.path
 import re
 from timeit import default_timer as timer
 from faster_whisper import WhisperModel
 from timeit import default_timer as timer
 from loguru import logger
 from app.config import config
@@ -23,18 +23,22 @@ def create(audio_file, subtitle_file: str = ""):
        if not os.path.isdir(model_path) or not os.path.isfile(model_bin_file):
            model_path = model_size
-        logger.info(f"loading model: {model_path}, device: {device}, compute_type: {compute_type}")
+        logger.info(
            f"loading model: {model_path}, device: {device}, compute_type: {compute_type}"
        )
        try:
-            model = WhisperModel(model_size_or_path=model_path,
+            model = WhisperModel(
-                                 device=device,
+                model_size_or_path=model_path, device=device, compute_type=compute_type
-                                 compute_type=compute_type)
+            )
        except Exception as e:
-            logger.error(f"failed to load model: {e} \n\n"
+            logger.error(
-                         f"********************************************\n"
+                f"failed to load model: {e} \n\n"
-                         f"this may be caused by network issue. \n"
+                f"********************************************\n"
-                         f"please download the model manually and put it in the 'models' folder. \n"
+                f"this may be caused by network issue. \n"
-                         f"see [README.md FAQ](https://github.com/harry0703/MoneyPrinterTurbo) for more details.\n"
+                f"please download the model manually and put it in the 'models' folder. \n"
-                         f"********************************************\n\n")
+                f"see [README.md FAQ](https://github.com/harry0703/MoneyPrinterTurbo) for more details.\n"
                f"********************************************\n\n"
            )
            return None
    logger.info(f"start, output file: {subtitle_file}")
@@ -49,7 +53,9 @@ def create(audio_file, subtitle_file: str = ""):
        vad_parameters=dict(min_silence_duration_ms=500),
    )
-    logger.info(f"detected language: '{info.language}', probability: {info.language_probability:.2f}")
+    logger.info(
        f"detected language: '{info.language}', probability: {info.language_probability:.2f}"
    )
    start = timer()
    subtitles = []
@@ -62,11 +68,9 @@ def create(audio_file, subtitle_file: str = ""):
        msg = "[%.2fs -> %.2fs] %s" % (seg_start, seg_end, seg_text)
        logger.debug(msg)
-        subtitles.append({
+        subtitles.append(
-            "msg": seg_text,
+            {"msg": seg_text, "start_time": seg_start, "end_time": seg_end}
-            "start_time": seg_start,
+        )
            "end_time": seg_end
        })
    for segment in segments:
        words_idx = 0
@@ -84,7 +88,7 @@ def create(audio_file, subtitle_file: str = ""):
                    is_segmented = True
                seg_end = word.end
-                # 如果包含标点,则断句
+                # If it contains punctuation, then break the sentence.
                seg_text += word.word
                if utils.str_contains_punctuation(word.word):
@@ -119,7 +123,11 @@ def create(audio_file, subtitle_file: str = ""):
    for subtitle in subtitles:
        text = subtitle.get("msg")
        if text:
-            lines.append(utils.text_to_srt(idx, text, subtitle.get("start_time"), subtitle.get("end_time")))
+            lines.append(
                utils.text_to_srt(
                    idx, text, subtitle.get("start_time"), subtitle.get("end_time")
                )
            )
            idx += 1
    sub = "\n".join(lines) + "\n"
@@ -136,12 +144,12 @@ def file_to_subtitles(filename):
    current_times = None
    current_text = ""
    index = 0
-    with open(filename, 'r', encoding="utf-8") as f:
+    with open(filename, "r", encoding="utf-8") as f:
        for line in f:
            times = re.findall("([0-9]*:[0-9]*:[0-9]*,[0-9]*)", line)
            if times:
                current_times = line
-            elif line.strip() == '' and current_times:
+            elif line.strip() == "" and current_times:
                index += 1
                times_texts.append((index, current_times.strip(), current_text.strip()))
                current_times, current_text = None, ""
@@ -150,27 +158,124 @@ def file_to_subtitles(filename):
    return times_texts
 def levenshtein_distance(s1, s2):
    if len(s1) < len(s2):
        return levenshtein_distance(s2, s1)
    if len(s2) == 0:
        return len(s1)
    previous_row = range(len(s2) + 1)
    for i, c1 in enumerate(s1):
        current_row = [i + 1]
        for j, c2 in enumerate(s2):
            insertions = previous_row[j + 1] + 1
            deletions = current_row[j] + 1
            substitutions = previous_row[j] + (c1 != c2)
            current_row.append(min(insertions, deletions, substitutions))
        previous_row = current_row
    return previous_row[-1]
 def similarity(a, b):
    distance = levenshtein_distance(a.lower(), b.lower())
    max_length = max(len(a), len(b))
    return 1 - (distance / max_length)
 def correct(subtitle_file, video_script):
    subtitle_items = file_to_subtitles(subtitle_file)
    script_lines = utils.split_string_by_punctuations(video_script)
    corrected = False
-    if len(subtitle_items) == len(script_lines):
+    new_subtitle_items = []
-        for i in range(len(script_lines)):
+    script_index = 0
-            script_line = script_lines[i].strip()
+    subtitle_index = 0
-            subtitle_line = subtitle_items[i][2]
+
-            if script_line != subtitle_line:
+    while script_index < len(script_lines) and subtitle_index < len(subtitle_items):
-                logger.warning(f"line {i + 1}, script: {script_line}, subtitle: {subtitle_line}")
+        script_line = script_lines[script_index].strip()
-                subtitle_items[i] = (subtitle_items[i][0], subtitle_items[i][1], script_line)
+        subtitle_line = subtitle_items[subtitle_index][2].strip()
        if script_line == subtitle_line:
            new_subtitle_items.append(subtitle_items[subtitle_index])
            script_index += 1
            subtitle_index += 1
        else:
            combined_subtitle = subtitle_line
            start_time = subtitle_items[subtitle_index][1].split(" --> ")[0]
            end_time = subtitle_items[subtitle_index][1].split(" --> ")[1]
            next_subtitle_index = subtitle_index + 1
            while next_subtitle_index < len(subtitle_items):
                next_subtitle = subtitle_items[next_subtitle_index][2].strip()
                if similarity(
                    script_line, combined_subtitle + " " + next_subtitle
                ) > similarity(script_line, combined_subtitle):
                    combined_subtitle += " " + next_subtitle
                    end_time = subtitle_items[next_subtitle_index][1].split(" --> ")[1]
                    next_subtitle_index += 1
                else:
                    break
            if similarity(script_line, combined_subtitle) > 0.8:
                logger.warning(
                    f"Merged/Corrected - Script: {script_line}, Subtitle: {combined_subtitle}"
                )
                new_subtitle_items.append(
                    (
                        len(new_subtitle_items) + 1,
                        f"{start_time} --> {end_time}",
                        script_line,
                    )
                )
                corrected = True
            else:
                logger.warning(
                    f"Mismatch - Script: {script_line}, Subtitle: {combined_subtitle}"
                )
                new_subtitle_items.append(
                    (
                        len(new_subtitle_items) + 1,
                        f"{start_time} --> {end_time}",
                        script_line,
                    )
                )
                corrected = True
            script_index += 1
            subtitle_index = next_subtitle_index
    # Process the remaining lines of the script.
    while script_index < len(script_lines):
        logger.warning(f"Extra script line: {script_lines[script_index]}")
        if subtitle_index < len(subtitle_items):
            new_subtitle_items.append(
                (
                    len(new_subtitle_items) + 1,
                    subtitle_items[subtitle_index][1],
                    script_lines[script_index],
                )
            )
            subtitle_index += 1
        else:
            new_subtitle_items.append(
                (
                    len(new_subtitle_items) + 1,
                    "00:00:00,000 --> 00:00:00,000",
                    script_lines[script_index],
                )
            )
        script_index += 1
        corrected = True
    if corrected:
        with open(subtitle_file, "w", encoding="utf-8") as fd:
-            for item in subtitle_items:
+            for i, item in enumerate(new_subtitle_items):
-                fd.write(f"{item[0]}\n{item[1]}\n{item[2]}\n\n")
+                fd.write(f"{i + 1}\n{item[1]}\n{item[2]}\n\n")
-        logger.info(f"subtitle corrected")
+        logger.info("Subtitle corrected")
    else:
-        logger.success(f"subtitle is correct")
+        logger.success("Subtitle is correct")
 if __name__ == "__main__":
--- a/app/services/task.py
+++ b/app/services/task.py
@@ -7,57 +7,42 @@ from loguru import logger
 from app.config import config
 from app.models import const
-from app.models.schema import VideoParams, VideoConcatMode
+from app.models.schema import VideoConcatMode, VideoParams
-from app.services import llm, material, voice, video, subtitle
+from app.services import llm, material, subtitle, video, voice
 from app.services import state as sm
 from app.utils import utils
-def start(task_id, params: VideoParams):
+def generate_script(task_id, params):
    """
    {
        "video_subject": "",
        "video_aspect": "横屏 16:9（西瓜视频）",
        "voice_name": "女生-晓晓",
        "enable_bgm": false,
        "font_name": "STHeitiMedium 黑体-中",
        "text_color": "#FFFFFF",
        "font_size": 60,
        "stroke_color": "#000000",
        "stroke_width": 1.5
    }
    """
    logger.info(f"start task: {task_id}")
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=5)
    video_subject = params.video_subject
    voice_name = voice.parse_voice_name(params.voice_name)
    paragraph_number = params.paragraph_number
    n_threads = params.n_threads
    max_clip_duration = params.video_clip_duration
    logger.info("\n\n## generating video script")
    video_script = params.video_script.strip()
    if not video_script:
-        video_script = llm.generate_script(video_subject=video_subject, language=params.video_language,
+        video_script = llm.generate_script(
-                                           paragraph_number=paragraph_number)
+            video_subject=params.video_subject,
            language=params.video_language,
            paragraph_number=params.paragraph_number,
        )
    else:
        logger.debug(f"video script: \n{video_script}")
    if not video_script:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        logger.error("failed to generate video script.")
-        return
+        return None
-    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=10)
+    return video_script
 def generate_terms(task_id, params, video_script):
    logger.info("\n\n## generating video terms")
    video_terms = params.video_terms
    if not video_terms:
-        video_terms = llm.generate_terms(video_subject=video_subject, video_script=video_script, amount=5)
+        video_terms = llm.generate_terms(
            video_subject=params.video_subject, video_script=video_script, amount=5
        )
    else:
        if isinstance(video_terms, str):
-            video_terms = [term.strip() for term in re.split(r'[,，]', video_terms)]
+            video_terms = [term.strip() for term in re.split(r"[,，]", video_terms)]
        elif isinstance(video_terms, list):
            video_terms = [term.strip() for term in video_terms]
        else:
@@ -68,9 +53,13 @@ def start(task_id, params: VideoParams):
    if not video_terms:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        logger.error("failed to generate video terms.")
-        return
+        return None
-    script_file = path.join(utils.task_dir(task_id), f"script.json")
+    return video_terms
 def save_script_data(task_id, video_script, video_terms, params):
    script_file = path.join(utils.task_dir(task_id), "script.json")
    script_data = {
        "script": video_script,
        "search_terms": video_terms,
@@ -80,11 +69,16 @@ def start(task_id, params: VideoParams):
    with open(script_file, "w", encoding="utf-8") as f:
        f.write(utils.to_json(script_data))
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=20)
 def generate_audio(task_id, params, video_script):
    logger.info("\n\n## generating audio")
-    audio_file = path.join(utils.task_dir(task_id), f"audio.mp3")
+    audio_file = path.join(utils.task_dir(task_id), "audio.mp3")
-    sub_maker = voice.tts(text=video_script, voice_name=voice_name, voice_file=audio_file)
+    sub_maker = voice.tts(
        text=video_script,
        voice_name=voice.parse_voice_name(params.voice_name),
        voice_rate=params.voice_rate,
        voice_file=audio_file,
    )
    if sub_maker is None:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        logger.error(
@@ -93,86 +87,102 @@ def start(task_id, params: VideoParams):
 2. check if the network is available. If you are in China, it is recommended to use a VPN and enable the global traffic mode.
        """.strip()
        )
-        return
+        return None, None, None
-    audio_duration = voice.get_audio_duration(sub_maker)
+    audio_duration = math.ceil(voice.get_audio_duration(sub_maker))
-    audio_duration = math.ceil(audio_duration)
+    return audio_file, audio_duration, sub_maker
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=30)
-    subtitle_path = ""
+def generate_subtitle(task_id, params, video_script, sub_maker, audio_file):
-    if params.subtitle_enabled:
+    if not params.subtitle_enabled:
-        subtitle_path = path.join(utils.task_dir(task_id), f"subtitle.srt")
+        return ""
        subtitle_provider = config.app.get("subtitle_provider", "").strip().lower()
        logger.info(f"\n\n## generating subtitle, provider: {subtitle_provider}")
        subtitle_fallback = False
        if subtitle_provider == "edge":
            voice.create_subtitle(text=video_script, sub_maker=sub_maker, subtitle_file=subtitle_path)
            if not os.path.exists(subtitle_path):
                subtitle_fallback = True
                logger.warning("subtitle file not found, fallback to whisper")
-        if subtitle_provider == "whisper" or subtitle_fallback:
+    subtitle_path = path.join(utils.task_dir(task_id), "subtitle.srt")
-            subtitle.create(audio_file=audio_file, subtitle_file=subtitle_path)
+    subtitle_provider = config.app.get("subtitle_provider", "edge").strip().lower()
-            logger.info("\n\n## correcting subtitle")
+    logger.info(f"\n\n## generating subtitle, provider: {subtitle_provider}")
            subtitle.correct(subtitle_file=subtitle_path, video_script=video_script)
-        subtitle_lines = subtitle.file_to_subtitles(subtitle_path)
+    subtitle_fallback = False
-        if not subtitle_lines:
+    if subtitle_provider == "edge":
-            logger.warning(f"subtitle file is invalid: {subtitle_path}")
+        voice.create_subtitle(
-            subtitle_path = ""
+            text=video_script, sub_maker=sub_maker, subtitle_file=subtitle_path
        )
        if not os.path.exists(subtitle_path):
            subtitle_fallback = True
            logger.warning("subtitle file not found, fallback to whisper")
-    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=40)
+    if subtitle_provider == "whisper" or subtitle_fallback:
        subtitle.create(audio_file=audio_file, subtitle_file=subtitle_path)
        logger.info("\n\n## correcting subtitle")
        subtitle.correct(subtitle_file=subtitle_path, video_script=video_script)
-    downloaded_videos = []
+    subtitle_lines = subtitle.file_to_subtitles(subtitle_path)
    if not subtitle_lines:
        logger.warning(f"subtitle file is invalid: {subtitle_path}")
        return ""
    return subtitle_path
 def get_video_materials(task_id, params, video_terms, audio_duration):
    if params.video_source == "local":
        logger.info("\n\n## preprocess local materials")
-        materials = video.preprocess_video(materials=params.video_materials, clip_duration=max_clip_duration)
+        materials = video.preprocess_video(
-        print(materials)
+            materials=params.video_materials, clip_duration=params.video_clip_duration
-
+        )
        if not materials:
            sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
-            logger.error("no valid materials found, please check the materials and try again.")
+            logger.error(
-            return
+                "no valid materials found, please check the materials and try again."
-        for material_info in materials:
+            )
-            print(material_info)
+            return None
-            downloaded_videos.append(material_info.url)
+        return [material_info.url for material_info in materials]
    else:
        logger.info(f"\n\n## downloading videos from {params.video_source}")
-        downloaded_videos = material.download_videos(task_id=task_id,
+        downloaded_videos = material.download_videos(
-                                                     search_terms=video_terms,
+            task_id=task_id,
-                                                     source=params.video_source,
+            search_terms=video_terms,
-                                                     video_aspect=params.video_aspect,
+            source=params.video_source,
-                                                     video_contact_mode=params.video_concat_mode,
+            video_aspect=params.video_aspect,
-                                                     audio_duration=audio_duration * params.video_count,
+            video_contact_mode=params.video_concat_mode,
-                                                     max_clip_duration=max_clip_duration,
+            audio_duration=audio_duration * params.video_count,
-                                                     )
+            max_clip_duration=params.video_clip_duration,
-    if not downloaded_videos:
+        )
-        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
+        if not downloaded_videos:
-        logger.error(
+            sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
-            "failed to download videos, maybe the network is not available. if you are in China, please use a VPN.")
+            logger.error(
-        return
+                "failed to download videos, maybe the network is not available. if you are in China, please use a VPN."
            )
            return None
        return downloaded_videos
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=50)
 def generate_final_videos(
    task_id, params, downloaded_videos, audio_file, subtitle_path
 ):
    final_video_paths = []
    combined_video_paths = []
-    video_concat_mode = params.video_concat_mode
+    video_concat_mode = (
-    if params.video_count > 1:
+        params.video_concat_mode if params.video_count == 1 else VideoConcatMode.random
-        video_concat_mode = VideoConcatMode.random
+    )
    video_transition_mode = params.video_transition_mode
    _progress = 50
    for i in range(params.video_count):
        index = i + 1
-        combined_video_path = path.join(utils.task_dir(task_id), f"combined-{index}.mp4")
+        combined_video_path = path.join(
            utils.task_dir(task_id), f"combined-{index}.mp4"
        )
        logger.info(f"\n\n## combining video: {index} => {combined_video_path}")
-        video.combine_videos(combined_video_path=combined_video_path,
+        video.combine_videos(
-                             video_paths=downloaded_videos,
+            combined_video_path=combined_video_path,
-                             audio_file=audio_file,
+            video_paths=downloaded_videos,
-                             video_aspect=params.video_aspect,
+            audio_file=audio_file,
-                             video_concat_mode=video_concat_mode,
+            video_aspect=params.video_aspect,
-                             max_clip_duration=max_clip_duration,
+            video_concat_mode=video_concat_mode,
-                             threads=n_threads)
+            video_transition_mode=video_transition_mode,
            max_clip_duration=params.video_clip_duration,
            threads=params.n_threads,
        )
        _progress += 50 / params.video_count / 2
        sm.state.update_task(task_id, progress=_progress)
@@ -180,13 +190,13 @@ def start(task_id, params: VideoParams):
        final_video_path = path.join(utils.task_dir(task_id), f"final-{index}.mp4")
        logger.info(f"\n\n## generating video: {index} => {final_video_path}")
-        # Put everything together
+        video.generate_video(
-        video.generate_video(video_path=combined_video_path,
+            video_path=combined_video_path,
-                             audio_path=audio_file,
+            audio_path=audio_file,
-                             subtitle_path=subtitle_path,
+            subtitle_path=subtitle_path,
-                             output_file=final_video_path,
+            output_file=final_video_path,
-                             params=params,
+            params=params,
-                             )
+        )
        _progress += 50 / params.video_count / 2
        sm.state.update_task(task_id, progress=_progress)
@@ -194,16 +204,136 @@ def start(task_id, params: VideoParams):
        final_video_paths.append(final_video_path)
        combined_video_paths.append(combined_video_path)
-    logger.success(f"task {task_id} finished, generated {len(final_video_paths)} videos.")
+    return final_video_paths, combined_video_paths
 def start(task_id, params: VideoParams, stop_at: str = "video"):
    logger.info(f"start task: {task_id}, stop_at: {stop_at}")
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=5)
    if type(params.video_concat_mode) is str:
        params.video_concat_mode = VideoConcatMode(params.video_concat_mode)
    # 1. Generate script
    video_script = generate_script(task_id, params)
    if not video_script or "Error: " in video_script:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        return
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=10)
    if stop_at == "script":
        sm.state.update_task(
            task_id, state=const.TASK_STATE_COMPLETE, progress=100, script=video_script
        )
        return {"script": video_script}
    # 2. Generate terms
    video_terms = ""
    if params.video_source != "local":
        video_terms = generate_terms(task_id, params, video_script)
        if not video_terms:
            sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
            return
    save_script_data(task_id, video_script, video_terms, params)
    if stop_at == "terms":
        sm.state.update_task(
            task_id, state=const.TASK_STATE_COMPLETE, progress=100, terms=video_terms
        )
        return {"script": video_script, "terms": video_terms}
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=20)
    # 3. Generate audio
    audio_file, audio_duration, sub_maker = generate_audio(
        task_id, params, video_script
    )
    if not audio_file:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        return
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=30)
    if stop_at == "audio":
        sm.state.update_task(
            task_id,
            state=const.TASK_STATE_COMPLETE,
            progress=100,
            audio_file=audio_file,
        )
        return {"audio_file": audio_file, "audio_duration": audio_duration}
    # 4. Generate subtitle
    subtitle_path = generate_subtitle(
        task_id, params, video_script, sub_maker, audio_file
    )
    if stop_at == "subtitle":
        sm.state.update_task(
            task_id,
            state=const.TASK_STATE_COMPLETE,
            progress=100,
            subtitle_path=subtitle_path,
        )
        return {"subtitle_path": subtitle_path}
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=40)
    # 5. Get video materials
    downloaded_videos = get_video_materials(
        task_id, params, video_terms, audio_duration
    )
    if not downloaded_videos:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        return
    if stop_at == "materials":
        sm.state.update_task(
            task_id,
            state=const.TASK_STATE_COMPLETE,
            progress=100,
            materials=downloaded_videos,
        )
        return {"materials": downloaded_videos}
    sm.state.update_task(task_id, state=const.TASK_STATE_PROCESSING, progress=50)
    # 6. Generate final videos
    final_video_paths, combined_video_paths = generate_final_videos(
        task_id, params, downloaded_videos, audio_file, subtitle_path
    )
    if not final_video_paths:
        sm.state.update_task(task_id, state=const.TASK_STATE_FAILED)
        return
    logger.success(
        f"task {task_id} finished, generated {len(final_video_paths)} videos."
    )
    kwargs = {
        "videos": final_video_paths,
-        "combined_videos": combined_video_paths
+        "combined_videos": combined_video_paths,
        "script": video_script,
        "terms": video_terms,
        "audio_file": audio_file,
        "audio_duration": audio_duration,
        "subtitle_path": subtitle_path,
        "materials": downloaded_videos,
    }
-    sm.state.update_task(task_id, state=const.TASK_STATE_COMPLETE, progress=100, **kwargs)
+    sm.state.update_task(
        task_id, state=const.TASK_STATE_COMPLETE, progress=100, **kwargs
    )
    return kwargs
-# def start_test(task_id, params: VideoParams):
+
-#     print(f"start task {task_id} \n")
+if __name__ == "__main__":
-#     time.sleep(5)
+    task_id = "task_id"
-#     print(f"task {task_id} finished \n")
+    params = VideoParams(
        video_subject="金钱的作用",
        voice_name="zh-CN-XiaoyiNeural-Female",
        voice_rate=1.0,
    )
    start(task_id, params, stop_at="video")
--- a/app/services/utils/video_effects.py
+++ b/app/services/utils/video_effects.py
@@ -0,0 +1,21 @@
 from moviepy import Clip, vfx
 # FadeIn
 def fadein_transition(clip: Clip, t: float) -> Clip:
    return clip.with_effects([vfx.FadeIn(t)])
 # FadeOut
 def fadeout_transition(clip: Clip, t: float) -> Clip:
    return clip.with_effects([vfx.FadeOut(t)])
 # SlideIn
 def slidein_transition(clip: Clip, t: float, side: str) -> Clip:
    return clip.with_effects([vfx.SlideIn(t, side)])
 # SlideOut
 def slideout_transition(clip: Clip, t: float, side: str) -> Clip:
    return clip.with_effects([vfx.SlideOut(t, side)])
--- a/app/services/video.py
+++ b/app/services/video.py
@@ -1,15 +1,102 @@
 import glob
 import itertools
 import os
 import random
 import gc
 import shutil
 from typing import List
 from PIL import ImageFont, Image
 from loguru import logger
-from moviepy.editor import *
+from moviepy import (
    AudioFileClip,
    ColorClip,
    CompositeAudioClip,
    CompositeVideoClip,
    ImageClip,
    TextClip,
    VideoFileClip,
    afx,
    concatenate_videoclips,
 )
 from moviepy.video.tools.subtitles import SubtitlesClip
 from PIL import ImageFont
 from app.models import const
-from app.models.schema import VideoAspect, VideoParams, VideoConcatMode, MaterialInfo
+from app.models.schema import (
    MaterialInfo,
    VideoAspect,
    VideoConcatMode,
    VideoParams,
    VideoTransitionMode,
 )
 from app.services.utils import video_effects
 from app.utils import utils
 class SubClippedVideoClip:
    def __init__(self, file_path, start_time=None, end_time=None, width=None, height=None, duration=None):
        self.file_path = file_path
        self.start_time = start_time
        self.end_time = end_time
        self.width = width
        self.height = height
        if duration is None:
            self.duration = end_time - start_time
        else:
            self.duration = duration
    def __str__(self):
        return f"SubClippedVideoClip(file_path={self.file_path}, start_time={self.start_time}, end_time={self.end_time}, duration={self.duration}, width={self.width}, height={self.height})"
 audio_codec = "aac"
 video_codec = "libx264"
 fps = 30
 def close_clip(clip):
    if clip is None:
        return
    try:
        # close main resources
        if hasattr(clip, 'reader') and clip.reader is not None:
            clip.reader.close()
        # close audio resources
        if hasattr(clip, 'audio') and clip.audio is not None:
            if hasattr(clip.audio, 'reader') and clip.audio.reader is not None:
                clip.audio.reader.close()
            del clip.audio
        # close mask resources
        if hasattr(clip, 'mask') and clip.mask is not None:
            if hasattr(clip.mask, 'reader') and clip.mask.reader is not None:
                clip.mask.reader.close()
            del clip.mask
        # handle child clips in composite clips
        if hasattr(clip, 'clips') and clip.clips:
            for child_clip in clip.clips:
                if child_clip is not clip:  # avoid possible circular references
                    close_clip(child_clip)
        # clear clip list
        if hasattr(clip, 'clips'):
            clip.clips = []
    except Exception as e:
        logger.error(f"failed to close clip: {str(e)}")
    del clip
    gc.collect()
 def delete_files(files: List[str] | str):
    if isinstance(files, str):
        files = [files]
    for file in files:
        try:
            os.remove(file)
        except:
            pass
 def get_bgm_file(bgm_type: str = "random", bgm_file: str = ""):
    if not bgm_type:
@@ -27,113 +114,203 @@ def get_bgm_file(bgm_type: str = "random", bgm_file: str = ""):
    return ""
-def combine_videos(combined_video_path: str,
+def combine_videos(
-                   video_paths: List[str],
+    combined_video_path: str,
-                   audio_file: str,
+    video_paths: List[str],
-                   video_aspect: VideoAspect = VideoAspect.portrait,
+    audio_file: str,
-                   video_concat_mode: VideoConcatMode = VideoConcatMode.random,
+    video_aspect: VideoAspect = VideoAspect.portrait,
-                   max_clip_duration: int = 5,
+    video_concat_mode: VideoConcatMode = VideoConcatMode.random,
-                   threads: int = 2,
+    video_transition_mode: VideoTransitionMode = None,
-                   ) -> str:
+    max_clip_duration: int = 5,
    threads: int = 2,
 ) -> str:
    audio_clip = AudioFileClip(audio_file)
    audio_duration = audio_clip.duration
-    logger.info(f"max duration of audio: {audio_duration} seconds")
+    logger.info(f"audio duration: {audio_duration} seconds")
    # Required duration of each clip
    req_dur = audio_duration / len(video_paths)
    req_dur = max_clip_duration
-    logger.info(f"each clip will be maximum {req_dur} seconds long")
+    logger.info(f"maximum clip duration: {req_dur} seconds")
    output_dir = os.path.dirname(combined_video_path)
    aspect = VideoAspect(video_aspect)
    video_width, video_height = aspect.to_resolution()
-    clips = []
+    processed_clips = []
    subclipped_items = []
    video_duration = 0
    raw_clips = []
    for video_path in video_paths:
-        clip = VideoFileClip(video_path).without_audio()
+        clip = VideoFileClip(video_path)
        clip_duration = clip.duration
        clip_w, clip_h = clip.size
        close_clip(clip)
        start_time = 0
        while start_time < clip_duration:
-            end_time = min(start_time + max_clip_duration, clip_duration)
+            end_time = min(start_time + max_clip_duration, clip_duration)            
-            split_clip = clip.subclip(start_time, end_time)
+            if clip_duration - start_time >= max_clip_duration:
-            raw_clips.append(split_clip)
+                subclipped_items.append(SubClippedVideoClip(file_path= video_path, start_time=start_time, end_time=end_time, width=clip_w, height=clip_h))
-            # logger.info(f"splitting from {start_time:.2f} to {end_time:.2f}, clip duration {clip_duration:.2f}, split_clip duration {split_clip.duration:.2f}")
+            start_time = end_time    
            start_time = end_time
            if video_concat_mode.value == VideoConcatMode.sequential.value:
                break
-    # random video_paths order
+    # random subclipped_items order
    if video_concat_mode.value == VideoConcatMode.random.value:
-        random.shuffle(raw_clips)
+        random.shuffle(subclipped_items)
-
+        
    logger.debug(f"total subclipped items: {len(subclipped_items)}")
    # Add downloaded clips over and over until the duration of the audio (max_duration) has been reached
-    while video_duration < audio_duration:
+    for i, subclipped_item in enumerate(subclipped_items):
-        for clip in raw_clips:
+        if video_duration > audio_duration:
-            # Check if clip is longer than the remaining audio
+            break
-            if (audio_duration - video_duration) < clip.duration:
+        
-                clip = clip.subclip(0, (audio_duration - video_duration))
+        logger.debug(f"processing clip {i+1}: {subclipped_item.width}x{subclipped_item.height}, current duration: {video_duration:.2f}s, remaining: {audio_duration - video_duration:.2f}s")
-            # Only shorten clips if the calculated clip length (req_dur) is shorter than the actual clip to prevent still image
+        
-            elif req_dur < clip.duration:
+        try:
-                clip = clip.subclip(0, req_dur)
+            clip = VideoFileClip(subclipped_item.file_path).subclipped(subclipped_item.start_time, subclipped_item.end_time)
-            clip = clip.set_fps(30)
+            clip_duration = clip.duration
            # Not all videos are same size, so we need to resize them
            clip_w, clip_h = clip.size
            if clip_w != video_width or clip_h != video_height:
                clip_ratio = clip.w / clip.h
                video_ratio = video_width / video_height
-
+                logger.debug(f"resizing clip, source: {clip_w}x{clip_h}, ratio: {clip_ratio:.2f}, target: {video_width}x{video_height}, ratio: {video_ratio:.2f}")
                if clip_ratio == video_ratio:
-                    # 等比例缩放
+                    clip = clip.resized(new_size=(video_width, video_height))
                    clip = clip.resize((video_width, video_height))
                else:
                    # 等比缩放视频
                    if clip_ratio > video_ratio:
                        # 按照目标宽度等比缩放
                        scale_factor = video_width / clip_w
                    else:
                        # 按照目标高度等比缩放
                        scale_factor = video_height / clip_h
                    new_width = int(clip_w * scale_factor)
                    new_height = int(clip_h * scale_factor)
                    clip_resized = clip.resize(newsize=(new_width, new_height))
-                    background = ColorClip(size=(video_width, video_height), color=(0, 0, 0))
+                    background = ColorClip(size=(video_width, video_height), color=(0, 0, 0)).with_duration(clip_duration)
-                    clip = CompositeVideoClip([
+                    clip_resized = clip.resized(new_size=(new_width, new_height)).with_position("center")
-                        background.set_duration(clip.duration),
+                    clip = CompositeVideoClip([background, clip_resized])
-                        clip_resized.set_position("center")
+                    
-                    ])
+                    close_clip(clip_resized)
-
+                    close_clip(background)
-                logger.info(f"resizing video to {video_width} x {video_height}, clip size: {clip_w} x {clip_h}")
+                    
            shuffle_side = random.choice(["left", "right", "top", "bottom"])
            if video_transition_mode.value == VideoTransitionMode.none.value:
                clip = clip
            elif video_transition_mode.value == VideoTransitionMode.fade_in.value:
                clip = video_effects.fadein_transition(clip, 1)
            elif video_transition_mode.value == VideoTransitionMode.fade_out.value:
                clip = video_effects.fadeout_transition(clip, 1)
            elif video_transition_mode.value == VideoTransitionMode.slide_in.value:
                clip = video_effects.slidein_transition(clip, 1, shuffle_side)
            elif video_transition_mode.value == VideoTransitionMode.slide_out.value:
                clip = video_effects.slideout_transition(clip, 1, shuffle_side)
            elif video_transition_mode.value == VideoTransitionMode.shuffle.value:
                transition_funcs = [
                    lambda c: video_effects.fadein_transition(c, 1),
                    lambda c: video_effects.fadeout_transition(c, 1),
                    lambda c: video_effects.slidein_transition(c, 1, shuffle_side),
                    lambda c: video_effects.slideout_transition(c, 1, shuffle_side),
                ]
                shuffle_transition = random.choice(transition_funcs)
                clip = shuffle_transition(clip)
            if clip.duration > max_clip_duration:
-                clip = clip.subclip(0, max_clip_duration)
+                clip = clip.subclipped(0, max_clip_duration)
-
+                
-            clips.append(clip)
+            # wirte clip to temp file
            clip_file = f"{output_dir}/temp-clip-{i+1}.mp4"
            clip.write_videofile(clip_file, logger=None, fps=fps, codec=video_codec)
            close_clip(clip)
            processed_clips.append(SubClippedVideoClip(file_path=clip_file, duration=clip.duration, width=clip_w, height=clip_h))
            video_duration += clip.duration
        except Exception as e:
            logger.error(f"failed to process clip: {str(e)}")
    # loop processed clips until the video duration matches or exceeds the audio duration.
    if video_duration < audio_duration:
        logger.warning(f"video duration ({video_duration:.2f}s) is shorter than audio duration ({audio_duration:.2f}s), looping clips to match audio length.")
        base_clips = processed_clips.copy()
        for clip in itertools.cycle(base_clips):
            if video_duration >= audio_duration:
                break
            processed_clips.append(clip)
            video_duration += clip.duration
        logger.info(f"video duration: {video_duration:.2f}s, audio duration: {audio_duration:.2f}s, looped {len(processed_clips)-len(base_clips)} clips")
    # merge video clips progressively, avoid loading all videos at once to avoid memory overflow
    logger.info("starting clip merging process")
    if not processed_clips:
        logger.warning("no clips available for merging")
        return combined_video_path
    # if there is only one clip, use it directly
    if len(processed_clips) == 1:
        logger.info("using single clip directly")
        shutil.copy(processed_clips[0].file_path, combined_video_path)
        delete_files(processed_clips)
        logger.info("video combining completed")
        return combined_video_path
    # create initial video file as base
    base_clip_path = processed_clips[0].file_path
    temp_merged_video = f"{output_dir}/temp-merged-video.mp4"
    temp_merged_next = f"{output_dir}/temp-merged-next.mp4"
    # copy first clip as initial merged video
    shutil.copy(base_clip_path, temp_merged_video)
    # merge remaining video clips one by one
    for i, clip in enumerate(processed_clips[1:], 1):
        logger.info(f"merging clip {i}/{len(processed_clips)-1}, duration: {clip.duration:.2f}s")
        try:
            # load current base video and next clip to merge
            base_clip = VideoFileClip(temp_merged_video)
            next_clip = VideoFileClip(clip.file_path)
            # merge these two clips
            merged_clip = concatenate_videoclips([base_clip, next_clip])
-    video_clip = concatenate_videoclips(clips)
+            # save merged result to temp file
-    video_clip = video_clip.set_fps(30)
+            merged_clip.write_videofile(
-    logger.info(f"writing")
+                filename=temp_merged_next,
-    # https://github.com/harry0703/MoneyPrinterTurbo/issues/111#issuecomment-2032354030
+                threads=threads,
-    video_clip.write_videofile(filename=combined_video_path,
+                logger=None,
-                               threads=threads,
+                temp_audiofile_path=output_dir,
-                               logger=None,
+                audio_codec=audio_codec,
-                               temp_audiofile_path=output_dir,
+                fps=fps,
-                               audio_codec="aac",
+            )
-                               fps=30,
+            close_clip(base_clip)
-                               )
+            close_clip(next_clip)
-    video_clip.close()
+            close_clip(merged_clip)
-    logger.success(f"completed")
+            
            # replace base file with new merged file
            delete_files(temp_merged_video)
            os.rename(temp_merged_next, temp_merged_video)
        except Exception as e:
            logger.error(f"failed to merge clip: {str(e)}")
            continue
    # after merging, rename final result to target file name
    os.rename(temp_merged_video, combined_video_path)
    # clean temp files
    clip_files = [clip.file_path for clip in processed_clips]
    delete_files(clip_files)
    logger.info("video combining completed")
    return combined_video_path
-def wrap_text(text, max_width, font='Arial', fontsize=60):
+def wrap_text(text, max_width, font="Arial", fontsize=60):
-    # 创建字体对象
+    # Create ImageFont
    font = ImageFont.truetype(font, fontsize)
    def get_text_size(inner_text):
@@ -145,13 +322,11 @@ def wrap_text(text, max_width, font='Arial', fontsize=60):
    if width <= max_width:
        return text, height
    # logger.warning(f"wrapping text, max_width: {max_width}, text_width: {width}, text: {text}")
    processed = True
    _wrapped_lines_ = []
    words = text.split(" ")
-    _txt_ = ''
+    _txt_ = ""
    for word in words:
        _before = _txt_
        _txt_ += f"{word} "
@@ -167,14 +342,13 @@ def wrap_text(text, max_width, font='Arial', fontsize=60):
    _wrapped_lines_.append(_txt_)
    if processed:
        _wrapped_lines_ = [line.strip() for line in _wrapped_lines_]
-        result = '\n'.join(_wrapped_lines_).strip()
+        result = "\n".join(_wrapped_lines_).strip()
        height = len(_wrapped_lines_) * height
        # logger.warning(f"wrapped text: {result}")
        return result, height
    _wrapped_lines_ = []
    chars = list(text)
-    _txt_ = ''
+    _txt_ = ""
    for word in chars:
        _txt_ += word
        _width, _height = get_text_size(_txt_)
@@ -182,24 +356,24 @@ def wrap_text(text, max_width, font='Arial', fontsize=60):
            continue
        else:
            _wrapped_lines_.append(_txt_)
-            _txt_ = ''
+            _txt_ = ""
    _wrapped_lines_.append(_txt_)
-    result = '\n'.join(_wrapped_lines_).strip()
+    result = "\n".join(_wrapped_lines_).strip()
    height = len(_wrapped_lines_) * height
    # logger.warning(f"wrapped text: {result}")
    return result, height
-def generate_video(video_path: str,
+def generate_video(
-                   audio_path: str,
+    video_path: str,
-                   subtitle_path: str,
+    audio_path: str,
-                   output_file: str,
+    subtitle_path: str,
-                   params: VideoParams,
+    output_file: str,
-                   ):
+    params: VideoParams,
 ):
    aspect = VideoAspect(params.video_aspect)
    video_width, video_height = aspect.to_resolution()
-    logger.info(f"start, video size: {video_width} x {video_height}")
+    logger.info(f"generating video: {video_width} x {video_height}")
    logger.info(f"  ① video: {video_path}")
    logger.info(f"  ② audio: {audio_path}")
    logger.info(f"  ③ subtitle: {subtitle_path}")
@@ -215,46 +389,71 @@ def generate_video(video_path: str,
        if not params.font_name:
            params.font_name = "STHeitiMedium.ttc"
        font_path = os.path.join(utils.font_dir(), params.font_name)
-        if os.name == 'nt':
+        if os.name == "nt":
            font_path = font_path.replace("\\", "/")
-        logger.info(f"using font: {font_path}")
+        logger.info(f"  ⑤ font: {font_path}")
    def create_text_clip(subtitle_item):
        params.font_size = int(params.font_size)
        params.stroke_width = int(params.stroke_width)
        phrase = subtitle_item[1]
        max_width = video_width * 0.9
-        wrapped_txt, txt_height = wrap_text(phrase,
+        wrapped_txt, txt_height = wrap_text(
-                                            max_width=max_width,
+            phrase, max_width=max_width, font=font_path, fontsize=params.font_size
-                                            font=font_path,
+        )
-                                            fontsize=params.font_size
+        interline = int(params.font_size * 0.25)
-                                            )
+        size=(int(max_width), int(txt_height + params.font_size * 0.25 + (interline * (wrapped_txt.count("\n") + 1))))
        _clip = TextClip(
-            wrapped_txt,
+            text=wrapped_txt,
            font=font_path,
-            fontsize=params.font_size,
+            font_size=params.font_size,
            color=params.text_fore_color,
            bg_color=params.text_background_color,
            stroke_color=params.stroke_color,
            stroke_width=params.stroke_width,
-            print_cmd=False,
+            # interline=interline,
            # size=size,
        )
        duration = subtitle_item[0][1] - subtitle_item[0][0]
-        _clip = _clip.set_start(subtitle_item[0][0])
+        _clip = _clip.with_start(subtitle_item[0][0])
-        _clip = _clip.set_end(subtitle_item[0][1])
+        _clip = _clip.with_end(subtitle_item[0][1])
-        _clip = _clip.set_duration(duration)
+        _clip = _clip.with_duration(duration)
        if params.subtitle_position == "bottom":
-            _clip = _clip.set_position(('center', video_height * 0.95 - _clip.h))
+            _clip = _clip.with_position(("center", video_height * 0.95 - _clip.h))
        elif params.subtitle_position == "top":
-            _clip = _clip.set_position(('center', video_height * 0.1))
+            _clip = _clip.with_position(("center", video_height * 0.05))
-        else:
+        elif params.subtitle_position == "custom":
-            _clip = _clip.set_position(('center', 'center'))
+            # Ensure the subtitle is fully within the screen bounds
            margin = 10  # Additional margin, in pixels
            max_y = video_height - _clip.h - margin
            min_y = margin
            custom_y = (video_height - _clip.h) * (params.custom_position / 100)
            custom_y = max(
                min_y, min(custom_y, max_y)
            )  # Constrain the y value within the valid range
            _clip = _clip.with_position(("center", custom_y))
        else:  # center
            _clip = _clip.with_position(("center", "center"))
        return _clip
-    video_clip = VideoFileClip(video_path)
+    video_clip = VideoFileClip(video_path).without_audio()
-    audio_clip = AudioFileClip(audio_path).volumex(params.voice_volume)
+    audio_clip = AudioFileClip(audio_path).with_effects(
        [afx.MultiplyVolume(params.voice_volume)]
    )
    def make_textclip(text):
        return TextClip(
            text=text,
            font=font_path,
            font_size=params.font_size,
        )
    if subtitle_path and os.path.exists(subtitle_path):
-        sub = SubtitlesClip(subtitles=subtitle_path, encoding='utf-8')
+        sub = SubtitlesClip(
            subtitles=subtitle_path, encoding="utf-8", make_textclip=make_textclip
        )
        text_clips = []
        for item in sub.subtitles:
            clip = create_text_clip(subtitle_item=item)
@@ -264,24 +463,28 @@ def generate_video(video_path: str,
    bgm_file = get_bgm_file(bgm_type=params.bgm_type, bgm_file=params.bgm_file)
    if bgm_file:
        try:
-            bgm_clip = (AudioFileClip(bgm_file)
+            bgm_clip = AudioFileClip(bgm_file).with_effects(
-                        .volumex(params.bgm_volume)
+                [
-                        .audio_fadeout(3))
+                    afx.MultiplyVolume(params.bgm_volume),
-            bgm_clip = afx.audio_loop(bgm_clip, duration=video_clip.duration)
+                    afx.AudioFadeOut(3),
                    afx.AudioLoop(duration=video_clip.duration),
                ]
            )
            audio_clip = CompositeAudioClip([audio_clip, bgm_clip])
        except Exception as e:
            logger.error(f"failed to add bgm: {str(e)}")
-    video_clip = video_clip.set_audio(audio_clip)
+    video_clip = video_clip.with_audio(audio_clip)
-    video_clip.write_videofile(output_file,
+    video_clip.write_videofile(
-                               audio_codec="aac",
+        output_file,
-                               temp_audiofile_path=output_dir,
+        audio_codec=audio_codec,
-                               threads=params.n_threads or 2,
+        temp_audiofile_path=output_dir,
-                               logger=None,
+        threads=params.n_threads or 2,
-                               fps=30,
+        logger=None,
-                               )
+        fps=fps,
    )
    video_clip.close()
-    logger.success(f"completed")
+    del video_clip
 def preprocess_video(materials: List[MaterialInfo], clip_duration=4):
@@ -292,93 +495,40 @@ def preprocess_video(materials: List[MaterialInfo], clip_duration=4):
        ext = utils.parse_extension(material.url)
        try:
            clip = VideoFileClip(material.url)
-        except Exception as e:
+        except Exception:
            clip = ImageClip(material.url)
        width = clip.size[0]
        height = clip.size[1]
        if width < 480 or height < 480:
-            logger.warning(f"video is too small, width: {width}, height: {height}")
+            logger.warning(f"low resolution material: {width}x{height}, minimum 480x480 required")
            continue
        if ext in const.FILE_TYPE_IMAGES:
            logger.info(f"processing image: {material.url}")
-            # 创建一个图片剪辑，并设置持续时间为3秒钟
+            # Create an image clip and set its duration to 3 seconds
-            clip = ImageClip(material.url).set_duration(clip_duration).set_position("center")
+            clip = (
-            # 使用resize方法来添加缩放效果。这里使用了lambda函数来使得缩放效果随时间变化。
+                ImageClip(material.url)
-            # 假设我们想要从原始大小逐渐放大到120%的大小。
+                .with_duration(clip_duration)
-            # t代表当前时间，clip.duration为视频总时长，这里是3秒。
+                .with_position("center")
-            # 注意：1 表示100%的大小，所以1.2表示120%的大小
+            )
-            zoom_clip = clip.resize(lambda t: 1 + (clip_duration * 0.03) * (t / clip.duration))
+            # Apply a zoom effect using the resize method.
            # A lambda function is used to make the zoom effect dynamic over time.
            # The zoom effect starts from the original size and gradually scales up to 120%.
            # t represents the current time, and clip.duration is the total duration of the clip (3 seconds).
            # Note: 1 represents 100% size, so 1.2 represents 120% size.
            zoom_clip = clip.resized(
                lambda t: 1 + (clip_duration * 0.03) * (t / clip.duration)
            )
-            # 如果需要，可以创建一个包含缩放剪辑的复合视频剪辑
+            # Optionally, create a composite video clip containing the zoomed clip.
-            # （这在您想要在视频中添加其他元素时非常有用）
+            # This is useful when you want to add other elements to the video.
            final_clip = CompositeVideoClip([zoom_clip])
-            # 输出视频
+            # Output the video to a file.
            video_file = f"{material.url}.mp4"
            final_clip.write_videofile(video_file, fps=30, logger=None)
-            final_clip.close()
+            close_clip(clip)
            material.url = video_file
-            logger.success(f"completed: {video_file}")
+            logger.success(f"image processed: {video_file}")
-    return materials
+    return materials
 if __name__ == "__main__":
    m = MaterialInfo()
    m.url = "/Users/harry/Downloads/IMG_2915.JPG"
    m.provider = "local"
    materials = preprocess_video([m], clip_duration=4)
    print(materials)
    # txt_en = "Here's your guide to travel hacks for budget-friendly adventures"
    # txt_zh = "测试长字段这是您的旅行技巧指南帮助您进行预算友好的冒险"
    # font = utils.resource_dir() + "/fonts/STHeitiMedium.ttc"
    # for txt in [txt_en, txt_zh]:
    #     t, h = wrap_text(text=txt, max_width=1000, font=font, fontsize=60)
    #     print(t)
    #
    # task_id = "aa563149-a7ea-49c2-b39f-8c32cc225baf"
    # task_dir = utils.task_dir(task_id)
    # video_file = f"{task_dir}/combined-1.mp4"
    # audio_file = f"{task_dir}/audio.mp3"
    # subtitle_file = f"{task_dir}/subtitle.srt"
    # output_file = f"{task_dir}/final.mp4"
    #
    # # video_paths = []
    # # for file in os.listdir(utils.storage_dir("test")):
    # #     if file.endswith(".mp4"):
    # #         video_paths.append(os.path.join(utils.storage_dir("test"), file))
    # #
    # # combine_videos(combined_video_path=video_file,
    # #                audio_file=audio_file,
    # #                video_paths=video_paths,
    # #                video_aspect=VideoAspect.portrait,
    # #                video_concat_mode=VideoConcatMode.random,
    # #                max_clip_duration=5,
    # #                threads=2)
    #
    # cfg = VideoParams()
    # cfg.video_aspect = VideoAspect.portrait
    # cfg.font_name = "STHeitiMedium.ttc"
    # cfg.font_size = 60
    # cfg.stroke_color = "#000000"
    # cfg.stroke_width = 1.5
    # cfg.text_fore_color = "#FFFFFF"
    # cfg.text_background_color = "transparent"
    # cfg.bgm_type = "random"
    # cfg.bgm_file = ""
    # cfg.bgm_volume = 1.0
    # cfg.subtitle_enabled = True
    # cfg.subtitle_position = "bottom"
    # cfg.n_threads = 2
    # cfg.paragraph_number = 1
    #
    # cfg.voice_volume = 1.0
    #
    # generate_video(video_path=video_file,
    #                audio_path=audio_file,
    #                subtitle_path=subtitle_file,
    #                output_file=output_file,
    #                params=cfg
    #                )
--- a/app/services/voice.py
+++ b/app/services/voice.py
@@ -2,21 +2,48 @@ import asyncio
 import os
 import re
 from datetime import datetime
 from typing import Union
 from xml.sax.saxutils import unescape
 import edge_tts
 import requests
 from edge_tts import SubMaker, submaker
 from edge_tts.submaker import mktimestamp
 from loguru import logger
 from edge_tts import submaker, SubMaker
 import edge_tts
 from moviepy.video.tools import subtitles
 from app.config import config
 from app.utils import utils
 def get_siliconflow_voices() -> list[str]:
    """
    获取硅基流动的声音列表
    Returns:
        声音列表，格式为 ["siliconflow:FunAudioLLM/CosyVoice2-0.5B:alex", ...]
    """
    # 硅基流动的声音列表和对应的性别（用于显示）
    voices_with_gender = [
        ("FunAudioLLM/CosyVoice2-0.5B", "alex", "Male"),
        ("FunAudioLLM/CosyVoice2-0.5B", "anna", "Female"),
        ("FunAudioLLM/CosyVoice2-0.5B", "bella", "Female"),
        ("FunAudioLLM/CosyVoice2-0.5B", "benjamin", "Male"),
        ("FunAudioLLM/CosyVoice2-0.5B", "charles", "Male"),
        ("FunAudioLLM/CosyVoice2-0.5B", "claire", "Female"),
        ("FunAudioLLM/CosyVoice2-0.5B", "david", "Male"),
        ("FunAudioLLM/CosyVoice2-0.5B", "diana", "Female"),
    ]
    # 添加siliconflow:前缀，并格式化为显示名称
    return [
        f"siliconflow:{model}:{voice}-{gender}"
        for model, voice, gender in voices_with_gender
    ]
 def get_all_azure_voices(filter_locals=None) -> list[str]:
-    if filter_locals is None:
+    azure_voices_str = """
        filter_locals = ["zh-CN", "en-US", "zh-HK", "zh-TW", "vi-VN"]
    voices_str = """
 Name: af-ZA-AdriNeural
 Gender: Female
@@ -302,21 +329,33 @@ Gender: Female
 Name: en-US-AnaNeural
 Gender: Female
 Name: en-US-AndrewMultilingualNeural
 Gender: Male
 Name: en-US-AndrewNeural
 Gender: Male
 Name: en-US-AriaNeural
 Gender: Female
 Name: en-US-AvaMultilingualNeural
 Gender: Female
 Name: en-US-AvaNeural
 Gender: Female
 Name: en-US-BrianMultilingualNeural
 Gender: Male
 Name: en-US-BrianNeural
 Gender: Male
 Name: en-US-ChristopherNeural
 Gender: Male
 Name: en-US-EmmaMultilingualNeural
 Gender: Female
 Name: en-US-EmmaNeural
 Gender: Female
@@ -602,12 +641,24 @@ Gender: Male
 Name: it-IT-ElsaNeural
 Gender: Female
-Name: it-IT-GiuseppeNeural
+Name: it-IT-GiuseppeMultilingualNeural
 Gender: Male
 Name: it-IT-IsabellaNeural
 Gender: Female
 Name: iu-Cans-CA-SiqiniqNeural
 Gender: Female
 Name: iu-Cans-CA-TaqqiqNeural
 Gender: Male
 Name: iu-Latn-CA-SiqiniqNeural
 Gender: Female
 Name: iu-Latn-CA-TaqqiqNeural
 Gender: Male
 Name: ja-JP-KeitaNeural
 Gender: Male
@@ -644,7 +695,7 @@ Gender: Male
 Name: kn-IN-SapnaNeural
 Gender: Female
-Name: ko-KR-HyunsuNeural
+Name: ko-KR-HyunsuMultilingualNeural
 Gender: Male
 Name: ko-KR-InJoonNeural
@@ -758,7 +809,7 @@ Gender: Male
 Name: pt-BR-FranciscaNeural
 Gender: Female
-Name: pt-BR-ThalitaNeural
+Name: pt-BR-ThalitaMultilingualNeural
 Gender: Female
 Name: pt-PT-DuarteNeural
@@ -988,27 +1039,20 @@ Name: zh-CN-XiaoxiaoMultilingualNeural-V2
 Gender: Female
    """.strip()
    voices = []
-    name = ''
+    # 定义正则表达式模式，用于匹配 Name 和 Gender 行
-    for line in voices_str.split("\n"):
+    pattern = re.compile(r"Name:\s*(.+)\s*Gender:\s*(.+)\s*", re.MULTILINE)
-        line = line.strip()
+    # 使用正则表达式查找所有匹配项
-        if not line:
+    matches = pattern.findall(azure_voices_str)
-            continue
+
-        if line.startswith("Name: "):
+    for name, gender in matches:
-            name = line[6:].strip()
+        # 应用过滤条件
-        if line.startswith("Gender: "):
+        if filter_locals and any(
-            gender = line[8:].strip()
+            name.lower().startswith(fl.lower()) for fl in filter_locals
-            if name and gender:
+        ):
-                # voices.append({
+            voices.append(f"{name}-{gender}")
-                #     "name": name,
+        elif not filter_locals:
-                #     "gender": gender,
+            voices.append(f"{name}-{gender}")
-                # })
+
                if filter_locals:
                    for filter_local in filter_locals:
                        if name.lower().startswith(filter_local.lower()):
                            voices.append(f"{name}-{gender}")
                else:
                    voices.append(f"{name}-{gender}")
                name = ''
    voices.sort()
    return voices
@@ -1028,33 +1072,76 @@ def is_azure_v2_voice(voice_name: str):
    return ""
-def tts(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
+def is_siliconflow_voice(voice_name: str):
    """检查是否是硅基流动的声音"""
    return voice_name.startswith("siliconflow:")
 def tts(
    text: str,
    voice_name: str,
    voice_rate: float,
    voice_file: str,
    voice_volume: float = 1.0,
 ) -> Union[SubMaker, None]:
    if is_azure_v2_voice(voice_name):
        return azure_tts_v2(text, voice_name, voice_file)
-    return azure_tts_v1(text, voice_name, voice_file)
+    elif is_siliconflow_voice(voice_name):
        # 从voice_name中提取模型和声音
        # 格式: siliconflow:model:voice-Gender
        parts = voice_name.split(":")
        if len(parts) >= 3:
            model = parts[1]
            # 移除性别后缀，例如 "alex-Male" -> "alex"
            voice_with_gender = parts[2]
            voice = voice_with_gender.split("-")[0]
            # 构建完整的voice参数，格式为 "model:voice"
            full_voice = f"{model}:{voice}"
            return siliconflow_tts(
                text, model, full_voice, voice_rate, voice_file, voice_volume
            )
        else:
            logger.error(f"Invalid siliconflow voice name format: {voice_name}")
            return None
    return azure_tts_v1(text, voice_name, voice_rate, voice_file)
-def azure_tts_v1(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
+def convert_rate_to_percent(rate: float) -> str:
    if rate == 1.0:
        return "+0%"
    percent = round((rate - 1.0) * 100)
    if percent > 0:
        return f"+{percent}%"
    else:
        return f"{percent}%"
 def azure_tts_v1(
    text: str, voice_name: str, voice_rate: float, voice_file: str
 ) -> Union[SubMaker, None]:
    voice_name = parse_voice_name(voice_name)
    text = text.strip()
    rate_str = convert_rate_to_percent(voice_rate)
    for i in range(3):
        try:
            logger.info(f"start, voice name: {voice_name}, try: {i + 1}")
            async def _do() -> SubMaker:
-                communicate = edge_tts.Communicate(text, voice_name)
+                communicate = edge_tts.Communicate(text, voice_name, rate=rate_str)
                sub_maker = edge_tts.SubMaker()
                with open(voice_file, "wb") as file:
                    async for chunk in communicate.stream():
                        if chunk["type"] == "audio":
                            file.write(chunk["data"])
                        elif chunk["type"] == "WordBoundary":
-                            sub_maker.create_sub((chunk["offset"], chunk["duration"]), chunk["text"])
+                            sub_maker.create_sub(
                                (chunk["offset"], chunk["duration"]), chunk["text"]
                            )
                return sub_maker
            sub_maker = asyncio.run(_do())
            if not sub_maker or not sub_maker.subs:
-                logger.warning(f"failed, sub_maker is None or sub_maker.subs is None")
+                logger.warning("failed, sub_maker is None or sub_maker.subs is None")
                continue
            logger.info(f"completed, output file: {voice_file}")
@@ -1064,7 +1151,145 @@ def azure_tts_v1(text: str, voice_name: str, voice_file: str) -> [SubMaker, None
    return None
-def azure_tts_v2(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
+def siliconflow_tts(
    text: str,
    model: str,
    voice: str,
    voice_rate: float,
    voice_file: str,
    voice_volume: float = 1.0,
 ) -> Union[SubMaker, None]:
    """
    使用硅基流动的API生成语音
    Args:
        text: 要转换为语音的文本
        model: 模型名称，如 "FunAudioLLM/CosyVoice2-0.5B"
        voice: 声音名称，如 "FunAudioLLM/CosyVoice2-0.5B:alex"
        voice_rate: 语音速度，范围[0.25, 4.0]
        voice_file: 输出的音频文件路径
        voice_volume: 语音音量，范围[0.6, 5.0]，需要转换为硅基流动的增益范围[-10, 10]
    Returns:
        SubMaker对象或None
    """
    text = text.strip()
    api_key = config.siliconflow.get("api_key", "")
    if not api_key:
        logger.error("SiliconFlow API key is not set")
        return None
    # 将voice_volume转换为硅基流动的增益范围
    # 默认voice_volume为1.0，对应gain为0
    gain = voice_volume - 1.0
    # 确保gain在[-10, 10]范围内
    gain = max(-10, min(10, gain))
    url = "https://api.siliconflow.cn/v1/audio/speech"
    payload = {
        "model": model,
        "input": text,
        "voice": voice,
        "response_format": "mp3",
        "sample_rate": 32000,
        "stream": False,
        "speed": voice_rate,
        "gain": gain,
    }
    headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
    for i in range(3):  # 尝试3次
        try:
            logger.info(
                f"start siliconflow tts, model: {model}, voice: {voice}, try: {i + 1}"
            )
            response = requests.post(url, json=payload, headers=headers)
            if response.status_code == 200:
                # 保存音频文件
                with open(voice_file, "wb") as f:
                    f.write(response.content)
                # 创建一个空的SubMaker对象
                sub_maker = SubMaker()
                # 获取音频文件的实际长度
                try:
                    # 尝试使用moviepy获取音频长度
                    from moviepy import AudioFileClip
                    audio_clip = AudioFileClip(voice_file)
                    audio_duration = audio_clip.duration
                    audio_clip.close()
                    # 将音频长度转换为100纳秒单位（与edge_tts兼容）
                    audio_duration_100ns = int(audio_duration * 10000000)
                    # 使用文本分割来创建更准确的字幕
                    # 将文本按标点符号分割成句子
                    sentences = utils.split_string_by_punctuations(text)
                    if sentences:
                        # 计算每个句子的大致时长（按字符数比例分配）
                        total_chars = sum(len(s) for s in sentences)
                        char_duration = (
                            audio_duration_100ns / total_chars if total_chars > 0 else 0
                        )
                        current_offset = 0
                        for sentence in sentences:
                            if not sentence.strip():
                                continue
                            # 计算当前句子的时长
                            sentence_chars = len(sentence)
                            sentence_duration = int(sentence_chars * char_duration)
                            # 添加到SubMaker
                            sub_maker.subs.append(sentence)
                            sub_maker.offset.append(
                                (current_offset, current_offset + sentence_duration)
                            )
                            # 更新偏移量
                            current_offset += sentence_duration
                    else:
                        # 如果无法分割，则使用整个文本作为一个字幕
                        sub_maker.subs = [text]
                        sub_maker.offset = [(0, audio_duration_100ns)]
                except Exception as e:
                    logger.warning(f"Failed to create accurate subtitles: {str(e)}")
                    # 回退到简单的字幕
                    sub_maker.subs = [text]
                    # 使用音频文件的实际长度，如果无法获取，则假设为10秒
                    sub_maker.offset = [
                        (
                            0,
                            audio_duration_100ns
                            if "audio_duration_100ns" in locals()
                            else 10000000,
                        )
                    ]
                logger.success(f"siliconflow tts succeeded: {voice_file}")
                print("s", sub_maker.subs, sub_maker.offset)
                return sub_maker
            else:
                logger.error(
                    f"siliconflow tts failed with status code {response.status_code}: {response.text}"
                )
        except Exception as e:
            logger.error(f"siliconflow tts failed: {str(e)}")
    return None
 def azure_tts_v2(text: str, voice_name: str, voice_file: str) -> Union[SubMaker, None]:
    voice_name = is_azure_v2_voice(voice_name)
    if not voice_name:
        logger.error(f"invalid voice name: {voice_name}")
@@ -1074,8 +1299,12 @@ def azure_tts_v2(text: str, voice_name: str, voice_file: str) -> [SubMaker, None
    def _format_duration_to_offset(duration) -> int:
        if isinstance(duration, str):
            time_obj = datetime.strptime(duration, "%H:%M:%S.%f")
-            milliseconds = (time_obj.hour * 3600000) + (time_obj.minute * 60000) + (time_obj.second * 1000) + (
+            milliseconds = (
-                    time_obj.microsecond // 1000)
+                (time_obj.hour * 3600000)
                + (time_obj.minute * 60000)
                + (time_obj.second * 1000)
                + (time_obj.microsecond // 1000)
            )
            return milliseconds * 10000
        if isinstance(duration, int):
@@ -1108,20 +1337,29 @@ def azure_tts_v2(text: str, voice_name: str, voice_file: str) -> [SubMaker, None
            # Creates an instance of a speech config with specified subscription key and service region.
            speech_key = config.azure.get("speech_key", "")
            service_region = config.azure.get("speech_region", "")
-            audio_config = speechsdk.audio.AudioOutputConfig(filename=voice_file, use_default_speaker=True)
+            audio_config = speechsdk.audio.AudioOutputConfig(
-            speech_config = speechsdk.SpeechConfig(subscription=speech_key,
+                filename=voice_file, use_default_speaker=True
-                                                   region=service_region)
+            )
            speech_config = speechsdk.SpeechConfig(
                subscription=speech_key, region=service_region
            )
            speech_config.speech_synthesis_voice_name = voice_name
            # speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceResponse_RequestSentenceBoundary,
            #                            value='true')
-            speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceResponse_RequestWordBoundary,
+            speech_config.set_property(
-                                       value='true')
+                property_id=speechsdk.PropertyId.SpeechServiceResponse_RequestWordBoundary,
                value="true",
            )
            speech_config.set_speech_synthesis_output_format(
-                speechsdk.SpeechSynthesisOutputFormat.Audio48Khz192KBitRateMonoMp3)
+                speechsdk.SpeechSynthesisOutputFormat.Audio48Khz192KBitRateMonoMp3
-            speech_synthesizer = speechsdk.SpeechSynthesizer(audio_config=audio_config,
+            )
-                                                             speech_config=speech_config)
+            speech_synthesizer = speechsdk.SpeechSynthesizer(
-            speech_synthesizer.synthesis_word_boundary.connect(speech_synthesizer_word_boundary_cb)
+                audio_config=audio_config, speech_config=speech_config
            )
            speech_synthesizer.synthesis_word_boundary.connect(
                speech_synthesizer_word_boundary_cb
            )
            result = speech_synthesizer.speak_text_async(text).get()
            if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
@@ -1129,9 +1367,13 @@ def azure_tts_v2(text: str, voice_name: str, voice_file: str) -> [SubMaker, None
                return sub_maker
            elif result.reason == speechsdk.ResultReason.Canceled:
                cancellation_details = result.cancellation_details
-                logger.error(f"azure v2 speech synthesis canceled: {cancellation_details.reason}")
+                logger.error(
                    f"azure v2 speech synthesis canceled: {cancellation_details.reason}"
                )
                if cancellation_details.reason == speechsdk.CancellationReason.Error:
-                    logger.error(f"azure v2 speech synthesis error: {cancellation_details.error_details}")
+                    logger.error(
                        f"azure v2 speech synthesis error: {cancellation_details.error_details}"
                    )
            logger.info(f"completed, output file: {voice_file}")
        except Exception as e:
            logger.error(f"failed, error: {str(e)}")
@@ -1168,11 +1410,7 @@ def create_subtitle(sub_maker: submaker.SubMaker, text: str, subtitle_file: str)
        """
        start_t = mktimestamp(start_time).replace(".", ",")
        end_t = mktimestamp(end_time).replace(".", ",")
-        return (
+        return f"{idx}\n{start_t} --> {end_t}\n{sub_text}\n"
            f"{idx}\n"
            f"{start_t} --> {end_t}\n"
            f"{sub_text}\n"
        )
    start_time = -1.0
    sub_items = []
@@ -1229,12 +1467,16 @@ def create_subtitle(sub_maker: submaker.SubMaker, text: str, subtitle_file: str)
            try:
                sbs = subtitles.file_to_subtitles(subtitle_file, encoding="utf-8")
                duration = max([tb for ((ta, tb), txt) in sbs])
-                logger.info(f"completed, subtitle file created: {subtitle_file}, duration: {duration}")
+                logger.info(
                    f"completed, subtitle file created: {subtitle_file}, duration: {duration}"
                )
            except Exception as e:
                logger.error(f"failed, error: {str(e)}")
                os.remove(subtitle_file)
        else:
-            logger.warning(f"failed, sub_items len: {len(sub_items)}, script_lines len: {len(script_lines)}")
+            logger.warning(
                f"failed, sub_items len: {len(sub_items)}, script_lines len: {len(script_lines)}"
            )
    except Exception as e:
        logger.error(f"failed, error: {str(e)}")
@@ -1258,7 +1500,6 @@ if __name__ == "__main__":
    voices = get_all_azure_voices()
    print(len(voices))
    async def _do():
        temp_dir = utils.storage_dir("temp")
@@ -1307,12 +1548,13 @@ if __name__ == "__main__":
        for voice_name in voice_names:
            voice_file = f"{temp_dir}/tts-{voice_name}.mp3"
            subtitle_file = f"{temp_dir}/tts.mp3.srt"
-            sub_maker = azure_tts_v2(text=text, voice_name=voice_name, voice_file=voice_file)
+            sub_maker = azure_tts_v2(
                text=text, voice_name=voice_name, voice_file=voice_file
            )
            create_subtitle(sub_maker=sub_maker, text=text, subtitle_file=subtitle_file)
            audio_duration = get_audio_duration(sub_maker)
            print(f"voice: {voice_name}, audio duration: {audio_duration}s")
    loop = asyncio.get_event_loop_policy().get_event_loop()
    try:
        loop.run_until_complete(_do())
--- a/app/utils/utils.py
+++ b/app/utils/utils.py
@@ -1,12 +1,13 @@
 import json
 import locale
 import os
-import platform
+from pathlib import Path
 import threading
 from typing import Any
 from loguru import logger
 import json
 from uuid import uuid4
 import urllib3
 from loguru import logger
 from app.models import const
@@ -15,44 +16,44 @@ urllib3.disable_warnings()
 def get_response(status: int, data: Any = None, message: str = ""):
    obj = {
-        'status': status,
+        "status": status,
    }
    if data:
-        obj['data'] = data
+        obj["data"] = data
    if message:
-        obj['message'] = message
+        obj["message"] = message
    return obj
 def to_json(obj):
    try:
-        # 定义一个辅助函数来处理不同类型的对象
+        # Define a helper function to handle different types of objects
        def serialize(o):
-            # 如果对象是可序列化类型，直接返回
+            # If the object is a serializable type, return it directly
            if isinstance(o, (int, float, bool, str)) or o is None:
                return o
-            # 如果对象是二进制数据，转换为base64编码的字符串
+            # If the object is binary data, convert it to a base64-encoded string
            elif isinstance(o, bytes):
                return "*** binary data ***"
-            # 如果对象是字典，递归处理每个键值对
+            # If the object is a dictionary, recursively process each key-value pair
            elif isinstance(o, dict):
                return {k: serialize(v) for k, v in o.items()}
-            # 如果对象是列表或元组，递归处理每个元素
+            # If the object is a list or tuple, recursively process each element
            elif isinstance(o, (list, tuple)):
                return [serialize(item) for item in o]
-            # 如果对象是自定义类型，尝试返回其__dict__属性
+            # If the object is a custom type, attempt to return its __dict__ attribute
-            elif hasattr(o, '__dict__'):
+            elif hasattr(o, "__dict__"):
                return serialize(o.__dict__)
-            # 其他情况返回None（或者可以选择抛出异常）
+            # Return None for other cases (or choose to raise an exception)
            else:
                return None
-        # 使用serialize函数处理输入对象
+        # Use the serialize function to process the input object
        serialized_obj = serialize(obj)
-        # 序列化处理后的对象为JSON字符串
+        # Serialize the processed object into a JSON string
        return json.dumps(serialized_obj, ensure_ascii=False, indent=4)
-    except Exception as e:
+    except Exception:
        return None
@@ -94,7 +95,7 @@ def task_dir(sub_dir: str = ""):
 def font_dir(sub_dir: str = ""):
-    d = resource_dir(f"fonts")
+    d = resource_dir("fonts")
    if sub_dir:
        d = os.path.join(d, sub_dir)
    if not os.path.exists(d):
@@ -103,7 +104,7 @@ def font_dir(sub_dir: str = ""):
 def song_dir(sub_dir: str = ""):
-    d = resource_dir(f"songs")
+    d = resource_dir("songs")
    if sub_dir:
        d = os.path.join(d, sub_dir)
    if not os.path.exists(d):
@@ -112,7 +113,7 @@ def song_dir(sub_dir: str = ""):
 def public_dir(sub_dir: str = ""):
-    d = resource_dir(f"public")
+    d = resource_dir("public")
    if sub_dir:
        d = os.path.join(d, sub_dir)
    if not os.path.exists(d):
@@ -182,7 +183,7 @@ def split_string_by_punctuations(s):
            next_char = s[i + 1]
        if char == "." and previous_char.isdigit() and next_char.isdigit():
-            # 取现1万，按2.5%收取手续费, 2.5 中的 . 不能作为换行标记
+            # # In the case of "withdraw 10,000, charged at 2.5% fee", the dot in "2.5" should not be treated as a line break marker
            txt += char
            continue
@@ -199,7 +200,8 @@ def split_string_by_punctuations(s):
 def md5(text):
    import hashlib
-    return hashlib.md5(text.encode('utf-8')).hexdigest()
+
    return hashlib.md5(text.encode("utf-8")).hexdigest()
 def get_system_locale():
@@ -209,7 +211,7 @@ def get_system_locale():
        # en_US, en_GB return en
        language_code = loc[0].split("_")[0]
        return language_code
-    except Exception as e:
+    except Exception:
        return "en"
@@ -225,4 +227,4 @@ def load_locales(i18n_dir):
 def parse_extension(filename):
-    return os.path.splitext(filename)[1].strip().lower().replace(".", "")
+    return Path(filename).suffix.lower().lstrip('.')
--- a/changelog.py
+++ b/changelog.py
@@ -12,6 +12,6 @@ build_and_render(
    parse_refs=False,
    sections=["build", "deps", "feat", "fix", "refactor"],
    versioning="pep440",
-    bump="1.1.2",   # 指定bump版本
+    bump="1.1.2",  # 指定bump版本
    in_place=True,
 )
--- a/config.example.toml
+++ b/config.example.toml
@@ -1,194 +1,205 @@
 [app]
 video_source = "pexels" # "pexels" or "pixabay"
-    video_source = "pexels"  # "pexels" or "pixabay"
+# 是否隐藏配置面板
-    # Pexels API Key
+hide_config = false
    # Register at https://www.pexels.com/api/ to get your API key.
    # You can use multiple keys to avoid rate limits.
    # For example: pexels_api_keys = ["123adsf4567adf89","abd1321cd13efgfdfhi"]
    # 特别注意格式，Key 用英文双引号括起来，多个Key用逗号隔开
    pexels_api_keys = []
-    # Pixabay API Key
+# Pexels API Key
-    # Register at https://pixabay.com/api/docs/ to get your API key.
+# Register at https://www.pexels.com/api/ to get your API key.
-    # You can use multiple keys to avoid rate limits.
+# You can use multiple keys to avoid rate limits.
-    # For example: pixabay_api_keys = ["123adsf4567adf89","abd1321cd13efgfdfhi"]
+# For example: pexels_api_keys = ["123adsf4567adf89","abd1321cd13efgfdfhi"]
-    # 特别注意格式，Key 用英文双引号括起来，多个Key用逗号隔开
+# 特别注意格式，Key 用英文双引号括起来，多个Key用逗号隔开
-    pixabay_api_keys = []
+pexels_api_keys = []
-    # 如果你没有 OPENAI API Key，可以使用 g4f 代替，或者使用国内的 Moonshot API
+# Pixabay API Key
-    # If you don't have an OPENAI API Key, you can use g4f instead
+# Register at https://pixabay.com/api/docs/ to get your API key.
 # You can use multiple keys to avoid rate limits.
 # For example: pixabay_api_keys = ["123adsf4567adf89","abd1321cd13efgfdfhi"]
 # 特别注意格式，Key 用英文双引号括起来，多个Key用逗号隔开
 pixabay_api_keys = []
-    # 支持的提供商 (Supported providers):
+# 支持的提供商 (Supported providers):
-    #   openai
+#   openai
-    #   moonshot (月之暗面)
+#   moonshot    (月之暗面)
-    #   oneapi
+#   azure
-    #   g4f
+#   qwen        (通义千问)
-    #   azure
+#   deepseek
-    #   qwen (通义千问)
+#   gemini
-    #   gemini
+#   ollama
-    llm_provider="openai"
+#   g4f
 #   oneapi
 #   cloudflare
 #   ernie       (文心一言)
 llm_provider = "openai"
-    ########## Ollama Settings
+########## Ollama Settings
-    # No need to set it unless you want to use your own proxy
+# No need to set it unless you want to use your own proxy
-    ollama_base_url = ""
+ollama_base_url = ""
-    # Check your available models at https://ollama.com/library
+# Check your available models at https://ollama.com/library
-    ollama_model_name = ""
+ollama_model_name = ""
-    ########## OpenAI API Key
+########## OpenAI API Key
-    # Get your API key at https://platform.openai.com/api-keys
+# Get your API key at https://platform.openai.com/api-keys
-    openai_api_key = ""
+openai_api_key = ""
-    # No need to set it unless you want to use your own proxy
+# No need to set it unless you want to use your own proxy
-    openai_base_url = ""
+openai_base_url = ""
-    # Check your available models at https://platform.openai.com/account/limits
+# Check your available models at https://platform.openai.com/account/limits
-    openai_model_name = "gpt-4-turbo"
+openai_model_name = "gpt-4o-mini"
-    ########## Moonshot API Key
+########## Moonshot API Key
-    # Visit https://platform.moonshot.cn/console/api-keys to get your API key.
+# Visit https://platform.moonshot.cn/console/api-keys to get your API key.
-    moonshot_api_key=""
+moonshot_api_key = ""
-    moonshot_base_url = "https://api.moonshot.cn/v1"
+moonshot_base_url = "https://api.moonshot.cn/v1"
-    moonshot_model_name = "moonshot-v1-8k"
+moonshot_model_name = "moonshot-v1-8k"
-    ########## OneAPI API Key
+########## OneAPI API Key
-    # Visit https://github.com/songquanpeng/one-api to get your API key
+# Visit https://github.com/songquanpeng/one-api to get your API key
-    oneapi_api_key=""
+oneapi_api_key = ""
-    oneapi_base_url=""
+oneapi_base_url = ""
-    oneapi_model_name=""
+oneapi_model_name = ""
-    ########## G4F
+########## G4F
-    # Visit https://github.com/xtekky/gpt4free to get more details
+# Visit https://github.com/xtekky/gpt4free to get more details
-    # Supported model list: https://github.com/xtekky/gpt4free/blob/main/g4f/models.py
+# Supported model list: https://github.com/xtekky/gpt4free/blob/main/g4f/models.py
-    g4f_model_name = "gpt-3.5-turbo"
+g4f_model_name = "gpt-3.5-turbo"
-    ########## Azure API Key
+########## Azure API Key
-    # Visit https://learn.microsoft.com/zh-cn/azure/ai-services/openai/ to get more details
+# Visit https://learn.microsoft.com/zh-cn/azure/ai-services/openai/ to get more details
-    # API documentation: https://learn.microsoft.com/zh-cn/azure/ai-services/openai/reference
+# API documentation: https://learn.microsoft.com/zh-cn/azure/ai-services/openai/reference
-    azure_api_key = ""
+azure_api_key = ""
-    azure_base_url=""
+azure_base_url = ""
-    azure_model_name="gpt-35-turbo" # replace with your model deployment name
+azure_model_name = "gpt-35-turbo"        # replace with your model deployment name
-    azure_api_version = "2024-02-15-preview"
+azure_api_version = "2024-02-15-preview"
-    ########## Gemini API Key
+########## Gemini API Key
-    gemini_api_key=""
+gemini_api_key = ""
-    gemini_model_name = "gemini-1.0-pro"
+gemini_model_name = "gemini-1.0-pro"
-    ########## Qwen API Key
+########## Qwen API Key
-    # Visit https://dashscope.console.aliyun.com/apiKey to get your API key
+# Visit https://dashscope.console.aliyun.com/apiKey to get your API key
-    # Visit below links to get more details
+# Visit below links to get more details
-    # https://tongyi.aliyun.com/qianwen/
+# https://tongyi.aliyun.com/qianwen/
-    # https://help.aliyun.com/zh/dashscope/developer-reference/model-introduction
+# https://help.aliyun.com/zh/dashscope/developer-reference/model-introduction
-    qwen_api_key = ""
+qwen_api_key = ""
-    qwen_model_name = "qwen-max"
+qwen_model_name = "qwen-max"
-    ########## DeepSeek API Key
+########## DeepSeek API Key
-    # Visit https://platform.deepseek.com/api_keys to get your API key
+# Visit https://platform.deepseek.com/api_keys to get your API key
-    deepseek_api_key = ""
+deepseek_api_key = ""
-    deepseek_base_url = "https://api.deepseek.com"
+deepseek_base_url = "https://api.deepseek.com"
-    deepseek_model_name = "deepseek-chat"
+deepseek_model_name = "deepseek-chat"
-    # Subtitle Provider, "edge" or "whisper"
+# Subtitle Provider, "edge" or "whisper"
-    # If empty, the subtitle will not be generated
+# If empty, the subtitle will not be generated
-    subtitle_provider = "edge"
+subtitle_provider = "edge"
-    #
+#
-    # ImageMagick
+# ImageMagick
-    #
+#
-    # Once you have installed it, ImageMagick will be automatically detected, except on Windows!
+# Once you have installed it, ImageMagick will be automatically detected, except on Windows!
-    # On Windows, for example "C:\Program Files (x86)\ImageMagick-7.1.1-Q16-HDRI\magick.exe"
+# On Windows, for example "C:\Program Files (x86)\ImageMagick-7.1.1-Q16-HDRI\magick.exe"
-    # Download from https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-29-Q16-x64-static.exe
+# Download from https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-29-Q16-x64-static.exe
-    # imagemagick_path = "C:\\Program Files (x86)\\ImageMagick-7.1.1-Q16\\magick.exe"
+# imagemagick_path = "C:\\Program Files (x86)\\ImageMagick-7.1.1-Q16\\magick.exe"
-    #
+#
-    # FFMPEG
+# FFMPEG
-    #
+#
-    # 通常情况下，ffmpeg 会被自动下载，并且会被自动检测到。
+# 通常情况下，ffmpeg 会被自动下载，并且会被自动检测到。
-    # 但是如果你的环境有问题，无法自动下载，可能会遇到如下错误：
+# 但是如果你的环境有问题，无法自动下载，可能会遇到如下错误：
-    #   RuntimeError: No ffmpeg exe could be found.
+#   RuntimeError: No ffmpeg exe could be found.
-    #   Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
+#   Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
-    # 此时你可以手动下载 ffmpeg 并设置 ffmpeg_path，下载地址：https://www.gyan.dev/ffmpeg/builds/
+# 此时你可以手动下载 ffmpeg 并设置 ffmpeg_path，下载地址：https://www.gyan.dev/ffmpeg/builds/
-    # Under normal circumstances, ffmpeg is downloaded automatically and detected automatically.
+# Under normal circumstances, ffmpeg is downloaded automatically and detected automatically.
-    # However, if there is an issue with your environment that prevents automatic downloading, you might encounter the following error:
+# However, if there is an issue with your environment that prevents automatic downloading, you might encounter the following error:
-    #   RuntimeError: No ffmpeg exe could be found.
+#   RuntimeError: No ffmpeg exe could be found.
-    #   Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
+#   Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
-    # In such cases, you can manually download ffmpeg and set the ffmpeg_path, download link: https://www.gyan.dev/ffmpeg/builds/
+# In such cases, you can manually download ffmpeg and set the ffmpeg_path, download link: https://www.gyan.dev/ffmpeg/builds/
-    # ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
+# ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
-    #########################################################################################
+#########################################################################################
-    # 当视频生成成功后，API服务提供的视频下载接入点，默认为当前服务的地址和监听端口
+# 当视频生成成功后，API服务提供的视频下载接入点，默认为当前服务的地址和监听端口
-    # 比如 http://127.0.0.1:8080/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
+# 比如 http://127.0.0.1:8080/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
-    # 如果你需要使用域名对外提供服务（一般会用nginx做代理），则可以设置为你的域名
+# 如果你需要使用域名对外提供服务（一般会用nginx做代理），则可以设置为你的域名
-    # 比如 https://xxxx.com/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
+# 比如 https://xxxx.com/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
-    # endpoint="https://xxxx.com"
+# endpoint="https://xxxx.com"
-    # When the video is successfully generated, the API service provides a download endpoint for the video, defaulting to the service's current address and listening port.
+# When the video is successfully generated, the API service provides a download endpoint for the video, defaulting to the service's current address and listening port.
-    # For example, http://127.0.0.1:8080/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
+# For example, http://127.0.0.1:8080/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
-    # If you need to provide the service externally using a domain name (usually done with nginx as a proxy), you can set it to your domain name.
+# If you need to provide the service externally using a domain name (usually done with nginx as a proxy), you can set it to your domain name.
-    # For example, https://xxxx.com/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
+# For example, https://xxxx.com/tasks/6357f542-a4e1-46a1-b4c9-bf3bd0df5285/final-1.mp4
-    # endpoint="https://xxxx.com"
+# endpoint="https://xxxx.com"
-    endpoint=""
+endpoint = ""
-    # Video material storage location
+# Video material storage location
-    # material_directory = ""                    # Indicates that video materials will be downloaded to the default folder, the default folder is ./storage/cache_videos under the current project
+# material_directory = ""                    # Indicates that video materials will be downloaded to the default folder, the default folder is ./storage/cache_videos under the current project
-    # material_directory = "/user/harry/videos"  # Indicates that video materials will be downloaded to a specified folder
+# material_directory = "/user/harry/videos"  # Indicates that video materials will be downloaded to a specified folder
-    # material_directory = "task"                # Indicates that video materials will be downloaded to the current task's folder, this method does not allow sharing of already downloaded video materials
+# material_directory = "task"                # Indicates that video materials will be downloaded to the current task's folder, this method does not allow sharing of already downloaded video materials
-    # 视频素材存放位置
+# 视频素材存放位置
-    # material_directory = ""                    #表示将视频素材下载到默认的文件夹，默认文件夹为当前项目下的 ./storage/cache_videos
+# material_directory = ""                    #表示将视频素材下载到默认的文件夹，默认文件夹为当前项目下的 ./storage/cache_videos
-    # material_directory = "/user/harry/videos"  #表示将视频素材下载到指定的文件夹中
+# material_directory = "/user/harry/videos"  #表示将视频素材下载到指定的文件夹中
-    # material_directory = "task"                #表示将视频素材下载到当前任务的文件夹中，这种方式无法共享已经下载的视频素材
+# material_directory = "task"                #表示将视频素材下载到当前任务的文件夹中，这种方式无法共享已经下载的视频素材
-    material_directory = ""
+material_directory = ""
-    # Used for state management of the task
+# Used for state management of the task
-    enable_redis = false
+enable_redis = false
-    redis_host = "localhost"
+redis_host = "localhost"
-    redis_port = 6379
+redis_port = 6379
-    redis_db = 0
+redis_db = 0
-    redis_password = ""
+redis_password = ""
-    # 文生视频时的最大并发任务数
+# 文生视频时的最大并发任务数
-    max_concurrent_tasks = 5
+max_concurrent_tasks = 5
    # webui界面是否显示配置项
    # webui hide baisc config panel
    hide_config = false
 [whisper]
-    # Only effective when subtitle_provider is "whisper"
+# Only effective when subtitle_provider is "whisper"
-    # Run on GPU with FP16
+# Run on GPU with FP16
-    # model = WhisperModel(model_size, device="cuda", compute_type="float16")
+# model = WhisperModel(model_size, device="cuda", compute_type="float16")
-    # Run on GPU with INT8
+# Run on GPU with INT8
-    # model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
+# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
-    # Run on CPU with INT8
+# Run on CPU with INT8
-    # model = WhisperModel(model_size, device="cpu", compute_type="int8")
+# model = WhisperModel(model_size, device="cpu", compute_type="int8")
-    # recommended model_size: "large-v3"
+# recommended model_size: "large-v3"
-    model_size="large-v3"
+model_size = "large-v3"
-    # if you want to use GPU, set device="cuda"
+# if you want to use GPU, set device="cuda"
-    device="CPU"
+device = "CPU"
-    compute_type="int8"
+compute_type = "int8"
 [proxy]
-    ### Use a proxy to access the Pexels API
+### Use a proxy to access the Pexels API
-    ### Format: "http://<username>:<password>@<proxy>:<port>"
+### Format: "http://<username>:<password>@<proxy>:<port>"
-    ### Example: "http://user:pass@proxy:1234"
+### Example: "http://user:pass@proxy:1234"
-    ### Doc: https://requests.readthedocs.io/en/latest/user/advanced/#proxies
+### Doc: https://requests.readthedocs.io/en/latest/user/advanced/#proxies
-    # http = "http://10.10.1.10:3128"
+# http = "http://10.10.1.10:3128"
-    # https = "http://10.10.1.10:1080"
+# https = "http://10.10.1.10:1080"
 [azure]
-    # Azure Speech API Key
+# Azure Speech API Key
-    # Get your API key at https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices
+# Get your API key at https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices
-    speech_key=""
+speech_key = ""
-    speech_region=""
+speech_region = ""
 [siliconflow]
 # SiliconFlow API Key
 # Get your API key at https://siliconflow.cn
 api_key = ""
 [ui]
 # UI related settings
 # 是否隐藏日志信息
 # Whether to hide logs in the UI
 hide_log = false
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -6,7 +6,7 @@ services:
    build:
      context: .
      dockerfile: Dockerfile
-    container_name: "webui"
+    container_name: "moneyprinterturbo-webui"
    ports:
      - "8501:8501"
    command: [ "streamlit", "run", "./webui/Main.py","--browser.serverAddress=127.0.0.1","--server.enableCORS=True","--browser.gatherUsageStats=False" ]
@@ -16,7 +16,7 @@ services:
    build:
      context: .
      dockerfile: Dockerfile
-    container_name: "api"
+    container_name: "moneyprinterturbo-api"
    ports:
      - "8080:8080"
    command: [ "python3", "main.py" ]
--- a/docs/api.jpg
+++ b/docs/api.jpg
--- a/docs/webui-en.jpg
+++ b/docs/webui-en.jpg
--- a/docs/webui.jpg
+++ b/docs/webui.jpg
--- a/docs/wechat-group.jpg
+++ b/docs/wechat-group.jpg
--- a/main.py
+++ b/main.py
@@ -1,8 +1,16 @@
 import uvicorn
 from loguru import logger
 from app.config import config
-if __name__ == '__main__':
+if __name__ == "__main__":
-    logger.info("start server, docs: http://127.0.0.1:" + str(config.listen_port) + "/docs")
+    logger.info(
-    uvicorn.run(app="app.asgi:app", host=config.listen_host, port=config.listen_port, reload=config.reload_debug,
+        "start server, docs: http://127.0.0.1:" + str(config.listen_port) + "/docs"
-                log_level="warning")
+    )
    uvicorn.run(
        app="app.asgi:app",
        host=config.listen_host,
        port=config.listen_port,
        reload=config.reload_debug,
        log_level="warning",
    )
--- a/pdm.lock
+++ b/pdm.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,32 @@
 [project]
 name = "MoneyPrinterTurbo"
 version = "1.2.3"
 description = "Default template for PDM package"
 authors = [
    {name = "yyhhyyyyyy", email = "yyhhyyyyyy8@gmail.com"},
 ]
 dependencies = [
    "moviepy==2.1.1",
    "streamlit==1.40.2",
    "edge-tts==6.1.19",
    "fastapi==0.115.6",
    "uvicorn==0.32.1",
    "openai==1.56.1",
    "faster-whisper==1.1.0",
    "loguru==0.7.2",
    "google-generativeai==0.8.3",
    "dashscope==1.20.14",
    "g4f==0.3.8.1",
    "azure-cognitiveservices-speech==1.41.1",
    "redis==5.2.0",
    "python-multipart==0.0.19",
    "streamlit-authenticator==0.4.1",
    "pyyaml",
 ]
 requires-python = "==3.11.*"
 readme = "README.md"
 license = {text = "MIT"}
 [tool.pdm]
 distribution = false
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,26 +1,15 @@
-requests~=2.31.0
+moviepy==2.1.2
-moviepy~=2.0.0.dev2
+streamlit==1.45.0
-openai~=1.13.3
+edge_tts==6.1.19
-faster-whisper~=1.0.1
+fastapi==0.115.6
-edge_tts~=6.1.10
+uvicorn==0.32.1
-uvicorn~=0.27.1
+openai==1.56.1
-fastapi~=0.110.0
+faster-whisper==1.1.0
-tomli~=2.0.1
+loguru==0.7.3
-streamlit~=1.33.0
+google.generativeai==0.8.3
-loguru~=0.7.2
+dashscope==1.20.14
-aiohttp~=3.9.3
+g4f==0.5.2.2
-urllib3~=2.2.1
+azure-cognitiveservices-speech==1.41.1
-pillow~=10.3.0
+redis==5.2.0
-pydantic~=2.6.3
+python-multipart==0.0.19
-g4f~=0.3.0.4
+pyyaml
 dashscope~=1.15.0
 google.generativeai~=0.4.1
 python-multipart~=0.0.9
 redis==5.0.3
 # if you use pillow~=10.3.0, you will get "PIL.Image' has no attribute 'ANTIALIAS'" error when resize video
 # please install opencv-python to fix "PIL.Image' has no attribute 'ANTIALIAS'" error
 opencv-python~=4.9.0.80
 # for azure speech
 # https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/9-more-realistic-ai-voices-for-conversations-now-generally/ba-p/4099471
 azure-cognitiveservices-speech~=1.37.0
 git-changelog~=2.5.2
--- a/test/README.md
+++ b/test/README.md
@@ -0,0 +1,39 @@
 # MoneyPrinterTurbo Test Directory
 This directory contains unit tests for the **MoneyPrinterTurbo** project.
 ## Directory Structure
 - `services/`: Tests for components in the `app/services` directory  
  - `test_video.py`: Tests for the video service  
  - `test_task.py`: Tests for the task service  
 ## Running Tests
 You can run the tests using Python’s built-in `unittest` framework:
 ```bash
 # Run all tests
 python -m unittest discover -s test
 # Run a specific test file
 python -m unittest test/services/test_video.py
 # Run a specific test class
 python -m unittest test.services.test_video.TestVideoService
 # Run a specific test method
 python -m unittest test.services.test_video.TestVideoService.test_preprocess_video
 ````
 ## Adding New Tests
 To add tests for other components, follow these guidelines:
 1. Create test files prefixed with `test_` in the appropriate subdirectory
 2. Use `unittest.TestCase` as the base class for your test classes
 3. Name test methods with the `test_` prefix
 ## Test Resources
 Place any resource files required for testing in the `test/resources` directory.
--- a/test/init.py
+++ b/test/init.py
@@ -0,0 +1 @@
 # Unit test package for test
--- a/test/resources/1.png
+++ b/test/resources/1.png
--- a/test/resources/1.png.mp4
+++ b/test/resources/1.png.mp4
--- a/test/resources/2.png
+++ b/test/resources/2.png
--- a/test/resources/2.png.mp4
+++ b/test/resources/2.png.mp4
--- a/test/resources/3.png
+++ b/test/resources/3.png
--- a/test/resources/3.png.mp4
+++ b/test/resources/3.png.mp4
--- a/test/resources/4.png
+++ b/test/resources/4.png
--- a/test/resources/5.png
+++ b/test/resources/5.png
--- a/test/resources/6.png
+++ b/test/resources/6.png
--- a/test/resources/7.png
+++ b/test/resources/7.png
--- a/test/resources/8.png
+++ b/test/resources/8.png
--- a/test/resources/9.png
+++ b/test/resources/9.png
--- a/test/services/init.py
+++ b/test/services/init.py
@@ -0,0 +1 @@
 # Unit test package for services
--- a/test/services/test_task.py
+++ b/test/services/test_task.py
@@ -0,0 +1,66 @@
 import unittest
 import os
 import sys
 from pathlib import Path
 # add project root to python path
 sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 from app.services import task as tm
 from app.models.schema import MaterialInfo, VideoParams
 resources_dir = os.path.join(os.path.dirname(os.path.dirname(__file__)), "resources")
 class TestTaskService(unittest.TestCase):
    def setUp(self):
        pass
    def tearDown(self):
        pass
    def test_task_local_materials(self):
        task_id = "00000000-0000-0000-0000-000000000000"
        video_materials=[]
        for i in range(1, 4):
            video_materials.append(MaterialInfo(
                provider="local",
                url=os.path.join(resources_dir, f"{i}.png"),
                duration=0
            ))
        params = VideoParams(
            video_subject="金钱的作用",
            video_script="金钱不仅是交换媒介，更是社会资源的分配工具。它能满足基本生存需求，如食物和住房，也能提供教育、医疗等提升生活品质的机会。拥有足够的金钱意味着更多选择权，比如职业自由或创业可能。但金钱的作用也有边界，它无法直接购买幸福、健康或真诚的人际关系。过度追逐财富可能导致价值观扭曲，忽视精神层面的需求。理想的状态是理性看待金钱，将其作为实现目标的工具而非终极目的。",
            video_terms="money importance, wealth and society, financial freedom, money and happiness, role of money",
            video_aspect="9:16",
            video_concat_mode="random",
            video_transition_mode="None",
            video_clip_duration=3,
            video_count=1,
            video_source="local",
            video_materials=video_materials,
            video_language="",
            voice_name="zh-CN-XiaoxiaoNeural-Female",
            voice_volume=1.0,
            voice_rate=1.0,
            bgm_type="random",
            bgm_file="",
            bgm_volume=0.2,
            subtitle_enabled=True,
            subtitle_position="bottom",
            custom_position=70.0,
            font_name="MicrosoftYaHeiBold.ttc",
            text_fore_color="#FFFFFF",
            text_background_color=True,
            font_size=60,
            stroke_color="#000000",
            stroke_width=1.5,
            n_threads=2,
            paragraph_number=1
        )
        result = tm.start(task_id=task_id, params=params)
        print(result)
 if __name__ == "__main__":
    unittest.main() 
--- a/test/services/test_video.py
+++ b/test/services/test_video.py
@@ -0,0 +1,85 @@
 import unittest
 import os
 import sys
 from pathlib import Path
 from moviepy import (
    VideoFileClip,
 )
 # add project root to python path
 sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 from app.models.schema import MaterialInfo
 from app.services import video as vd
 from app.utils import utils
 resources_dir = os.path.join(os.path.dirname(os.path.dirname(__file__)), "resources")
 class TestVideoService(unittest.TestCase):
    def setUp(self):
        self.test_img_path = os.path.join(resources_dir, "1.png")
    def tearDown(self):
        pass
    def test_preprocess_video(self):
        if not os.path.exists(self.test_img_path):
            self.fail(f"test image not found: {self.test_img_path}")
        # test preprocess_video function
        m = MaterialInfo()
        m.url = self.test_img_path
        m.provider = "local"
        print(m)
        materials = vd.preprocess_video([m], clip_duration=4)
        print(materials)
        # verify result
        self.assertIsNotNone(materials)
        self.assertEqual(len(materials), 1)
        self.assertTrue(materials[0].url.endswith(".mp4"))
        # moviepy get video info
        clip = VideoFileClip(materials[0].url)
        print(clip)
        # clean generated test video file
        if os.path.exists(materials[0].url):
            os.remove(materials[0].url)
    def test_wrap_text(self):
        """test text wrapping function"""
        try:
            font_path = os.path.join(utils.font_dir(), "STHeitiMedium.ttc")
            if not os.path.exists(font_path):
                self.fail(f"font file not found: {font_path}")
            # test english text wrapping
            test_text_en = "This is a test text for wrapping long sentences in english language"
            wrapped_text_en, text_height_en = vd.wrap_text(
                text=test_text_en,
                max_width=300,
                font=font_path,
                fontsize=30
            )
            print(wrapped_text_en, text_height_en)
            # verify text is wrapped
            self.assertIn("\n", wrapped_text_en)
            # test chinese text wrapping
            test_text_zh = "这是一段用来测试中文长句换行的文本内容，应该会根据宽度限制进行换行处理"
            wrapped_text_zh, text_height_zh = vd.wrap_text(
                text=test_text_zh,
                max_width=300,
                font=font_path,
                fontsize=30
            )   
            print(wrapped_text_zh, text_height_zh)
            # verify chinese text is wrapped
            self.assertIn("\n", wrapped_text_zh)
        except Exception as e:
            self.fail(f"test wrap_text failed: {str(e)}")
 if __name__ == "__main__":
    unittest.main() 
--- a/webui/Main.py
+++ b/webui/Main.py
--- a/webui/i18n/de.json
+++ b/webui/i18n/de.json
@@ -1,6 +1,14 @@
 {
-  "Language": "German",
+  "Language": "Deutsch",
  "Translation": {
    "Login Required": "Anmeldung erforderlich",
    "Please login to access settings": "Bitte melden Sie sich an, um auf die Einstellungen zuzugreifen",
    "Username": "Benutzername",
    "Password": "Passwort",
    "Login": "Anmelden",
    "Login Error": "Anmeldefehler",
    "Incorrect username or password": "Falscher Benutzername oder Passwort",
    "Please enter your username and password": "Bitte geben Sie Ihren Benutzernamen und Ihr Passwort ein",
    "Video Script Settings": "**Drehbuch / Topic des Videos**",
    "Video Subject": "Worum soll es in dem Video gehen? (Geben Sie ein Keyword an, :red[Dank KI wird automatisch ein Drehbuch generieren])",
    "Script Language": "Welche Sprache soll zum Generieren von Drehbüchern  verwendet werden? :red[KI generiert anhand dieses Begriffs das Drehbuch]",
@@ -10,12 +18,19 @@
    "Generate Video Keywords": "Klicken Sie, um KI zum Generieren zu verwenden [Video Keywords] basierend auf dem **Drehbuch**",
    "Please Enter the Video Subject": "Bitte geben Sie zuerst das Drehbuch an",
    "Generating Video Script and Keywords": "KI generiert ein Drehbuch und Schlüsselwörter...",
-    "Generating Video Keywords": "AI is generating video keywords...",
+    "Generating Video Keywords": "KI generiert Video-Schlüsselwörter...",
    "Video Keywords": "Video Schlüsselwörter (:blue[① Optional, KI generiert ② Verwende **, (Kommas)** zur Trennung der Wörter, in englischer Sprache])",
    "Video Settings": "**Video Einstellungen**",
    "Video Concat Mode": "Videoverkettungsmodus",
    "Random": "Zufällige Verkettung (empfohlen)",
    "Sequential": "Sequentielle Verkettung",
    "Video Transition Mode": "Video Übergangsmodus",
    "None": "Kein Übergang",
    "Shuffle": "Zufällige Übergänge",
    "FadeIn": "FadeIn",
    "FadeOut": "FadeOut",
    "SlideIn": "SlideIn",
    "SlideOut": "SlideOut",
    "Video Ratio": "Video-Seitenverhältnis",
    "Portrait": "Portrait 9:16",
    "Landscape": "Landschaft 16:9",
@@ -23,9 +38,10 @@
    "Number of Videos Generated Simultaneously": "Anzahl der parallel generierten Videos",
    "Audio Settings": "**Audio Einstellungen**",
    "Speech Synthesis": "Sprachausgabe",
-    "Speech Region": "Region(:red[Required，[Get Region](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
+    "Speech Region": "Region(:red[Erforderlich，[Region abrufen](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
-    "Speech Key": "API Key(:red[Required，[Get API Key](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
+    "Speech Key": "API-Schlüssel(:red[Erforderlich，[API-Schlüssel abrufen](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Volume": "Lautstärke der Sprachausgabe",
    "Speech Rate": "Lesegeschwindigkeit (1,0 bedeutet 1x)",
    "Male": "Männlich",
    "Female": "Weiblich",
    "Background Music": "Hintergrundmusik",
@@ -41,6 +57,7 @@
    "Top": "Oben",
    "Center": "Mittig",
    "Bottom": "Unten (empfohlen)",
    "Custom": "Benutzerdefinierte Position (70, was 70% von oben bedeutet)",
    "Font Size": "Schriftgröße für Untertitel",
    "Font Color": "Schriftfarbe",
    "Stroke Color": "Kontur",
@@ -52,26 +69,37 @@
    "Video Generation Completed": "Video erfolgreich generiert",
    "Video Generation Failed": "Video Generierung fehlgeschlagen",
    "You can download the generated video from the following links": "Sie können das generierte Video über die folgenden Links herunterladen",
-    "Basic Settings": "**Grunde Instellungen**",
+    "Basic Settings": "**Grundeinstellungen** (:blue[Klicken zum Erweitern])",
-    "Pexels API Key": "Pexels API Key ([Get API Key](https://www.pexels.com/api/))",
+    "Language": "Sprache",
-    "Pixabay API Key": "Pixabay API Key ([Get API Key](https://pixabay.com/api/docs/#api_search_videos))",
+    "Pexels API Key": "Pexels API-Schlüssel ([API-Schlüssel abrufen](https://www.pexels.com/api/))",
-    "Language": "Language",
+    "Pixabay API Key": "Pixabay API-Schlüssel ([API-Schlüssel abrufen](https://pixabay.com/api/docs/#api_search_videos))",
-    "LLM Provider": "LLM Provider",
+    "LLM Provider": "KI-Modellanbieter",
-    "API Key": "API Key (:red[Required])",
+    "API Key": "API-Schlüssel (:red[Erforderlich])",
-    "Base Url": "Base Url",
+    "Base Url": "Basis-URL",
-    "Model Name": "Model Name",
+    "Account ID": "Konto-ID (Aus dem Cloudflare-Dashboard)",
-    "Please Enter the LLM API Key": "Please Enter the **LLM API Key**",
+    "Model Name": "Modellname",
-    "Please Enter the Pexels API Key": "Please Enter the **Pexels API Key**",
+    "Please Enter the LLM API Key": "Bitte geben Sie den **KI-Modell API-Schlüssel** ein",
-    "Please Enter the Pixabay API Key": "Please Enter the **Pixabay API Key**",
+    "Please Enter the Pexels API Key": "Bitte geben Sie den **Pexels API-Schlüssel** ein",
-    "Get Help": "If you need help, or have any questions, you can join discord for help: https://harryai.cc",
+    "Please Enter the Pixabay API Key": "Bitte geben Sie den **Pixabay API-Schlüssel** ein",
-    "Video Source": "Video Source",
+    "Get Help": "Wenn Sie Hilfe benötigen oder Fragen haben, können Sie dem Discord beitreten: https://harryai.cc",
-    "TikTok": "TikTok (TikTok support is coming soon)",
+    "Video Source": "Videoquelle",
-    "Bilibili": "Bilibili (Bilibili support is coming soon)",
+    "TikTok": "TikTok (TikTok-Unterstützung kommt bald)",
-    "Xiaohongshu": "Xiaohongshu (Xiaohongshu support is coming soon)",
+    "Bilibili": "Bilibili (Bilibili-Unterstützung kommt bald)",
-    "Local file": "Local file",
+    "Xiaohongshu": "Xiaohongshu (Xiaohongshu-Unterstützung kommt bald)",
-    "Play Voice": "Play Voice",
+    "Local file": "Lokale Datei",
-    "Voice Example": "This is an example text for testing speech synthesis",
+    "Play Voice": "Sprachausgabe abspielen",
-    "Synthesizing Voice": "Synthesizing voice, please wait...",
+    "Voice Example": "Dies ist ein Beispieltext zum Testen der Sprachsynthese",
-    "TTS Provider": "Select the voice synthesis provider"
+    "Synthesizing Voice": "Sprachsynthese läuft, bitte warten...",
    "TTS Provider": "Sprachsynthese-Anbieter auswählen",
    "TTS Servers": "TTS-Server",
    "No voices available for the selected TTS server. Please select another server.": "Keine Stimmen für den ausgewählten TTS-Server verfügbar. Bitte wählen Sie einen anderen Server.",
    "SiliconFlow API Key": "SiliconFlow API-Schlüssel",
    "SiliconFlow TTS Settings": "SiliconFlow TTS-Einstellungen",
    "Speed: Range [0.25, 4.0], default is 1.0": "Geschwindigkeit: Bereich [0.25, 4.0], Standardwert ist 1.0",
    "Volume: Uses Speech Volume setting, default 1.0 maps to gain 0": "Lautstärke: Verwendet die Sprachlautstärke-Einstellung, Standardwert 1.0 entspricht Verstärkung 0",
    "Hide Log": "Protokoll ausblenden",
    "Hide Basic Settings": "Basis-Einstellungen ausblenden\n\nWenn diese Option deaktiviert ist, wird die Basis-Einstellungen-Leiste nicht auf der Seite angezeigt.\n\nWenn Sie sie erneut anzeigen möchten, setzen Sie `hide_config = false` in `config.toml`",
    "LLM Settings": "**LLM-Einstellungen**",
    "Video Source Settings": "**Videoquellen-Einstellungen**"
  }
 }
--- a/webui/i18n/en.json
+++ b/webui/i18n/en.json
@@ -1,6 +1,14 @@
 {
  "Language": "English",
  "Translation": {
    "Login Required": "Login Required",
    "Please login to access settings": "Please login to access settings",
    "Username": "Username",
    "Password": "Password",
    "Login": "Login",
    "Login Error": "Login Error",
    "Incorrect username or password": "Incorrect username or password",
    "Please enter your username and password": "Please enter your username and password",
    "Video Script Settings": "**Video Script Settings**",
    "Video Subject": "Video Subject (Provide a keyword, :red[AI will automatically generate] video script)",
    "Script Language": "Language for Generating Video Script (AI will automatically output based on the language of your subject)",
@@ -16,6 +24,13 @@
    "Video Concat Mode": "Video Concatenation Mode",
    "Random": "Random Concatenation (Recommended)",
    "Sequential": "Sequential Concatenation",
    "Video Transition Mode": "Video Transition Mode",
    "None": "None",
    "Shuffle": "Shuffle",
    "FadeIn": "FadeIn",
    "FadeOut": "FadeOut",
    "SlideIn": "SlideIn",
    "SlideOut": "SlideOut",
    "Video Ratio": "Video Aspect Ratio",
    "Portrait": "Portrait 9:16",
    "Landscape": "Landscape 16:9",
@@ -26,6 +41,7 @@
    "Speech Region": "Region(:red[Required，[Get Region](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Key": "API Key(:red[Required，[Get API Key](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Volume": "Speech Volume (1.0 represents 100%)",
    "Speech Rate": "Speech Rate (1.0 means 1x speed)",
    "Male": "Male",
    "Female": "Female",
    "Background Music": "Background Music",
@@ -41,6 +57,7 @@
    "Top": "Top",
    "Center": "Center",
    "Bottom": "Bottom (Recommended)",
    "Custom": "Custom position (70, indicating 70% down from the top)",
    "Font Size": "Subtitle Font Size",
    "Font Color": "Subtitle Font Color",
    "Stroke Color": "Subtitle Outline Color",
@@ -73,6 +90,16 @@
    "Play Voice": "Play Voice",
    "Voice Example": "This is an example text for testing speech synthesis",
    "Synthesizing Voice": "Synthesizing voice, please wait...",
-    "TTS Provider": "Select the voice synthesis provider"
+    "TTS Provider": "Select the voice synthesis provider",
    "TTS Servers": "TTS Servers",
    "No voices available for the selected TTS server. Please select another server.": "No voices available for the selected TTS server. Please select another server.",
    "SiliconFlow API Key": "SiliconFlow API Key [Click to get](https://cloud.siliconflow.cn/account/ak)",
    "SiliconFlow TTS Settings": "SiliconFlow TTS Settings",
    "Speed: Range [0.25, 4.0], default is 1.0": "Speed: Range [0.25, 4.0], default is 1.0",
    "Volume: Uses Speech Volume setting, default 1.0 maps to gain 0": "Volume: Uses Speech Volume setting, default 1.0 maps to gain 0",
    "Hide Log": "Hide Log",
    "Hide Basic Settings": "Hide Basic Settings\n\nHidden, the basic settings panel will not be displayed on the page.\n\nIf you need to display it again, please set `hide_config = false` in `config.toml`",
    "LLM Settings": "**LLM Settings**",
    "Video Source Settings": "**Video Source Settings**"
  }
 }
--- a/webui/i18n/pt.json
+++ b/webui/i18n/pt.json
@@ -0,0 +1,105 @@
 {
  "Language": "Português Brasileiro",
  "Translation": {
    "Login Required": "Login Necessário",
    "Please login to access settings": "Por favor, faça login para acessar as configurações",
    "Username": "Nome de usuário",
    "Password": "Senha",
    "Login": "Entrar",
    "Login Error": "Erro de Login",
    "Incorrect username or password": "Nome de usuário ou senha incorretos",
    "Please enter your username and password": "Por favor, digite seu nome de usuário e senha",
    "Video Script Settings": "**Configurações do Roteiro do Vídeo**",
    "Video Subject": "Tema do Vídeo (Forneça uma palavra-chave, :red[a IA irá gerar automaticamente] o roteiro do vídeo)",
    "Script Language": "Idioma para Gerar o Roteiro do Vídeo (a IA irá gerar automaticamente com base no idioma do seu tema)",
    "Generate Video Script and Keywords": "Clique para usar a IA para gerar o [Roteiro do Vídeo] e as [Palavras-chave do Vídeo] com base no **tema**",
    "Auto Detect": "Detectar Automaticamente",
    "Video Script": "Roteiro do Vídeo (:blue[① Opcional, gerado pela IA  ② Pontuação adequada ajuda na geração de legendas])",
    "Generate Video Keywords": "Clique para usar a IA para gerar [Palavras-chave do Vídeo] com base no **roteiro**",
    "Please Enter the Video Subject": "Por favor, insira o Roteiro do Vídeo primeiro",
    "Generating Video Script and Keywords": "A IA está gerando o roteiro do vídeo e as palavras-chave...",
    "Generating Video Keywords": "A IA está gerando as palavras-chave do vídeo...",
    "Video Keywords": "Palavras-chave do Vídeo (:blue[① Opcional, gerado pela IA ② Use **vírgulas em inglês** para separar, somente em inglês])",
    "Video Settings": "**Configurações do Vídeo**",
    "Video Concat Mode": "Modo de Concatenação de Vídeo",
    "Random": "Concatenação Aleatória (Recomendado)",
    "Sequential": "Concatenação Sequencial",
    "Video Transition Mode": "Modo de Transição de Vídeo",
    "None": "Nenhuma Transição",
    "Shuffle": "Transição Aleatória",
    "FadeIn": "FadeIn",
    "FadeOut": "FadeOut",
    "SlideIn": "SlideIn",
    "SlideOut": "SlideOut",
    "Video Ratio": "Proporção do Vídeo",
    "Portrait": "Retrato 9:16",
    "Landscape": "Paisagem 16:9",
    "Clip Duration": "Duração Máxima dos Clipes de Vídeo (segundos)",
    "Number of Videos Generated Simultaneously": "Número de Vídeos Gerados Simultaneamente",
    "Audio Settings": "**Configurações de Áudio**",
    "Speech Synthesis": "Voz de Síntese de Fala",
    "Speech Region": "Região(:red[Obrigatório，[Obter Região](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Key": "Chave da API(:red[Obrigatório，[Obter Chave da API](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Volume": "Volume da Fala (1.0 representa 100%)",
    "Speech Rate": "Velocidade da Fala (1.0 significa velocidade 1x)",
    "Male": "Masculino",
    "Female": "Feminino",
    "Background Music": "Música de Fundo",
    "No Background Music": "Sem Música de Fundo",
    "Random Background Music": "Música de Fundo Aleatória",
    "Custom Background Music": "Música de Fundo Personalizada",
    "Custom Background Music File": "Por favor, insira o caminho do arquivo para a música de fundo personalizada:",
    "Background Music Volume": "Volume da Música de Fundo (0.2 representa 20%, a música de fundo não deve ser muito alta)",
    "Subtitle Settings": "**Configurações de Legendas**",
    "Enable Subtitles": "Ativar Legendas (Se desmarcado, as configurações abaixo não terão efeito)",
    "Font": "Fonte da Legenda",
    "Position": "Posição da Legenda",
    "Top": "Superior",
    "Center": "Centralizar",
    "Bottom": "Inferior (Recomendado)",
    "Custom": "Posição personalizada (70, indicando 70% abaixo do topo)",
    "Font Size": "Tamanho da Fonte da Legenda",
    "Font Color": "Cor da Fonte da Legenda",
    "Stroke Color": "Cor do Contorno da Legenda",
    "Stroke Width": "Largura do Contorno da Legenda",
    "Generate Video": "Gerar Vídeo",
    "Video Script and Subject Cannot Both Be Empty": "O Tema do Vídeo e o Roteiro do Vídeo não podem estar ambos vazios",
    "Generating Video": "Gerando vídeo, por favor aguarde...",
    "Start Generating Video": "Começar a Gerar Vídeo",
    "Video Generation Completed": "Geração do Vídeo Concluída",
    "Video Generation Failed": "Falha na Geração do Vídeo",
    "You can download the generated video from the following links": "Você pode baixar o vídeo gerado a partir dos seguintes links",
    "Basic Settings": "**Configurações Básicas** (:blue[Clique para expandir])",
    "Language": "Idioma",
    "Pexels API Key": "Chave da API do Pexels ([Obter Chave da API](https://www.pexels.com/api/))",
    "Pixabay API Key": "Chave da API do Pixabay ([Obter Chave da API](https://pixabay.com/api/docs/#api_search_videos))",
    "LLM Provider": "Provedor LLM",
    "API Key": "Chave da API (:red[Obrigatório])",
    "Base Url": "URL Base",
    "Account ID": "ID da Conta (Obter no painel do Cloudflare)",
    "Model Name": "Nome do Modelo",
    "Please Enter the LLM API Key": "Por favor, insira a **Chave da API LLM**",
    "Please Enter the Pexels API Key": "Por favor, insira a **Chave da API do Pexels**",
    "Please Enter the Pixabay API Key": "Por favor, insira a **Chave da API do Pixabay**",
    "Get Help": "Se precisar de ajuda ou tiver alguma dúvida, você pode entrar no discord para obter ajuda: https://harryai.cc",
    "Video Source": "Fonte do Vídeo",
    "TikTok": "TikTok (Suporte para TikTok em breve)",
    "Bilibili": "Bilibili (Suporte para Bilibili em breve)",
    "Xiaohongshu": "Xiaohongshu (Suporte para Xiaohongshu em breve)",
    "Local file": "Arquivo local",
    "Play Voice": "Reproduzir Voz",
    "Voice Example": "Este é um exemplo de texto para testar a síntese de fala",
    "Synthesizing Voice": "Sintetizando voz, por favor aguarde...",
    "TTS Provider": "Selecione o provedor de síntese de voz",
    "TTS Servers": "Servidores TTS",
    "No voices available for the selected TTS server. Please select another server.": "Não há vozes disponíveis para o servidor TTS selecionado. Por favor, selecione outro servidor.",
    "SiliconFlow API Key": "Chave API do SiliconFlow",
    "SiliconFlow TTS Settings": "Configurações do SiliconFlow TTS",
    "Speed: Range [0.25, 4.0], default is 1.0": "Velocidade: Intervalo [0.25, 4.0], o padrão é 1.0",
    "Volume: Uses Speech Volume setting, default 1.0 maps to gain 0": "Volume: Usa a configuração de Volume de Fala, o padrão 1.0 corresponde ao ganho 0",
    "Hide Log": "Ocultar Log",
    "Hide Basic Settings": "Ocultar Configurações Básicas\n\nOculto, o painel de configurações básicas não será exibido na página.\n\nSe precisar exibi-lo novamente, defina `hide_config = false` em `config.toml`",
    "LLM Settings": "**Configurações do LLM**",
    "Video Source Settings": "**Configurações da Fonte do Vídeo**"
  }
 }
--- a/webui/i18n/vi.json
+++ b/webui/i18n/vi.json
@@ -1,6 +1,14 @@
 {
  "Language": "Tiếng Việt",
  "Translation": {
    "Login Required": "Yêu cầu đăng nhập",
    "Please login to access settings": "Vui lòng đăng nhập để truy cập cài đặt",
    "Username": "Tên đăng nhập",
    "Password": "Mật khẩu",
    "Login": "Đăng nhập",
    "Login Error": "Lỗi đăng nhập",
    "Incorrect username or password": "Tên đăng nhập hoặc mật khẩu không chính xác",
    "Please enter your username and password": "Vui lòng nhập tên đăng nhập và mật khẩu của bạn",
    "Video Script Settings": "**Cài Đặt Kịch Bản Video**",
    "Video Subject": "Chủ Đề Video (Cung cấp một từ khóa, :red[AI sẽ tự động tạo ra] kịch bản video)",
    "Script Language": "Ngôn Ngữ cho Việc Tạo Kịch Bản Video (AI sẽ tự động xuất ra dựa trên ngôn ngữ của chủ đề của bạn)",
@@ -16,6 +24,13 @@
    "Video Concat Mode": "Chế Độ Nối Video",
    "Random": "Nối Ngẫu Nhiên (Được Khuyến Nghị)",
    "Sequential": "Nối Theo Thứ Tự",
    "Video Transition Mode": "Chế Độ Chuyển Đổi Video",
    "None": "Không Có Chuyển Đổi",
    "Shuffle": "Chuyển Đổi Ngẫu Nhiên",
    "FadeIn": "FadeIn",
    "FadeOut": "FadeOut",
    "SlideIn": "SlideIn",
    "SlideOut": "SlideOut",
    "Video Ratio": "Tỷ Lệ Khung Hình Video",
    "Portrait": "Dọc 9:16",
    "Landscape": "Ngang 16:9",
@@ -26,6 +41,7 @@
    "Speech Region": "Vùng(:red[Bắt Buộc，[Lấy Vùng](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Key": "Khóa API(:red[Bắt Buộc，[Lấy Khóa API](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Volume": "Âm Lượng Giọng Đọc (1.0 đại diện cho 100%)",
    "Speech Rate": "Tốc độ đọc (1.0 biểu thị tốc độ gốc)",
    "Male": "Nam",
    "Female": "Nữ",
    "Background Music": "Âm Nhạc Nền",
@@ -41,6 +57,7 @@
    "Top": "Trên",
    "Center": "Giữa",
    "Bottom": "Dưới (Được Khuyến Nghị)",
    "Custom": "Vị trí tùy chỉnh (70, chỉ ra là cách đầu trang 70%)",
    "Font Size": "Cỡ Chữ Phụ Đề",
    "Font Color": "Màu Chữ Phụ Đề",
    "Stroke Color": "Màu Viền Phụ Đề",
@@ -52,10 +69,10 @@
    "Video Generation Completed": "Hoàn Tất Tạo Video",
    "Video Generation Failed": "Tạo Video Thất Bại",
    "You can download the generated video from the following links": "Bạn có thể tải video được tạo ra từ các liên kết sau",
    "Pexels API Key": "Khóa API Pexels ([Lấy Khóa API](https://www.pexels.com/api/))",
    "Pixabay API Key": "Pixabay API Key ([Get API Key](https://pixabay.com/api/docs/#api_search_videos))",
    "Basic Settings": "**Cài Đặt Cơ Bản** (:blue[Nhấp để mở rộng])",
    "Language": "Ngôn Ngữ",
    "Pexels API Key": "Khóa API Pexels ([Lấy Khóa API](https://www.pexels.com/api/))",
    "Pixabay API Key": "Khóa API Pixabay ([Lấy Khóa API](https://pixabay.com/api/docs/#api_search_videos))",
    "LLM Provider": "Nhà Cung Cấp LLM",
    "API Key": "Khóa API (:red[Bắt Buộc])",
    "Base Url": "Url Cơ Bản",
@@ -63,16 +80,26 @@
    "Model Name": "Tên Mô Hình",
    "Please Enter the LLM API Key": "Vui lòng Nhập **Khóa API LLM**",
    "Please Enter the Pexels API Key": "Vui lòng Nhập **Khóa API Pexels**",
-    "Please Enter the Pixabay API Key": "Vui lòng Nhập **Pixabay API Key**",
+    "Please Enter the Pixabay API Key": "Vui lòng Nhập **Khóa API Pixabay**",
    "Get Help": "Nếu bạn cần giúp đỡ hoặc có bất kỳ câu hỏi nào, bạn có thể tham gia discord để được giúp đỡ: https://harryai.cc",
-    "Video Source": "Video Source",
+    "Video Source": "Nguồn Video",
-    "TikTok": "TikTok (TikTok support is coming soon)",
+    "TikTok": "TikTok (Hỗ trợ TikTok sắp ra mắt)",
-    "Bilibili": "Bilibili (Bilibili support is coming soon)",
+    "Bilibili": "Bilibili (Hỗ trợ Bilibili sắp ra mắt)",
-    "Xiaohongshu": "Xiaohongshu (Xiaohongshu support is coming soon)",
+    "Xiaohongshu": "Xiaohongshu (Hỗ trợ Xiaohongshu sắp ra mắt)",
-    "Local file": "Local file",
+    "Local file": "Tệp cục bộ",
-    "Play Voice": "Play Voice",
+    "Play Voice": "Phát Giọng Nói",
-    "Voice Example": "This is an example text for testing speech synthesis",
+    "Voice Example": "Đây là văn bản mẫu để kiểm tra tổng hợp giọng nói",
-    "Synthesizing Voice": "Synthesizing voice, please wait...",
+    "Synthesizing Voice": "Đang tổng hợp giọng nói, vui lòng đợi...",
-    "TTS Provider": "Select the voice synthesis provider"
+    "TTS Provider": "Chọn nhà cung cấp tổng hợp giọng nói",
    "TTS Servers": "Máy chủ TTS",
    "No voices available for the selected TTS server. Please select another server.": "Không có giọng nói nào cho máy chủ TTS đã chọn. Vui lòng chọn máy chủ khác.",
    "SiliconFlow API Key": "Khóa API SiliconFlow",
    "SiliconFlow TTS Settings": "Cài đặt SiliconFlow TTS",
    "Speed: Range [0.25, 4.0], default is 1.0": "Tốc độ: Phạm vi [0.25, 4.0], mặc định là 1.0",
    "Volume: Uses Speech Volume setting, default 1.0 maps to gain 0": "Âm lượng: Sử dụng cài đặt Âm lượng Giọng nói, mặc định 1.0 tương ứng với tăng ích 0",
    "Hide Log": "Ẩn Nhật Ký",
    "Hide Basic Settings": "Ẩn Cài Đặt Cơ Bản\n\nẨn, thanh cài đặt cơ bản sẽ không hiển thị trên trang web.\n\nNếu bạn muốn hiển thị lại, vui lòng đặt `hide_config = false` trong `config.toml`",
    "LLM Settings": "**Cài Đặt LLM**",
    "Video Source Settings": "**Cài Đặt Nguồn Video**"
  }
-}
+}
--- a/webui/i18n/zh.json
+++ b/webui/i18n/zh.json
@@ -1,6 +1,14 @@
 {
  "Language": "简体中文",
  "Translation": {
    "Login Required": "需要登录",
    "Please login to access settings": "请登录后访问配置设置 (:gray[默认用户名: admin, 密码: admin, 您可以在 config.toml 中修改])",
    "Username": "用户名",
    "Password": "密码",
    "Login": "登录",
    "Login Error": "登录错误",
    "Incorrect username or password": "用户名或密码不正确",
    "Please enter your username and password": "请输入用户名和密码",
    "Video Script Settings": "**文案设置**",
    "Video Subject": "视频主题（给定一个关键词，:red[AI自动生成]视频文案）",
    "Script Language": "生成视频脚本的语言（一般情况AI会自动根据你输入的主题语言输出）",
@@ -16,6 +24,13 @@
    "Video Concat Mode": "视频拼接模式",
    "Random": "随机拼接（推荐）",
    "Sequential": "顺序拼接",
    "Video Transition Mode": "视频转场模式",
    "None": "无转场",
    "Shuffle": "随机转场",
    "FadeIn": "渐入",
    "FadeOut": "渐出",
    "SlideIn": "滑动入",
    "SlideOut": "滑动出",
    "Video Ratio": "视频比例",
    "Portrait": "竖屏 9:16（抖音视频）",
    "Landscape": "横屏 16:9（西瓜视频）",
@@ -26,6 +41,7 @@
    "Speech Region": "服务区域 (:red[必填，[点击获取](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Key": "API Key (:red[必填，密钥1 或 密钥2 均可 [点击获取](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
    "Speech Volume": "朗读音量（1.0表示100%）",
    "Speech Rate": "朗读速度（1.0表示1倍速）",
    "Male": "男性",
    "Female": "女性",
    "Background Music": "背景音乐",
@@ -41,6 +57,7 @@
    "Top": "顶部",
    "Center": "中间",
    "Bottom": "底部（推荐）",
    "Custom": "自定义位置（70，表示离顶部70%的位置）",
    "Font Size": "字幕大小",
    "Font Color": "字幕颜色",
    "Stroke Color": "描边颜色",
@@ -54,8 +71,8 @@
    "You can download the generated video from the following links": "你可以从以下链接下载生成的视频",
    "Basic Settings": "**基础设置** (:blue[点击展开])",
    "Language": "界面语言",
-    "Pexels API Key": "Pexels API Key ([点击获取](https://www.pexels.com/api/))",
+    "Pexels API Key": "Pexels API Key ([点击获取](https://www.pexels.com/api/)) :red[推荐使用]",
-    "Pixabay API Key": "Pixabay API Key ([点击获取](https://pixabay.com/api/docs/#api_search_videos))",
+    "Pixabay API Key": "Pixabay API Key ([点击获取](https://pixabay.com/api/docs/#api_search_videos)) :red[可以不用配置，如果 Pexels 无法使用，再选择Pixabay]",
    "LLM Provider": "大模型提供商",
    "API Key": "API Key (:red[必填，需要到大模型提供商的后台申请])",
    "Base Url": "Base Url (可选)",
@@ -73,6 +90,16 @@
    "Play Voice": "试听语音合成",
    "Voice Example": "这是一段测试语音合成的示例文本",
    "Synthesizing Voice": "语音合成中，请稍候...",
-    "TTS Provider": "语音合成提供商"
+    "TTS Provider": "语音合成提供商",
    "TTS Servers": "TTS服务器",
    "No voices available for the selected TTS server. Please select another server.": "当前选择的TTS服务器没有可用的声音，请选择其他服务器。",
    "SiliconFlow API Key": "硅基流动API密钥 [点击获取](https://cloud.siliconflow.cn/account/ak)",
    "SiliconFlow TTS Settings": "硅基流动TTS设置",
    "Speed: Range [0.25, 4.0], default is 1.0": "语速范围 [0.25, 4.0]，默认值为1.0",
    "Volume: Uses Speech Volume setting, default 1.0 maps to gain 0": "音量：使用朗读音量设置，默认值1.0对应增益0",
    "Hide Log": "隐藏日志",
    "Hide Basic Settings": "隐藏基础设置\n\n隐藏后，基础设置面板将不会显示在页面中。\n\n如需要再次显示，请在 `config.toml` 中设置 `hide_config = false`",
    "LLM Settings": "**大模型设置**",
    "Video Source Settings": "**视频源设置**"
  }
 }
Author	SHA1	Message	Date
Harry	0bfec956c5	Merge pull request #658 from harry0703/dev bump version to 1.2.6	2025-05-10 14:14:42 +08:00
harry	fec3a8b6bd	Merge branch 'add-siliconflow-tts' into dev	2025-05-10 14:13:37 +08:00
harry	3108c2e4e5	perf: bump version to 1.2.6	2025-05-10 14:13:18 +08:00
Harry	d8dd1f1acf	Merge pull request #657 from harry0703/add-siliconflow-tts feat: update SiliconFlow API Key descriptions in localization files	2025-05-10 14:12:11 +08:00
Harry	208ea5c11b	Merge pull request #653 from yyhhyyyyyy/add-siliconflow-tts feat: Increase SiliconFlow TTS services.	2025-05-10 14:11:26 +08:00
harry	71d791a9af	feat: update SiliconFlow API Key descriptions in localization files	2025-05-10 14:10:42 +08:00
Harry	03a06f141c	Merge pull request #655 from harry0703/dev Dev	2025-05-10 13:27:27 +08:00
harry	4c9ac5e6df	feat: loop video clips to match audio duration	2025-05-10 13:26:24 +08:00
harry	4a64e211f9	fix: correct condition for subclipping	2025-05-10 12:35:45 +08:00
harry	97c631e696	feat: improve file extension parsing using pathlib	2025-05-10 12:34:53 +08:00
harry	a601705bf4	feat: add unit tests	2025-05-10 12:34:37 +08:00
yyhhyyyyyy	45f32756a3	feat: increase siliconflow TTS services	2025-05-09 23:31:04 +08:00
yyhhyyyyyy	22f47d90de	feat: add TTS services provider selection list	2025-05-09 22:14:43 +08:00
Harry	c03dc9c984	Merge pull request #652 from harry0703/dev perf: optimize memory usage and processing performance, bump version to 1.2.5	2025-05-09 20:56:14 +08:00
harry	7569c08a62	perf: bump version to 1.2.5	2025-05-09 20:55:36 +08:00
harry	f07e5802f7	perf: optimize memory usage and processing performance	2025-05-09 20:55:12 +08:00
Harry	ffcfe8e03b	Merge pull request #642 from harry0703/dev feat: remove voice filter	2025-05-08 18:10:16 +08:00
harry	35a7ef657a	feat: remove voice filter	2025-05-08 18:09:26 +08:00
Harry	250ec4f65c	Merge pull request #641 from harry0703/dev update	2025-05-08 17:39:44 +08:00
harry	5d0ffdad8a	feat: update README.md for clarity and remove outdated information	2025-05-08 17:39:16 +08:00
harry	95e4d3170d	feat: rename container names in docker-compose.yml	2025-05-08 17:35:12 +08:00
harry	dfa8328bb0	feat: optimize code	2025-05-08 17:34:51 +08:00
harry	5177c1871a	feat: comment out interline and size parameters in video.py	2025-05-08 17:34:09 +08:00
Harry	1901c2905b	Merge pull request #639 from harry0703/dev feat: remove streamlit_authenticator	2025-05-08 15:53:06 +08:00
harry	b312c52a33	feat: remove streamlit_authenticator	2025-05-08 15:51:33 +08:00
Harry	fb974cefcf	Merge pull request #638 from harry0703/dev bump version to 1.2.4	2025-05-08 15:45:00 +08:00
harry	c7f7fa12b4	feat: optimize code and bump version to 1.2.4	2025-05-08 15:44:07 +08:00
harry	6a19e2bb29	feat: update requirements.txt and config.example.toml	2025-05-08 15:40:46 +08:00
Harry	443f5bf61e	Merge pull request #632 from eren1106/fix-subtitle-bug Fix subtitle generation not working by setting the default subtitle provider to "edge"	2025-05-08 09:10:19 +08:00
Harry	7d00e9c768	Merge pull request #617 from garylab/main Solve subtitle header and footer was cut in some font family	2025-05-08 09:09:45 +08:00
Harry	c0ab0ba473	Merge pull request #614 from faycal-rakza/fix/comment fix(dockerfile): comment fix	2025-05-08 09:08:55 +08:00
Gary Meng	4b2f9e42d7	Merge branch 'harry0703:main' into main	2025-05-07 11:28:57 +04:00
eren	4ce32a8851	fix: set default subtitle provider to 'edge'	2025-05-01 14:35:23 +08:00
yyhhyyyyyy	47e4cff758	feat: Add PDM support with auth & i18n enhancements (#627 ) * feat: Add PDM support with auth & i18n enhancements 1. Added PDM project dependency management - Created pyproject.toml for dependency definitions - Added PDM lock file for reproducible builds - Created .pdm-python for virtual environment management 2. Enhanced authentication & configuration - Added user validation in base configuration - Implemented streamlit-authenticator for login functionality - Updated config.example.toml with user authentication fields 3. Improved internationalization (i18n) - Updated translation files for multiple languages (en, de, pt, vi, zh) - Enhanced i18n support in the web UI - Standardized translation structure across language files	2025-04-27 13:35:45 +08:00
Gary Meng	96e109e199	Solve subtitle header and footer was cut in some font family	2025-03-26 20:57:13 +04:00
Harry	36dffe8de3	Merge pull request #599 from bz-e/main refactor: Refactor the get_all_azure_voices function	2025-03-23 18:45:26 +08:00
Harry	6d2e4a8081	Merge pull request #603 from garymengcom/main Add get_all_tasks() endpoint and update .gitignore	2025-03-23 18:40:52 +08:00
faycal	a7c45b125f	fix(dockerfile): comment fix	2025-03-09 00:23:55 +01:00
Guozao Meng	6c2b5b8cf4	Update .gitignore	2025-03-08 22:54:10 +04:00
Guozao Meng	91e9f3900d	Add get_all_tasks() endpoint	2025-03-08 22:53:22 +04:00
evan.zhang5	ab1bd03f0b	refactor: Refactor the get_all_azure_voices function to reduce the amount of code by half	2025-02-27 17:31:32 +08:00
Harry	cd0cbc8061	Merge pull request #583 from iorikingdom/main Update requirements.txt	2025-02-10 11:08:23 +08:00
iorikingdom	c6c6390a83	Update requirements.txt	2025-02-09 02:26:43 +09:00
iorikingdom	6bfb9355cf	Update requirements.txt	2025-02-09 02:20:21 +09:00
harry	34d785a246	feat: remove wechat qrcode	2025-02-07 17:07:06 +08:00
harry	c9bd480514	fix: ModuleNotFoundError: No module named 'app'	2025-02-07 17:06:26 +08:00
Harry	5349f29415	Merge pull request #579 from vipinbihari/patch-1 Update video.py - Fixing BackGround Music Volume Multiplier	2025-02-05 14:53:04 +08:00
VIPIN BIHARI	6500cafa4f	Update video.py - Fixing BackGround Music Volume Multiplier These was a typo in MuiliplyVolume function parameter. The name of the parameter should be bgm_voice	2025-01-29 21:08:17 +05:30
yyhhyy	e2e92a433e	✨ feat: Add video transition effects (fadein, fadeout, slidein, slideout)	2025-01-23 12:13:04 +08:00
yyhhyyyyyy	dd90cfecbb	✨ feat: Added SlideIn and SlideOut video transition effects and optimized front-end implementation	2025-01-09 19:46:57 +08:00
yyhhyyyyyy	7a5b037ad8	✨ feat: Add video transition effects (fadein, fadeout)	2024-12-24 22:39:48 +08:00
Harry	ee0d2371d5	Merge pull request #554 from yyhhyyyyyy/llm-logic 🐛 fix: fix the LLM logic	2024-12-12 16:54:09 +08:00
yyhhyyyyyy	c4586d37f5	🎨 style: format llm.py code	2024-12-12 14:32:17 +08:00
yyhhyyyyyy	2d8cd23fe7	🐛 fix: fix the LLM logic	2024-12-12 14:29:14 +08:00
Harry	85d446e2d0	Merge pull request #552 from yyhhyyyyyy/code-cleanup 🎨 style: Format Code	2024-12-10 14:45:11 +08:00
yyhhyyyyyy	afd064e15d	🎨 style: Format Code	2024-12-10 10:34:56 +08:00
Harry	809d6cabbb	Merge pull request #548 from harry0703/dev feat: add feature request template	2024-12-06 15:48:01 +08:00
harry	8058eed9ab	feat: add feature request template	2024-12-06 15:47:04 +08:00
Harry	15ee6126a5	Merge pull request #547 from harry0703/dev feat: add issue template	2024-12-06 15:37:45 +08:00
harry	b6a7ea2756	feat: add issue template	2024-12-06 15:37:23 +08:00
Harry	63c3402c94	Update version to 1.2.2	2024-12-06 13:45:43 +08:00
Harry	5a6dd6c7a5	Merge pull request #541 from yyhhyyyyyy/update-requirements ⬆️ deps: Upgrade dependencies to latest versions and address minor issues	2024-12-05 11:02:14 +08:00
yyhhyy	8c226322a0	Merge branch 'main' into update-requirements	2024-12-05 10:59:41 +08:00
Harry	3a7888937f	Merge pull request #536 from Felix3322/main better requirements.txt	2024-12-05 10:47:26 +08:00
yyhhyyyyyy	6760a0ad00	📝 docs: Update documentation	2024-12-05 10:34:09 +08:00
yyhhyyyyyy	6288b70ae2	⬆️ deps: Upgrade dependencies to latest versions and address minor issues	2024-12-05 10:16:38 +08:00
Jiaying Liu	4adc010388	Update requirements.txt	2024-11-27 15:04:46 -05:00
Harry	162b5e17c3	Merge pull request #508 from flingjie/main allow api key empty when using ollama	2024-11-20 15:45:40 +08:00
Harry	0d43ba2124	Merge pull request #505 from LucasHenriqueDiniz/main feat: add PT-BR translation	2024-11-20 15:45:18 +08:00
Harry	080d8d82b4	Merge pull request #504 from Dreyfi/fix-403-error-pexels-request Fix the response 403 from pexels - search_videos_pexels - failed to download videos, maybe the network is not available. if you are in China, please use a VPN.	2024-11-20 15:44:46 +08:00
Harry	fc50e16bc5	Merge pull request #486 from FLY-Open-AI/main [Readme]Docker部署，启动命令优化。	2024-11-20 15:44:08 +08:00
Jie.F	345b6d59a1	allow api key empty when using ollama the ollama API key is not required	2024-10-08 09:44:39 +08:00
Dreyfi	4ec19fd56a	Add headers with user_agent to save_video request	2024-09-30 15:48:54 +10:00
Lucas Diniz	136630ec60	feat: add PT-BR translation	2024-09-29 19:30:12 -03:00
Dreyfi	9d3d99a595	Fix the response 403 from pexels search_videos_pexels - failed to download videos, maybe the network is not available. if you are in China, please use a VPN.	2024-09-28 16:25:53 +10:00
wangyanfei	747c745ec0	[Readme]Docker部署，启动命令优化。最新版的docker安装时会自动以插件的形式安装docker compose，启动命令调整为docker compose up	2024-08-31 07:22:05 +08:00
Harry	a53ca843e8	Merge pull request #467 from harry0703/dev update readme	2024-07-26 18:23:52 +08:00
harry	8b18d84d8a	update readme	2024-07-26 18:23:04 +08:00
Harry	edc4df6eb5	Merge pull request #466 from harry0703/dev fixed: subtitle generation failure	2024-07-26 17:56:32 +08:00
harry	5ed98d317c	fixed: subtitle generation failure	2024-07-26 17:55:26 +08:00
Harry	c22ef5f1d2	Merge pull request #462 from harry0703/dev update readme	2024-07-25 15:00:07 +08:00
harry	bcc9621976	update readme	2024-07-25 14:59:45 +08:00
Harry	6512e3f140	Merge pull request #461 from harry0703/dev Optimize memory usage in moviepy	2024-07-25 13:58:46 +08:00
harry	931e1a0caa	Optimize memory usage in moviepy Upgrade version number to 1.2.0	2024-07-25 13:57:39 +08:00
yyhhyy	84ae8e5248	Merge pull request #460 from yyhhyyyyyy/code-formatting Code Formatting	2024-07-25 13:39:05 +08:00
yyhhyyyyyy	5c2db3aa92	resolve issue with video concatenation order always being random	2024-07-25 13:36:21 +08:00
yyhhyyyyyy	905841965a	Format project code	2024-07-24 14:59:06 +08:00
Harry	bbd4e94941	Merge pull request #459 from yyhhyyyyyy/customize-subtitle-position feat: support custom subtitle positioning	2024-07-24 14:35:50 +08:00
yyhhyyyyyy	b89250874b	Change default value to 70.0	2024-07-24 14:31:56 +08:00
yyhhyyyyyy	e8b20c697d	feat: support custom subtitle positioning	2024-07-24 14:25:20 +08:00
Harry	e64041c93d	Merge pull request #458 from yyhhyyyyyy/refactor-task-add-subtitle-api Refactor task.py and add subtitle API	2024-07-24 11:47:27 +08:00
yyhhyyyyyy	17b4a61e64	1.Refactor task.py to encapsulate separable functions. 2.Add a new subtitle API.	2024-07-23 17:00:23 +08:00
Harry	6d520a4266	Merge pull request #453 from yyhhyyyyyy/fit-oneapi fit(oneapi):Fix the issue where model_name is always empty when using OneAPI as the LLM source.	2024-07-22 10:38:10 +08:00
yyhhyyyyyy	7ff8467f9d	Fix the issue where model_name is always empty when using OneAPI as the LLM source.	2024-07-20 09:36:19 +08:00
Harry	4cf9cefb5c	Merge pull request #450 from yyhhyyyyyy/fit-subtitle-correct fit(subtitle):Fix subtitle correction logic	2024-07-20 08:25:25 +08:00
yyhhyyyyyy	33534db8bb	1. .gitignore ignores the models folder 2. Fix subtitle correction logic	2024-07-19 15:00:17 +08:00
Harry	ec16f1c41b	Merge pull request #449 from harry0703/dev update readme	2024-07-19 14:21:56 +08:00
harry	9653d7d18a	update readme	2024-07-19 14:21:35 +08:00
Harry	36a367d713	Merge pull request #448 from yyhhyyyyyy/add-rate feat(azure_tts_v1): Allows to control the speed of speech generation.	2024-07-19 14:17:15 +08:00
yyhhyyyyyy	77b304537a	Speech Rate	2024-07-19 11:15:36 +08:00
yyhhyyyyyy	63fb848a17	1. Add azure_tts_v1 to control the speed of speech	2024-07-19 11:06:34 +08:00
Harry	6853163905	Merge pull request #447 from harry0703/dev update readme	2024-07-15 14:09:55 +08:00
harry	052c29b579	update readme	2024-07-15 14:09:33 +08:00
Harry	df62529f2a	Merge pull request #443 from harry0703/dev update readme	2024-07-09 13:41:04 +08:00
harry	934eff13ae	update readme	2024-07-09 13:40:43 +08:00
Harry	0472338184	Merge pull request #437 from harry0703/dev support baidu ERNIE llm	2024-07-03 21:13:51 +08:00
harry	66c81a04bf	support baidu ERNIE llm	2024-07-03 21:12:21 +08:00
Harry	8dd66cf624	Merge pull request #435 from harry0703/dev update readme	2024-07-02 10:00:53 +08:00
harry	dca23d99e4	update readme	2024-07-02 09:57:53 +08:00
Harry	42560cc7f5	Merge pull request #421 from harry0703/dev update readme	2024-06-21 11:01:41 +08:00
harry	11478063e7	update readme	2024-06-21 11:01:15 +08:00
Harry	bf0dbcc045	Merge pull request #414 from harry0703/dev update readme	2024-06-15 17:37:36 +08:00
harry	43df593ac3	update readme	2024-06-15 17:36:37 +08:00
Harry	7cf21c6541	Merge pull request #408 from harry0703/dev update readme	2024-06-11 11:50:48 +08:00
harry	f76f905833	update readme	2024-06-11 11:48:04 +08:00
Harry	0f27c26042	Merge pull request #399 from harry0703/dev update readme	2024-06-04 10:36:18 +08:00
harry	e1d7318cee	update readme	2024-06-04 10:34:32 +08:00
Harry	6408c31b7f	Merge pull request #391 from harry0703/dev update readme	2024-05-28 18:41:24 +08:00
harry	b0d694db08	update readme	2024-05-28 14:51:03 +08:00
Harry	730c2a461a	Merge pull request #381 from harry0703/dev update readme	2024-05-23 18:21:05 +08:00
harry	bdb49a4c82	update readme	2024-05-23 18:20:45 +08:00
Harry	a4692060a0	Merge pull request #372 from harry0703/dev enhanced exception handling for generating terms	2024-05-17 17:12:13 +08:00
harry	fc6844dd19	enhanced exception handling for generating terms	2024-05-17 17:11:35 +08:00
Harry	d740a6babd	Merge pull request #370 from harry0703/dev update readme	2024-05-17 08:44:01 +08:00
harry	9c58991830	update readme	2024-05-17 08:43:35 +08:00