feat: v3.0.2 - i18n, audio-only flow, task control, UX polish by doomsday616 · Pull Request #576 · Huanshere/VideoLingo

doomsday616 · 2026-06-11T06:22:40Z

feat: v3.0.2 - i18n, audio-only flow, task control, UX polish

Bumps version to 3.0.2.
版本号升至 3.0.2。

主要内容 / Major themes:

i18n

Browser language auto-detection (Accept-Language) on first load,
with a top-right language selector overriding the choice per session.
首次加载根据浏览器 Accept-Language 自动识别语言；右上角语言选择器可按会话覆盖。
Sidebar duplicate language selector removed.
移除侧边栏重复的语言选择器。
Routes display_language through query params + session_state, with
config.yaml as a fallback only.
display_language 改为优先走 query params + session_state，config.yaml 仅作兜底。
Adds normalize_language_code() to map zh / zh-CN / zh-HK / zh-Hant /
variants to the supported set.
新增 normalize_language_code()，把 zh / zh-CN / zh-HK / zh-Hant 等变体统一映射到受支持的语言。
Translates previously hard-coded UI strings: WhisperX runtime, TTS
engine names, Voice / 302ai API / ElevenLabs API labels, "Star on
GitHub" button, YouTube resolution "Best".
翻译之前硬编码的 UI 文案：WhisperX runtime、TTS 引擎名、Voice / 302ai API / ElevenLabs API 标签、"Star on GitHub" 按钮、YouTube 分辨率 "Best"。
Fixes 'here' link text leaking English in zh-CN / zh-HK welcome
string.
修复欢迎语在简中/繁中里 "here" 链接文字仍为英文的问题。
Adds CSS overlay for the file_uploader internals (Streamlit has no
official i18n for these) covering "Drag and drop file here",
"Limit ... per file" and "Browse files" labels.
通过 CSS 覆盖 file_uploader 内部文案（Streamlit 官方未提供 i18n），包括 "Drag and drop file here"、"Limit ... per file"、"Browse files"。
Hides Streamlit developer toolbar (client.toolbarMode = "viewer")
and disables the file watcher (server.fileWatcherType = "none") so
"File change / Rerun / Always rerun" prompts no longer appear.
隐藏 Streamlit 开发者工具栏（client.toolbarMode = "viewer"），关闭文件监听（server.fileWatcherType = "none"），避免出现 "File change / Rerun / Always rerun" 英文提示。
Fills missing translation keys across en / zh-CN / zh-HK / es / fr /
ja / ru.
补全 en / zh-CN / zh-HK / es / fr / ja / ru 七种语言中缺失的翻译键。

Audio-only input flow / 纯音频输入流程

Adds output/input_manifest.json written by the upload / YouTube
download path, recording the original media type. find_media_file()
now reads the manifest first, so generated artefacts (dub.mp3,
normalized_dub.wav) no longer poison detection.
上传 / YouTube 下载后写入 output/input_manifest.json 记录原始媒体类型。find_media_file() 优先读 manifest，避免生成产物（dub.mp3、normalized_dub.wav）污染识别。
find_audio_files() now skips generated audio names.
find_audio_files() 自动跳过生成产物文件名。
find_media_file() distinguishes "no media" vs "multiple media"
errors instead of silently falling back.
find_media_file() 区分"无媒体"和"多个媒体"两种错误，不再静默 fallback。
Sidebar no longer persistently writes burn_subtitles = false when
the input is audio; the toggle is only disabled in the UI.
音频输入时不再持久化写入 burn_subtitles = false，只在 UI 层禁用开关。
Main pipeline now flows download -> subtitles -> (optional) dubbing,
only showing the dubbing section after subtitles are done AND the
input is not audio.
主流程改为：下载 → 字幕 →（可选）配音；只有字幕完成且输入不是音频时才显示配音段。
Adds prepare_audio_for_asr() so audio-only inputs are normalized to
16k / mono / mp3 without going through video conversion.
新增 prepare_audio_for_asr()，纯音频输入直接归一化为 16k / 单声道 / mp3，不再经过视频转换。
Removes the obsolete convert_audio_to_video() placeholder path.
移除已废弃的 convert_audio_to_video() 占位路径。

Task control (pause / resume / stop) / 任务控制（暂停 / 继续 / 停止）

TaskRunner gains a class-level _current pointer and TaskRunner.check_cancel()
so long-running core loops can cooperatively cancel.
TaskRunner 新增类级 _current 指针和 TaskRunner.check_cancel()，让长循环可以协作式取消。
Adds core.utils.check_cancel() wrapper, imported via the existing
from core.utils import * pattern.
新增 core.utils.check_cancel() 包装，沿用现有 from core.utils import * 导入方式。
Inserts check_cancel() into the hot loops: ASR segment loop,
translate parallel loop, TTS warmup / parallel collection / chunk
merge, audio segment merge, translate_lines entry. Parallel loops
also cancel pending futures on stop.
在热点循环里插入 check_cancel()：ASR 分段循环、翻译并行循环、TTS warmup / 并行收集 / chunk 合并、音频段合并、translate_lines 入口。并行循环在 stop 时主动 cancel 未启动的 futures。

Done markers and completion detection / 完成标记与状态判断

Adds output/.subtitle_done and output/.dubbing_done markers written
by the runner as the last step of each stage.
TaskRunner 在每个阶段最后一步写入 output/.subtitle_done 和 output/.dubbing_done 标记。
text_done / audio_done now prefer the marker, falling back to a full
outputs-present check, so half-failed runs are no longer mistaken
for completion.
text_done / audio_done 优先读标记，没有则回落到"所有最终产物齐全"检查，避免半失败被误判为完成。
Subtitle length tuning controls (max_split_length, subtitle.max_length)
surfaced in an expandable section above "Start Processing Subtitles",
with suggested ranges and a "Restore defaults" button.
在"开始处理字幕"上方加入折叠的字幕长度微调（max_split_length、subtitle.max_length），含建议范围和"恢复默认"按钮。

Robustness fixes / 健壮性修复

ElevenLabs ASR: elev2whisper() now always emits word-level
timestamps; process_transcription() tolerates segments without
words by synthesizing one from the segment text.
ElevenLabs ASR：elev2whisper() 始终输出词级时间戳；process_transcription() 对没有 words 的 segment 用 segment 文本合成一个，避免 KeyError。
download_video_section now surfaces detection errors (e.g. multiple
media files in output/) with a clear message and a "Clear output and
reselect" button, instead of silently falling back to the upload
view.
download_video_section 在媒体识别失败（例如 output/ 里有多个媒体）时显示明确错误和"清空输出并重新选择"按钮，不再静默回到上传界面。
Re-upload of the same file is detected via session_state, avoiding
an infinite rerun loop.
通过 session_state 识别同一文件的重复上传，避免无限 rerun 循环。
give_star_button rewritten to a plain string template; previous
f-string broke on the literal { in the embedded CSS.
give_star_button 改写为普通字符串模板；原 f-string 因内嵌 CSS 里的 { 报错。

Tooling and config / 工具与配置

OneKeyStart.bat consolidated: auto-detects .venv (uv install) or
falls back to the legacy Conda env "videolingo"; OneKeyStart_uv.bat
removed.
合并 OneKeyStart.bat：自动检测 .venv（uv 安装）或回落到旧的 Conda 环境 "videolingo"；删除 OneKeyStart_uv.bat。
Logs now go to logs/videolingo_.log instead of the
project root.
日志写入 logs/videolingo_.log，不再散落在项目根目录。
.streamlit/config.toml: client.toolbarMode = "viewer",
server.fileWatcherType = "none", server.maxUploadSize preserved.
.streamlit/config.toml：client.toolbarMode = "viewer"，server.fileWatcherType = "none"，保留 server.maxUploadSize。
.gitignore: ignores logs/, videolingo_.log, AGENTS.md, pr-body.md.
.gitignore：忽略 logs/、videolingo_.log、AGENTS.md、pr-body.md。
setup.py + config.yaml header bumped to 3.0.2.
setup.py 与 config.yaml 顶部版本号统一升至 3.0.2。

No new dependencies. No CLI behavior changes.
未引入新依赖。CLI 行为无变更。

Bumps version to 3.0.2. 版本号升至 3.0.2。主要内容 / Major themes: i18n - Browser language auto-detection (Accept-Language) on first load, with a top-right language selector overriding the choice per session. 首次加载根据浏览器 Accept-Language 自动识别语言；右上角语言选择器可按会话覆盖。 - Sidebar duplicate language selector removed. 移除侧边栏重复的语言选择器。 - Routes display_language through query params + session_state, with config.yaml as a fallback only. display_language 改为优先走 query params + session_state，config.yaml 仅作兜底。 - Adds normalize_language_code() to map zh / zh-CN / zh-HK / zh-Hant / variants to the supported set. 新增 normalize_language_code()，把 zh / zh-CN / zh-HK / zh-Hant 等变体统一映射到受支持的语言。 - Translates previously hard-coded UI strings: WhisperX runtime, TTS engine names, Voice / 302ai API / ElevenLabs API labels, "Star on GitHub" button, YouTube resolution "Best". 翻译之前硬编码的 UI 文案：WhisperX runtime、TTS 引擎名、Voice / 302ai API / ElevenLabs API 标签、"Star on GitHub" 按钮、YouTube 分辨率 "Best"。 - Fixes 'here' link text leaking English in zh-CN / zh-HK welcome string. 修复欢迎语在简中/繁中里 "here" 链接文字仍为英文的问题。 - Adds CSS overlay for the file_uploader internals (Streamlit has no official i18n for these) covering "Drag and drop file here", "Limit ... per file" and "Browse files" labels. 通过 CSS 覆盖 file_uploader 内部文案（Streamlit 官方未提供 i18n），包括 "Drag and drop file here"、"Limit ... per file"、"Browse files"。 - Hides Streamlit developer toolbar (client.toolbarMode = "viewer") and disables the file watcher (server.fileWatcherType = "none") so "File change / Rerun / Always rerun" prompts no longer appear. 隐藏 Streamlit 开发者工具栏（client.toolbarMode = "viewer"），关闭文件监听（server.fileWatcherType = "none"），避免出现 "File change / Rerun / Always rerun" 英文提示。 - Fills missing translation keys across en / zh-CN / zh-HK / es / fr / ja / ru. 补全 en / zh-CN / zh-HK / es / fr / ja / ru 七种语言中缺失的翻译键。 Audio-only input flow / 纯音频输入流程 - Adds output/input_manifest.json written by the upload / YouTube download path, recording the original media type. find_media_file() now reads the manifest first, so generated artefacts (dub.mp3, normalized_dub.wav) no longer poison detection. 上传 / YouTube 下载后写入 output/input_manifest.json 记录原始媒体类型。find_media_file() 优先读 manifest，避免生成产物（dub.mp3、normalized_dub.wav）污染识别。 - find_audio_files() now skips generated audio names. find_audio_files() 自动跳过生成产物文件名。 - find_media_file() distinguishes "no media" vs "multiple media" errors instead of silently falling back. find_media_file() 区分"无媒体"和"多个媒体"两种错误，不再静默 fallback。 - Sidebar no longer persistently writes burn_subtitles = false when the input is audio; the toggle is only disabled in the UI. 音频输入时不再持久化写入 burn_subtitles = false，只在 UI 层禁用开关。 - Main pipeline now flows download -> subtitles -> (optional) dubbing, only showing the dubbing section after subtitles are done AND the input is not audio. 主流程改为：下载 → 字幕 →（可选）配音；只有字幕完成且输入不是音频时才显示配音段。 - Adds prepare_audio_for_asr() so audio-only inputs are normalized to 16k / mono / mp3 without going through video conversion. 新增 prepare_audio_for_asr()，纯音频输入直接归一化为 16k / 单声道 / mp3，不再经过视频转换。 - Removes the obsolete convert_audio_to_video() placeholder path. 移除已废弃的 convert_audio_to_video() 占位路径。 Task control (pause / resume / stop) / 任务控制（暂停 / 继续 / 停止） - TaskRunner gains a class-level _current pointer and TaskRunner.check_cancel() so long-running core loops can cooperatively cancel. TaskRunner 新增类级 _current 指针和 TaskRunner.check_cancel()，让长循环可以协作式取消。 - Adds core.utils.check_cancel() wrapper, imported via the existing `from core.utils import *` pattern. 新增 core.utils.check_cancel() 包装，沿用现有 `from core.utils import *` 导入方式。 - Inserts check_cancel() into the hot loops: ASR segment loop, translate parallel loop, TTS warmup / parallel collection / chunk merge, audio segment merge, translate_lines entry. Parallel loops also cancel pending futures on stop. 在热点循环里插入 check_cancel()：ASR 分段循环、翻译并行循环、TTS warmup / 并行收集 / chunk 合并、音频段合并、translate_lines 入口。并行循环在 stop 时主动 cancel 未启动的 futures。 Done markers and completion detection / 完成标记与状态判断 - Adds output/.subtitle_done and output/.dubbing_done markers written by the runner as the last step of each stage. TaskRunner 在每个阶段最后一步写入 output/.subtitle_done 和 output/.dubbing_done 标记。 - text_done / audio_done now prefer the marker, falling back to a full outputs-present check, so half-failed runs are no longer mistaken for completion. text_done / audio_done 优先读标记，没有则回落到"所有最终产物齐全"检查，避免半失败被误判为完成。 - Subtitle length tuning controls (max_split_length, subtitle.max_length) surfaced in an expandable section above "Start Processing Subtitles", with suggested ranges and a "Restore defaults" button. 在"开始处理字幕"上方加入折叠的字幕长度微调（max_split_length、subtitle.max_length），含建议范围和"恢复默认"按钮。 Robustness fixes / 健壮性修复 - ElevenLabs ASR: elev2whisper() now always emits word-level timestamps; process_transcription() tolerates segments without `words` by synthesizing one from the segment text. ElevenLabs ASR：elev2whisper() 始终输出词级时间戳；process_transcription() 对没有 `words` 的 segment 用 segment 文本合成一个，避免 KeyError。 - download_video_section now surfaces detection errors (e.g. multiple media files in output/) with a clear message and a "Clear output and reselect" button, instead of silently falling back to the upload view. download_video_section 在媒体识别失败（例如 output/ 里有多个媒体）时显示明确错误和"清空输出并重新选择"按钮，不再静默回到上传界面。 - Re-upload of the same file is detected via session_state, avoiding an infinite rerun loop. 通过 session_state 识别同一文件的重复上传，避免无限 rerun 循环。 - give_star_button rewritten to a plain string template; previous f-string broke on the literal `{` in the embedded CSS. give_star_button 改写为普通字符串模板；原 f-string 因内嵌 CSS 里的 `{` 报错。 Tooling and config / 工具与配置 - OneKeyStart.bat consolidated: auto-detects .venv (uv install) or falls back to the legacy Conda env "videolingo"; OneKeyStart_uv.bat removed. 合并 OneKeyStart.bat：自动检测 .venv（uv 安装）或回落到旧的 Conda 环境 "videolingo"；删除 OneKeyStart_uv.bat。 - Logs now go to logs/videolingo_<timestamp>.log instead of the project root. 日志写入 logs/videolingo_<timestamp>.log，不再散落在项目根目录。 - .streamlit/config.toml: client.toolbarMode = "viewer", server.fileWatcherType = "none", server.maxUploadSize preserved. .streamlit/config.toml：client.toolbarMode = "viewer"，server.fileWatcherType = "none"，保留 server.maxUploadSize。 - .gitignore: ignores logs/, videolingo_*.log, AGENTS.md, pr-body.md. .gitignore：忽略 logs/、videolingo_*.log、AGENTS.md、pr-body.md。 - setup.py + config.yaml header bumped to 3.0.2. setup.py 与 config.yaml 顶部版本号统一升至 3.0.2。 No new dependencies. No CLI behavior changes. 未引入新依赖。CLI 行为无变更。

This was referenced Jun 11, 2026

feat: v3.0.3 installer refactor and shared env doomsday616/VideoLingo#1

Closed

feat: v3.0.3 installer refactor and shared env #577

Open

doomsday616 force-pushed the feat/v3.0.2-i18n-and-bugfixes branch from 1e4ce21 to 0cf6e13 Compare June 12, 2026 08:25

doomsday616 force-pushed the feat/v3.0.2-i18n-and-bugfixes branch from 0cf6e13 to 8eb83cd Compare June 16, 2026 06:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: v3.0.2 - i18n, audio-only flow, task control, UX polish#576

feat: v3.0.2 - i18n, audio-only flow, task control, UX polish#576
doomsday616 wants to merge 1 commit into
Huanshere:mainfrom
doomsday616:feat/v3.0.2-i18n-and-bugfixes

doomsday616 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

doomsday616 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant