Skip to content

feat: v3.0.2 - i18n, audio-only flow, task control, UX polish#576

Open
doomsday616 wants to merge 1 commit into
Huanshere:mainfrom
doomsday616:feat/v3.0.2-i18n-and-bugfixes
Open

feat: v3.0.2 - i18n, audio-only flow, task control, UX polish#576
doomsday616 wants to merge 1 commit into
Huanshere:mainfrom
doomsday616:feat/v3.0.2-i18n-and-bugfixes

Conversation

@doomsday616

Copy link
Copy Markdown
Contributor

feat: v3.0.2 - i18n, audio-only flow, task control, UX polish

Bumps version to 3.0.2.
版本号升至 3.0.2。

主要内容 / Major themes:

i18n

  • Browser language auto-detection (Accept-Language) on first load,
    with a top-right language selector overriding the choice per session.
    首次加载根据浏览器 Accept-Language 自动识别语言;右上角语言选择器可按会话覆盖。
  • Sidebar duplicate language selector removed.
    移除侧边栏重复的语言选择器。
  • Routes display_language through query params + session_state, with
    config.yaml as a fallback only.
    display_language 改为优先走 query params + session_state,config.yaml 仅作兜底。
  • Adds normalize_language_code() to map zh / zh-CN / zh-HK / zh-Hant /
    variants to the supported set.
    新增 normalize_language_code(),把 zh / zh-CN / zh-HK / zh-Hant 等变体统一映射到受支持的语言。
  • Translates previously hard-coded UI strings: WhisperX runtime, TTS
    engine names, Voice / 302ai API / ElevenLabs API labels, "Star on
    GitHub" button, YouTube resolution "Best".
    翻译之前硬编码的 UI 文案:WhisperX runtime、TTS 引擎名、Voice / 302ai API / ElevenLabs API 标签、"Star on GitHub" 按钮、YouTube 分辨率 "Best"。
  • Fixes 'here' link text leaking English in zh-CN / zh-HK welcome
    string.
    修复欢迎语在简中/繁中里 "here" 链接文字仍为英文的问题。
  • Adds CSS overlay for the file_uploader internals (Streamlit has no
    official i18n for these) covering "Drag and drop file here",
    "Limit ... per file" and "Browse files" labels.
    通过 CSS 覆盖 file_uploader 内部文案(Streamlit 官方未提供 i18n),包括 "Drag and drop file here"、"Limit ... per file"、"Browse files"。
  • Hides Streamlit developer toolbar (client.toolbarMode = "viewer")
    and disables the file watcher (server.fileWatcherType = "none") so
    "File change / Rerun / Always rerun" prompts no longer appear.
    隐藏 Streamlit 开发者工具栏(client.toolbarMode = "viewer"),关闭文件监听(server.fileWatcherType = "none"),避免出现 "File change / Rerun / Always rerun" 英文提示。
  • Fills missing translation keys across en / zh-CN / zh-HK / es / fr /
    ja / ru.
    补全 en / zh-CN / zh-HK / es / fr / ja / ru 七种语言中缺失的翻译键。

Audio-only input flow / 纯音频输入流程

  • Adds output/input_manifest.json written by the upload / YouTube
    download path, recording the original media type. find_media_file()
    now reads the manifest first, so generated artefacts (dub.mp3,
    normalized_dub.wav) no longer poison detection.
    上传 / YouTube 下载后写入 output/input_manifest.json 记录原始媒体类型。find_media_file() 优先读 manifest,避免生成产物(dub.mp3、normalized_dub.wav)污染识别。
  • find_audio_files() now skips generated audio names.
    find_audio_files() 自动跳过生成产物文件名。
  • find_media_file() distinguishes "no media" vs "multiple media"
    errors instead of silently falling back.
    find_media_file() 区分"无媒体"和"多个媒体"两种错误,不再静默 fallback。
  • Sidebar no longer persistently writes burn_subtitles = false when
    the input is audio; the toggle is only disabled in the UI.
    音频输入时不再持久化写入 burn_subtitles = false,只在 UI 层禁用开关。
  • Main pipeline now flows download -> subtitles -> (optional) dubbing,
    only showing the dubbing section after subtitles are done AND the
    input is not audio.
    主流程改为:下载 → 字幕 →(可选)配音;只有字幕完成且输入不是音频时才显示配音段。
  • Adds prepare_audio_for_asr() so audio-only inputs are normalized to
    16k / mono / mp3 without going through video conversion.
    新增 prepare_audio_for_asr(),纯音频输入直接归一化为 16k / 单声道 / mp3,不再经过视频转换。
  • Removes the obsolete convert_audio_to_video() placeholder path.
    移除已废弃的 convert_audio_to_video() 占位路径。

Task control (pause / resume / stop) / 任务控制(暂停 / 继续 / 停止)

  • TaskRunner gains a class-level _current pointer and TaskRunner.check_cancel()
    so long-running core loops can cooperatively cancel.
    TaskRunner 新增类级 _current 指针和 TaskRunner.check_cancel(),让长循环可以协作式取消。
  • Adds core.utils.check_cancel() wrapper, imported via the existing
    from core.utils import * pattern.
    新增 core.utils.check_cancel() 包装,沿用现有 from core.utils import * 导入方式。
  • Inserts check_cancel() into the hot loops: ASR segment loop,
    translate parallel loop, TTS warmup / parallel collection / chunk
    merge, audio segment merge, translate_lines entry. Parallel loops
    also cancel pending futures on stop.
    在热点循环里插入 check_cancel():ASR 分段循环、翻译并行循环、TTS warmup / 并行收集 / chunk 合并、音频段合并、translate_lines 入口。并行循环在 stop 时主动 cancel 未启动的 futures。

Done markers and completion detection / 完成标记与状态判断

  • Adds output/.subtitle_done and output/.dubbing_done markers written
    by the runner as the last step of each stage.
    TaskRunner 在每个阶段最后一步写入 output/.subtitle_done 和 output/.dubbing_done 标记。
  • text_done / audio_done now prefer the marker, falling back to a full
    outputs-present check, so half-failed runs are no longer mistaken
    for completion.
    text_done / audio_done 优先读标记,没有则回落到"所有最终产物齐全"检查,避免半失败被误判为完成。
  • Subtitle length tuning controls (max_split_length, subtitle.max_length)
    surfaced in an expandable section above "Start Processing Subtitles",
    with suggested ranges and a "Restore defaults" button.
    在"开始处理字幕"上方加入折叠的字幕长度微调(max_split_length、subtitle.max_length),含建议范围和"恢复默认"按钮。

Robustness fixes / 健壮性修复

  • ElevenLabs ASR: elev2whisper() now always emits word-level
    timestamps; process_transcription() tolerates segments without
    words by synthesizing one from the segment text.
    ElevenLabs ASR:elev2whisper() 始终输出词级时间戳;process_transcription() 对没有 words 的 segment 用 segment 文本合成一个,避免 KeyError。
  • download_video_section now surfaces detection errors (e.g. multiple
    media files in output/) with a clear message and a "Clear output and
    reselect" button, instead of silently falling back to the upload
    view.
    download_video_section 在媒体识别失败(例如 output/ 里有多个媒体)时显示明确错误和"清空输出并重新选择"按钮,不再静默回到上传界面。
  • Re-upload of the same file is detected via session_state, avoiding
    an infinite rerun loop.
    通过 session_state 识别同一文件的重复上传,避免无限 rerun 循环。
  • give_star_button rewritten to a plain string template; previous
    f-string broke on the literal { in the embedded CSS.
    give_star_button 改写为普通字符串模板;原 f-string 因内嵌 CSS 里的 { 报错。

Tooling and config / 工具与配置

  • OneKeyStart.bat consolidated: auto-detects .venv (uv install) or
    falls back to the legacy Conda env "videolingo"; OneKeyStart_uv.bat
    removed.
    合并 OneKeyStart.bat:自动检测 .venv(uv 安装)或回落到旧的 Conda 环境 "videolingo";删除 OneKeyStart_uv.bat。
  • Logs now go to logs/videolingo_.log instead of the
    project root.
    日志写入 logs/videolingo_.log,不再散落在项目根目录。
  • .streamlit/config.toml: client.toolbarMode = "viewer",
    server.fileWatcherType = "none", server.maxUploadSize preserved.
    .streamlit/config.toml:client.toolbarMode = "viewer",server.fileWatcherType = "none",保留 server.maxUploadSize。
  • .gitignore: ignores logs/, videolingo_.log, AGENTS.md, pr-body.md.
    .gitignore:忽略 logs/、videolingo_
    .log、AGENTS.md、pr-body.md。
  • setup.py + config.yaml header bumped to 3.0.2.
    setup.py 与 config.yaml 顶部版本号统一升至 3.0.2。

No new dependencies. No CLI behavior changes.
未引入新依赖。CLI 行为无变更。

Bumps version to 3.0.2.
版本号升至 3.0.2。

主要内容 / Major themes:

i18n
- Browser language auto-detection (Accept-Language) on first load,
  with a top-right language selector overriding the choice per session.
  首次加载根据浏览器 Accept-Language 自动识别语言;右上角语言选择器可按会话覆盖。
- Sidebar duplicate language selector removed.
  移除侧边栏重复的语言选择器。
- Routes display_language through query params + session_state, with
  config.yaml as a fallback only.
  display_language 改为优先走 query params + session_state,config.yaml 仅作兜底。
- Adds normalize_language_code() to map zh / zh-CN / zh-HK / zh-Hant /
  variants to the supported set.
  新增 normalize_language_code(),把 zh / zh-CN / zh-HK / zh-Hant 等变体统一映射到受支持的语言。
- Translates previously hard-coded UI strings: WhisperX runtime, TTS
  engine names, Voice / 302ai API / ElevenLabs API labels, "Star on
  GitHub" button, YouTube resolution "Best".
  翻译之前硬编码的 UI 文案:WhisperX runtime、TTS 引擎名、Voice / 302ai API / ElevenLabs API 标签、"Star on GitHub" 按钮、YouTube 分辨率 "Best"。
- Fixes 'here' link text leaking English in zh-CN / zh-HK welcome
  string.
  修复欢迎语在简中/繁中里 "here" 链接文字仍为英文的问题。
- Adds CSS overlay for the file_uploader internals (Streamlit has no
  official i18n for these) covering "Drag and drop file here",
  "Limit ... per file" and "Browse files" labels.
  通过 CSS 覆盖 file_uploader 内部文案(Streamlit 官方未提供 i18n),包括 "Drag and drop file here"、"Limit ... per file"、"Browse files"。
- Hides Streamlit developer toolbar (client.toolbarMode = "viewer")
  and disables the file watcher (server.fileWatcherType = "none") so
  "File change / Rerun / Always rerun" prompts no longer appear.
  隐藏 Streamlit 开发者工具栏(client.toolbarMode = "viewer"),关闭文件监听(server.fileWatcherType = "none"),避免出现 "File change / Rerun / Always rerun" 英文提示。
- Fills missing translation keys across en / zh-CN / zh-HK / es / fr /
  ja / ru.
  补全 en / zh-CN / zh-HK / es / fr / ja / ru 七种语言中缺失的翻译键。

Audio-only input flow / 纯音频输入流程
- Adds output/input_manifest.json written by the upload / YouTube
  download path, recording the original media type. find_media_file()
  now reads the manifest first, so generated artefacts (dub.mp3,
  normalized_dub.wav) no longer poison detection.
  上传 / YouTube 下载后写入 output/input_manifest.json 记录原始媒体类型。find_media_file() 优先读 manifest,避免生成产物(dub.mp3、normalized_dub.wav)污染识别。
- find_audio_files() now skips generated audio names.
  find_audio_files() 自动跳过生成产物文件名。
- find_media_file() distinguishes "no media" vs "multiple media"
  errors instead of silently falling back.
  find_media_file() 区分"无媒体"和"多个媒体"两种错误,不再静默 fallback。
- Sidebar no longer persistently writes burn_subtitles = false when
  the input is audio; the toggle is only disabled in the UI.
  音频输入时不再持久化写入 burn_subtitles = false,只在 UI 层禁用开关。
- Main pipeline now flows download -> subtitles -> (optional) dubbing,
  only showing the dubbing section after subtitles are done AND the
  input is not audio.
  主流程改为:下载 → 字幕 →(可选)配音;只有字幕完成且输入不是音频时才显示配音段。
- Adds prepare_audio_for_asr() so audio-only inputs are normalized to
  16k / mono / mp3 without going through video conversion.
  新增 prepare_audio_for_asr(),纯音频输入直接归一化为 16k / 单声道 / mp3,不再经过视频转换。
- Removes the obsolete convert_audio_to_video() placeholder path.
  移除已废弃的 convert_audio_to_video() 占位路径。

Task control (pause / resume / stop) / 任务控制(暂停 / 继续 / 停止)
- TaskRunner gains a class-level _current pointer and TaskRunner.check_cancel()
  so long-running core loops can cooperatively cancel.
  TaskRunner 新增类级 _current 指针和 TaskRunner.check_cancel(),让长循环可以协作式取消。
- Adds core.utils.check_cancel() wrapper, imported via the existing
  `from core.utils import *` pattern.
  新增 core.utils.check_cancel() 包装,沿用现有 `from core.utils import *` 导入方式。
- Inserts check_cancel() into the hot loops: ASR segment loop,
  translate parallel loop, TTS warmup / parallel collection / chunk
  merge, audio segment merge, translate_lines entry. Parallel loops
  also cancel pending futures on stop.
  在热点循环里插入 check_cancel():ASR 分段循环、翻译并行循环、TTS warmup / 并行收集 / chunk 合并、音频段合并、translate_lines 入口。并行循环在 stop 时主动 cancel 未启动的 futures。

Done markers and completion detection / 完成标记与状态判断
- Adds output/.subtitle_done and output/.dubbing_done markers written
  by the runner as the last step of each stage.
  TaskRunner 在每个阶段最后一步写入 output/.subtitle_done 和 output/.dubbing_done 标记。
- text_done / audio_done now prefer the marker, falling back to a full
  outputs-present check, so half-failed runs are no longer mistaken
  for completion.
  text_done / audio_done 优先读标记,没有则回落到"所有最终产物齐全"检查,避免半失败被误判为完成。
- Subtitle length tuning controls (max_split_length, subtitle.max_length)
  surfaced in an expandable section above "Start Processing Subtitles",
  with suggested ranges and a "Restore defaults" button.
  在"开始处理字幕"上方加入折叠的字幕长度微调(max_split_length、subtitle.max_length),含建议范围和"恢复默认"按钮。

Robustness fixes / 健壮性修复
- ElevenLabs ASR: elev2whisper() now always emits word-level
  timestamps; process_transcription() tolerates segments without
  `words` by synthesizing one from the segment text.
  ElevenLabs ASR:elev2whisper() 始终输出词级时间戳;process_transcription() 对没有 `words` 的 segment 用 segment 文本合成一个,避免 KeyError。
- download_video_section now surfaces detection errors (e.g. multiple
  media files in output/) with a clear message and a "Clear output and
  reselect" button, instead of silently falling back to the upload
  view.
  download_video_section 在媒体识别失败(例如 output/ 里有多个媒体)时显示明确错误和"清空输出并重新选择"按钮,不再静默回到上传界面。
- Re-upload of the same file is detected via session_state, avoiding
  an infinite rerun loop.
  通过 session_state 识别同一文件的重复上传,避免无限 rerun 循环。
- give_star_button rewritten to a plain string template; previous
  f-string broke on the literal `{` in the embedded CSS.
  give_star_button 改写为普通字符串模板;原 f-string 因内嵌 CSS 里的 `{` 报错。

Tooling and config / 工具与配置
- OneKeyStart.bat consolidated: auto-detects .venv (uv install) or
  falls back to the legacy Conda env "videolingo"; OneKeyStart_uv.bat
  removed.
  合并 OneKeyStart.bat:自动检测 .venv(uv 安装)或回落到旧的 Conda 环境 "videolingo";删除 OneKeyStart_uv.bat。
- Logs now go to logs/videolingo_<timestamp>.log instead of the
  project root.
  日志写入 logs/videolingo_<timestamp>.log,不再散落在项目根目录。
- .streamlit/config.toml: client.toolbarMode = "viewer",
  server.fileWatcherType = "none", server.maxUploadSize preserved.
  .streamlit/config.toml:client.toolbarMode = "viewer",server.fileWatcherType = "none",保留 server.maxUploadSize。
- .gitignore: ignores logs/, videolingo_*.log, AGENTS.md, pr-body.md.
  .gitignore:忽略 logs/、videolingo_*.log、AGENTS.md、pr-body.md。
- setup.py + config.yaml header bumped to 3.0.2.
  setup.py 与 config.yaml 顶部版本号统一升至 3.0.2。

No new dependencies. No CLI behavior changes.
未引入新依赖。CLI 行为无变更。
@doomsday616 doomsday616 force-pushed the feat/v3.0.2-i18n-and-bugfixes branch from 0cf6e13 to 8eb83cd Compare June 16, 2026 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant