Skip to content
This repository was archived by the owner on Jun 17, 2026. It is now read-only.

Codex/offline caption packaging readme#685

Open
sleel-sun wants to merge 4 commits into
siddharthvaddem:mainfrom
sleel-sun:codex/offline-caption-packaging-readme
Open

Codex/offline caption packaging readme#685
sleel-sun wants to merge 4 commits into
siddharthvaddem:mainfrom
sleel-sun:codex/offline-caption-packaging-readme

Conversation

@sleel-sun

@sleel-sun sleel-sun commented Jun 6, 2026

Copy link
Copy Markdown

Pull Request Template

Description

Motivation

Type of Change

  • New Feature
  • Bug Fix
  • Refactor / Code Cleanup
  • Documentation Update
  • Other (please specify)

Related Issue(s)

Screenshots / Video

Screenshot (if applicable):

![Screenshot Description](path/to/screenshot.png)

Video (if applicable):

<video src="path/to/video.mp4" controls width="600"></video>

Testing

Checklist

  • I have performed a self-review of my code.
  • I have added any necessary screenshots or videos.
  • I have linked related issue(s) and updated the changelog if applicable.

Thank you for contributing!

Summary by CodeRabbit

Release Notes

  • New Features

    • Added offline auto-caption generation for macOS using bundled Whisper model
    • Added drawing tools for recording annotations (pen, arrow, shapes, text, highlight)
    • Added annotation sticker presets for quick markup
    • Auto-caption generation in video editor with localized language support
  • Improvements

    • Enhanced native recording lifecycle management for macOS and Windows
    • Improved macOS accessibility detection for cursor recording
  • Bug Fixes

    • Fixed macOS cursor accessibility permission handling
  • Build & Configuration

    • Updated macOS build configuration for arm64 architecture
    • Added multilingual support for captions and annotation features

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Need the big picture first? Review this PR in Change Stack to see what changed before going file by file.

Review Change Stack

📝 Walkthrough

Walkthrough

PR adds offline auto-captions on macOS via bundled Whisper tooling, an on-screen annotation drawing overlay during recording with multiple tools and text input, preset annotation stickers, improved native recording state tracking and cleanup, and comprehensive i18n support for captions and sticker presets across all locales.

Changes

Captions, annotations, and native recording improvements

Layer / File(s) Summary
Whisper caption generation backend
electron/captions/whisper.ts, electron/captions/whisper.test.ts, electron/captions/jobs.ts, electron/captions/jobs.test.ts
CaptionJobRegistry manages concurrent caption jobs with per-jobId AbortController. generateCaptionsWithWhisper resolves platform-specific tool paths from env/defaults, extracts mono 16kHz WAV via FFmpeg, runs Whisper with optional language, retries on macOS GPU init failures using --no-gpu, and returns structured results.
Caption parsing and utilities
src/lib/captions.ts, src/lib/captions.test.ts
Whisper JSON parsing supporting two output formats (transcription vs segments arrays), segment normalization with min duration enforcement, locale-to-language-code mapping, and comprehensive test coverage.
Auto-caption hook and editor integration
src/components/video-editor/useAutoCaptionGeneration.ts, src/components/video-editor/useAutoCaptionGeneration.test.ts, src/components/video-editor/captionAnnotations.ts, src/components/video-editor/VideoEditor.tsx
useAutoCaptionGeneration hook deduplicates by source path, tracks caption status, cancels on unmount. createCaptionAnnotations converts segments to bottom-centered white-text annotations with semi-transparent black background. VideoEditor integrates auto-generation with UI status badge.
Recording annotation types and utilities
src/components/launch/recordingAnnotations.ts, src/components/launch/recordingAnnotations.test.ts
Tool types (pen, arrow, rectangle, ellipse, highlight, text), point/element shapes, hit detection via polyline proximity and geometric bounds for rectangles/ellipses/text, removal helpers.
Recording annotation overlay component
src/components/launch/RecordingAnnotationOverlay.tsx, src/components/launch/RecordingAnnotationOverlay.test.tsx
Full-screen SVG overlay with pointer-driven drawing, tool-specific geometry (polylines, arrows with markers, rectangles, ellipses, text with HTML input). Right-click erases, left-button commits and disarms tool. Listens to Electron events for tool/clear/undo.
Launch window annotation toolbar
src/components/launch/LaunchWindow.tsx
Recording-only toolbar with six annotation tool buttons, undo/clear buttons, active tool styling, event syncing with overlay, toolbar cleanup on recording stop.
Annotation sticker presets
src/components/video-editor/annotationStickers.ts, src/components/video-editor/AnnotationSettingsPanel.tsx, src/components/video-editor/AnnotationSettingsPanel.test.tsx
10+ SVG sticker definitions (check, cross, warning, question, numbers, arrows, etc.) encoded as data URLs; UI grid in annotation settings panel.
Overlay window factory and management
electron/windows.ts, electron/main.ts
createRecordingAnnotationOverlayWindow creates frameless transparent always-on-top window with mouse forwarding. Main process reuses/manages overlay reference, excludes from visible-window count (like countdown overlay). HUD width now computed from display work area (600–880 clamp).
Native recording state tracking and output parsing
electron/ipc/handlers.ts
Windows/macOS capture state flags (started/closed/stopRequested/closeCode), output drain/parsing helpers to extract stopped path from helper output, stop IPC branching on already-closed vs. pending state.
Overlay IPC handlers and caption start/cancel
electron/ipc/handlers.ts, electron/preload.ts, electron/electron-env.d.ts
New IPC handlers: show/hide overlay (with bounds), set tool (validated against allowlist), clear/undo. Caption handlers: start (creates jobId, wires to registry), cancel (by jobId). Preload/env updated with new API surface.
Screen recorder hook overlay and native improvements
src/hooks/useScreenRecorder.ts, src/hooks/useScreenRecorder.test.tsx
Safe overlay helpers with error suppression. macOS microphone-enabled forces browser fallback. Recoverable ScreenCaptureKit permission errors trigger fallback instead of throw. Screen track ended event listener stops recording when allowed. Overlay hidden on finalization/cleanup/error.
macOS cursor accessibility optimization
electron/native-bridge/cursor/recording/macNativeCursorRecordingSession.ts, *.test.ts
Early return when isTrustedAccessibilityClient(true) succeeds, skipping helper probing when already granted.
App window type routing and overlay styling
src/App.tsx, src/main.tsx
App routes windowType=recording-annotation-overlay to new overlay component. Window transparency styling now covers hud-overlay, source-selector, countdown-overlay, and recording-annotation-overlay.
Build config and packaging
electron-builder.json5, .gitignore
macOS DMG targets arm64 only (was x64+arm64). Native asset filters narrowed to darwin-arm64/**/*. Added ignores for .worktrees/ and /electron/native/captions/*/.
i18n and documentation
src/i18n/locales/*/editor.json, src/i18n/locales/*/settings.json, README.md
Caption generation status messages (generating/generated/skipped/unavailable/failed) added to 10 locales. Sticker preset labels added to annotation settings across all locales. README documents offline macOS captions via bundled whisper.cpp.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • siddharthvaddem/openscreen#436: Adds zh-TW locale support; this PR extends those new locale files with caption and sticker strings.
  • siddharthvaddem/openscreen#460: Countdown overlay visibility logic; this PR updates the same window-type condition paths to treat recording-annotation overlay alongside countdown.

Suggested reviewers

  • siddharthvaddem

Poem

📝✨ captions drift in silent whispers, arrows fly across the screen—

stickers bloom and annotations dance where recording's been.

arm64 builds lean and mean, while macOS hums with grace,

ten languages now speak the words of every screenshot's space. 🎬

✨ Finishing Touches
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch codex/offline-caption-packaging-readme

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d60e42f9ec

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread electron-builder.json5 Outdated
{
"from": "electron/native/bin",
"to": "electron/native/bin",
"filter": ["darwin-arm64/**/*"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore x64 macOS helper packaging

In the macOS packaging path I checked, scripts/build_macos.sh still iterates ARCHS=("arm64" "x64") and invokes electron-builder --mac --${ARCH}, while the runtime helper lookup uses darwin-x64 on Intel Macs (getNativeMacCaptureHelperCandidates). With this filter restricted to darwin-arm64, any x64 package produced by that script or by electron-builder --mac --x64 will omit the ScreenCaptureKit/cursor helper binaries, so native macOS recording becomes unavailable on Intel Macs even though the build flow still produces that architecture.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
electron/ipc/handlers.ts (1)

2235-2251: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Failed native stops can still be marked successful with a guessed path

lowkey risky: both stop handlers can return success: true with preferredPath even when the helper already exited unsuccessfully and never emitted a valid stopped path. That can persist a broken session manifest pointing at a non-existent/incomplete file.

nit: cleaner + safer guard in both stop handlers
- const screenVideoPath = stoppedPath || preferredPath;
+ const screenVideoPath =
+   stoppedPath ??
+   ((nativeWindowsCaptureCloseCode === 0 /* use nativeMacCaptureCloseCode in mac path */)
+     ? preferredPath
+     : null);
  if (!screenVideoPath) {
    throw new Error("Native Windows capture did not return an output path.");
  }
+ await fs.access(screenVideoPath, fsConstants.R_OK);

Apply the same pattern in the macOS stop handler.

Also applies to: 2342-2358

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@electron/ipc/handlers.ts` around lines 2235 - 2251, The stop logic can return
success with preferredPath even when the helper failed and never produced a real
stopped path; update the Windows stop branch (variables/functions:
nativeWindowsCaptureClosed, getNativeWindowsStoppedPathFromOutput(),
nativeWindowsCaptureCloseCode, waitForNativeWindowsCaptureStop(proc), proc,
preferredPath, screenVideoPath) so that you treat the stop as successful only if
a real stoppedPath was returned (or verify the file exists) and otherwise
surface an error/mark failure instead of falling back to preferredPath; apply
the identical safe-guarding pattern to the macOS stop handler as well so both
handlers only return success when they have a valid, present output path.
🧹 Nitpick comments (3)
src/i18n/locales/zh-TW/settings.json (1)

143-143: ⚡ Quick win

translation inconsistency with existing preset terminology

kinda noticed this file already translates "preset" as "預設" in lines 42-43 ("preset": "預設", "selectPreset": "選擇預設"), but here you're using "內建貼紙" (built-in stickers) instead of "預設貼紙" (preset stickers).

for internal consistency, should probably align with the existing translation convention throughout the file.

suggested alignment with existing terminology
-		"stickerPresets": "內建貼紙",
+		"stickerPresets": "預設貼紙",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/i18n/locales/zh-TW/settings.json` at line 143, The translation for the
key "stickerPresets" uses "內建貼紙" but the file already uses "預設" for "preset" and
"selectPreset", so update the "stickerPresets" value to "預設貼紙" to keep
terminology consistent with the existing keys ("preset", "selectPreset", and
"stickerPresets").
src/components/launch/RecordingAnnotationOverlay.test.tsx (1)

13-14: ⚡ Quick win

Restore patched Element.prototype methods after each test

Directly overriding these prototype methods without restoring them can leak mocks to other suites — lowkey risky for flaky, order-dependent tests.

nit: cleaner test isolation patch
 describe("RecordingAnnotationOverlay", () => {
 	let emitToolChange: (tool: RecordingAnnotationTool | null) => void;
 	let setRecordingAnnotationTool: ReturnType<typeof vi.fn>;
+	let originalSetPointerCapture: typeof Element.prototype.setPointerCapture;
+	let originalReleasePointerCapture: typeof Element.prototype.releasePointerCapture;
 
 	beforeEach(() => {
+		originalSetPointerCapture = Element.prototype.setPointerCapture;
+		originalReleasePointerCapture = Element.prototype.releasePointerCapture;
 		Element.prototype.setPointerCapture = vi.fn();
 		Element.prototype.releasePointerCapture = vi.fn();
 		setRecordingAnnotationTool = vi.fn(async () => ({ success: true }));
@@
 	afterEach(() => {
 		vi.restoreAllMocks();
+		Element.prototype.setPointerCapture = originalSetPointerCapture;
+		Element.prototype.releasePointerCapture = originalReleasePointerCapture;
 		window.electronAPI = undefined as unknown as ElectronAPI;
 	});

Also applies to: 27-30

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/components/launch/RecordingAnnotationOverlay.test.tsx` around lines 13 -
14, The test currently replaces Element.prototype.setPointerCapture and
Element.prototype.releasePointerCapture with vi.fn() without restoring them;
save the original methods (e.g., const origSetPointerCapture =
Element.prototype.setPointerCapture and const origReleasePointerCapture =
Element.prototype.releasePointerCapture) before mocking and restore them in an
afterEach/afterAll cleanup (call Element.prototype.setPointerCapture =
origSetPointerCapture and Element.prototype.releasePointerCapture =
origReleasePointerCapture) so the patched prototype methods do not leak into
other test suites; apply the same pattern for the other mocks at lines 27-30.
src/hooks/useScreenRecorder.test.tsx (1)

140-141: ⚡ Quick win

Mock return shape should match the typed Electron overlay contract.

these mocks currently return undefined, which can miss behavior tied to { success: false, error }. worth aligning with the real contract so failures are testable.

small test-fidelity tweak
- showRecordingAnnotationOverlay = vi.fn(async () => undefined);
- hideRecordingAnnotationOverlay = vi.fn(async () => undefined);
+ showRecordingAnnotationOverlay = vi.fn(async () => ({ success: true }));
+ hideRecordingAnnotationOverlay = vi.fn(async () => ({ success: true }));

Also applies to: 164-165

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/useScreenRecorder.test.tsx` around lines 140 - 141, The mocks
showRecordingAnnotationOverlay and hideRecordingAnnotationOverlay return
undefined but should mirror the Electron overlay contract (objects like {
success: true } or { success: false, error }), so update both mock
implementations (including the other occurrences mentioned) to return the proper
shaped objects asynchronously; ensure success paths return { success: true } and
add separate test cases or mock variants that return { success: false, error:
new Error(...) } so failure behavior is testable and aligned with the real
contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@electron-builder.json5`:
- Around line 41-60: The build config currently ships only darwin-arm64
caption/native assets and targets arm64, causing electron/captions/whisper.ts to
fail for darwin-x64 runtime; update electron-builder.json5 so extraResources
filters include the x64 variants (e.g. add "darwin-x64/**/*" or a wildcard like
"darwin-*/**/*" for both electron/native/bin and electron/native/captions)
and/or add x64 to the mac target arches so the bundle contains darwin-x64
assets; confirm scripts/build_macos.sh outputs match these arch names and that
electron/captions/whisper.ts will find `${request.platform}-${request.arch}`
(e.g. darwin-x64) at runtime.

In `@electron/captions/whisper.ts`:
- Around line 173-176: safeJobId collision risk: update the temp dir creation
that uses request.jobId/safeJobId/jobDir (and the same pattern around lines
276-279) to append a cryptographically unique suffix (e.g., crypto.randomUUID()
or a uuid v4) or include a timestamp+pid to guarantee uniqueness; construct
jobDir from tempRoot + `openscreen-captions-${safeJobId}-${uniqueSuffix}` and
update all derived paths (wavPath, outputBase) and corresponding cleanup logic
to use this new jobDir so concurrent jobs cannot collide or delete each other’s
files.

In `@electron/ipc/handlers.ts`:
- Around line 2857-2879: Compute the jobId once and reuse it in both the try and
catch paths instead of recomputing with Date.now(); move or declare const jobId
= options?.jobId ?? `caption-${Date.now()}` outside/before the try block (or
save it to a stable variable at the top of the handler) so
captionJobRegistry.start({ jobId, ... }) uses the same id and the catch return
uses that same jobId (and keep the existing error string/message and other
fields like segments), ensuring resolveApprovedVideoPath and
captionJobRegistry.start still behave the same.

In `@src/components/video-editor/AnnotationSettingsPanel.tsx`:
- Around line 490-492: The sticker accessibility text is currently using
hardcoded English via sticker.label for aria-label and title in
AnnotationSettingsPanel; change this to use the app's i18n translation function
(e.g., t or translate) or map sticker IDs to localized strings before rendering
(e.g., t(`stickers.${sticker.id}`) or a lookup like getStickerLabel(sticker.id))
and pass that translated value to aria-label and title while keeping the onClick
to onContentChange(sticker.dataUrl) unchanged so tooltips and screen-readers
follow the active locale.

In `@src/components/video-editor/useAutoCaptionGeneration.ts`:
- Around line 42-48: The jobId generation using `jobId =
\`caption-${Date.now()}\`` is collision-prone; change it to a
collision-resistant ID (e.g., use `crypto.randomUUID()` or a UUID/v4 generator)
wherever `jobId` is created (both in the block around
`startCaptionGeneration(sourcePath, { jobId, ... })` and the similar creation at
lines 82-88) so that `startCaptionGeneration` and any cancellation IPC target
the correct in-flight caption job; ensure the same new ID format is used when
sending cancel requests so `cancel` logic matches the unique job identifier.

In `@src/components/video-editor/VideoEditor.tsx`:
- Line 287: The caption-generation is being retriggered when locale changes
because captionLanguage (useMemo) updates while autoCaptionSourcePath remains
set; add a one-shot guard so auto-captions run once per loaded source: introduce
a ref/state like autoCaptionGeneratedForSourceRef and update it when you
start/finish generation, then in the effect that kicks off auto-captions check
that autoCaptionSourcePath is truthy and
autoCaptionGeneratedForSourceRef.current !== autoCaptionSourcePath before
running; reset or clear the ref when the media source truly changes (so future
new sources can generate) and keep existing symbols captionLanguage and
autoCaptionSourcePath to locate where to add the guard.

In `@src/hooks/useScreenRecorder.ts`:
- Around line 147-161: safeShowRecordingAnnotationOverlay and
safeHideRecordingAnnotationOverlay currently only catch thrown exceptions and
ignore IPC results; update both to await the IPC call result, check if the
returned payload has { success: false } and log or report the included
error/message (e.g., processLogger.warn or console.warn) when success is false,
keeping the try/catch for real exceptions; ensure you reference the returned
value (e.g., const res = await
window.electronAPI?.showRecordingAnnotationOverlay?.()) and handle res.success
=== false before returning.

---

Outside diff comments:
In `@electron/ipc/handlers.ts`:
- Around line 2235-2251: The stop logic can return success with preferredPath
even when the helper failed and never produced a real stopped path; update the
Windows stop branch (variables/functions: nativeWindowsCaptureClosed,
getNativeWindowsStoppedPathFromOutput(), nativeWindowsCaptureCloseCode,
waitForNativeWindowsCaptureStop(proc), proc, preferredPath, screenVideoPath) so
that you treat the stop as successful only if a real stoppedPath was returned
(or verify the file exists) and otherwise surface an error/mark failure instead
of falling back to preferredPath; apply the identical safe-guarding pattern to
the macOS stop handler as well so both handlers only return success when they
have a valid, present output path.

---

Nitpick comments:
In `@src/components/launch/RecordingAnnotationOverlay.test.tsx`:
- Around line 13-14: The test currently replaces
Element.prototype.setPointerCapture and Element.prototype.releasePointerCapture
with vi.fn() without restoring them; save the original methods (e.g., const
origSetPointerCapture = Element.prototype.setPointerCapture and const
origReleasePointerCapture = Element.prototype.releasePointerCapture) before
mocking and restore them in an afterEach/afterAll cleanup (call
Element.prototype.setPointerCapture = origSetPointerCapture and
Element.prototype.releasePointerCapture = origReleasePointerCapture) so the
patched prototype methods do not leak into other test suites; apply the same
pattern for the other mocks at lines 27-30.

In `@src/hooks/useScreenRecorder.test.tsx`:
- Around line 140-141: The mocks showRecordingAnnotationOverlay and
hideRecordingAnnotationOverlay return undefined but should mirror the Electron
overlay contract (objects like { success: true } or { success: false, error }),
so update both mock implementations (including the other occurrences mentioned)
to return the proper shaped objects asynchronously; ensure success paths return
{ success: true } and add separate test cases or mock variants that return {
success: false, error: new Error(...) } so failure behavior is testable and
aligned with the real contract.

In `@src/i18n/locales/zh-TW/settings.json`:
- Line 143: The translation for the key "stickerPresets" uses "內建貼紙" but the
file already uses "預設" for "preset" and "selectPreset", so update the
"stickerPresets" value to "預設貼紙" to keep terminology consistent with the
existing keys ("preset", "selectPreset", and "stickerPresets").
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8bda5e53-d617-4196-abd5-f8643f5f56d2

📥 Commits

Reviewing files that changed from the base of the PR and between c6331ba and d60e42f.

📒 Files selected for processing (57)
  • .gitignore
  • README.md
  • electron-builder.json5
  • electron/captions/jobs.test.ts
  • electron/captions/jobs.ts
  • electron/captions/whisper.test.ts
  • electron/captions/whisper.ts
  • electron/electron-env.d.ts
  • electron/ipc/handlers.ts
  • electron/main.ts
  • electron/native-bridge/cursor/recording/macNativeCursorRecordingSession.test.ts
  • electron/native-bridge/cursor/recording/macNativeCursorRecordingSession.ts
  • electron/preload.ts
  • electron/windows.ts
  • src/App.tsx
  • src/components/launch/LaunchWindow.tsx
  • src/components/launch/RecordingAnnotationOverlay.test.tsx
  • src/components/launch/RecordingAnnotationOverlay.tsx
  • src/components/launch/recordingAnnotations.test.ts
  • src/components/launch/recordingAnnotations.ts
  • src/components/video-editor/AnnotationSettingsPanel.test.tsx
  • src/components/video-editor/AnnotationSettingsPanel.tsx
  • src/components/video-editor/VideoEditor.tsx
  • src/components/video-editor/annotationStickers.ts
  • src/components/video-editor/captionAnnotations.test.ts
  • src/components/video-editor/captionAnnotations.ts
  • src/components/video-editor/useAutoCaptionGeneration.test.ts
  • src/components/video-editor/useAutoCaptionGeneration.ts
  • src/hooks/useScreenRecorder.test.tsx
  • src/hooks/useScreenRecorder.ts
  • src/i18n/locales/ar/editor.json
  • src/i18n/locales/ar/settings.json
  • src/i18n/locales/en/editor.json
  • src/i18n/locales/en/settings.json
  • src/i18n/locales/es/editor.json
  • src/i18n/locales/es/settings.json
  • src/i18n/locales/fr/editor.json
  • src/i18n/locales/fr/settings.json
  • src/i18n/locales/it/editor.json
  • src/i18n/locales/it/settings.json
  • src/i18n/locales/ja-JP/editor.json
  • src/i18n/locales/ja-JP/settings.json
  • src/i18n/locales/ko-KR/editor.json
  • src/i18n/locales/ko-KR/settings.json
  • src/i18n/locales/ru/editor.json
  • src/i18n/locales/ru/settings.json
  • src/i18n/locales/tr/editor.json
  • src/i18n/locales/tr/settings.json
  • src/i18n/locales/vi/editor.json
  • src/i18n/locales/vi/settings.json
  • src/i18n/locales/zh-CN/editor.json
  • src/i18n/locales/zh-CN/settings.json
  • src/i18n/locales/zh-TW/editor.json
  • src/i18n/locales/zh-TW/settings.json
  • src/lib/captions.test.ts
  • src/lib/captions.ts
  • src/main.tsx

Comment thread electron-builder.json5
Comment on lines +41 to +60
"target": [
{
"target": "dmg",
"arch": ["arm64"]
}
],
"icon": "icons/icons/mac/icon.icns",
"artifactName": "${productName}-Mac-${arch}-${version}-Installer.${ext}",
"extraResources": [
{
"from": "electron/native/bin",
"to": "electron/native/bin",
"filter": ["darwin-arm64/**/*"]
},
{
"from": "electron/native/captions",
"to": "electron/native/captions",
"filter": ["darwin-arm64/**/*"]
}
],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check mac build arch defaults/flags =="
rg -n -C3 --no-heading 'ARCHS|--x64|--arm64|electron-builder --mac' scripts/build_macos.sh

echo
echo "== Check packaging config arch/resource filters =="
rg -n -C2 --no-heading '"target"|"arch"|darwin-arm64|darwin-x64|electron/native/captions|electron/native/bin' electron-builder.json5

echo
echo "== Check docs/automation mentions of mac x64 support =="
rg -n -C2 --no-heading 'mac.*x64|darwin-x64|--x64|universal|arm64-only' README.md .github 2>/dev/null || true

Repository: siddharthvaddem/openscreen

Length of output: 2367


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Inspect scripts/build_macos.sh around arch build + artifact handling =="
rg -n "ARCHS=|\-\-mac|\-\-dir|DMG_NAME|DMG_PATH|APP_BUNDLE|release/" scripts/build_macos.sh
echo
# show the relevant region (best-effort)
sed -n '1,140p' scripts/build_macos.sh
echo
sed -n '140,260p' scripts/build_macos.sh

echo
echo "== Check electron-builder.json5 for any darwin-x64/universal resource handling =="
rg -n --no-heading "darwin-x64|x64|universal|darwin-arm64|extraResources|macos" electron-builder.json5

echo
echo "== Show the full mac section of electron-builder.json5 (roughly) =="
# Print from mac: to next top-level key (best effort)
python3 - <<'PY'
import re, pathlib
p=pathlib.Path("electron-builder.json5")
s=p.read_text(encoding="utf-8", errors="ignore").splitlines()
start=None
for i,line in enumerate(s):
    if re.match(r'^\s*"mac"\s*:', line):
        start=i
        break
if start is None:
    raise SystemExit("mac section not found")
# find next top-level key after start
end=len(s)
for i in range(start+1,len(s)):
    if re.match(r'^\s*"[a-zA-Z]+"\s*:\s*{?', s[i]):
        end=i
        break
for line in s[start:end]:
    print(line)
PY

Repository: siddharthvaddem/openscreen

Length of output: 9489


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Inspect electron/native assets layout =="
# list only a few levels to keep output small
ls -la electron/native 2>/dev/null || true
find electron/native/bin -maxdepth 3 -type d 2>/dev/null | head -n 200 || true
find electron/native/captions -maxdepth 3 -type d 2>/dev/null | head -n 200 || true

echo
echo "== Look for darwin-x64 assets presence =="
rg -n --hidden --no-messages "darwin-x64" . || true
rg -n --hidden --no-messages "darwin-arm64" . || true

echo
echo "== Search for code that loads electron/native/bin/captions =="
rg -n --hidden --no-messages "native/(bin|captions)|electron/native/(bin|captions)|captions|darwin-(arm64|x64)" . || true

Repository: siddharthvaddem/openscreen

Length of output: 12329


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== electron-builder.json5 length + mac block with context =="
wc -l electron-builder.json5
rg -n --no-heading '^\s*"mac"\s*:' electron-builder.json5
rg -n --no-heading '"extraResources"|^\s*"target"|darwin-arm64|darwin-x64' electron-builder.json5
echo
python3 - <<'PY'
import pathlib,re
p=pathlib.Path("electron-builder.json5")
lines=p.read_text(encoding="utf-8",errors="ignore").splitlines()
# find mac block start
start=None
for i,l in enumerate(lines):
    if re.match(r'^\s*"mac"\s*:\s*{', l):
        start=i; break
if start is None: 
    print("mac block not found"); raise SystemExit(1)
# print until next top-level key at same indent (starts with "  "??) approximate by regex for ^\s*"[a-zA-Z]+"\s*:
end=len(lines)
for j in range(start+1,len(lines)):
    if re.match(r'^\s*"[a-zA-Z]+"\s*:\s*{?', lines[j]) and j>start:
        end=j
        break
for k in range(start, min(end, start+120)):
    print(f"{k+1:4d}:{lines[k]}")
PY

echo
echo "== Inspect electron/captions/whisper.ts asset path + platformArch logic =="
rg -n --no-heading "resourcesPath|platformArch|darwin-arm64|darwin-x64|native/captions" electron/captions/whisper.ts
python3 - <<'PY'
import pathlib
p=pathlib.Path("electron/captions/whisper.ts")
s=p.read_text(encoding="utf-8",errors="ignore").splitlines()
# print around lines referenced earlier (rough)
for idx in range(60,120):
    if idx < len(s):
        print(f"{idx+1:4d}:{s[idx]}")
PY

echo
echo "== Inspect cursor helper resolution fallback =="
rg -n --no-heading "OPENSCREEN_MAC_CURSOR_HELPER_EXE|openscreen-macos-cursor-helper|archTag|darwin-arm64|darwin-x64" electron/native-bridge/cursor/recording/macNativeCursorRecordingSession.ts
python3 - <<'PY'
import pathlib
p=pathlib.Path("electron/native-bridge/cursor/recording/macNativeCursorRecordingSession.ts")
s=p.read_text(encoding="utf-8",errors="ignore").splitlines()
for idx in range(1,120):
    if idx < len(s):
        print(f"{idx+1:4d}:{s[idx]}")
PY

Repository: siddharthvaddem/openscreen

Length of output: 7171


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Inspect whisper.ts where whisperPath/modelPath are used =="
rg -n --no-heading "whisperPath|modelPath|resolveCaptionTools|OPENSCREEN_WHISPER" electron/captions/whisper.ts
echo
python3 - <<'PY'
import pathlib
p=pathlib.Path("electron/captions/whisper.ts")
s=p.read_text(encoding="utf-8",errors="ignore").splitlines()
# print ranges likely around resolveCaptionTools usage
for start in [100,140,180,220,260,300]:
    end=min(len(s), start+80)
    print("\n--- lines",start+1,"-",end,"---")
    for i in range(start,end):
        print(f"{i+1:4d}:{s[i]}")
PY

Repository: siddharthvaddem/openscreen

Length of output: 10725


Flag arm64-only native/captions filters as lowkey risky for x64 runtime

electron-builder.json5 hard-pins mac dmg to arm64 and only includes extraResources with filter: ["darwin-arm64/**/*"] for both electron/native/bin and electron/native/captions, even though scripts/build_macos.sh builds an x64 .app. so the x64 bundle won’t contain the darwin-x64 caption/native assets, and electron/captions/whisper.ts will look for ${request.platform}-${request.arch} (darwin-x64 for x64) and return Local caption tools are not available when the tools/models aren’t found (no cross-arch fallback). kinda cursed—this shows up as “captions unavailable” only on x64.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@electron-builder.json5` around lines 41 - 60, The build config currently
ships only darwin-arm64 caption/native assets and targets arm64, causing
electron/captions/whisper.ts to fail for darwin-x64 runtime; update
electron-builder.json5 so extraResources filters include the x64 variants (e.g.
add "darwin-x64/**/*" or a wildcard like "darwin-*/**/*" for both
electron/native/bin and electron/native/captions) and/or add x64 to the mac
target arches so the bundle contains darwin-x64 assets; confirm
scripts/build_macos.sh outputs match these arch names and that
electron/captions/whisper.ts will find `${request.platform}-${request.arch}`
(e.g. darwin-x64) at runtime.

Comment on lines +173 to +176
const safeJobId = request.jobId.replace(/[^a-zA-Z0-9_-]/g, "-") || "job";
const jobDir = path.join(tempRoot, `openscreen-captions-${safeJobId}`);
const wavPath = path.join(jobDir, "audio.wav");
const outputBase = path.join(jobDir, "transcript");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Temp dir naming can collide across different job IDs (lowkey risky).

Different jobId values can sanitize to the same safeJobId, so two concurrent jobs may share one jobDir; then one job can overwrite artifacts or delete the other job’s files in cleanup.

nit: safer unique temp dir
+import { randomUUID } from "node:crypto";
 import { spawn } from "node:child_process";
 import fs from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
@@
 	const tempRoot = request.tempDir ?? os.tmpdir();
 	const safeJobId = request.jobId.replace(/[^a-zA-Z0-9_-]/g, "-") || "job";
-	const jobDir = path.join(tempRoot, `openscreen-captions-${safeJobId}`);
+	const jobDir = path.join(
+		tempRoot,
+		`openscreen-captions-${safeJobId}-${randomUUID()}`,
+	);

Also applies to: 276-279

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@electron/captions/whisper.ts` around lines 173 - 176, safeJobId collision
risk: update the temp dir creation that uses request.jobId/safeJobId/jobDir (and
the same pattern around lines 276-279) to append a cryptographically unique
suffix (e.g., crypto.randomUUID() or a uuid v4) or include a timestamp+pid to
guarantee uniqueness; construct jobDir from tempRoot +
`openscreen-captions-${safeJobId}-${uniqueSuffix}` and update all derived paths
(wavPath, outputBase) and corresponding cleanup logic to use this new jobDir so
concurrent jobs cannot collide or delete each other’s files.

Comment thread electron/ipc/handlers.ts
Comment on lines +2857 to +2879
const jobId = options?.jobId ?? `caption-${Date.now()}`;
const normalizedPath = resolveApprovedVideoPath(videoPath);
if (!normalizedPath) {
return {
jobId,
status: "error",
segments: [],
message: "File path is not approved or is not a supported video file",
};
}

return await captionJobRegistry.start({
jobId,
videoPath: normalizedPath,
language: options?.language,
});
} catch (error) {
return {
jobId: options?.jobId ?? `caption-${Date.now()}`,
status: "error",
segments: [],
message: "Failed to generate captions",
error: String(error),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep jobId stable in the caption error path

right now the catch branch can generate a new timestamped jobId, so callers may get a different id than the one used for the attempted start. kinda cursed for state correlation.

small fix
-ipcMain.handle(
+ipcMain.handle(
   "start-caption-generation",
   async (
     _,
     videoPath: string,
     options?: import("../../src/lib/captions").CaptionGenerationOptions,
   ) => {
+    const jobId = options?.jobId ?? `caption-${Date.now()}`;
     try {
-      const jobId = options?.jobId ?? `caption-${Date.now()}`;
       const normalizedPath = resolveApprovedVideoPath(videoPath);
       if (!normalizedPath) {
         return {
           jobId,
@@
     } catch (error) {
       return {
-        jobId: options?.jobId ?? `caption-${Date.now()}`,
+        jobId,
         status: "error",
         segments: [],
         message: "Failed to generate captions",
         error: String(error),
       };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@electron/ipc/handlers.ts` around lines 2857 - 2879, Compute the jobId once
and reuse it in both the try and catch paths instead of recomputing with
Date.now(); move or declare const jobId = options?.jobId ??
`caption-${Date.now()}` outside/before the try block (or save it to a stable
variable at the top of the handler) so captionJobRegistry.start({ jobId, ... })
uses the same id and the catch return uses that same jobId (and keep the
existing error string/message and other fields like segments), ensuring
resolveApprovedVideoPath and captionJobRegistry.start still behave the same.

Comment on lines +490 to +492
aria-label={`${sticker.label} sticker`}
title={sticker.label}
onClick={() => onContentChange(sticker.dataUrl)}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sticker a11y text is hardcoded in English (not locale-aware).

lowkey risky for i18n/a11y: aria-label and title are built from sticker.label (English literals), so screen-reader/tooltip text won’t follow the active locale. Please switch this to translated keys (or translate by preset id) before rendering.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/components/video-editor/AnnotationSettingsPanel.tsx` around lines 490 -
492, The sticker accessibility text is currently using hardcoded English via
sticker.label for aria-label and title in AnnotationSettingsPanel; change this
to use the app's i18n translation function (e.g., t or translate) or map sticker
IDs to localized strings before rendering (e.g., t(`stickers.${sticker.id}`) or
a lookup like getStickerLabel(sticker.id)) and pass that translated value to
aria-label and title while keeping the onClick to
onContentChange(sticker.dataUrl) unchanged so tooltips and screen-readers follow
the active locale.

Comment on lines +42 to +48
const jobId = `caption-${Date.now()}`;
let cancelled = false;
let settled = false;
setStatus("running");

void window.electronAPI
.startCaptionGeneration(sourcePath, { jobId, ...(language ? { language } : {}) })

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

jobId is collision-prone, which makes cancellation lowkey risky.

Using only Date.now() can produce duplicate IDs during rapid effect replays; with async cancel IPC, that can cancel the wrong in-flight caption job.

nit: cleaner/safer job id generation
-		const jobId = `caption-${Date.now()}`;
+		const jobId = `caption-${Date.now()}-${Math.random().toString(36).slice(2)}`;

Also applies to: 82-88

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/components/video-editor/useAutoCaptionGeneration.ts` around lines 42 -
48, The jobId generation using `jobId = \`caption-${Date.now()}\`` is
collision-prone; change it to a collision-resistant ID (e.g., use
`crypto.randomUUID()` or a UUID/v4 generator) wherever `jobId` is created (both
in the block around `startCaptionGeneration(sourcePath, { jobId, ... })` and the
similar creation at lines 82-88) so that `startCaptionGeneration` and any
cancellation IPC target the correct in-flight caption job; ensure the same new
ID format is used when sending cancel requests so `cancel` logic matches the
unique job identifier.

const t = useScopedT("editor");
const ts = useScopedT("settings");
const availableLocales = getAvailableLocales();
const captionLanguage = useMemo(() => getWhisperLanguageForLocale(locale), [locale]);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Auto-captions can retrigger on locale changes and stack duplicate caption annotations.

Because language tracks live locale while autoCaptionSourcePath stays set, switching UI locale can start a fresh generation for the same media and append another caption batch. kinda cursed UX-wise.

small guard to make auto-run one-shot per loaded source
 	const handleCaptionStatusMessage = useCallback(
 		(result: CaptionGenerationResult) => {
+			if (result.status !== "cancelled") {
+				setAutoCaptionSourcePath(null);
+			}
+
 			if (result.status === "success" && result.segments.length > 0) {
 				toast.success(t("captions.generated", { count: result.segments.length }));
 				return;
 			}

Also applies to: 1971-1998

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/components/video-editor/VideoEditor.tsx` at line 287, The
caption-generation is being retriggered when locale changes because
captionLanguage (useMemo) updates while autoCaptionSourcePath remains set; add a
one-shot guard so auto-captions run once per loaded source: introduce a
ref/state like autoCaptionGeneratedForSourceRef and update it when you
start/finish generation, then in the effect that kicks off auto-captions check
that autoCaptionSourcePath is truthy and
autoCaptionGeneratedForSourceRef.current !== autoCaptionSourcePath before
running; reset or clear the ref when the media source truly changes (so future
new sources can generate) and keep existing symbols captionLanguage and
autoCaptionSourcePath to locate where to add the guard.

Comment on lines +147 to +161
const safeShowRecordingAnnotationOverlay = useCallback(async () => {
try {
await window.electronAPI?.showRecordingAnnotationOverlay?.();
} catch (error) {
console.warn("Failed to show recording annotation overlay:", error);
}
}, []);

const safeHideRecordingAnnotationOverlay = useCallback(async () => {
try {
await window.electronAPI?.hideRecordingAnnotationOverlay?.();
} catch (error) {
console.warn("Failed to hide recording annotation overlay:", error);
}
}, []);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Overlay IPC failures can be silently swallowed when success === false.

right now only thrown errors are logged; a non-throwing { success: false, error } response is ignored. that can make annotation overlay failures kinda invisible during recording. Consider checking the returned payload and logging/reporting when success is false.

nit: cleaner failure handling
 const safeShowRecordingAnnotationOverlay = useCallback(async () => {
 	try {
-		await window.electronAPI?.showRecordingAnnotationOverlay?.();
+		const result = await window.electronAPI?.showRecordingAnnotationOverlay?.();
+		if (result && !result.success) {
+			console.warn("Failed to show recording annotation overlay:", result.error);
+		}
 	} catch (error) {
 		console.warn("Failed to show recording annotation overlay:", error);
 	}
 }, []);

 const safeHideRecordingAnnotationOverlay = useCallback(async () => {
 	try {
-		await window.electronAPI?.hideRecordingAnnotationOverlay?.();
+		const result = await window.electronAPI?.hideRecordingAnnotationOverlay?.();
+		if (result && !result.success) {
+			console.warn("Failed to hide recording annotation overlay:", result.error);
+		}
 	} catch (error) {
 		console.warn("Failed to hide recording annotation overlay:", error);
 	}
 }, []);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/useScreenRecorder.ts` around lines 147 - 161,
safeShowRecordingAnnotationOverlay and safeHideRecordingAnnotationOverlay
currently only catch thrown exceptions and ignore IPC results; update both to
await the IPC call result, check if the returned payload has { success: false }
and log or report the included error/message (e.g., processLogger.warn or
console.warn) when success is false, keeping the try/catch for real exceptions;
ensure you reference the returned value (e.g., const res = await
window.electronAPI?.showRecordingAnnotationOverlay?.()) and handle res.success
=== false before returning.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant