|
| 1 | +# Podcast Generation System & TTS Worker Fixes |
| 2 | + |
| 3 | +- Added `{{@Podcast:}}` tag for AI-powered podcast generation from any document content |
| 4 | +- Built multi-speaker script generation with configurable styles (debate, interview, chat, lecture, storytelling) |
| 5 | +- Integrated Kokoro TTS multi-speaker synthesis with voice pre-fetching and per-chunk progress |
| 6 | +- Created podcast marketplace UI with pre-built podcast templates and search/filter |
| 7 | +- Added podcast template system with 15+ curated templates across tech, science, business categories |
| 8 | +- Built WAV audio creation from Float32Array TTS output with download support |
| 9 | +- Added real-time podcast generation progress UI with phase indicators (research → script → audio → done) |
| 10 | +- Extracted `processMultiSegments()` as standalone async function in TTS worker for reliable synthesis |
| 11 | +- Fixed: TTS worker message delivery bug — bundled segments with `init` message to process in same handler execution |
| 12 | +- Fixed: Service worker cache-first strategy serving stale `tts-worker.js` — added cache-busting `?v=` param |
| 13 | +- Fixed: Worker files now excluded from service worker `shouldCacheResponse()` caching |
| 14 | +- Bumped service worker cache version from `v2` → `v3` to force cache invalidation |
| 15 | +- Added `worker.onerror` handler on main thread to catch worker-level errors |
| 16 | +- Added TTS worker version identifier (`TTS_WORKER_VERSION`) with startup logging |
| 17 | +- Added `_pendingMultiSegments` backup mechanism in `status: ready` handler |
| 18 | +- Added per-chunk 90s timeout via `Promise.race` to prevent infinite synthesis hangs |
| 19 | +- Added event loop yields (`setTimeout(0)`) between WASM calls so `postMessage` flushes |
| 20 | +- Added voice pre-fetch phase before synthesis to separate network vs WASM issues |
| 21 | +- Added heartbeat logger (10s interval) during multi-speaker synthesis |
| 22 | +- Added detailed timestamped logging across `textToSpeech.js`, `tts-worker.js`, `podcast-docgen.js` |
| 23 | +- Added help mode entries for podcast generation feature |
| 24 | +- Added podcast renderer integration in `renderer.js` for `{{@Podcast:}}` tag processing |
| 25 | + |
| 26 | +--- |
| 27 | + |
| 28 | +## Summary |
| 29 | +Complete podcast generation system: users write `{{@Podcast: topic}}` in any document and get an AI-generated multi-speaker podcast with web research, script writing, and Kokoro TTS audio synthesis. Also fixed a critical TTS worker bug where the service worker's cache-first strategy served stale worker code, and the Web Worker silently dropped `speak-multi` messages sent after the async `init` handler completed. |
| 30 | + |
| 31 | +--- |
| 32 | + |
| 33 | +## 1. Podcast Document Generator (`{{@Podcast:}}` Tag) |
| 34 | +**Files:** `js/podcast-docgen.js`, `css/podcast-docgen.css` |
| 35 | +**What:** New IIFE component that intercepts `{{@Podcast: topic}}` tags in rendered markdown. Performs 3-phase generation: (1) web search research via Jina API, (2) AI script generation with `[Speaker]` markers, (3) Kokoro TTS multi-speaker audio synthesis. Includes `parseScript()` for speaker segmentation, `createWavBlob()` for audio encoding, and real-time progress UI with phase indicators. |
| 36 | +**Impact:** Users can generate full podcast episodes from any topic directly in their documents — no external tools needed. |
| 37 | + |
| 38 | +## 2. Podcast Marketplace |
| 39 | +**Files:** `js/podcast-marketplace.js`, `css/podcast-marketplace.css`, `js/templates/podcasts.js` |
| 40 | +**What:** Built a browsable marketplace UI with 15+ curated podcast templates across categories (Tech, Science, Business, Creative, Education). Includes search/filter, category tabs, template cards with metadata (duration, speakers, style), and one-click generation. Templates define speaker count, style, custom prompts, and voice assignments. |
| 41 | +**Impact:** Users can browse and generate podcasts from pre-built templates without writing prompts. |
| 42 | + |
| 43 | +## 3. TTS Worker Multi-Speaker Fix (Critical Bug) |
| 44 | +**Files:** `js/tts-worker.js`, `js/textToSpeech.js` |
| 45 | +**What:** The Web Worker silently dropped `speak-multi` messages sent after the async `init` handler completed. Root cause: service worker served cached `tts-worker.js` via cache-first strategy, AND the worker couldn't reliably process a second `postMessage` after `init`. Fix: (1) extracted `processMultiSegments()` as standalone function, (2) bundled segments with `init` message via `pendingSegments` field, (3) worker processes segments inline at end of init handler, (4) added cache-busting `?v=` param to worker URL, (5) added `_pendingMultiSegments` backup dispatch from `status: ready` handler. |
| 46 | +**Impact:** Podcast TTS synthesis now works reliably — previously it hung forever after model loaded. |
| 47 | + |
| 48 | +## 4. Service Worker Cache Fix |
| 49 | +**Files:** `sw.js` |
| 50 | +**What:** Bumped `CACHE_NAME` from `textagent-v2` to `textagent-v3` to invalidate stale caches. Added exclusion for `*worker*` files in `shouldCacheResponse()` so worker JS is always fetched fresh. This prevents the cache-first strategy from serving outdated worker code. |
| 51 | +**Impact:** Future worker code changes take effect immediately without manual cache clearing. |
| 52 | + |
| 53 | +## 5. TTS Synthesis Robustness |
| 54 | +**Files:** `js/tts-worker.js`, `js/textToSpeech.js` |
| 55 | +**What:** Added per-chunk 90s timeout (`Promise.race`), event loop yields between WASM calls (`await setTimeout(0)`), voice pre-fetch phase, heartbeat logger, `worker.onerror` handler, and version stamping. Failed chunks are skipped gracefully instead of aborting the entire podcast. |
| 56 | +**Impact:** Audio synthesis is more resilient — provides real-time progress, detects hangs, and degrades gracefully on failures. |
| 57 | + |
| 58 | +## 6. Integration & UI Updates |
| 59 | +**Files:** `index.html`, `js/renderer.js`, `js/templates.js`, `js/modal-templates.js`, `js/help-mode.js`, `src/main.js` |
| 60 | +**What:** Added podcast module imports in `main.js`, podcast tag processing in `renderer.js`, marketplace modal in `modal-templates.js`, help entries in `help-mode.js`, and toolbar button in `templates.js`. Updated `index.html` with podcast CSS imports. |
| 61 | +**Impact:** Podcast features are fully integrated into the TextAgent UI with discoverable entry points. |
| 62 | + |
| 63 | +--- |
| 64 | + |
| 65 | +## Files Changed (14 total) |
| 66 | + |
| 67 | +| File | Lines Changed | Type | |
| 68 | +|------|:---:|------| |
| 69 | +| `js/podcast-docgen.js` | +1046 | New: podcast generation engine | |
| 70 | +| `js/podcast-marketplace.js` | +923 | New: marketplace UI | |
| 71 | +| `css/podcast-marketplace.css` | +730 | New: marketplace styles | |
| 72 | +| `css/podcast-docgen.css` | +406 | New: podcast player styles | |
| 73 | +| `js/templates/podcasts.js` | +279 | New: podcast templates | |
| 74 | +| `js/textToSpeech.js` | +205 −30 | Multi-speaker fix, worker caching, error handling | |
| 75 | +| `js/tts-worker.js` | +203 −0 | processMultiSegments, bundled init, version stamp | |
| 76 | +| `index.html` | +30 −23 | CSS imports, podcast integration | |
| 77 | +| `js/renderer.js` | +12 −1 | Podcast tag processing | |
| 78 | +| `js/help-mode.js` | +11 | Podcast help entries | |
| 79 | +| `src/main.js` | +9 | Module imports | |
| 80 | +| `js/templates.js` | +4 −1 | Toolbar button | |
| 81 | +| `sw.js` | +3 −1 | Cache version bump, worker exclusion | |
| 82 | +| `js/modal-templates.js` | +1 | Marketplace modal | |
0 commit comments