feat(cli): support any OpenAI-compatible vision endpoint for image captioning by yuemeng200 · Pull Request #1809 · heygen-com/hyperframes

yuemeng200 · 2026-06-30T10:45:34Z

Summary

The hyperframes capture image captioning feature currently supports only two hardcoded providers: Google Gemini and OpenRouter. This blocks users outside of Google's supported regions (e.g. mainland China) from getting AI-driven asset descriptions, since neither Gemini nor OpenRouter is easily accessible there.

This PR adds three new environment variables that let any OpenAI-compatible vision endpoint be used:

Variable	Description
`HYPERFRAMES_VISION_API_KEY`	Bearer token for the custom endpoint
`HYPERFRAMES_VISION_BASE_URL`	Base URL (e.g. `https://ark.cn-beijing.volces.com/api/v3`)
`HYPERFRAMES_VISION_MODEL`	Model ID (required; warns and skips if unset)

Provider priority (first match wins):

HYPERFRAMES_VISION_* — custom OpenAI-compatible endpoint ← new
OPENROUTER_API_KEY — OpenRouter (unchanged)
GEMINI_API_KEY / GOOGLE_API_KEY — Google Gemini (unchanged)

Example (Volcengine ARK / Doubao vision):

HYPERFRAMES_VISION_API_KEY=ark-xxx
HYPERFRAMES_VISION_BASE_URL=https://ark.cn-beijing.volces.com/api/v3
HYPERFRAMES_VISION_MODEL=doubao-seed-2-0-mini-260428

Works equally well with Azure OpenAI, local Ollama, self-hosted vLLM, or any other endpoint that speaks the OpenAI chat completions wire format with image_url content parts.

Implementation notes

The OpenRouter fetch path is refactored to share the new openAiCompatCaptionOne helper — no behaviour change for existing OpenRouter users.
If HYPERFRAMES_VISION_API_KEY and HYPERFRAMES_VISION_BASE_URL are set but HYPERFRAMES_VISION_MODEL is missing, a warning is pushed and captioning is skipped gracefully (same degradation pattern as a missing key).

Tests

4 new unit tests added to contentExtractor.test.ts:

Custom endpoint happy path (verifies URL, auth header, model, base64 image payload)
Custom endpoint takes priority over OPENROUTER_API_KEY when both are set
Missing HYPERFRAMES_VISION_MODEL → warning emitted, fetch not called
No-key path now also stubs HYPERFRAMES_VISION_API_KEY for completeness

All 6 tests pass (bunx vitest run). Build clean (bun run build). Lint + format clean (oxlint + oxfmt).

…ible vision endpoint Allow any OpenAI-compatible vision API (Volcengine ARK, Azure OpenAI, Ollama, vLLM, etc.) to be used for image captioning during `hyperframes capture`. New env vars (highest priority, checked before OPENROUTER_API_KEY): HYPERFRAMES_VISION_API_KEY - bearer token for the custom endpoint HYPERFRAMES_VISION_BASE_URL - base URL (e.g. https://ark.cn-beijing.volces.com/api/v3) HYPERFRAMES_VISION_MODEL - model ID (required; warns and skips if missing) Priority order: HYPERFRAMES_VISION_* > OPENROUTER_API_KEY > GEMINI/GOOGLE_API_KEY The OpenRouter path is refactored to share the same openAiCompatCaptionOne helper used by the custom endpoint — no behaviour change for existing users. Adds 4 new tests covering: custom endpoint happy path, priority over OpenRouter, missing-model warning, and the no-key skip path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cli): support any OpenAI-compatible vision endpoint for image captioning#1809

feat(cli): support any OpenAI-compatible vision endpoint for image captioning#1809
yuemeng200 wants to merge 1 commit into
heygen-com:mainfrom
yuemeng200:feat/vision-openai-compatible-endpoint

yuemeng200 commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

yuemeng200 commented Jun 30, 2026

Summary

Implementation notes

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant