Skip to content

[AI-Generated] feat: auto-discover models from custom OpenAI-compatible endpoints#527

Open
HWiese1980 wants to merge 2 commits into
AsyncFuncAI:mainfrom
HWiese1980:feature/auto-discover-models
Open

[AI-Generated] feat: auto-discover models from custom OpenAI-compatible endpoints#527
HWiese1980 wants to merge 2 commits into
AsyncFuncAI:mainfrom
HWiese1980:feature/auto-discover-models

Conversation

@HWiese1980
Copy link
Copy Markdown

@HWiese1980 HWiese1980 commented May 26, 2026

⚠️ Disclosure

This PR was fully AI-generated (code, tests, documentation, and this PR description) using Claude (Anthropic) via Hermes Agent. A human reviewed and approved the intent and final result, but did not write any of the code manually.


Summary

When OPENAI_BASE_URL is set (indicating a custom OpenAI-compatible endpoint), the /models/config API endpoint now dynamically discovers available models by querying the endpoint's /v1/models API.

This enables seamless integration with self-hosted LLM servers like llama.cpp, vLLM, LocalAI, and text-generation-webui without requiring manual model configuration in generator.json.

Changes

  • api/api.py: Add _discover_openai_models() async helper + modify get_model_config() to use it for the openai provider when OPENAI_BASE_URL is set
  • tests/unit/test_model_discovery.py: Unit tests for discovery success, failure fallback, empty response, and URL construction
  • README.md: Document custom endpoint configuration and automatic model discovery

Behavior

Condition Result
OPENAI_BASE_URL not set Uses generator.json models (unchanged)
OPENAI_BASE_URL set + endpoint reachable Discovers models from /v1/models
OPENAI_BASE_URL set + endpoint unreachable Logs warning, falls back to generator.json

Example

# llama.cpp server running locally
export OPENAI_BASE_URL=http://localhost:8000/v1
export OPENAI_API_KEY=dummy

The model loaded in llama.cpp will automatically appear in the UI dropdown.

No new dependencies

Uses aiohttp which is already in the project's dependencies.

When OPENAI_BASE_URL is set, the /models/config endpoint now queries
the custom endpoint's /v1/models API to dynamically discover available
models instead of relying solely on the static generator.json config.

This enables seamless integration with self-hosted LLM servers like
llama.cpp, vLLM, LocalAI, and text-generation-webui without requiring
manual model configuration.

Key changes:
- Add _discover_openai_models() helper that queries /v1/models
- Modify get_model_config() to use discovery for the openai provider
  when OPENAI_BASE_URL is set
- Graceful fallback to generator.json if discovery fails
- Add unit tests for the discovery feature
- Document the feature in README.md
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces dynamic model discovery for custom OpenAI-compatible endpoints by querying their /models API when OPENAI_BASE_URL is configured, falling back gracefully to static configurations on failure. The feedback suggests improving the robustness of the _discover_openai_models helper function by passing the OPENAI_API_KEY in the request headers, reducing the timeout from 10 to 3 seconds to prevent UI hangs, and adding defensive type checks when parsing the JSON response.

Comment thread api/api.py
Comment on lines +181 to +199
try:
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
async with session.get(models_url) as response:
response.raise_for_status()
data = await response.json()

discovered = []
# OpenAI /v1/models returns {"data": [{"id": "model-name", ...}, ...]}
for model_info in data.get("data", []):
model_id = model_info.get("id", "")
if model_id:
discovered.append(Model(id=model_id, name=model_id))

if discovered:
logger.info(f"Discovered {len(discovered)} models from {models_url}")
return discovered
except Exception as e:
logger.warning(f"Failed to discover models from {models_url}: {e}")
return []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This helper function can be improved in three key areas to enhance correctness, performance, and robustness:

  1. Authentication Support: Many OpenAI-compatible custom endpoints (such as OpenRouter, Together AI, or private endpoints with token authentication) require an Authorization header even for the /v1/models endpoint. Since OPENAI_API_KEY is configured in the environment, we should pass it as a Bearer token in the request headers.
  2. Reduced Timeout: A 10-second timeout is excessively long for a blocking GET endpoint (/models/config). If the custom endpoint is offline or slow, the entire UI/API will hang for 10 seconds before falling back to static models. Reducing this to 3 seconds provides a much better user experience.
  3. Defensive Parsing: We should defensively verify that the returned JSON data and each model_info item are dictionaries before attempting to access them, preventing potential AttributeError exceptions if a non-compliant endpoint returns an unexpected format.
    headers = {}
    api_key = os.environ.get("OPENAI_API_KEY")
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"

    try:
        async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=3)) as session:
            async with session.get(models_url, headers=headers) as response:
                response.raise_for_status()
                data = await response.json()

                discovered = []
                if isinstance(data, dict):
                    # OpenAI /v1/models returns {"data": [{"id": "model-name", ...}, ...]}
                    for model_info in data.get("data", []):
                        if isinstance(model_info, dict):
                            model_id = model_info.get("id", "")
                            if model_id:
                                discovered.append(Model(id=model_id, name=model_id))

                if discovered:
                    logger.info(f"Discovered {len(discovered)} models from {models_url}")
                return discovered
    except Exception as e:
        logger.warning(f"Failed to discover models from {models_url}: {e}")
        return []

@HWiese1980 HWiese1980 changed the title feat: auto-discover models from custom OpenAI-compatible endpoints [AI-Generated] feat: auto-discover models from custom OpenAI-compatible endpoints May 26, 2026
@HWiese1980 HWiese1980 force-pushed the feature/auto-discover-models branch 2 times, most recently from c7034c7 to b533a39 Compare May 26, 2026 08:14
…ve parsing

- Pass OPENAI_API_KEY as Bearer token in discovery request headers
- Reduce timeout from 10s to 3s to prevent UI hangs
- Add isinstance checks for defensive JSON parsing
@HWiese1980 HWiese1980 force-pushed the feature/auto-discover-models branch from b533a39 to e10ac16 Compare May 26, 2026 08:16
@HWiese1980
Copy link
Copy Markdown
Author

Human here: ready to be merged in my opinion... also happy to hear some feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant