Skip to content

Expand free-model catalogs; add /version, non-root Docker, and config sync#30

Merged
BillJr99 merged 1 commit into
mainfrom
claude/free-models-comparison-7vI4R
May 28, 2026
Merged

Expand free-model catalogs; add /version, non-root Docker, and config sync#30
BillJr99 merged 1 commit into
mainfrom
claude/free-models-comparison-7vI4R

Conversation

@BillJr99
Copy link
Copy Markdown
Owner

Summary

Refreshes the free-model catalogs for two already-supported providers and adds three quality-of-life fixes that came up while working through deployment. Four self-contained commits:

1. Expand believed-free model sets for GitHub Models and Google

Refreshed believed_free (+ model_reasoning, and free_limits for Google) using current free-tier catalogs, verified against the maintained cheahjs/free-llm-api-resources list:

  • GitHub Models: 1 → 38 models — OpenAI (gpt-4o/4.1/5 families, o1/o3/o4-mini), Meta Llama 3.x/4, Microsoft Phi-4, Mistral, Cohere, AI21 Jamba, DeepSeek, xAI Grok, each with a reasoning tag.
  • Google: 3 → 9 models — new Gemini 3 flash previews (gemini-3-flash-preview, gemini-3.1-flash-lite-preview) and the Gemma 3 instruct line (27b/12b/4b/1b), with free-tier limits.
  • The setup wizard reads these from free_models.json at runtime; config.example.json is regenerated from the same source of truth.
  • Extended the Google docs scraper to discover gemma-* IDs (not just gemini-*), so the new Gemma models are covered on future scrapes.

2. /version endpoint + gunicorn temp-dir permission fix

  • Add GET /version{"name": "llmproxy", "version": ...}. Without it, Flask 404s since the /v1/<path> pass-through only covers /v1/*.
  • Pass an explicit, writable worker_tmp_dir to gunicorn (prefers /dev/shm). In containers run read-only or with an arbitrary --user, Python's tempfile resolution can fall through to the CWD (/app), which a non-root user can't write — crashing startup with a PermissionError.

3. Non-root Docker by default + docs

  • Dockerfile adds a default non-root USER (uid 1000, gid 0) with group-writable config/home dirs, so docker run works without --user and also under an arbitrary --user <uid>:0 (OpenShift-style). HOME is fixed so Path.home() resolves for unmapped uids.
  • README: document /version in the endpoints table; note the image is non-root by default; update the named-volume config path.

4. --config PATH to sync a live config.json

The proxy reads believed_free / model_reasoning / free_limits at runtime from the user's config.json, not the package sidecar. scripts/update_free_models.py --config PATH now reconciles a live config in the same run:

  • Scope limited to providers configured in that file; custom/unconfigured providers and non-model keys (e.g. free_limits _note) untouched.
  • believed_free/free_limits synced (add new, remove no-longer-free); model_reasoning is add-only (never pruned), matching the scraper and config.example.json semantics.
  • Honors --dry-run, works under --regen-config-only, preserves providers/server and all other sections.

Test plan

  • Full suite green (239 passed), including new tests/test_scraper/test_config_sync.py (8 cases) and a /version route smoke test.
  • config.example.json regen test passes byte-for-byte; regen does not dirty the committed file.
  • End-to-end: --config dry-run shows a per-section diff and writes nothing; real run reconciles correctly and is idempotent on re-run; API keys / server / _note preserved.

Notes / caveats

  • Free-tier lists are best-effort (the file's own disclaimer applies) — GitHub model-ID casing and Google's tightening free tier can drift. Running update_free_models.py --dry-run from an environment with outbound web access will reconcile against live /v1/models.
  • I kept google/gemini-2.5-pro despite signals it may have moved to paid-only in 2026 — evidence was conflicting and the repo only removes on high-confidence negatives.
  • The Docker image change only takes effect once the image is rebuilt; I could not build it in CI here (no docker daemon), so a local docker build && docker run … && curl /version is worth doing before relying on it.

https://claude.ai/code/session_01Ey3m4qS4tWxG6kT5fF7YrA


Generated by Claude Code

The proxy reads believed_free / model_reasoning / free_limits at runtime from
the user's config.json, not the package sidecar. Add an optional --config PATH
so the scraper can reconcile a live config in the same run:

- Scope limited to providers configured in that file; custom/unconfigured
  providers and non-model keys (e.g. free_limits "_note") are left untouched.
- believed_free and free_limits are synced (add newly-free, remove
  no-longer-free); model_reasoning is add-only (never pruned), matching the
  scraper and config.example.json semantics.
- Honors --dry-run (prints a per-section diff, writes nothing) and also runs
  under --regen-config-only; preserves providers/server and all other sections.

Adds tests/test_scraper/test_config_sync.py and documents the flag in the
README "Keeping the free-models list current" section.
@BillJr99 BillJr99 merged commit a040cfe into main May 28, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants