Expand believed-free model sets for GitHub Models and Google#29
Merged
Conversation
Refresh the believed_free lists for two already-supported providers using the current free-tier catalogs (verified against the maintained cheahjs/free-llm-api-resources list): - GitHub Models: 1 -> 38 models, covering the free catalog across OpenAI (gpt-4o/4.1/5 families, o1/o3/o4-mini), Meta Llama 3.x/4, Microsoft Phi-4, Mistral, Cohere, AI21 Jamba, DeepSeek, and xAI Grok, each with a reasoning-level tag. - Google: add the new free Gemini 3 flash previews (gemini-3-flash-preview, gemini-3.1-flash-lite-preview) and the Gemma 3 instruct line (27b/12b/4b/1b-it), with free_limits from the published free-tier quotas. The setup wizard reads these from free_models.json at runtime, and config.example.json is regenerated from the same source of truth. Also extend the Google docs scraper (the keep-up-to-date source) to discover gemma-* IDs in addition to gemini-*, so the newly added Gemma models are covered on future scrapes.
…ocker
- Add a GET /version route returning {"name": "llmproxy", "version": ...}.
Clients and uptime probes poll /version; without an explicit route Flask
404s, since the /v1/<path> pass-through only covers /v1/* paths.
- Pass an explicit, writable worker_tmp_dir to gunicorn. Gunicorn writes a
per-worker heartbeat file to the system temp dir; in containers run
read-only or with an arbitrary --user, Python's tempfile resolution can
fall through to the CWD (/app), which the non-root user cannot write,
crashing startup with a PermissionError. Prefer /dev/shm (gunicorn's
documented Docker recommendation), then the system temp dir.
Adds a /version smoke test.
- Dockerfile: add a default non-root USER (uid 1000, gid 0) with group-writable /config and home dirs, so `docker run` works without the --user flag and also under an arbitrary `--user <uid>:0` (OpenShift-style). HOME is fixed so Path.home() resolves for unmapped uids. Update the header examples and the named-volume mount path accordingly. - README: add /version to the API endpoints table, note the image is non-root by default (--user now optional, used for bind-mount ownership), and update the named-volume config path to the non-root user's home.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refresh the believed_free lists for two already-supported providers using
the current free-tier catalogs (verified against the maintained
cheahjs/free-llm-api-resources list):
(gpt-4o/4.1/5 families, o1/o3/o4-mini), Meta Llama 3.x/4, Microsoft
Phi-4, Mistral, Cohere, AI21 Jamba, DeepSeek, and xAI Grok, each with a
reasoning-level tag.
(gemini-3-flash-preview, gemini-3.1-flash-lite-preview) and the Gemma 3
instruct line (27b/12b/4b/1b-it), with free_limits from the published
free-tier quotas.
The setup wizard reads these from free_models.json at runtime, and
config.example.json is regenerated from the same source of truth.
Also extend the Google docs scraper (the keep-up-to-date source) to
discover gemma-* IDs in addition to gemini-*, so the newly added Gemma
models are covered on future scrapes.