LCORE-2282: Normalize Vertex AI model IDs to workaround llama-stack 0.6.x bug#1818
LCORE-2282: Normalize Vertex AI model IDs to workaround llama-stack 0.6.x bug#1818anik120 wants to merge 1 commit into
Conversation
|
Warning Review limit reached
More reviews will be available in 25 minutes and 3 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (5)
WalkthroughThis PR adds Vertex AI model ID normalization to the Responses API integration. A new ChangesVertex AI Model ID Normalization
🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
42e92b3 to
5ce7bf4
Compare
….6.x bug Fixes 500 error when using Vertex AI models with llama-stack 0.6.x. Root cause: llama-stack 0.6.x inline::meta-reference responses provider normalizes model IDs before checking allowed_models, but doesn't normalize the allowed_models list itself. This causes validation to fail: - Model registered as: publishers/google/models/gemini-2.5-flash - llama-stack strips to: google/gemini-2.5-flash internally - Checks against allowed list with full prefix - Mismatch → 500 error Solution: Strip publishers/google/models/ prefix before passing model ID to llama-stack, matching what it expects internally. This workaround can be removed when upgrading to llama-stack 0.7.0+ which fixes the underlying bug via ogx-ai/ogx#5169 Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>
| Normalized model ID with Vertex AI prefix stripped if present | ||
| """ | ||
| if model_id.startswith("publishers/google/models/"): | ||
| return model_id.replace("publishers/google/models/", "google/", 1) |
There was a problem hiding this comment.
@anik120 is "publishers/google/models" some consistent prefix? I thought that the fix will be more variable (namely replacing any model prefix with google/ not hard-coded)
Description
Fixes 500 error when using Vertex AI models with llama-stack 0.6.x.
Root cause: llama-stack 0.6.x inline::meta-reference responses provider normalizes model IDs before checking allowed_models, but doesn't normalize the allowed_models list itself. This causes validation to fail:
Solution: Strip publishers/google/models/ prefix before passing model ID to llama-stack, matching what it expects internally.
This workaround can be removed when upgrading to llama-stack 0.7.0+ which fixes the underlying bug via ogx-ai/ogx#5169
Signed-off-by: Anik Bhattacharjee anbhatta@redhat.com
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Bug Fixes
Tests