Skip to content

LCORE-2308: LlamaStack Pydantic AI Provider#1806

Open
jrobertboos wants to merge 6 commits into
lightspeed-core:mainfrom
jrobertboos:lcore-2308
Open

LCORE-2308: LlamaStack Pydantic AI Provider#1806
jrobertboos wants to merge 6 commits into
lightspeed-core:mainfrom
jrobertboos:lcore-2308

Conversation

@jrobertboos
Copy link
Copy Markdown
Contributor

@jrobertboos jrobertboos commented May 27, 2026

Description

Added Pydantic AI provider to use llama stack as a provider for Pydantic AI models.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Dependencies

    • Added pydantic-ai>=1.99.0 dependency to support new integrations
  • New Features

    • Llama Stack provider enabling Pydantic AI integration with Llama Stack systems
    • Flexible initialization supporting both library-mode and server-mode configurations
    • Full OpenAI-compatible API support including streaming operations
  • Tests

    • Comprehensive unit tests validating provider properties, configuration modes, and request handling

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Review Change Stack

Walkthrough

This PR adds a Pydantic AI provider that integrates with Llama Stack, enabling OpenAI-compatible API calls to be routed through in-process Llama Stack route handlers or remote Llama Stack servers. The implementation includes a dual-mode LlamaStackProvider, a custom httpx transport for request routing, comprehensive test coverage, and the required pydantic-ai dependency.

Changes

Pydantic AI Llama Stack Integration

Layer / File(s) Summary
Project dependencies and package structure
pyproject.toml, src/pydantic_ai_lightspeed/__init__.py, src/pydantic_ai_lightspeed/llamastack/__init__.py, tests/unit/pydantic_ai_lightspeed/*
Added pydantic-ai>=1.99.0 dependency, root package docstring, and public API exports with LlamaStackProvider as the primary export from the llamastack subpackage.
LlamaStackProvider implementation
src/pydantic_ai_lightspeed/llamastack/_provider.py
Implements LlamaStackProvider(Provider[AsyncOpenAI]) supporting two initialization modes: library mode accepts AsyncLlamaStackAsLibraryClient and builds an in-process transport; server mode accepts base_url and optional api_key. Exposes name, base_url, and client properties, static model_profile() lookup, __repr__() for debugging, and _set_http_client() for HTTP client mutation.
LlamaStackLibraryTransport
src/pydantic_ai_lightspeed/llamastack/_transport.py
Custom httpx.AsyncBaseTransport intercepts OpenAI-compatible requests and routes them through Llama Stack in-process handlers; handles both non-streaming JSON responses and streaming server-sent events, with provider-data header injection and context preservation across streaming operations.
LlamaStackProvider test suite
tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py
Five test classes covering provider properties, server-mode initialization variants, library-mode initialization, mutual-exclusion constraints between configuration modes, and HTTP client replacement behavior.
LlamaStackLibraryTransport test suite
tests/unit/pydantic_ai_lightspeed/llamastack/test_transport.py
Five test classes validating async byte stream iteration, transport initialization, request dispatching, provider-data header injection, path parameter merging, DELETE response handling, and SSE frame formatting for streaming responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • radofuchs
  • tisnik
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 79.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: adding a LlamaStack Pydantic AI Provider implementation, which is directly reflected in the new provider class, transport, and comprehensive test suite.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_transport.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_transport.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Fixed
Comment thread src/pydantic_ai_lightspeed/llamastack/_transport.py Fixed
@jrobertboos jrobertboos marked this pull request as ready for review June 1, 2026 14:16
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (3)
src/pydantic_ai_lightspeed/llamastack/_provider.py (2)

85-87: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Configure a timeout on the library-mode httpx.AsyncClient.

Although this transport dispatches in-process, the client is constructed without an explicit timeout (also flagged by static analysis). Set an explicit timeout to keep behavior predictable and silence the scanner.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/pydantic_ai_lightspeed/llamastack/_provider.py` around lines 85 - 87, The
AsyncClient instantiation for lib_http_client is missing an explicit timeout;
update the httpx.AsyncClient(...) call (where lib_http_client is created,
passing the existing transport and base_url) to include a timeout parameter
(e.g., timeout=httpx.Timeout(10.0) or another sensible duration) so the client
has a bounded request timeout and the static scanner is satisfied.

74-81: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace assert with explicit exceptions for argument validation.

These mutual-exclusion checks are stripped when Python runs with -O, so the constraint silently disappears in optimized builds. Raise a proper error instead.

🛡️ Proposed fix
-        if library_client is not None:
-            assert (
-                base_url is None
-            ), "Cannot provide both `library_client` and `base_url`"
-            assert api_key is None, "Cannot provide both `library_client` and `api_key`"
-            assert (
-                http_client is None
-            ), "Cannot provide both `library_client` and `http_client`"
+        if library_client is not None:
+            if base_url is not None:
+                raise ValueError("Cannot provide both `library_client` and `base_url`")
+            if api_key is not None:
+                raise ValueError("Cannot provide both `library_client` and `api_key`")
+            if http_client is not None:
+                raise ValueError("Cannot provide both `library_client` and `http_client`")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/pydantic_ai_lightspeed/llamastack/_provider.py` around lines 74 - 81,
Replace the runtime-optimized asserts in the mutual-exclusion checks with
explicit exceptions so validation cannot be stripped under -O: in the block
checking library_client against base_url, api_key, and http_client (symbols:
library_client, base_url, api_key, http_client) raise a clear exception (e.g.,
ValueError or TypeError) for each conflict instead of using assert, preserving
the existing messages like "Cannot provide both `library_client` and `base_url`"
etc.; ensure each check uses raise with the same descriptive text so callers get
deterministic validation errors regardless of Python optimization flags.
src/pydantic_ai_lightspeed/llamastack/_transport.py (1)

100-100: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid assert for runtime invariants.

assert self._client.route_impls is not None is removed under python -O. Since handle_async_request already raises RuntimeError when route_impls is None, narrow the type explicitly or raise instead of asserting. Same applies to Line 133.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/pydantic_ai_lightspeed/llamastack/_transport.py` at line 100, Replace the
runtime `assert` checks on `self._client.route_impls` with an explicit
None-check and raise a RuntimeError (or use typing.cast after the check) so the
invariant holds even under -O; specifically, in handle_async_request (and the
similar occurrence at line 133) change `assert self._client.route_impls is not
None` to an explicit `if self._client.route_impls is None: raise
RuntimeError("route_impls is not initialized")` and then assign `route_impls =
self._client.route_impls` (or `route_impls = cast(ConcreteType,
self._client.route_impls)`) to narrow the type for subsequent code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/pydantic_ai_lightspeed/llamastack/__init__.py`:
- Line 3: Replace the relative import in the package initializer by importing
the provider using its absolute module path: change the import that references
LlamaStackProvider (from ._provider import LlamaStackProvider) to use the full
package path pydantic_ai_lightspeed.llamastack._provider so that the symbol
LlamaStackProvider is imported via absolute import.

In `@src/pydantic_ai_lightspeed/llamastack/_provider.py`:
- Around line 111-112: Add a descriptive docstring to the _set_http_client
method explaining its purpose and behavior: state that it injects an
httpx.AsyncClient into the underlying client by setting self._client._client
(note the protected attribute access), describe expected parameter http_client:
httpx.AsyncClient and that the method returns None, and mention any side effects
or threading/async considerations if relevant; place this docstring immediately
above def _set_http_client to satisfy project conventions.
- Line 14: The import of LlamaStackLibraryTransport uses a relative import;
change it to an absolute import so internal modules follow the codebase
guideline — replace "from ._transport import LlamaStackLibraryTransport" with an
absolute import referencing the package path that exposes _transport (import the
LlamaStackLibraryTransport symbol via its full module path) and ensure any tests
or import consumers still resolve the new absolute name.

In `@src/pydantic_ai_lightspeed/llamastack/_transport.py`:
- Line 105: The code mutates the caller's body dict via the in-place union
operator (body |= path_params); instead, create a new merged mapping and use
that local variable (e.g., merged_body = {**(body or {}), **(path_params or
{})}) so the original body parameter is not modified, and replace both
occurrences of the in-place mutation (the body |= path_params at the shown
location and the similar one later) with assignment to a new dict variable used
for downstream processing.
- Around line 25-30: Add concise docstrings for the _AsyncByteStream class
methods: document __init__(self, gen: AsyncGenerator[bytes, None]) to explain
that it stores an async generator that yields raw bytes (mention the parameter
name "gen" and its type/purpose), and document async def __aiter__(self) ->
AsyncIterator[bytes] to state that it returns an async iterator yielding bytes
chunks for streaming (matching httpx.AsyncByteStream contract). Follow project
convention style (short summary line, param description for gen, and return
description indicating an async iterator of bytes).
- Around line 93-99: Both helper methods _handle_non_streaming and
_handle_streaming are missing docstrings; add descriptive docstrings to each
function that summarize purpose, list parameters with types (request:
httpx.Request, method: str, path: str, body: dict[str, Any] for
_handle_non_streaming and the appropriate params for _handle_streaming),
describe the return type (httpx.Response or async stream/response behavior), and
mention possible exceptions/errors raised. Place the docstrings immediately
below each def, using triple-quoted strings and concise sentences following the
project's docstring style so linters and readers can understand intent and
types.

In `@tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py`:
- Around line 70-75: The test currently only checks that provider.client is not
None, which won't detect if the constructor ignored the provided http_client;
update test_custom_http_client to assert identity against the provided instance
(e.g., assert provider.client is custom_client) so it verifies
LlamaStackProvider(http_client=custom_client) actually wires the same client
into provider.client.
- Around line 72-73: The tests leak httpx.AsyncClient instances because
LlamaStackProvider (and its _set_http_client path that assigns to AsyncOpenAI)
never closes a provided client; update the failing tests to either (a) replace
real httpx.AsyncClient() with a mock/stub (e.g., monkeypatch or a simple dummy
object) when testing non-I/O logic, or (b) convert the tests that create
httpx.AsyncClient() into async tests (use pytest.mark.asyncio) and explicitly
close the client after use with await custom_client.aclose() (or use "async with
httpx.AsyncClient() as custom_client:"), ensuring any use of LlamaStackProvider,
_set_http_client, or AsyncOpenAI in those tests does not leave open connections.

---

Duplicate comments:
In `@src/pydantic_ai_lightspeed/llamastack/_provider.py`:
- Around line 85-87: The AsyncClient instantiation for lib_http_client is
missing an explicit timeout; update the httpx.AsyncClient(...) call (where
lib_http_client is created, passing the existing transport and base_url) to
include a timeout parameter (e.g., timeout=httpx.Timeout(10.0) or another
sensible duration) so the client has a bounded request timeout and the static
scanner is satisfied.
- Around line 74-81: Replace the runtime-optimized asserts in the
mutual-exclusion checks with explicit exceptions so validation cannot be
stripped under -O: in the block checking library_client against base_url,
api_key, and http_client (symbols: library_client, base_url, api_key,
http_client) raise a clear exception (e.g., ValueError or TypeError) for each
conflict instead of using assert, preserving the existing messages like "Cannot
provide both `library_client` and `base_url`" etc.; ensure each check uses raise
with the same descriptive text so callers get deterministic validation errors
regardless of Python optimization flags.

In `@src/pydantic_ai_lightspeed/llamastack/_transport.py`:
- Line 100: Replace the runtime `assert` checks on `self._client.route_impls`
with an explicit None-check and raise a RuntimeError (or use typing.cast after
the check) so the invariant holds even under -O; specifically, in
handle_async_request (and the similar occurrence at line 133) change `assert
self._client.route_impls is not None` to an explicit `if
self._client.route_impls is None: raise RuntimeError("route_impls is not
initialized")` and then assign `route_impls = self._client.route_impls` (or
`route_impls = cast(ConcreteType, self._client.route_impls)`) to narrow the type
for subsequent code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b16d8b64-3aea-4572-a5f6-de9532c70e71

📥 Commits

Reviewing files that changed from the base of the PR and between 9700c70 and 7d77ef4.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • pyproject.toml
  • src/pydantic_ai_lightspeed/__init__.py
  • src/pydantic_ai_lightspeed/llamastack/__init__.py
  • src/pydantic_ai_lightspeed/llamastack/_provider.py
  • src/pydantic_ai_lightspeed/llamastack/_transport.py
  • tests/unit/pydantic_ai_lightspeed/__init__.py
  • tests/unit/pydantic_ai_lightspeed/llamastack/__init__.py
  • tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py
  • tests/unit/pydantic_ai_lightspeed/llamastack/test_transport.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/pydantic_ai_lightspeed/llamastack/__init__.py
  • tests/unit/pydantic_ai_lightspeed/__init__.py
  • tests/unit/pydantic_ai_lightspeed/llamastack/test_transport.py
  • tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/pydantic_ai_lightspeed/__init__.py
  • src/pydantic_ai_lightspeed/llamastack/__init__.py
  • src/pydantic_ai_lightspeed/llamastack/_transport.py
  • src/pydantic_ai_lightspeed/llamastack/_provider.py
src/**/__init__.py

📄 CodeRabbit inference engine (AGENTS.md)

Package __init__.py files must contain brief package descriptions

Files:

  • src/pydantic_ai_lightspeed/__init__.py
  • src/pydantic_ai_lightspeed/llamastack/__init__.py
🔇 Additional comments (10)
src/pydantic_ai_lightspeed/__init__.py (1)

1-1: LGTM!

src/pydantic_ai_lightspeed/llamastack/__init__.py (2)

1-1: LGTM!


5-5: LGTM!

tests/unit/pydantic_ai_lightspeed/__init__.py (1)

1-1: LGTM!

tests/unit/pydantic_ai_lightspeed/llamastack/__init__.py (1)

1-1: LGTM!

pyproject.toml (1)

82-83: ⚡ Quick win

Confirm pydantic-ai>=1.99.0 is valid.

pydantic-ai has a published release 1.99.0 on PyPI, so the dependency constraint pydantic-ai>=1.99.0 is not based on a non-existent version.

src/pydantic_ai_lightspeed/llamastack/_provider.py (1)

31-49: LGTM!

src/pydantic_ai_lightspeed/llamastack/_transport.py (1)

10-19: Guard against private Llama Stack internals breakage

src/pydantic_ai_lightspeed/llamastack/_transport.py imports several non-public Llama Stack internals (convert_pydantic_to_json_value, PROVIDER_DATA_VAR, request_provider_data_context, find_matching_route, preserve_contexts_async_generator, plus later route_impls/_convert_body). The repo pins llama-stack==0.6.0 (and related *-client/*-api), but the unit tests mostly patch these symbols rather than exercising real imports from the pinned package. Add a small CI smoke test that installs the pinned llama-stack and imports/uses these symbols, and consider narrowing the version range to reduce silent breakage when internals change.

tests/unit/pydantic_ai_lightspeed/llamastack/test_transport.py (1)

18-344: LGTM!

tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py (1)

1-69: LGTM!

Also applies to: 77-141, 146-151, 155-157

Comment thread src/pydantic_ai_lightspeed/llamastack/__init__.py Outdated
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py Outdated
Comment thread src/pydantic_ai_lightspeed/llamastack/_provider.py
Comment thread src/pydantic_ai_lightspeed/llamastack/_transport.py
Comment thread src/pydantic_ai_lightspeed/llamastack/_transport.py
Comment thread src/pydantic_ai_lightspeed/llamastack/_transport.py Outdated
Comment thread tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py Outdated
Comment thread tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py Outdated
@jrobertboos
Copy link
Copy Markdown
Contributor Author

Please Review:

Comment on lines +86 to +87
lib_http_client = httpx.AsyncClient(
transport=transport,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants