Skip to content

feat: implement user agent handling across providers and normalize headers#8445

Open
Soulter wants to merge 1 commit into
masterfrom
feat/user-agent
Open

feat: implement user agent handling across providers and normalize headers#8445
Soulter wants to merge 1 commit into
masterfrom
feat/user-agent

Conversation

@Soulter
Copy link
Copy Markdown
Member

@Soulter Soulter commented May 30, 2026

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Standardize HTTP header handling across providers and ensure a consistent astrbot user agent is applied by default while preserving user-specified headers.

New Features:

  • Introduce a shared astrbot user agent string derived from the core version for use in outbound HTTP requests.

Enhancements:

  • Add reusable utilities for normalizing HTTP headers and applying default headers in a case-insensitive, non-destructive way.
  • Update OpenAI, Anthropic, Gemini, and Kimi Code providers to derive and propagate custom headers via the common utilities and use the default astrbot user agent when none is supplied.
  • Align default configuration for Anthropic-based providers to use the shared astrbot user agent instead of hard-coded values.
  • Ensure Gemini HTTP clients and underlying SDK client consistently receive the resolved custom headers, including the user agent.

Tests:

  • Add provider-level tests to verify default and custom user agent behavior for OpenAI, Anthropic, and Gemini providers.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels May 30, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The new normalize_headers helper currently only accepts dict and silently drops other Mapping types; consider widening the type check to Mapping to avoid surprising behavior when callers pass e.g. CaseInsensitiveDict or similar mapping implementations.
  • Header casing and normalization logic are now spread across providers (e.g. User-Agent vs user-agent handling in OpenAI, Anthropic, and Gemini); it may be worth centralizing this into a single case-insensitive header utility so that all providers apply and expose headers in a consistent way.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `normalize_headers` helper currently only accepts `dict` and silently drops other `Mapping` types; consider widening the type check to `Mapping` to avoid surprising behavior when callers pass e.g. `CaseInsensitiveDict` or similar mapping implementations.
- Header casing and normalization logic are now spread across providers (e.g. `User-Agent` vs `user-agent` handling in OpenAI, Anthropic, and Gemini); it may be worth centralizing this into a single case-insensitive header utility so that all providers apply and expose headers in a consistent way.

## Individual Comments

### Comment 1
<location path="tests/test_openai_source.py" line_range="71-80" />
<code_context>
     )


+def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
+    monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
+
+    provider = _make_provider()
+
+    assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
+    assert provider.client.kwargs["default_headers"] == {
+        "User-Agent": ASTRBOT_USER_AGENT,
+    }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Consider a test where a custom user agent is provided but empty/whitespace to ensure the default UA is applied.

We already test the default UA with no `custom_headers` and the case where a non-empty custom UA is preserved. Please also add a case where `custom_headers` includes `"User-Agent"` with an empty or whitespace value (e.g., `"   "`), and assert that `apply_default_headers` replaces it with `ASTRBOT_USER_AGENT` to cover this malformed-config fallback behavior.

```suggestion
def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
    monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)

    provider = _make_provider()

    assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
    assert provider.client.kwargs["default_headers"] == {
        "User-Agent": ASTRBOT_USER_AGENT,
    }


def test_openai_provider_uses_astrbot_default_user_agent_for_blank_custom_header(
    monkeypatch,
):
    """
    When a custom User-Agent header is configured but is empty/whitespace,
    ensure apply_default_headers replaces it with ASTRBOT_USER_AGENT.
    """
    monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)

    provider = _make_provider(
        overrides={
            "custom_headers": {
                "User-Agent": "   ",  # malformed/blank UA should be replaced
            },
        }
    )

    # The provider should normalize the header to the Astrbot default UA
    assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
    assert provider.client.kwargs["default_headers"] == {
        "User-Agent": ASTRBOT_USER_AGENT,
    }
```
</issue_to_address>

### Comment 2
<location path="tests/test_anthropic_kimi_code_provider.py" line_range="20-29" />
<code_context>
         return None


+def test_anthropic_provider_uses_astrbot_default_user_agent(monkeypatch):
+    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
+
+    provider = anthropic_source.ProviderAnthropic(
+        provider_config={
+            "id": "anthropic-test",
+            "type": "anthropic_chat_completion",
+            "model": "claude-test",
+            "key": ["test-key"],
+        },
+        provider_settings={},
+    )
+
+    assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
+    assert provider.client.kwargs["default_headers"] == {
+        "User-Agent": ASTRBOT_USER_AGENT,
+    }
</code_context>
<issue_to_address>
**suggestion (testing):** Extend Anthropics tests to cover custom headers with/without user agent and type normalization.

Consider adding a couple of nearby tests to fully cover the new header behavior:

- One where `provider_config.custom_headers` includes its own `User-Agent` plus extra headers, to verify a non-empty custom UA is preserved (and the default UA is not also applied) and that additional headers remain.
- Optionally, one where `custom_headers` includes non-string values (e.g., integers) to confirm normalization still coerces values to strings as before.

This will better demonstrate that existing Anthropics header behavior is preserved with the new shared helpers.

Suggested implementation:

```python
    assert provider.client.kwargs["default_headers"] == {
        "User-Agent": ASTRBOT_USER_AGENT,
    }


def test_anthropic_provider_preserves_custom_user_agent_and_headers(monkeypatch):
    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

    custom_headers = {
        "User-Agent": "MyApp/1.0 (test)",
        "X-Extra-Header": "extra-value",
    }

    provider = anthropic_source.ProviderAnthropic(
        provider_config={
            "id": "anthropic-test",
            "type": "anthropic_chat_completion",
            "model": "claude-test",
            "key": ["test-key"],
            "custom_headers": custom_headers,
        },
        provider_settings={},
    )

    # Custom non-empty User-Agent should be preserved, and the default ASTRBOT_USER_AGENT
    # should not be added on top.
    assert provider.custom_headers == custom_headers
    assert provider.client.kwargs["default_headers"] == custom_headers


def test_anthropic_provider_normalizes_custom_header_values_to_strings(monkeypatch):
    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

    custom_headers = {
        "User-Agent": "MyApp/2.0 (test)",
        "X-Int-Header": 123,
        "X-Bool-Header": True,
    }

    provider = anthropic_source.ProviderAnthropic(
        provider_config={
            "id": "anthropic-test",
            "type": "anthropic_chat_completion",
            "model": "claude-test",
            "key": ["test-key"],
            "custom_headers": custom_headers,
        },
        provider_settings={},
    )

    expected_headers = {
        "User-Agent": "MyApp/2.0 (test)",
        "X-Int-Header": "123",
        "X-Bool-Header": "True",
    }

    assert provider.custom_headers == expected_headers
    assert provider.client.kwargs["default_headers"] == expected_headers


def test_anthropic_provider_passes_custom_headers_via_default_headers(monkeypatch):
    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

```

If `ProviderAnthropic` currently expects `custom_headers` to be passed via `provider_settings` instead of `provider_config`, adjust the tests to move the `custom_headers` dict from `provider_config={...}` into `provider_settings={"custom_headers": custom_headers}` to match the existing convention in the rest of the file.
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/sources/gemini_source.py" line_range="81" />
<code_context>
         if isinstance(self.timeout, str):
             self.timeout = int(self.timeout)
         self.thinking_config = provider_config.get("anth_thinking_config", {})
-        self.custom_headers = self._resolve_custom_headers(provider_config)
+        self.custom_headers = self._resolve_custom_headers(
+            provider_config,
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying header handling by computing normalized headers once and reusing them everywhere instead of mutating Gemini client internals after creation.

You can simplify this without losing any functionality by:

1. Building the final, normalized headers once.
2. Passing them into both `HttpOptions` and `httpx.AsyncClient`.
3. Dropping the post-hoc mutation of Gemini client internals.

That removes indirection and reflection and keeps the header behavior centralized.

### 1. Simplify `_resolve_custom_headers`

Normalize keys (including `user-agent`) and apply defaults in one pass:

```python
@staticmethod
def _resolve_custom_headers(provider_config: dict) -> dict[str, str]:
    # Normalize keys/values
    raw = normalize_headers(provider_config.get("custom_headers", {}))

    # Normalize the user-agent key while building the dict
    normalized = {
        ("user-agent" if key.lower() == "user-agent" else key): value
        for key, value in raw.items()
    }

    # Apply defaults with the normalized UA key
    return apply_default_headers(
        normalized,
        {"user-agent": ASTRBOT_USER_AGENT},
    )
```

This removes the extra comprehension after `apply_default_headers`.

### 2. Remove `_set_gemini_user_agent` and avoid private internals

You already pass headers into `HttpOptions`, which are used by the Gemini client. If you ensure `self.custom_headers["user-agent"]` is always the final value (as above), you don’t need to mutate `http_options.headers` via private attributes:

```python
# Delete this helper entirely
# @staticmethod
# def _set_gemini_user_agent(...):
#     ...
```

Then simplify `_init_client` to rely only on the precomputed, final headers:

```python
def _init_client(self) -> None:
    """初始化Gemini客户端"""
    proxy = self.provider_config.get("proxy", "")

    http_options = types.HttpOptions(
        base_url=self.api_base,
        headers=self.custom_headers,
        timeout=self.timeout * 1000,  # 毫秒
    )

    async_client_kwargs: dict = {
        "base_url": self.api_base,
        "headers": self.custom_headers,
        "timeout": self.timeout,
    }
    if proxy:
        async_client_kwargs["proxy"] = proxy
        async_client_kwargs["trust_env"] = False
        logger.info("[Gemini] 使用代理")
    else:
        async_client_kwargs["trust_env"] = True

    if self._http_client is not None:
        self._stale_http_clients = [self._http_client]

    self._http_client = httpx.AsyncClient(**async_client_kwargs)
    http_options.httpx_async_client = self._http_client

    self.client = genai.Client(
        api_key=self.chosen_api_key,
        http_options=http_options,
    ).aio
```

This keeps:

- Single source of truth for headers (`self.custom_headers`).
- Consistent headers between `httpx.AsyncClient` and `HttpOptions`.
- No reliance on private `_api_client` / `_http_options`.
- No triple handling or post-hoc header mutation.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +71 to +80
def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)

provider = _make_provider()

assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider a test where a custom user agent is provided but empty/whitespace to ensure the default UA is applied.

We already test the default UA with no custom_headers and the case where a non-empty custom UA is preserved. Please also add a case where custom_headers includes "User-Agent" with an empty or whitespace value (e.g., " "), and assert that apply_default_headers replaces it with ASTRBOT_USER_AGENT to cover this malformed-config fallback behavior.

Suggested change
def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
provider = _make_provider()
assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}
def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
provider = _make_provider()
assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}
def test_openai_provider_uses_astrbot_default_user_agent_for_blank_custom_header(
monkeypatch,
):
"""
When a custom User-Agent header is configured but is empty/whitespace,
ensure apply_default_headers replaces it with ASTRBOT_USER_AGENT.
"""
monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
provider = _make_provider(
overrides={
"custom_headers": {
"User-Agent": " ", # malformed/blank UA should be replaced
},
}
)
# The provider should normalize the header to the Astrbot default UA
assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}

Comment on lines +20 to +29
def test_anthropic_provider_uses_astrbot_default_user_agent(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

provider = anthropic_source.ProviderAnthropic(
provider_config={
"id": "anthropic-test",
"type": "anthropic_chat_completion",
"model": "claude-test",
"key": ["test-key"],
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Extend Anthropics tests to cover custom headers with/without user agent and type normalization.

Consider adding a couple of nearby tests to fully cover the new header behavior:

  • One where provider_config.custom_headers includes its own User-Agent plus extra headers, to verify a non-empty custom UA is preserved (and the default UA is not also applied) and that additional headers remain.
  • Optionally, one where custom_headers includes non-string values (e.g., integers) to confirm normalization still coerces values to strings as before.

This will better demonstrate that existing Anthropics header behavior is preserved with the new shared helpers.

Suggested implementation:

    assert provider.client.kwargs["default_headers"] == {
        "User-Agent": ASTRBOT_USER_AGENT,
    }


def test_anthropic_provider_preserves_custom_user_agent_and_headers(monkeypatch):
    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

    custom_headers = {
        "User-Agent": "MyApp/1.0 (test)",
        "X-Extra-Header": "extra-value",
    }

    provider = anthropic_source.ProviderAnthropic(
        provider_config={
            "id": "anthropic-test",
            "type": "anthropic_chat_completion",
            "model": "claude-test",
            "key": ["test-key"],
            "custom_headers": custom_headers,
        },
        provider_settings={},
    )

    # Custom non-empty User-Agent should be preserved, and the default ASTRBOT_USER_AGENT
    # should not be added on top.
    assert provider.custom_headers == custom_headers
    assert provider.client.kwargs["default_headers"] == custom_headers


def test_anthropic_provider_normalizes_custom_header_values_to_strings(monkeypatch):
    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

    custom_headers = {
        "User-Agent": "MyApp/2.0 (test)",
        "X-Int-Header": 123,
        "X-Bool-Header": True,
    }

    provider = anthropic_source.ProviderAnthropic(
        provider_config={
            "id": "anthropic-test",
            "type": "anthropic_chat_completion",
            "model": "claude-test",
            "key": ["test-key"],
            "custom_headers": custom_headers,
        },
        provider_settings={},
    )

    expected_headers = {
        "User-Agent": "MyApp/2.0 (test)",
        "X-Int-Header": "123",
        "X-Bool-Header": "True",
    }

    assert provider.custom_headers == expected_headers
    assert provider.client.kwargs["default_headers"] == expected_headers


def test_anthropic_provider_passes_custom_headers_via_default_headers(monkeypatch):
    monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)

If ProviderAnthropic currently expects custom_headers to be passed via provider_settings instead of provider_config, adjust the tests to move the custom_headers dict from provider_config={...} into provider_settings={"custom_headers": custom_headers} to match the existing convention in the rest of the file.

if self.api_base and self.api_base.endswith("/"):
self.api_base = self.api_base[:-1]

self.custom_headers = self._resolve_custom_headers(provider_config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider simplifying header handling by computing normalized headers once and reusing them everywhere instead of mutating Gemini client internals after creation.

You can simplify this without losing any functionality by:

  1. Building the final, normalized headers once.
  2. Passing them into both HttpOptions and httpx.AsyncClient.
  3. Dropping the post-hoc mutation of Gemini client internals.

That removes indirection and reflection and keeps the header behavior centralized.

1. Simplify _resolve_custom_headers

Normalize keys (including user-agent) and apply defaults in one pass:

@staticmethod
def _resolve_custom_headers(provider_config: dict) -> dict[str, str]:
    # Normalize keys/values
    raw = normalize_headers(provider_config.get("custom_headers", {}))

    # Normalize the user-agent key while building the dict
    normalized = {
        ("user-agent" if key.lower() == "user-agent" else key): value
        for key, value in raw.items()
    }

    # Apply defaults with the normalized UA key
    return apply_default_headers(
        normalized,
        {"user-agent": ASTRBOT_USER_AGENT},
    )

This removes the extra comprehension after apply_default_headers.

2. Remove _set_gemini_user_agent and avoid private internals

You already pass headers into HttpOptions, which are used by the Gemini client. If you ensure self.custom_headers["user-agent"] is always the final value (as above), you don’t need to mutate http_options.headers via private attributes:

# Delete this helper entirely
# @staticmethod
# def _set_gemini_user_agent(...):
#     ...

Then simplify _init_client to rely only on the precomputed, final headers:

def _init_client(self) -> None:
    """初始化Gemini客户端"""
    proxy = self.provider_config.get("proxy", "")

    http_options = types.HttpOptions(
        base_url=self.api_base,
        headers=self.custom_headers,
        timeout=self.timeout * 1000,  # 毫秒
    )

    async_client_kwargs: dict = {
        "base_url": self.api_base,
        "headers": self.custom_headers,
        "timeout": self.timeout,
    }
    if proxy:
        async_client_kwargs["proxy"] = proxy
        async_client_kwargs["trust_env"] = False
        logger.info("[Gemini] 使用代理")
    else:
        async_client_kwargs["trust_env"] = True

    if self._http_client is not None:
        self._stale_http_clients = [self._http_client]

    self._http_client = httpx.AsyncClient(**async_client_kwargs)
    http_options.httpx_async_client = self._http_client

    self.client = genai.Client(
        api_key=self.chosen_api_key,
        http_options=http_options,
    ).aio

This keeps:

  • Single source of truth for headers (self.custom_headers).
  • Consistent headers between httpx.AsyncClient and HttpOptions.
  • No reliance on private _api_client / _http_options.
  • No triple handling or post-hoc header mutation.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request standardizes the User-Agent header across different LLM providers (Anthropic, Gemini, OpenAI, and Kimi) by introducing a default ASTRBOT_USER_AGENT and a helper utility for normalizing and applying default headers. It also adds corresponding unit tests. The review feedback highlights two key improvements: first, handling None values in normalize_headers to prevent sending literal "None" strings as header values, and second, adding a defensive type check in _set_gemini_user_agent to ensure http_options.headers is a dictionary before modifying it.

Comment on lines +4 to +7
def normalize_headers(headers: object) -> dict[str, str]:
if not isinstance(headers, dict):
return {}
return {str(key): str(value) for key, value in headers.items()}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In normalize_headers, if a header value is None (for example, if custom_headers contains {"User-Agent": null} in JSON), str(value) will convert it to the literal string "None". This can lead to invalid headers like User-Agent: None being sent to LLM providers, which might cause request failures or unexpected behavior.

We should defensively convert None values to an empty string "" so that apply_default_headers can correctly recognize them as empty and fall back to the default header values.

def normalize_headers(headers: object) -> dict[str, str]:
    if not isinstance(headers, dict):
        return {}
    return {
        str(key): str(value) if value is not None else ""
        for key, value in headers.items()
    }

Comment on lines +100 to +104
def _set_gemini_user_agent(client: object, user_agent: str) -> None:
api_client = getattr(client, "_api_client", None)
http_options = getattr(api_client, "_http_options", None)
if http_options is None or http_options.headers is None:
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential AttributeError or type mismatch issues if the internal structure of the Gemini SDK client changes in future versions, we should add a defensive type check to ensure http_options.headers is indeed a mutable dictionary before attempting to modify it.

Suggested change
def _set_gemini_user_agent(client: object, user_agent: str) -> None:
api_client = getattr(client, "_api_client", None)
http_options = getattr(api_client, "_http_options", None)
if http_options is None or http_options.headers is None:
return
@staticmethod
def _set_gemini_user_agent(client: object, user_agent: str) -> None:
api_client = getattr(client, "_api_client", None)
http_options = getattr(api_client, "_http_options", None)
if http_options is None or not isinstance(http_options.headers, dict):
return

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant