feat: implement user agent handling across providers and normalize headers#8445
feat: implement user agent handling across providers and normalize headers#8445Soulter wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The new
normalize_headershelper currently only acceptsdictand silently drops otherMappingtypes; consider widening the type check toMappingto avoid surprising behavior when callers pass e.g.CaseInsensitiveDictor similar mapping implementations. - Header casing and normalization logic are now spread across providers (e.g.
User-Agentvsuser-agenthandling in OpenAI, Anthropic, and Gemini); it may be worth centralizing this into a single case-insensitive header utility so that all providers apply and expose headers in a consistent way.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `normalize_headers` helper currently only accepts `dict` and silently drops other `Mapping` types; consider widening the type check to `Mapping` to avoid surprising behavior when callers pass e.g. `CaseInsensitiveDict` or similar mapping implementations.
- Header casing and normalization logic are now spread across providers (e.g. `User-Agent` vs `user-agent` handling in OpenAI, Anthropic, and Gemini); it may be worth centralizing this into a single case-insensitive header utility so that all providers apply and expose headers in a consistent way.
## Individual Comments
### Comment 1
<location path="tests/test_openai_source.py" line_range="71-80" />
<code_context>
)
+def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
+ monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
+
+ provider = _make_provider()
+
+ assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
+ assert provider.client.kwargs["default_headers"] == {
+ "User-Agent": ASTRBOT_USER_AGENT,
+ }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Consider a test where a custom user agent is provided but empty/whitespace to ensure the default UA is applied.
We already test the default UA with no `custom_headers` and the case where a non-empty custom UA is preserved. Please also add a case where `custom_headers` includes `"User-Agent"` with an empty or whitespace value (e.g., `" "`), and assert that `apply_default_headers` replaces it with `ASTRBOT_USER_AGENT` to cover this malformed-config fallback behavior.
```suggestion
def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch):
monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
provider = _make_provider()
assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}
def test_openai_provider_uses_astrbot_default_user_agent_for_blank_custom_header(
monkeypatch,
):
"""
When a custom User-Agent header is configured but is empty/whitespace,
ensure apply_default_headers replaces it with ASTRBOT_USER_AGENT.
"""
monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI)
provider = _make_provider(
overrides={
"custom_headers": {
"User-Agent": " ", # malformed/blank UA should be replaced
},
}
)
# The provider should normalize the header to the Astrbot default UA
assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}
```
</issue_to_address>
### Comment 2
<location path="tests/test_anthropic_kimi_code_provider.py" line_range="20-29" />
<code_context>
return None
+def test_anthropic_provider_uses_astrbot_default_user_agent(monkeypatch):
+ monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
+
+ provider = anthropic_source.ProviderAnthropic(
+ provider_config={
+ "id": "anthropic-test",
+ "type": "anthropic_chat_completion",
+ "model": "claude-test",
+ "key": ["test-key"],
+ },
+ provider_settings={},
+ )
+
+ assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT}
+ assert provider.client.kwargs["default_headers"] == {
+ "User-Agent": ASTRBOT_USER_AGENT,
+ }
</code_context>
<issue_to_address>
**suggestion (testing):** Extend Anthropics tests to cover custom headers with/without user agent and type normalization.
Consider adding a couple of nearby tests to fully cover the new header behavior:
- One where `provider_config.custom_headers` includes its own `User-Agent` plus extra headers, to verify a non-empty custom UA is preserved (and the default UA is not also applied) and that additional headers remain.
- Optionally, one where `custom_headers` includes non-string values (e.g., integers) to confirm normalization still coerces values to strings as before.
This will better demonstrate that existing Anthropics header behavior is preserved with the new shared helpers.
Suggested implementation:
```python
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}
def test_anthropic_provider_preserves_custom_user_agent_and_headers(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
custom_headers = {
"User-Agent": "MyApp/1.0 (test)",
"X-Extra-Header": "extra-value",
}
provider = anthropic_source.ProviderAnthropic(
provider_config={
"id": "anthropic-test",
"type": "anthropic_chat_completion",
"model": "claude-test",
"key": ["test-key"],
"custom_headers": custom_headers,
},
provider_settings={},
)
# Custom non-empty User-Agent should be preserved, and the default ASTRBOT_USER_AGENT
# should not be added on top.
assert provider.custom_headers == custom_headers
assert provider.client.kwargs["default_headers"] == custom_headers
def test_anthropic_provider_normalizes_custom_header_values_to_strings(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
custom_headers = {
"User-Agent": "MyApp/2.0 (test)",
"X-Int-Header": 123,
"X-Bool-Header": True,
}
provider = anthropic_source.ProviderAnthropic(
provider_config={
"id": "anthropic-test",
"type": "anthropic_chat_completion",
"model": "claude-test",
"key": ["test-key"],
"custom_headers": custom_headers,
},
provider_settings={},
)
expected_headers = {
"User-Agent": "MyApp/2.0 (test)",
"X-Int-Header": "123",
"X-Bool-Header": "True",
}
assert provider.custom_headers == expected_headers
assert provider.client.kwargs["default_headers"] == expected_headers
def test_anthropic_provider_passes_custom_headers_via_default_headers(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
```
If `ProviderAnthropic` currently expects `custom_headers` to be passed via `provider_settings` instead of `provider_config`, adjust the tests to move the `custom_headers` dict from `provider_config={...}` into `provider_settings={"custom_headers": custom_headers}` to match the existing convention in the rest of the file.
</issue_to_address>
### Comment 3
<location path="astrbot/core/provider/sources/gemini_source.py" line_range="81" />
<code_context>
if isinstance(self.timeout, str):
self.timeout = int(self.timeout)
self.thinking_config = provider_config.get("anth_thinking_config", {})
- self.custom_headers = self._resolve_custom_headers(provider_config)
+ self.custom_headers = self._resolve_custom_headers(
+ provider_config,
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying header handling by computing normalized headers once and reusing them everywhere instead of mutating Gemini client internals after creation.
You can simplify this without losing any functionality by:
1. Building the final, normalized headers once.
2. Passing them into both `HttpOptions` and `httpx.AsyncClient`.
3. Dropping the post-hoc mutation of Gemini client internals.
That removes indirection and reflection and keeps the header behavior centralized.
### 1. Simplify `_resolve_custom_headers`
Normalize keys (including `user-agent`) and apply defaults in one pass:
```python
@staticmethod
def _resolve_custom_headers(provider_config: dict) -> dict[str, str]:
# Normalize keys/values
raw = normalize_headers(provider_config.get("custom_headers", {}))
# Normalize the user-agent key while building the dict
normalized = {
("user-agent" if key.lower() == "user-agent" else key): value
for key, value in raw.items()
}
# Apply defaults with the normalized UA key
return apply_default_headers(
normalized,
{"user-agent": ASTRBOT_USER_AGENT},
)
```
This removes the extra comprehension after `apply_default_headers`.
### 2. Remove `_set_gemini_user_agent` and avoid private internals
You already pass headers into `HttpOptions`, which are used by the Gemini client. If you ensure `self.custom_headers["user-agent"]` is always the final value (as above), you don’t need to mutate `http_options.headers` via private attributes:
```python
# Delete this helper entirely
# @staticmethod
# def _set_gemini_user_agent(...):
# ...
```
Then simplify `_init_client` to rely only on the precomputed, final headers:
```python
def _init_client(self) -> None:
"""初始化Gemini客户端"""
proxy = self.provider_config.get("proxy", "")
http_options = types.HttpOptions(
base_url=self.api_base,
headers=self.custom_headers,
timeout=self.timeout * 1000, # 毫秒
)
async_client_kwargs: dict = {
"base_url": self.api_base,
"headers": self.custom_headers,
"timeout": self.timeout,
}
if proxy:
async_client_kwargs["proxy"] = proxy
async_client_kwargs["trust_env"] = False
logger.info("[Gemini] 使用代理")
else:
async_client_kwargs["trust_env"] = True
if self._http_client is not None:
self._stale_http_clients = [self._http_client]
self._http_client = httpx.AsyncClient(**async_client_kwargs)
http_options.httpx_async_client = self._http_client
self.client = genai.Client(
api_key=self.chosen_api_key,
http_options=http_options,
).aio
```
This keeps:
- Single source of truth for headers (`self.custom_headers`).
- Consistent headers between `httpx.AsyncClient` and `HttpOptions`.
- No reliance on private `_api_client` / `_http_options`.
- No triple handling or post-hoc header mutation.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch): | ||
| monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI) | ||
|
|
||
| provider = _make_provider() | ||
|
|
||
| assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT} | ||
| assert provider.client.kwargs["default_headers"] == { | ||
| "User-Agent": ASTRBOT_USER_AGENT, | ||
| } | ||
|
|
There was a problem hiding this comment.
suggestion (testing): Consider a test where a custom user agent is provided but empty/whitespace to ensure the default UA is applied.
We already test the default UA with no custom_headers and the case where a non-empty custom UA is preserved. Please also add a case where custom_headers includes "User-Agent" with an empty or whitespace value (e.g., " "), and assert that apply_default_headers replaces it with ASTRBOT_USER_AGENT to cover this malformed-config fallback behavior.
| def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch): | |
| monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI) | |
| provider = _make_provider() | |
| assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT} | |
| assert provider.client.kwargs["default_headers"] == { | |
| "User-Agent": ASTRBOT_USER_AGENT, | |
| } | |
| def test_openai_provider_uses_astrbot_default_user_agent(monkeypatch): | |
| monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI) | |
| provider = _make_provider() | |
| assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT} | |
| assert provider.client.kwargs["default_headers"] == { | |
| "User-Agent": ASTRBOT_USER_AGENT, | |
| } | |
| def test_openai_provider_uses_astrbot_default_user_agent_for_blank_custom_header( | |
| monkeypatch, | |
| ): | |
| """ | |
| When a custom User-Agent header is configured but is empty/whitespace, | |
| ensure apply_default_headers replaces it with ASTRBOT_USER_AGENT. | |
| """ | |
| monkeypatch.setattr(openai_source_module, "AsyncOpenAI", _FakeAsyncOpenAI) | |
| provider = _make_provider( | |
| overrides={ | |
| "custom_headers": { | |
| "User-Agent": " ", # malformed/blank UA should be replaced | |
| }, | |
| } | |
| ) | |
| # The provider should normalize the header to the Astrbot default UA | |
| assert provider.custom_headers == {"User-Agent": ASTRBOT_USER_AGENT} | |
| assert provider.client.kwargs["default_headers"] == { | |
| "User-Agent": ASTRBOT_USER_AGENT, | |
| } |
| def test_anthropic_provider_uses_astrbot_default_user_agent(monkeypatch): | ||
| monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic) | ||
|
|
||
| provider = anthropic_source.ProviderAnthropic( | ||
| provider_config={ | ||
| "id": "anthropic-test", | ||
| "type": "anthropic_chat_completion", | ||
| "model": "claude-test", | ||
| "key": ["test-key"], | ||
| }, |
There was a problem hiding this comment.
suggestion (testing): Extend Anthropics tests to cover custom headers with/without user agent and type normalization.
Consider adding a couple of nearby tests to fully cover the new header behavior:
- One where
provider_config.custom_headersincludes its ownUser-Agentplus extra headers, to verify a non-empty custom UA is preserved (and the default UA is not also applied) and that additional headers remain. - Optionally, one where
custom_headersincludes non-string values (e.g., integers) to confirm normalization still coerces values to strings as before.
This will better demonstrate that existing Anthropics header behavior is preserved with the new shared helpers.
Suggested implementation:
assert provider.client.kwargs["default_headers"] == {
"User-Agent": ASTRBOT_USER_AGENT,
}
def test_anthropic_provider_preserves_custom_user_agent_and_headers(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
custom_headers = {
"User-Agent": "MyApp/1.0 (test)",
"X-Extra-Header": "extra-value",
}
provider = anthropic_source.ProviderAnthropic(
provider_config={
"id": "anthropic-test",
"type": "anthropic_chat_completion",
"model": "claude-test",
"key": ["test-key"],
"custom_headers": custom_headers,
},
provider_settings={},
)
# Custom non-empty User-Agent should be preserved, and the default ASTRBOT_USER_AGENT
# should not be added on top.
assert provider.custom_headers == custom_headers
assert provider.client.kwargs["default_headers"] == custom_headers
def test_anthropic_provider_normalizes_custom_header_values_to_strings(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)
custom_headers = {
"User-Agent": "MyApp/2.0 (test)",
"X-Int-Header": 123,
"X-Bool-Header": True,
}
provider = anthropic_source.ProviderAnthropic(
provider_config={
"id": "anthropic-test",
"type": "anthropic_chat_completion",
"model": "claude-test",
"key": ["test-key"],
"custom_headers": custom_headers,
},
provider_settings={},
)
expected_headers = {
"User-Agent": "MyApp/2.0 (test)",
"X-Int-Header": "123",
"X-Bool-Header": "True",
}
assert provider.custom_headers == expected_headers
assert provider.client.kwargs["default_headers"] == expected_headers
def test_anthropic_provider_passes_custom_headers_via_default_headers(monkeypatch):
monkeypatch.setattr(anthropic_source, "AsyncAnthropic", _FakeAsyncAnthropic)If ProviderAnthropic currently expects custom_headers to be passed via provider_settings instead of provider_config, adjust the tests to move the custom_headers dict from provider_config={...} into provider_settings={"custom_headers": custom_headers} to match the existing convention in the rest of the file.
| if self.api_base and self.api_base.endswith("/"): | ||
| self.api_base = self.api_base[:-1] | ||
|
|
||
| self.custom_headers = self._resolve_custom_headers(provider_config) |
There was a problem hiding this comment.
issue (complexity): Consider simplifying header handling by computing normalized headers once and reusing them everywhere instead of mutating Gemini client internals after creation.
You can simplify this without losing any functionality by:
- Building the final, normalized headers once.
- Passing them into both
HttpOptionsandhttpx.AsyncClient. - Dropping the post-hoc mutation of Gemini client internals.
That removes indirection and reflection and keeps the header behavior centralized.
1. Simplify _resolve_custom_headers
Normalize keys (including user-agent) and apply defaults in one pass:
@staticmethod
def _resolve_custom_headers(provider_config: dict) -> dict[str, str]:
# Normalize keys/values
raw = normalize_headers(provider_config.get("custom_headers", {}))
# Normalize the user-agent key while building the dict
normalized = {
("user-agent" if key.lower() == "user-agent" else key): value
for key, value in raw.items()
}
# Apply defaults with the normalized UA key
return apply_default_headers(
normalized,
{"user-agent": ASTRBOT_USER_AGENT},
)This removes the extra comprehension after apply_default_headers.
2. Remove _set_gemini_user_agent and avoid private internals
You already pass headers into HttpOptions, which are used by the Gemini client. If you ensure self.custom_headers["user-agent"] is always the final value (as above), you don’t need to mutate http_options.headers via private attributes:
# Delete this helper entirely
# @staticmethod
# def _set_gemini_user_agent(...):
# ...Then simplify _init_client to rely only on the precomputed, final headers:
def _init_client(self) -> None:
"""初始化Gemini客户端"""
proxy = self.provider_config.get("proxy", "")
http_options = types.HttpOptions(
base_url=self.api_base,
headers=self.custom_headers,
timeout=self.timeout * 1000, # 毫秒
)
async_client_kwargs: dict = {
"base_url": self.api_base,
"headers": self.custom_headers,
"timeout": self.timeout,
}
if proxy:
async_client_kwargs["proxy"] = proxy
async_client_kwargs["trust_env"] = False
logger.info("[Gemini] 使用代理")
else:
async_client_kwargs["trust_env"] = True
if self._http_client is not None:
self._stale_http_clients = [self._http_client]
self._http_client = httpx.AsyncClient(**async_client_kwargs)
http_options.httpx_async_client = self._http_client
self.client = genai.Client(
api_key=self.chosen_api_key,
http_options=http_options,
).aioThis keeps:
- Single source of truth for headers (
self.custom_headers). - Consistent headers between
httpx.AsyncClientandHttpOptions. - No reliance on private
_api_client/_http_options. - No triple handling or post-hoc header mutation.
There was a problem hiding this comment.
Code Review
This pull request standardizes the User-Agent header across different LLM providers (Anthropic, Gemini, OpenAI, and Kimi) by introducing a default ASTRBOT_USER_AGENT and a helper utility for normalizing and applying default headers. It also adds corresponding unit tests. The review feedback highlights two key improvements: first, handling None values in normalize_headers to prevent sending literal "None" strings as header values, and second, adding a defensive type check in _set_gemini_user_agent to ensure http_options.headers is a dictionary before modifying it.
| def normalize_headers(headers: object) -> dict[str, str]: | ||
| if not isinstance(headers, dict): | ||
| return {} | ||
| return {str(key): str(value) for key, value in headers.items()} |
There was a problem hiding this comment.
In normalize_headers, if a header value is None (for example, if custom_headers contains {"User-Agent": null} in JSON), str(value) will convert it to the literal string "None". This can lead to invalid headers like User-Agent: None being sent to LLM providers, which might cause request failures or unexpected behavior.
We should defensively convert None values to an empty string "" so that apply_default_headers can correctly recognize them as empty and fall back to the default header values.
def normalize_headers(headers: object) -> dict[str, str]:
if not isinstance(headers, dict):
return {}
return {
str(key): str(value) if value is not None else ""
for key, value in headers.items()
}| def _set_gemini_user_agent(client: object, user_agent: str) -> None: | ||
| api_client = getattr(client, "_api_client", None) | ||
| http_options = getattr(api_client, "_http_options", None) | ||
| if http_options is None or http_options.headers is None: | ||
| return |
There was a problem hiding this comment.
To prevent potential AttributeError or type mismatch issues if the internal structure of the Gemini SDK client changes in future versions, we should add a defensive type check to ensure http_options.headers is indeed a mutable dictionary before attempting to modify it.
| def _set_gemini_user_agent(client: object, user_agent: str) -> None: | |
| api_client = getattr(client, "_api_client", None) | |
| http_options = getattr(api_client, "_http_options", None) | |
| if http_options is None or http_options.headers is None: | |
| return | |
| @staticmethod | |
| def _set_gemini_user_agent(client: object, user_agent: str) -> None: | |
| api_client = getattr(client, "_api_client", None) | |
| http_options = getattr(api_client, "_http_options", None) | |
| if http_options is None or not isinstance(http_options.headers, dict): | |
| return |
Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Standardize HTTP header handling across providers and ensure a consistent astrbot user agent is applied by default while preserving user-specified headers.
New Features:
Enhancements:
Tests: