BigQueryAgentAnalyticsPlugin: parent_span_id Points to Unlogged OTel Spans

# GitHub Issue: BigQuery Plugin `parent_span_id` Points to Unlogged OTel Spans

## 🔴 Required Information

**Describe the Bug:**

When OpenTelemetry instrumentation is active (e.g., `opentelemetry-instrumentation-google-genai`), the `BigQueryAgentAnalyticsPlugin` writes `parent_span_id` values that reference framework-internal OTel spans (`call_llm`, `execute_tool`, `send_data`) which are **never logged to the BigQuery table**. This makes the `parent_span_id` column unusable for building parent-child relationships within BigQuery — it points to spans that don't exist in the table.

The consequence is that any SQL JOIN like `LEFT JOIN agent_events_view A ON T.parent_span_id = A.span_id` returns NULL for the agent context, making it impossible to correlate LLM/tool events with their enclosing agent within BigQuery alone.

In our dataset, **92.8% of LLM_RESPONSE events** and **10.3% of TOOL events** have phantom `parent_span_id` values.

**Steps to Reproduce:**

1. Create an ADK agent with sub-agents and tools (see minimal reproduction below)
2. Install `opentelemetry-instrumentation-google-genai` (or any package that activates OTel tracing)
3. Configure the `BigQueryAgentAnalyticsPlugin` and run the agent
4. Query the BQ table to find parent_span_ids that don't exist as span_ids:

```sql
-- Find "phantom" parent_span_ids (referenced but never logged)
WITH logged_spans AS (
  SELECT DISTINCT span_id FROM `project.dataset.table`
),
parent_refs AS (
  SELECT DISTINCT parent_span_id 
  FROM `project.dataset.table`
  WHERE parent_span_id IS NOT NULL
)
SELECT p.parent_span_id, 
  CASE WHEN s.span_id IS NOT NULL THEN 'EXISTS' ELSE 'PHANTOM' END as status
FROM parent_refs p
LEFT JOIN logged_spans s ON p.parent_span_id = s.span_id
WHERE s.span_id IS NULL;
-- Returns dozens/hundreds of phantom span_ids
```

5. Attempt to JOIN tool or LLM events to their parent agent:

```sql
SELECT T.tool_name, T.parent_span_id, A.agent_name, A.duration_ms
FROM tool_events_view T
LEFT JOIN agent_events_view A ON T.parent_span_id = A.span_id
-- A.agent_name and A.duration_ms are NULL for all phantom cases
```

**Expected Behavior:**

Every `parent_span_id` written to BigQuery should reference a `span_id` that also exists in the same BigQuery table, forming a self-consistent span tree. This would allow users to traverse the parent-child relationship to build execution trees, correlate tool/LLM latency with agent context, and construct dashboards that show the full execution hierarchy.

**Observed Behavior:**

`parent_span_id` frequently points to internal ADK framework spans that are only visible in Cloud Trace but never written to BigQuery. The BQ span graph has "dangling pointers" that break any parent-child JOIN.

Concrete example from a single trace (`c7f3ed279e620d0e26df69f9165406e8`):

```
Trace raw events in BigQuery:
timestamp                  event_type         agent                  span_id              parent_span_id
──────────────────────────────────────────────────────────────────────────────────────────────────────────
2026-04-09 21:32:40.762    USER_MSG_RECEIVED  knowledge_supervisor   351611af4c2a9ca5     (null)
2026-04-09 21:32:40.764    INVOCATION_START   knowledge_supervisor   351611af4c2a9ca5     (null)
2026-04-09 21:32:40.765    AGENT_STARTING     knowledge_supervisor   40fad0f4b5d8203c     351611af4c2a9ca5  ← OK
2026-04-09 21:32:40.766    LLM_REQUEST        knowledge_supervisor   5179bd626c6d4fb9     40fad0f4b5d8203c  ← OK
2026-04-09 21:32:45.825    LLM_RESPONSE       knowledge_supervisor   5179bd626c6d4fb9     40fad0f4b5d8203c  ← OK
2026-04-09 21:32:45.826    TOOL_STARTING      knowledge_supervisor   fb9761a38d9251ba     84a19a6ed0f23b06  ← PHANTOM!
2026-04-09 21:32:45.826    TOOL_COMPLETED     knowledge_supervisor   fb9761a38d9251ba     84a19a6ed0f23b06  ← PHANTOM!
2026-04-09 21:32:45.827    AGENT_STARTING     internal_docs_agent    b6000853c649fd03     84a19a6ed0f23b06  ← PHANTOM!
2026-04-09 21:32:45.829    LLM_REQUEST        internal_docs_agent    70f56dd9f82afc16     b6000853c649fd03  ← OK
2026-04-09 21:32:49.502    LLM_RESPONSE       internal_docs_agent    70f56dd9f82afc16     b6000853c649fd03  ← OK
2026-04-09 21:32:49.503    TOOL_STARTING      internal_docs_agent    3dab867f3003b630     9b7b334a7c4d42ad  ← PHANTOM!
2026-04-09 21:32:50.288    TOOL_COMPLETED     internal_docs_agent    3dab867f3003b630     9b7b334a7c4d42ad  ← PHANTOM!
2026-04-09 21:32:53.837    AGENT_COMPLETED    internal_docs_agent    b6000853c649fd03     84a19a6ed0f23b06  ← PHANTOM!
2026-04-09 21:32:53.838    AGENT_COMPLETED    knowledge_supervisor   40fad0f4b5d8203c     351611af4c2a9ca5  ← OK
2026-04-09 21:32:53.838    INVOCATION_COMPL   knowledge_supervisor   351611af4c2a9ca5     (null)

Parent span cross-reference:
  parent_span_id=351611af4c2a9ca5  →  EXISTS in BQ  (invocation span)
  parent_span_id=40fad0f4b5d8203c  →  EXISTS in BQ  (agent span)
  parent_span_id=b6000853c649fd03  →  EXISTS in BQ  (sub-agent span)
  parent_span_id=84a19a6ed0f23b06  →  PHANTOM       (framework 'execute_tool' span — not in BQ)
  parent_span_id=9b7b334a7c4d42ad  →  PHANTOM       (framework 'execute_tool' span — not in BQ)
```

The phantom span `84a19a6ed0f23b06` is the ADK framework's `execute_tool transfer_to_agent` OTel span created at `flows/llm_flows/functions.py:588`. The phantom span `9b7b334a7c4d42ad` is the framework's `execute_tool search_internal_docs` OTel span. Both exist in Cloud Trace but are never written to BQ.


```
event_type                     parent_status      count
──────────────────────────────────────────────────────────
AGENT_COMPLETED                phantom              131
AGENT_COMPLETED                resolved            3446
AGENT_STARTING                 phantom              131
AGENT_STARTING                 resolved            3446
LLM_REQUEST                    resolved            3554
LLM_RESPONSE                   phantom             3299   ← 92.8% phantom!
LLM_RESPONSE                   resolved             255
TOOL_COMPLETED                 phantom              175   ← 10.3% phantom
TOOL_COMPLETED                 resolved            1520
TOOL_STARTING                  phantom              175
TOOL_STARTING                  resolved            1520
```

**Environment Details:**

- ADK Library Version: `google-adk` 1.28.1
- Desktop OS: Linux (Ubuntu 24.04)
- Python Version: 3.11

**Model Information:**

- Are you using LiteLLM: No
- Which model is being used: `gemini-2.5-pro` (via Vertex AI)

---

## 🟡 Optional Information

**Regression:**
Unknown — this may have existed since OTel support was added. The plugin comments reference issues #4561 and #4645 which drove the current `_resolve_ids()` layered approach.

**Logs:**
N/A — this is not a runtime error. The plugin runs without errors; it simply writes `parent_span_id` values that reference OTel spans not present in BQ.

**Root Cause Analysis:**

The issue is in `_resolve_ids()` (bigquery_agent_analytics_plugin.py, ~line 2520):

```python
# --- Layer 2: ambient OTel span ---
ambient = trace.get_current_span()
ambient_ctx = ambient.get_span_context()
if ambient_ctx.is_valid:
    trace_id = format(ambient_ctx.trace_id, "032x")
    span_id = format(ambient_ctx.span_id, "016x")        # ← takes ambient span_id
    parent_span_id = None
    parent_ctx = getattr(ambient, "parent", None)
    if parent_ctx is not None and parent_ctx.span_id:
        parent_span_id = format(parent_ctx.span_id, "016x")  # ← takes ambient parent
```

When OTel instrumentation is active, the ADK framework wraps callbacks in internal spans:

| Framework Location | `start_as_current_span()` Name | Logged to BQ? |
|---|---|---|
| `runners.py:546` | `invocation` | No |
| `base_agent.py:288` | `invoke_agent {name}` | No |
| `base_llm_flow.py:1126` | `call_llm` | No |
| `functions.py:588` | `execute_tool {name}` | No |
| `base_llm_flow.py:505` | `send_data` | No |

When `before_tool_callback` fires, the ambient OTel span is `execute_tool search_internal_docs`. Layer 2 picks up this span's `span_id` and its parent's `span_id`. Since these framework spans are never written to BQ by the plugin, the resulting `parent_span_id` in the BQ row is a dangling reference.

The plugin's own internal span stack (Layer 3, via `TraceManager`) would produce correct, self-consistent parent_span_ids. But Layer 2 overrides Layer 3 whenever an ambient OTel span is present.

In `after_tool_callback`, the plugin explicitly decides NOT to override when ambient OTel is present:

```python
has_ambient = trace.get_current_span().get_span_context().is_valid
event_data = EventData(
    span_id_override=None if has_ambient else span_id,           # ← None when OTel active
    parent_span_id_override=None if has_ambient else parent_span_id,  # ← None when OTel active
)
```

This means Layer 1 (explicit overrides) is intentionally skipped, and Layer 2 (ambient OTel) takes full control of span_id/parent_span_id.

**Consequences:**

1. **Parent-child JOINs in BQ are broken.** Any query like `LEFT JOIN agent_events_view ON tool.parent_span_id = agent.span_id` returns NULL for the agent columns. This makes it impossible to answer basic questions like "which agent was this tool running inside?" or "what was the agent's total latency for this tool call?" using only BigQuery data.

2. **Execution tree reconstruction is impossible.** Building a tree of agent → sub-agent → tool → LLM from the BQ data requires a connected parent-child graph. Phantom spans create disconnected subtrees.

3. **BQ analytics and dashboards are degraded.** Any dashboard that computes "tool latency as % of parent agent latency" or "agent error status for failed tool calls" gets NULL values for the vast majority of records.

4. **Inconsistency between BQ and Cloud Trace.** The same trace looks correct in Cloud Trace (all spans present) but broken in BQ (dangling parent references). Users expect the BQ data to be self-consistent.

**Proposed Fix (in the BQ plugin):**

The fix would be to always use the plugin's internal TraceManager stack for `parent_span_id`, even when ambient OTel is present. The ambient span should only influence `trace_id` (to maintain correlation with Cloud Trace) and optionally `span_id`, but `parent_span_id` should always come from the plugin's stack (Layer 3) since that's the only layer guaranteed to produce span_ids that are actually logged to BQ.

Conceptually:

```python
# In _resolve_ids():
if ambient_ctx.is_valid:
    trace_id = format(ambient_ctx.trace_id, "032x")
    span_id = format(ambient_ctx.span_id, "016x")
    # DON'T override parent_span_id from ambient — keep plugin stack value
    # parent_span_id stays as plugin_parent_span_id from Layer 3
```

And in the `after_*_callback` methods, always pass the popped span's parent as an override:

```python
# In after_tool_callback:
event_data = EventData(
    span_id_override=None if has_ambient else span_id,
    parent_span_id_override=parent_span_id,  # ← ALWAYS override parent, not just when no ambient
)
```

This would keep `trace_id` and `span_id` aligned with Cloud Trace (for cross-referencing) while ensuring `parent_span_id` always forms a valid, self-consistent graph within BQ.

**Workaround (SQL-level):**

Until a plugin fix is available, consumers can JOIN on `trace_id + agent_name` instead of `parent_span_id`:

```sql
-- Instead of:
LEFT JOIN agent_events_view A ON T.parent_span_id = A.span_id

-- Use:
LEFT JOIN agent_events_view A ON T.trace_id = A.trace_id AND T.agent_name = A.agent_name
```

This works because both `tool_events_view` and `llm_events_view` carry an `agent_name` column (from the raw `agent` field) that identifies which agent the event belongs to.

**Minimal Reproduction Code:**

```python
from google.adk.agents import Agent, LlmAgent
from google.adk.apps import App
from google.adk.models import Gemini
from google.adk.plugins.bigquery_agent_analytics_plugin import (
    BigQueryLoggerConfig,
    BigQueryAgentAnalyticsPlugin,
)

# A tool that the sub-agent will call
def search_docs(query: str) -> str:
    """Searches documents."""
    return "No results found."

# Sub-agent with a tool
sub_agent = LlmAgent(
    name="docs_agent",
    model="gemini-2.5-flash",
    description="Searches documents",
    instruction="Use search_docs to answer questions.",
    tools=[search_docs],
)

# Root supervisor agent
root = Agent(
    name="supervisor",
    model=Gemini(model="gemini-2.5-pro"),
    description="Routes queries",
    instruction="Route document questions to docs_agent.",
    sub_agents=[sub_agent],
)

bq_plugin = BigQueryAgentAnalyticsPlugin(
    project_id="your-project",
    dataset_id="your-dataset",
    table_id="your-table",
    config=BigQueryLoggerConfig(enabled=True, batch_size=1),
    location="us-central1",
)

app = App(root_agent=root, name="test_app", plugins=[bq_plugin])

# Run a query that triggers: supervisor → LLM → transfer → docs_agent → LLM → search_docs → LLM
# Then query BQ and check parent_span_id references
```

After running, execute the diagnostic SQL from "Steps to Reproduce" step 4. You will find TOOL_STARTING/COMPLETED events for `search_docs` with `parent_span_id` values that don't exist as any row's `span_id` in the table.

**How often has this issue occurred?:**

- Always (100%) — every trace with OTel instrumentation active exhibits phantom parent_span_ids. The exact percentage varies by event type (92.8% for LLM_RESPONSE, 10.3% for TOOL events, 3.7% for AGENT events in our dataset).


Framework Location	`start_as_current_span()` Name	Logged to BQ?
`runners.py:546`	`invocation`	No
`base_agent.py:288`	`invoke_agent {name}`	No
`base_llm_flow.py:1126`	`call_llm`	No
`functions.py:588`	`execute_tool {name}`	No
`base_llm_flow.py:505`	`send_data`	No

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigQueryAgentAnalyticsPlugin: parent_span_id Points to Unlogged OTel Spans #5310

GitHub Issue: BigQuery Plugin `parent_span_id` Points to Unlogged OTel Spans

🔴 Required Information

🟡 Optional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BigQueryAgentAnalyticsPlugin: parent_span_id Points to Unlogged OTel Spans #5310

Description

GitHub Issue: BigQuery Plugin parent_span_id Points to Unlogged OTel Spans

🔴 Required Information

🟡 Optional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

GitHub Issue: BigQuery Plugin `parent_span_id` Points to Unlogged OTel Spans