Skip to content

[Bug]: compress feedback loop burns 738K tokens in one session — summary grows monotonically while accepting stale endId #573

@js-sknk

Description

@js-sknk

Summary

Across a single 454-message session on @tarquinen/opencode-dcp@3.1.13, the compress pipeline entered a self-reinforcing loop in which each new block re-absorbed the previous block's (bN) placeholder plus a tiny new tail of raw messages, producing a summary larger than the content it replaced. The MAX CONTEXT LIMIT REACHED system reminder fired 45 times, forced 71 compress blocks, and burned 738,738 prune tokens while the on-disk summary grew monotonically to ~68K tokens (~234K chars).

User reaction mid-session (verbatim, Korean):

  • 15:16 — "뭘 이렇게 compress 하는 건가요?" ("Why is it compressing this much?")
  • 23:53 — "또 무한 compress 인가요?" ("Infinite compress again?")

This is distinct from #572 (compression runs but UI window doesn't shrink) and #551/#555 (stale boundary IDs) — those are upstream causes; this report documents the downstream cost when those failures combine with no size guard, and includes a quantitative growth chart.

Environment

  • @tarquinen/opencode-dcp: 3.1.13 (latest on npm as of 2026-06-18)
  • opencode: 1.17.8
  • Platform: macOS (darwin)
  • Config: dcp.jsonc was effectively empty (only $schema); all defaults applied. Specifically manualMode.enabled: false, compress.permission: "allow", compress.maxContextLimit: 100000, compress.nudgeFrequency: 5.
  • Session id: ses_12cf90792ffen6cfatZJzBARA5 (Hephaestus / Prometheus / Atlas — start-work plan + execute on Claude Opus 4.7)

State file: ~/.local/share/opencode/storage/plugin/dcp/ses_12cf90792ffen6cfatZJzBARA5.json (8.3 MB; evidence preserved locally and available on request).

Quantitative evidence

From stats and nudges in the dcp state file:

Metric Value
stats.totalPruneTokens 738,738
nudges.contextLimitAnchors 45 entries (all *_prefill_recovery)
nextBlockId 72 → 71 compress blocks in one session

Two monotonic growth waves visible in blocksById (Wave 2 begins after a cold-reset block at B34):

Wave 1

Block Range summaryTokens summaryChars
1 m0001..m0020 12,389 41,712
5 m0127..m0188 6,022 22,858
11 b10..m0190 14,595 53,252
20 b19..m0217 37,326 131,252
30 b29..m0228 61,616 210,704
33 b32..m0237 68,286 233,645

Wave 2 (after B34 reset)

Block Range summaryTokens summaryChars
34 m0002..m0003 1,604 6,046
47 b46..m0033 21,017 72,415
64 b63..m0073 48,297 166,555
71 b70..m0345 56,025 192,384

Mid-session example seen in a single block: −2.6K removed, +19.3K summary = net −16.7K context (i.e. compressing made things worse).

Suspicious endId chains

The range field on several blocks shows endId message numbers lower than the start block's already-covered range — the model fed compress stale <dcp-message-id> values from earlier turns, and the tool accepted them:

Block range Note
23 b22..m0085 b22 already covers through ~m0223
41 b40..m0018 endId older than start block by hundreds of msgs
47 b46..m0033 same pattern
50 b49..m0040 same
53 b52..m0046 same
64 b63..m0073 same

Compress proceeded each time, and the resulting summary effectively re-summarized content already inside (bN-1) plus a small new tail.

Three compounding failures

  1. Stale endId accepted. The model picks message IDs from injected <dcp-message-id> tags in earlier responses (already inside the previous block's range) instead of the latest active tail. Same family as [Bug]: "compress" boundary IDs orphaned after auto-compaction; error wording misleads the agent #551 / [Bug]: mXXXX</parameter> hallucinated at end of responses — not stripped, creates feedback loop #555 / [Bug]: Compressed block summaries retain stale mNNNN message ID tags — model copies stale IDs #542 / [Bug]: Model uses stale mNNNN IDs from nudges/summaries — compress fails with "startId not available" #541, but the tool does not refuse the call — it produces a block whose endId is older than the previous block's coverage.
  2. No net-compaction size guard. Compressions are committed even when summaryTokens > removedTokens. There is no check like if summaryTokens >= removedTokens * threshold then refuse. As a result Wave 1 added context across 33 successive operations.
  3. System-reminder feedback loop. Once context fills, every assistant turn triggers MAX CONTEXT LIMIT REACHED, which forces another compress, which produces another oversized summary, which keeps context full → next reminder. This fired 45 times in a single session and is the proximate cause of the token burn the user observed.

These three interact: (1) gives the loop wrong data, (2) lets each iteration pay net-negative, and (3) repeatedly retriggers (1)+(2). Fixing any one of them likely breaks the loop.

Reproduction (observational)

I cannot give a deterministic repro recipe — this happened during a long agent-driven start-work session on a Korean-language project — but the state file makes the pattern reproducible offline. Read ~/.local/share/opencode/storage/plugin/dcp/ses_12cf90792ffen6cfatZJzBARA5.json and inspect blocksById[*].range, blocksById[*].summaryTokens, nudges.contextLimitAnchors, and stats.totalPruneTokens.

Conditions that appear to drive it:

  • Long session (400+ messages) on a high-context model (Claude Opus 4.7)
  • Heavy tool-call output filling context faster than compress can prune
  • Default nudgeFrequency: 5 and default maxContextLimit: 100000
  • Once *_prefill_recovery anchors start firing, the loop self-reinforces

Suggested fixes (any one would help)

  1. Validate endId against state. When committing a compress block, refuse if parseEndId(endId) is not strictly after the previous block's endId. Return an error message that tells the agent which IDs are valid (and reference [Bug]: "compress" boundary IDs orphaned after auto-compaction; error wording misleads the agent #551 wording fix).
  2. Net-compaction guard. Refuse to commit a block where summaryTokens >= removedTokens (or where summaryTokens / removedTokens > configurable_ratio, default e.g. 0.7). On refusal, emit a different nudge so the agent does not immediately retry the same compress.
  3. Anchor backoff. Track *_prefill_recovery anchor count per session; after N (e.g. 3) recoveries that did not actually shrink context, switch to manual mode automatically and surface a warning rather than continuing to nudge.
  4. Cap iteration nudges already in iterationNudgeThreshold (default 15) — but this seems to be bypassed by *_prefill_recovery anchors firing 45× in one session.

Local mitigation in place (FYI)

For users hitting this today, setting manualMode.enabled: true in dcp.jsonc stops auto-nudges entirely while keeping deduplication and purgeErrors strategies active. Combined with raising compress.maxContextLimit and compress.nudgeFrequency, this cleanly broke the loop in subsequent sessions.

{
  "$schema": "https://raw.githubusercontent.com/Opencode-DCP/opencode-dynamic-context-pruning/master/dcp.schema.json",
  "manualMode": { "enabled": true, "automaticStrategies": true },
  "compress": {
    "maxContextLimit": "85%",
    "minContextLimit": "60%",
    "nudgeFrequency": 20,
    "nudgeForce": "soft"
  }
}

Related issues

This report adds quantitative monotonic-growth evidence and the net-negative compaction observation that the prior issues did not capture.

Artifacts available on request

  • Full ses_12cf90792ffen6cfatZJzBARA5.json (8.3 MB)
  • All 71 blocksById entries with range / summaryTokens / summaryChars
  • All 45 contextLimitAnchors with timestamps and trigger reasons

Happy to attach or share via a private channel if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions