feat(doctor): surface prunable-memory backlog in memory_integrity probe (#429)#430
Merged
Merged
Conversation
…be (#429) The `memory_integrity` doctor probe reported the memory store's decisions/rejected/episodes/skills breakdown and returned PASS whenever the store opened cleanly — with no visibility into the *prunable* backlog: the active, non-load-bearing episodic items that accumulate one-per-issue (epic #426) until they drown the load-bearing decisions in boot context. A loop that has run 300 issues showed `episodes=600` and a green light, giving a maestro resuming after context loss no signal that compaction was overdue. This enriches the existing probe (no new probe / CHECK_NAMES / CLI command): - `PRUNABLE_MEMORY_WARN_THRESHOLD` and `COMPACT_REMEDIATION` are declared once alongside the existing named `*_REMEDIATION` / verdict constants (no bare literal at the call site). - The detail line additionally reports `prunable=<n>` (active items where `is_load_bearing_memory(item)` is False), computed over the already-fetched `active` list (no extra DB round-trip). Existing fields are unchanged. - Prunable strictly > threshold -> WARN naming the count + compaction, with `COMPACT_REMEDIATION`; <= threshold -> unchanged PASS / remediation None. - The absent (WARN), corrupt (FAIL) and unreadable (FAIL) branches are untouched; the prunable logic runs only on the clean-open path. Read-only. Dependency on sibling #427: `is_load_bearing_memory` is NOT yet merged. Per the ticket's "do not re-implement the classification" constraint (also manifesto Q7), the probe consumes the predicate via injection and resolves it dynamically in production (`_resolve_load_bearing_predicate`). Until #427 lands the symbol is absent, so the probe degrades — reporting `prunable=unmeasured` and staying PASS — exactly as `mutation_survivors_check` degrades without its #379 checker (declared-degrade per Q11). When #427 merges the WARN gate lights up with no further change here. Tests inject a fake predicate to exercise the full PASS/WARN/boundary matrix now, and monkeypatch the module symbol to drive the real CLI path end-to-end. Tests (tests/test_doctor_control_plane.py): below-threshold PASS, above- threshold WARN, strict `>` boundary (at-threshold PASS, +1 WARN), existing- fields-preserved, unavailable-predicate degrade, corrupt-store FAIL not masked, plus `doctor --json` WARN integration and human-table rendering of the prunable count + remediation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
hadamrd
commented
Jun 8, 2026
hadamrd
left a comment
Owner
Author
There was a problem hiding this comment.
Minimal path to green (must-fix to merge)
Nothing blocks merge.
Optional follow-ups (do NOT block merge)
- [sev3/tests] Full
pytestsuite was not run to completion in the worker worktree (slow-fsync disk); only targeted tests + ruff + pyright were run. The change is additive and well-isolated so this is low-risk, but confirm CI runs the full suite green before merge. - [sev3/tests] tests/test_doctor_control_plane.py:143 —
_episodic_is_prunableencodes an assumed semantics of #427's not-yet-mergedis_load_bearing_memory. The seam test will stay green even if #427 ships different classification. When #427 lands, replace the monkeypatched fake with a real end-to-end assertion against the actual predicate. - [sev3/style] tests/test_doctor_control_plane.py:165 —
_seed_memory_backlogreaches into the privatestore._connectionto setPRAGMA synchronous=OFF. Acceptable for a throwaway test db, but if this seeding pattern spreads, consider a small store/test helper rather than touching a private attribute.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enriches the existing
memory_integritydoctor probe (src/forge_loop/control/doctor.py) so it surfaces the prunable-memory backlog — the active, non-load-bearing episodic items that accumulate one-per-issue (parent epic #426) until they drown the load-bearing decisions in boot context. Previously the probe printedepisodes=600and returned PASS, giving a maestro resuming after context loss no signal that compaction was overdue.No new probe,
CHECK_NAMESentry, or CLI command — the existing probe is enriched only. Read-only: no writes to.forge.Dependency on #427 — which path I took
is_load_bearing_memory(the pure predicate) is #427, which is OPEN / not merged with no PR at the time of writing. Per the ticket's "Do not re-implement the classification here" constraint (and manifesto Q7), this probe consumes the predicate via injection and resolves it dynamically in production (_resolve_load_bearing_predicate→getattr(forge_loop.memory, "is_load_bearing_memory", None)).prunable=unmeasuredand stays PASS — exactly asmutation_survivors_checkdegrades to WARN without its Add a scopedmutation-checkcommand over one configured high-risk module #379 checker (declared-degrade per manifesto Q11). It never raises a WARN it cannot substantiate, and never crashesdoctor.doctor --json/ human-table CLI path end-to-end (a seam test, so the consumer is proven to actually receive the classification).Acceptance criteria & how they're tested
PRUNABLE_MEMORY_WARN_THRESHOLDCOMPACT_REMEDIATIONalongside*_REMEDIATION== COMPACT_REMEDIATIONprunable=<n>to the breakdowntest_prunable_backlog_reported_below_threshold_passes,test_warn_detail_preserves_existing_breakdown_fields> threshold→ WARN naming count + compaction, non-None remediation>branchtest_prunable_backlog_above_threshold_warns<= threshold→ unchanged PASS, remediation None>(not>=) boundarytest_prunable_threshold_boundary_uses_strict_greater_thantest_prunable_logic_does_not_mask_corrupt_store_fail, existingtest_fails_cleanly_on_corrupt_dbdoctor --jsonover above-threshold store →warn+ non-null remediation; human table renders count + remediationtest_json_memory_integrity_warns_on_above_threshold_prunable_backlog,test_human_table_renders_prunable_count_and_compaction_remediationPrunable count is computed over the already-fetched
activelist (no extra DB round-trip; manifesto Q9).Test plan
ruff check src tests— passespyright src/forge_loop— 0 errors in the changeddoctor.py. 2 errors remain inrunner/dispatch.pyandrunner/tick_checks.py— both pre-existing ontrunkin files this PR does not touch (verified by checking out thetrunkversions).TestMemoryIntegrity+ the two cross-boundary tests).Out of scope (respected)
is_load_bearing_memory(Add a pure is_load_bearing_memory(item) predicate for curated memory #427).Fixes #429