Skip to content

feat(query): opt-in --recency flag to down-weight stale facts#1665

Open
TPAteeq wants to merge 2 commits into
Graphify-Labs:v8from
TPAteeq:feat/query-recency-weighting
Open

feat(query): opt-in --recency flag to down-weight stale facts#1665
TPAteeq wants to merge 2 commits into
Graphify-Labs:v8from
TPAteeq:feat/query-recency-weighting

Conversation

@TPAteeq

@TPAteeq TPAteeq commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

What

Adds an opt-in --recency flag to graphify query (and the equivalent
recency / half_life_days fields on the MCP query_graph tool) that
multiplies each matched node's search score by a time-decay factor, so newer
facts rank ahead of otherwise-equal stale ones. Default query output is
byte-for-byte unchanged when the flag is off.

This is the small, low-risk slice of #1650. The larger, riskier piece — fact
supersession / temporal-validity invalidation (valid_from/valid_to,
marking an old fact as superseded by a newer one) — is deferred to a
follow-up
and is intentionally not implemented here.

How

  • Reuses the existing half-life math. graphify reflect already has a pure
    _decay() (halves every _DEFAULT_HALF_LIFE_DAYS = 30); serve.py now
    imports those two pure names and decays query scores on the same curve. No
    coupling to the reflect learning sidecar.
  • Recency signal precedence (_node_recency_weight): captured_at (ISO
    datetime, present only on ingested docs) → else the source_file's on-disk
    mtime resolved under the repo root → else 1.0 (neutral). Code/AST nodes
    carry neither, so recency is a no-op for them.
  • Threading: _query_graph_text(..., recency, half_life_days, now, source_root)_score_nodes(...). The decay multiplier is applied only
    inside if recency:; when the flag is off, every code path is identical to
    the pre-Feature: temporal validity on facts + recency weighting in query (living corpora return stale facts at full weight) #1650 scorer (verified by byte-identical tests).
  • Determinism / testability: a query is inherently "now"-relative, which is
    exactly why this stays behind an opt-in flag. _score_nodes /
    _query_graph_text accept an optional explicit now anchor so tests inject
    ages instead of depending on the wall clock.
  • _pick_seeds' per-term coverage guarantee is left age-neutral on purpose: a
    single old match for a query term shouldn't be starved out just for being old.

CLI / MCP surface

graphify query "<question>" [--dfs] [--context C] [--budget N] \
    [--recency] [--half-life-days N] [--graph path]

MCP query_graph gains recency (boolean, default false) and
half_life_days (number, default 30, only used when recency=true).

Tests

tests/test_serve.py and tests/test_query_cli.py:

  • flag off_score_nodes / _query_graph_text output byte-identical to
    today, even when captured_at is present;
  • flag on → the newer of two equally-matching nodes is promoted (ranking
    and the Start: seed order shift toward newer);
  • captured_at precedence, on-disk mtime fallback, neutral weight when neither
    is present, future-date clamp, and _source_root_for path derivation;
  • CLI --recency flip and --half-life-days parsing/validation.

Ages are injected via captured_at + an explicit now anchor (unit tests) or
decades-apart dates (CLI tests, which have no now-injection), so nothing is
wall-clock dependent.

Run:

uv run pytest tests/test_serve.py tests/test_query_cli.py \
    tests/test_serve_http.py tests/test_reflect.py tests/test_querylog.py -q

Result: 165 passed, 1 skipped.

Deferred (follow-up, not in this PR)

  • Fact supersession / temporal-validity invalidation (valid_from/valid_to,
    superseded-by links). Only the query-time recency weighting is implemented here.

Refs #1650

TPAteeq and others added 2 commits July 5, 2026 00:01
…fy-Labs#1650, partial)

Add an opt-in `--recency` flag to `graphify query` (and `recency` /
`half_life_days` fields on the MCP `query_graph` tool) that multiplies each
matched node's search score by a time-decay factor, so newer facts rank ahead
of otherwise-equal stale ones. Default output is byte-for-byte unchanged when
the flag is off.

Recency signal precedence: `captured_at` (ingested docs) -> else `source_file`
mtime -> else 1.0 (neutral, so code/AST nodes are unaffected). Decay reuses the
reflect sidecar's pure half-life math (`_decay`, 30-day default) without
coupling to the learning sidecar. `_score_nodes` / `_query_graph_text` accept
an optional explicit `now` anchor so tests inject ages instead of depending on
the wall clock; the decay is applied only inside `if recency:` so the flag-off
path is identical to the pre-change scorer.

This is the small slice of Graphify-Labs#1650; fact supersession / temporal-validity
invalidation is deferred to a follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ry (Graphify-Labs#1650)

Follow-up to review of the opt-in `--recency` query weighting:

- CHANGELOG: add the required `## Unreleased` Feat bullet for the flag.
- serve.py: guard the MCP `half_life_days` parse. A non-numeric MCP argument
  previously hit an unguarded `float(...)` and crashed the `query_graph`
  handler with `ValueError`, whereas the CLI degrades gracefully. Extracted
  `_recency_args` coerces the payload and falls back to the 30-day default on a
  bad value, so the two entry points are consistent (and the parse is unit-
  testable without the `mcp` package installed).
- serve.py: normalize a trailing `Z` on `captured_at` to `+00:00` in the
  recency path before decay. `datetime.fromisoformat` only accepts bare `Z` on
  Python >= 3.11, so external frontmatter written as `...Z` silently degraded to
  neutral weight on 3.10. reflect._parse_dt is left untouched (the reflect Q&A
  path keeps its semantics).
- tests: lock malformed/null/non-string `captured_at` -> neutral 1.0;
  `half_life_days <= 0` -> recency disabled with no div-by-zero; `_recency_args`
  threading + bad-value fallback (no crash); and the `Z` normalization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant