Version: graphifyy 0.9.5. Context: living, CRM-like documentary corpus (people join/leave, RFPs open/close), 289 real queries logged over one month.
Problem
serve._query_graph_text ranks purely by term-overlap + IDF + confidence — no time factor at all. On a living corpus, stale facts weigh exactly as much as fresh ones: a query about a person returns their old assignment and their current one with equal standing, and nothing marks which edge is superseded.
reflect.py already has the half-life mechanics (30 days) but only applies it to Q&A reputation, not to graph facts.
Proposal
Three composable pieces:
- Optional temporal fields
valid_from / valid_to / superseded_by on nodes/edges, populated by extraction when the source states them.
--recency flag on query that weights results by captured_at / mtime of the source_file.
- Invalidation mechanism: when a new extraction states "X left Y", mark the existing employment edge as closed (don't delete it — history stays queryable).
Even just (2) would be a big win: today we have to re-check every time-sensitive answer against the raw source files.
Version: graphifyy 0.9.5. Context: living, CRM-like documentary corpus (people join/leave, RFPs open/close), 289 real queries logged over one month.
Problem
serve._query_graph_textranks purely by term-overlap + IDF + confidence — no time factor at all. On a living corpus, stale facts weigh exactly as much as fresh ones: a query about a person returns their old assignment and their current one with equal standing, and nothing marks which edge is superseded.reflect.pyalready has the half-life mechanics (30 days) but only applies it to Q&A reputation, not to graph facts.Proposal
Three composable pieces:
valid_from/valid_to/superseded_byon nodes/edges, populated by extraction when the source states them.--recencyflag onquerythat weights results bycaptured_at/ mtime of thesource_file.Even just (2) would be a big win: today we have to re-check every time-sensitive answer against the raw source files.