docs(chat-bot-memory): TurboQuant inspiration + sync to v4 memory state by thedeutschmark · Pull Request #1 · thedeutschmark/engineering-notes

thedeutschmark · 2026-05-21T05:18:11Z

What

Adds an Inspiration: TurboQuant section to the chat-bot-memory note, and brings the note up to date with the current engine.

TurboQuant (arXiv:2504.19874)

Read the full paper, not the press summaries (which incorrectly describe it as "PolarQuant + QJL" — PolarQuant is a separate method TurboQuant beats as a baseline). Actual pipeline: random rotation → per-coordinate Lloyd–Max scalar quantizer → 1-bit QJL residual.

Honest verdict baked into the section: the machinery buys this app nothing at its scale (retrieval is a sort over dozens of rows; reply LLM is cloud-hosted → no local KV cache). What's worth taking:

QJL one-bit sketch as a semantic-dedup signal in the hygiene pass — catches "plays drums" vs "is a drummer" (lexical Jaccard scores these ~0.33 and misses them), no stored embeddings, off the hot path.
Outlier-channel bit allocation, generalized (the "+1"): spend retention + token budget where the variance/signal is, not uniformly. Reframes confidence-weighted selection as a measurable survival-weighted retrieval quality metric.
TurboQuant documented as the escape hatch that de-risks the deferred embeddings step (packed bit-blobs + brute-force Hamming) without reordering the roadmap.

Drift fixes (the "up to date" part)

Cross-stream session recaps (v4 stream_sessions / "this stream" vs "last stream").
Provenance render-tags ([said]/[reported]/[guess]) surfaced to the model.

Roadmap bullets updated to fold in the two concrete ideas. Deliberately scoped out bait-decisiveness work — that's a reply/action concern, not memory.

Note

This documents thinking and sharpens the roadmap; it does not commit to building a quantizer. The forgetmenot code changes (QJL dedup, signal-weighted budget) are a separate effort.

… state Add an "Inspiration: TurboQuant" section drawing on Google Research's TurboQuant paper (arXiv:2504.19874), read in full rather than from the press summaries — which conflate it with the separate PolarQuant method. The honest verdict: TurboQuant validates the "compress, don't hoard" thesis but its machinery buys nothing at this scale (retrieval is a sort over dozens of rows; the reply LLM is cloud-hosted, so there's no local KV cache to compress). Two things are worth taking: - The QJL primitive (1-bit sign-projection, unbiased angle estimate) as a semantic-dedup signal in the hygiene pass — no stored embeddings. - The outlier-channel bit-allocation idea, generalized: spend retention and token budget where the variance/signal is, not uniformly. Reframes confidence-weighted selection as a measurable "survival-weighted retrieval quality" metric. TurboQuant is also documented as the escape hatch that de-risks the deferred embeddings step (packed bit-blobs + brute-force Hamming, no codebook/server) without reordering the roadmap. Also sync the note to current code: cross-stream session recaps (v4 stream_sessions) and provenance render-tags ([said]/[reported]/[guess]).

…l framing Independent audit found two inaccuracies introduced/exposed by the implementation work: - The note claimed a 1-bit sketch would catch zero-overlap synonyms ('plays drums' / 'is a drummer'). It won't — over bag-of-token features a sign-projection sketch is ~equivalent to Jaccard. The prototype was built and removed; reframed as the honest lesson, with true synonym dedup waiting for embeddings (QJL makes those cheap). - The 'uniform / recency-rank-alone' framing was stale: retrieval already blends confidence x recency, and stale-decay now scales with confidence. Reframed so the genuinely-open frontier is distinctiveness + measuring whether the weighting helps.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 65f94109d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T17:38:25Z

+| 10 | ~500 | ~2,000 | 99.93% |
+| 100 | ~5,000 | ~20,000 | 99.34% |
+| 1,000 | ~50,000 | ~200,000 | 93% |
+| 10,000 | ~500,000 | ~2M | 34% |


Recalculate 10k-user headroom against stated plan limits

The 10,000-user row reports ~500,000 KV reads/day and ~2M worker req/day while also claiming 34% headroom, but §4.1 states Workers Paid includes 10M/month KV reads and 10M/month worker requests (~333k/day each). At these daily rates, both resources are already well over the included monthly quotas (about 15M KV reads/month and 60M worker requests/month), so this headroom value is directionally wrong and will mislead capacity/cost planning.

Useful? React with 👍 / 👎.

thedeutschmark added 2 commits May 21, 2026 01:17

thedeutschmark force-pushed the docs/turboquant-inspiration branch from 65f9410 to cb4023a Compare May 23, 2026 17:37

chatgpt-codex-connector Bot reviewed May 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(chat-bot-memory): TurboQuant inspiration + sync to v4 memory state#1

docs(chat-bot-memory): TurboQuant inspiration + sync to v4 memory state#1
thedeutschmark wants to merge 2 commits into
mainfrom
docs/turboquant-inspiration

thedeutschmark commented May 21, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thedeutschmark commented May 21, 2026

What

TurboQuant (arXiv:2504.19874)

Drift fixes (the "up to date" part)

Note

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant