Skip to content

fix: escape interior double-quotes in FTS5 search to prevent 500s#462

Closed
Mubashirrrr wants to merge 1 commit into
Gentleman-Programming:mainfrom
Mubashirrrr:fix/fts-search-embedded-quote-crash
Closed

fix: escape interior double-quotes in FTS5 search to prevent 500s#462
Mubashirrrr wants to merge 1 commit into
Gentleman-Programming:mainfrom
Mubashirrrr:fix/fts-search-embedded-quote-crash

Conversation

@Mubashirrrr

Copy link
Copy Markdown

Problem

GET /search (and GET /prompts/search) return HTTP 500 for ordinary search input that contains a double-quote inside a word, e.g. foo"bar.

$ curl -s 'http://127.0.0.1:PORT/search?q=foo%22bar'
{"error":"search: SQL logic error: unterminated string (1)"}

Root cause

sanitizeFTS (internal/store/store.go) wraps each whitespace-separated token in double-quotes so FTS5 treats it as a phrase. It strips surrounding quotes with strings.Trim(w, "\"") but leaves interior quotes untouched:

w = strings.Trim(w, `"`)   // only edges
words[i] = `"` + w + `"`    // re-wrap

So foo"bar becomes the phrase "foo"bar". That is an unbalanced FTS5 string literal, and SQLite aborts the whole MATCH query with unterminated string. Store.Search returns that error and the HTTP handler maps it to 500. Whether a given input crashes is data-dependent (an even number of interior quotes happens to balance, an odd number does not), which makes it an easy-to-hit, hard-to-explain 500.

Fix

Escape any interior double-quotes by doubling them, which is how a literal " is represented inside an FTS5 string literal:

w = strings.Trim(w, `"`)            // strip surrounding quotes (unchanged)
w = strings.ReplaceAll(w, `"`, `""`) // escape remaining interior quotes
words[i] = `"` + w + `"`

foo"bar now sanitizes to the valid phrase "foo""bar", which both runs without error and correctly matches text containing the literal token foo"bar. Plain queries are unaffected: fix auth bug"fix" "auth" "bug" and "quoted phrase""quoted" "phrase", identical to before. One line of logic changed; surrounding-quote behavior preserved.

Test (fail-before → pass-after)

Added two tests in internal/store/store_test.go:

  • TestSearchWithEmbeddedQuoteDoesNotCrash — stores an observation containing foo"bar, then runs Search with several embedded-quote inputs (foo"bar, a"b"c, ", he said "hi). Asserts none error, and that searching foo"bar returns the matching observation.
  • TestSanitizeFTSPreservesPlainQueries — locks in the unchanged output for plain/edge-quoted queries (backward-compat guard).
# with the fix reverted (fail-before):
--- FAIL: TestSearchWithEmbeddedQuoteDoesNotCrash
    store_test.go:8320: Search("foo\"bar") returned error: search: SQL logic error: unterminated string (1)
--- FAIL: TestSanitizeFTSPreservesPlainQueries
    store_test.go:8352: sanitizeFTS("foo\"bar") = "\"foo\"bar\"", want "\"foo\"\"bar\""

# with the fix (pass-after):
ok  github.com/Gentleman-Programming/engram/internal/store

Full go test ./internal/store/ and go test ./internal/server/ pass; go vet ./internal/store/ is clean.

Repro steps

  1. Start the engram HTTP server and create at least one observation.
  2. curl -s 'http://127.0.0.1:<port>/search?q=foo%22bar'
  3. Before this change: 500 with search: SQL logic error: unterminated string. After: 200 with normal results.

🤖 Generated with Claude Code

A search token containing a double-quote inside the word (e.g. foo"bar)
was wrapped by sanitizeFTS into the unbalanced FTS5 phrase "foo"bar".
SQLite rejects this with "unterminated string", so Store.Search returns
an error and the GET /search and GET /prompts/search handlers respond
with HTTP 500 on ordinary user input.

Escape any interior double-quotes by doubling them, as required for an
FTS5 string literal. Surrounding-quote stripping is preserved, so plain
queries like `fix auth bug` and `"quoted phrase"` produce identical
output to before.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Alan-TheGentleman

Copy link
Copy Markdown
Collaborator

This looks like a plausible FTS5 bug, but I am closing the PR because it skips the issue-first process and the PR body includes generated-tool attribution we do not carry in contributions. Open a clear bug issue with the reproduction, get it approved, then re-file a clean PR linked with Closes #N. The underlying fix can be reviewed once the process is clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants