Skip to content

fix(pgtype): scan "char" (OID 18) into *string in binary format (#2314)#2571

Open
luongs3 wants to merge 1 commit into
jackc:masterfrom
luongs3:fix/2314-qchar-binary-scan
Open

fix(pgtype): scan "char" (OID 18) into *string in binary format (#2314)#2571
luongs3 wants to merge 1 commit into
jackc:masterfrom
luongs3:fix/2314-qchar-binary-scan

Conversation

@luongs3
Copy link
Copy Markdown

@luongs3 luongs3 commented May 28, 2026

Fixes #2314.

Problem

Scanning the PostgreSQL "char" type (OID 18, single-byte system type) into a Go *string worked in TextFormat but failed in BinaryFormat:

cannot scan char (OID 18) in binary format into *string

Minimal repro from the issue:

var confupdtype string
err := conn.QueryRow(ctx, `SELECT confupdtype FROM pg_constraint LIMIT 1`).Scan(&confupdtype)
// err: cannot scan char (OID 18) in binary format into *string

This hits anyone reading common system catalog columns (relkind, contype, confupdtype, ...) because binary is the default wire format for OID 18.

Why text worked: pgtype.Map.planScan short-circuits any text-format *string target to scanPlanString before consulting the codec. The binary branch only does that for TextOID / VarcharOID, so for OID 18 it falls through to QCharCodec.PlanScan — which only handled *byte and *rune.

Fix

  • pgtype/qchar.go: add a *string case to QCharCodec.PlanScan, returning a new scanPlanQcharCodecString that mirrors scanPlanQcharCodecByte / scanPlanQcharCodecRune (single-byte payload, empty src → zero value, multi-byte src → error).

Both wire formats now produce the same result for the same target type.

Tests

  • pgtype/qchar_test.go: new TestQcharCodecPlanScanString exercising the codec directly (no DB needed) — covers text format, binary format, empty payload, and over-long payload. The binary + empty-binary subtests fail on master and pass on this branch.
  • TestQcharTranscode: added *string round-trip cases ("a", "z", "0", " ") so the fix is exercised end-to-end against a real Postgres server when one is available.
  • gofmt and go vet clean on touched files.

What does NOT change

  • *byte and *rune scan paths are untouched.
  • Encode path is untouched.
  • DecodeValue / DecodeDatabaseSQLValue are untouched.
  • No new exported symbols; scanPlanQcharCodecString is package-private, same as its sibling plans.

Co-authored-by: Claude Opus 4.7 (1M context) noreply@anthropic.com

…c#2314)

The "char" codec only registered scan plans for *byte and *rune targets.
In the text format, scanning into *string happened to work because
pgtype.Map.planScan short-circuits any text-format *string target to
scanPlanString before consulting the codec. The binary path has no such
short-circuit, so the same scan failed with:

    cannot scan char (OID 18) in binary format into *string

That broke a very common pattern when reading system catalogs:

    SELECT confupdtype FROM pg_constraint LIMIT 1

(or any other pg_*.relkind / contype / etc.) because the default wire
format for OID 18 is binary.

This change adds a *string scan plan to QCharCodec that mirrors the
*byte and *rune plans (single-byte payload, empty src -> zero value,
multi-byte src -> error), so both wire formats produce the same result
for the same target type.

Tests:
- pgtype/qchar_test.go: new TestQcharCodecPlanScanString covering text +
  binary + empty payload + too-long payload, exercising the codec
  directly so it runs without a Postgres connection.
- TestQcharTranscode: added *string round-trip cases for "a", "z", "0",
  " " so the fix is exercised end-to-end against a real server when PG
  is available.

Local verification:
- Reproduced the issue: PlanScan(QCharOID, BinaryFormatCode, *string)
  returned *scanPlanFail before the fix; returns scanPlanQcharCodecString
  after.
- TestQcharCodecPlanScanString fails on master (binary + empty-binary
  subtests) and passes on this branch.
- gofmt and go vet clean on touched files.

Fixes jackc#2314

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jackc
Copy link
Copy Markdown
Owner

jackc commented May 31, 2026

Thanks for the PR! The diagnosis and the routing analysis (text *string short-circuits to scanPlanString for any OID, binary only for Text/VarcharOID) are spot on, and the fix is correct for the common ASCII catalog case. But I think the conversion isn't quite right for the full range of the type.

(Also: I gather this was co-authored by Claude — and I reviewed it with Claude too. So this is largely my Claude leaving notes for your Claude. 🤖🤝🤖)

What "char" actually is

OID 18 is a single raw byte (C char), not a Unicode character — it can hold any value 0–255, and PostgreSQL treats it as a byte. The catalogs happen to use ASCII, but the type itself is byte-oriented.

The issue

The new plan does *p = string(src[0]). In Go, string(x) for an integer-typed x (here byte) doesn't copy the byte — it interprets it as a Unicode code point and returns its UTF-8 encoding. For any byte ≥ 128 that's two bytes. The text-format *string path, by contrast, does string(src) — a raw byte copy. So they diverge:

wire format scan plan result bytes len
text scanPlanStringstring(src) "\xc8" c8 1
binary this PR → string(src[0]) "È" c3 88 2

(reproduced by scanning the same single-byte "char" payload []byte{200} through Map.Scan in each format.)

That means the PR's headline invariant — "Both wire formats now produce the same result for the same target type" — doesn't hold for bytes 128–255, and the binary path no longer matches the raw-byte semantics a text-format *string scan already has.

Suggested fix

Copy the raw byte instead, mirroring the text path exactly:

p := dst.(*string)
// Copy the raw byte so the result matches the text-format *string scan path
// (string(src)) byte-for-byte. Using string(src[0]) would instead UTF-8 encode
// the byte as a code point, diverging for byte values >= 128.
*p = string(src)

This also lets the len(src) == 0 special case go away, since string([]byte{}) is already "". With it, text and binary agree for a"a", 200"\xc8", 0"\x00", []""; your TestQcharCodecPlanScanString, TestQcharTranscode, the full pgtype suite, gofmt, and vet all stay green.

It'd be worth extending TestQcharCodecPlanScanString with a non-ASCII byte (e.g. 0xC8) asserting text and binary produce identical bytes, so this stays locked down.

One genuinely irreducible case to be aware of (not something to fix here): the NUL char. charout emits an empty string for byte 0, so text yields "" while binary yields "\x00". That's rooted in PostgreSQL's text output, and no conversion choice can reconcile it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scanning of char into string possible in TextFormat but not in BinaryFormat

2 participants