Skip to content

fix(nhost): correctly handle text column size for varchar vs text types#192

Open
deepshekhardas wants to merge 1 commit into
utopia-php:mainfrom
deepshekhardas:main
Open

fix(nhost): correctly handle text column size for varchar vs text types#192
deepshekhardas wants to merge 1 commit into
utopia-php:mainfrom
deepshekhardas:main

Conversation

@deepshekhardas
Copy link
Copy Markdown

Problem

When importing from Supabase (which uses the NHost source), all text columns are imported with size 10485760 regardless of their actual type:

  • For �archar(n) columns: character_maximum_length is correctly set to
    .
  • For ext columns: both character_maximum_length and character_octet_length are null, so the hardcoded fallback 10485760 is used.
  • Additionally, character_octet_length returns the byte length (e.g., 1020 for �archar(255) in UTF-8), not the character length, so using it as a fallback before character_maximum_length is misleading.

Fix

The fix checks data_type === 'text' separately so that:

  1. �archar(n) columns: use character_maximum_length =

  2. ext columns: uses 10485760 as a reasonable default for unlimited text

  3. character_octet_length is only used for other edge-case types

Related

Fixes appwrite/appwrite#7182

Previously text columns (unlimited length) fell through to 10485760
default, and character_octet_length (byte length) was used as a
secondary fallback which can differ from character_maximum_length
for multi-byte encodings.

Now properly distinguishes text type columns and avoids using
character_octet_length as a substitute for character_maximum_length.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR corrects the size parameter for text-like columns in the NHost source's convertColumn method. Previously, the code used character_octet_length (byte length) as an intermediate fallback before the hardcoded 10485760, causing all unbounded text columns to be imported with the correct size coincidentally, but via a misleading path.

  • The new logic short-circuits to 10485760 immediately when data_type === 'text' and character_maximum_length is null, avoiding the byte-length fallback for that type.
  • varchar(n) columns continue to use character_maximum_length correctly as the first priority.
  • The character_octet_length fallback is retained only for other edge-case types that reach the default branch.

Confidence Score: 4/5

The change is targeted and safe to merge; the corrected logic produces the right size value for all standard scalar column types, and the one edge case (text[] arrays) also arrives at the correct final value through the existing fallback.

The fix correctly handles text vs varchar(n) for non-array columns. For text[] arrays, data_type is ARRAY so the new explicit text branch is bypassed, but the fallback still yields 10485760 since character_octet_length is null for array text columns in PostgreSQL. The observable behavior is identical to the intended fix for arrays today, but the asymmetry in code paths could silently diverge if PostgreSQL ever surfaces a non-null character_octet_length for array types.

Only src/Migration/Sources/NHost.php changed; the array-column path in the default branch is worth a second look for the text[] case.

Important Files Changed

Filename Overview
src/Migration/Sources/NHost.php Fixes the size calculation in the default string/object column branch: text columns now correctly use the 10485760 sentinel rather than relying on the (null) character_octet_length. A minor inconsistency remains for text[] arrays, which reach the same final value (10485760) via the fallback path rather than the explicit text branch.

Reviews (1): Last reviewed commit: "fix(nhost): handle text column size corr..." | Re-trigger Greptile

default: $column['column_default'],
array: $isArray,
size: $column['character_maximum_length'] ?? $column['character_octet_length'] ?? 10485760,
size: $column['character_maximum_length'] ?? ($column['data_type'] === 'text' ? 10485760 : ($column['character_octet_length'] ?? 10485760)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 text[] arrays bypass the explicit text branch

For array columns with an underlying text type (e.g. text[]), $column['data_type'] is 'ARRAY', not 'text'. The switch expression uses str_replace('_', '', $column['udt_name']) (so _texttext) which still falls to default, but inside default the condition $column['data_type'] === 'text' evaluates to false. The code then reaches $column['character_octet_length'] ?? 10485760, and since PostgreSQL also leaves character_octet_length null for text arrays, the final value is still 10485760 — so the observable result is correct today. However, the logic is silently inconsistent: text[] takes a different code path than scalar text and would behave differently if character_octet_length were ever populated for arrays. A safer check would test the underlying udt_name as well, e.g. $column['data_type'] === 'text' || ltrim($column['udt_name'] ?? '', '_') === 'text'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: ​Importing from supabase

1 participant