Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
# Changelog

### v1.10.1

#### Features

- Add `dbt_sqlserver_enable_safe_type_expansion` behaviour flag to allow safe column type widening during schema expansion: `varchar` → `nvarchar`, integer family promotions (`bit` → `tinyint` → `smallint` → `int` → `bigint`), and `numeric`/`decimal` precision/scale upgrades. Gated by the per-model `column_type_expansion_max_rows` config (default 1,000,000 rows). See [#699](https://github.com/dbt-msft/dbt-sqlserver/issues/699).
- Add `prefer_single_alter_column` model config to use a single `ALTER COLUMN` statement instead of the add+update+drop+rename pattern when altering column types on tables.
- Add `string_type_instance()` to preserve the NVARCHAR/NCHAR type family during column expansion, fixing incorrect promotion of NVARCHAR/NCHAR to VARCHAR.
- Add `tinyint` and `bit` to the `is_integer()` type list for correct type detection.

#### Bugfixes

- Fix catalog generation for NVARCHAR/NCHAR columns: use `user_type_id` instead of `system_type_id` in catalog.sql, preventing them from appearing as `SYSNAME` in `dbt docs`. [#637](https://github.com/dbt-msft/dbt-sqlserver/issues/637)
- Fix `is_numeric()` to exclude `money`/`smallmoney` (now `is_fixed_numeric()`), preventing incorrect type expansion for fixed-precision money types.
- Fix seed table ingestion of empty numeric cells by inlining `null` literals instead of binding parameters. [#425](https://github.com/dbt-msft/dbt-sqlserver/issues/425)
- Fix integer-to-numeric safe expansion to require sufficient precision (e.g. `int` → `numeric(10,0)` minimum), avoiding data-loss risk.

#### Migration note

- `money` and `smallmoney` columns are no longer classified as `is_numeric()`. If you have custom code or macros that depend on `money` being numeric, use `is_number()` (which covers all numeric types) or `is_fixed_numeric()` for money types specifically.

### v1.10.0

#### Features
Expand Down
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,44 @@ vars:

*(default: `pyodbc`)* Set to `mssql-python` in a profile target to use the `mssql-python` backend instead of `pyodbc`. The adapter fails if the required backend package (Python dependency), such as `pyodbc` or `mssql-python`, is not installed.

### `dbt_sqlserver_enable_safe_type_expansion`

*(default: `false`)* When enabled, allows the adapter to widen column types during incremental model schema expansion beyond same-family string resizes. Supported safe expansions include:

- **Cross-family string**: `varchar`/`char` → `nvarchar`/`nchar` (same or larger size)
- **Integer family**: `bit` → `tinyint` → `smallint` → `int` → `bigint`
- **Integer → numeric**: `int` → `numeric` (with sufficient precision to hold the integer range)
- **Numeric precision/scale**: `numeric(p,s)` → `numeric(p2,s2)` where precision and scale both increase
- **Fixed-money**: `smallmoney` → `money`, `money` → `numeric` (with sufficient precision)

Safe expansions are further gated by `column_type_expansion_max_rows` (default 1,000,000 rows) to avoid long-running operations on large tables.

```yaml
# dbt_project.yml
flags:
dbt_sqlserver_enable_safe_type_expansion: true
```

### `column_type_expansion_max_rows`

*(default: `1000000`)* Per-model config that limits when safe type expansion runs. When the target table exceeds this row count, safe type expansion is skipped (basic same-family string resizes still proceed). Set to `-1` to disable the check entirely.

```sql
-- In an incremental model
{{ config(materialized='incremental', unique_key='id',
column_type_expansion_max_rows=500000) }}
```

### `prefer_single_alter_column`

*(default: `false`)* Model-level config that controls how `alter_column_type` changes column types on tables. When `false` (default), the adapter uses the safer approach: add a temporary column, copy data, drop the original, and rename. When `true`, the adapter uses a single `ALTER COLUMN` statement, which is faster on small, medium tables and instant on safe type expansions but may fail for types that cannot be implicitly converted.

```sql
-- In an incremental model
{{ config(materialized='incremental', unique_key='id',
prefer_single_alter_column=true) }}
```

## Contributing

[![Unit tests](https://github.com/dbt-msft/dbt-sqlserver/actions/workflows/unit-tests.yml/badge.svg)](https://github.com/dbt-msft/dbt-sqlserver/actions/workflows/unit-tests.yml)
Expand Down
118 changes: 117 additions & 1 deletion dbt/adapters/sqlserver/sqlserver_adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,17 @@
from dbt.adapters.base.meta import available
from dbt.adapters.base.relation import BaseRelation
from dbt.adapters.capability import Capability, CapabilityDict, CapabilitySupport, Support
from dbt.adapters.events.types import SchemaCreation
from dbt.adapters.events.logging import AdapterLogger
from dbt.adapters.events.types import ColTypeChange, SchemaCreation
from dbt.adapters.reference_keys import _make_ref_key_dict
from dbt.adapters.sql.impl import CREATE_SCHEMA_MACRO_NAME, SQLAdapter
from dbt.adapters.sqlserver.sqlserver_column import SQLServerColumn, SQLServerColumnNative
from dbt.adapters.sqlserver.sqlserver_configs import SQLServerConfigs
from dbt.adapters.sqlserver.sqlserver_connections import SQLServerConnectionManager
from dbt.adapters.sqlserver.sqlserver_relation import SQLServerRelation

logger = AdapterLogger("SQLServer")


class SQLServerAdapter(SQLAdapter):
"""
Expand Down Expand Up @@ -99,6 +102,16 @@ def _behavior_flags(self) -> List[BehaviorFlag]:
"The new behaviour is intended to become the default in a future release."
),
},
{
"name": "dbt_sqlserver_enable_safe_type_expansion",
"default": False,
"description": (
"Allow the SQL Server adapter to widen column types during schema expansion. "
"This enables promotions like varchar -> nvarchar, "
"bit -> tinyint -> smallint -> int -> bigint, "
"and numeric(p,s) -> numeric(p2,s2) using alter column."
),
},
]

@available.parse(lambda *a, **k: [])
Expand Down Expand Up @@ -288,6 +301,109 @@ def render_model_constraint(cls, constraint: ModelLevelConstraint) -> Optional[s
else:
return None

def _get_row_count(self, relation) -> int:
"""Return the number of rows in the given relation."""
sql = f"SELECT COUNT_BIG(*) FROM {relation}"
_, cursor = self.connections.add_select_query(sql)
row = cursor.fetchone()
return int(row[0]) if row else 0

def expand_column_types(self, goal, current, max_rows: int = 1000000):
"""Override to ensure we preserve nvarchar/nchar type family during
column expansion. Necessary same-family resizes (e.g. varchar size)
always proceed. Safe type expansions (cross-family promotions like
varchar -> nvarchar) are guarded by column_type_expansion_max_rows.
enable_safe_type_expansion is the future approach for widening."""

reference_columns = {c.name: c for c in self.get_columns_in_relation(goal)}
target_columns = {c.name: c for c in self.get_columns_in_relation(current)}

enable_safe = self.behavior.dbt_sqlserver_enable_safe_type_expansion

row_count_exceeds = False
if enable_safe and max_rows != -1:
if max_rows == 0:
row_count_exceeds = True
logger.info(
"Safe type expansion skipped for %s: " "column_type_expansion_max_rows is 0.",
current,
)
else:
row_count = self._get_row_count(current)
if row_count > max_rows:
row_count_exceeds = True
logger.warning(
"Safe type expansion skipped for %s: "
"%s rows exceeds column_type_expansion_max_rows (%s). "
"Set column_type_expansion_max_rows=-1 to disable "
"this check, or increase the limit.",
current,
row_count,
max_rows,
)

for column_name, reference_column in reference_columns.items():
target_column = target_columns.get(column_name)
if target_column is None:
continue

if target_column.can_expand_to(reference_column):
pass
elif (
enable_safe
and not row_count_exceeds
and target_column.can_expand_safe(reference_column)
):
pass
else:
continue

if reference_column.is_string():
col_string_size = reference_column.string_size()
new_type = reference_column.string_type_instance(col_string_size)
else:
new_type = reference_column.data_type
fire_event(
ColTypeChange(
orig_type=target_column.data_type,
new_type=new_type,
table=_make_ref_key_dict(current),
)
)
self.alter_column_type(current, column_name, new_type)

@available.parse_none
def expand_target_column_types(
self, from_relation: BaseRelation, to_relation: BaseRelation, max_rows: int = 1000000
) -> None:
if not isinstance(from_relation, self.Relation):
from dbt.adapters.base.impl import MacroArgTypeError

raise MacroArgTypeError(
method_name="expand_target_column_types",
arg_name="from_relation",
got_value=from_relation,
expected_type=self.Relation,
)
if not isinstance(to_relation, self.Relation):
from dbt.adapters.base.impl import MacroArgTypeError

raise MacroArgTypeError(
method_name="expand_target_column_types",
arg_name="to_relation",
got_value=to_relation,
expected_type=self.Relation,
)
self.expand_column_types(from_relation, to_relation, max_rows)

def alter_column_type(self, relation, column_name, new_column_type):
kwargs = {
"relation": relation,
"column_name": column_name,
"new_column_type": new_column_type,
}
self.execute_macro("alter_column_type", kwargs=kwargs)


COLUMNS_EQUAL_SQL = """
with diff_count as (
Expand Down
94 changes: 85 additions & 9 deletions dbt/adapters/sqlserver/sqlserver_column.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,23 @@ class SQLServerColumn(Column):

@classmethod
def string_type(cls, size: int) -> str:
"""Class-level string_type used by SQLAdapter.expand_column_types.

Return a VARCHAR default for the SQLAdapter path; this keeps behaviour
consistent with the rest of dbt where class-level string_type is
generic and not instance-aware.
"""
return f"varchar({size if size > 0 else '8000'})"

def string_type_instance(self, size: int) -> str:
"""Instance-level string type selection that respects NVARCHAR/NCHAR."""
dtype = (self.dtype or "").lower()
if dtype == "nvarchar":
return f"nvarchar({size if size > 0 else '4000'})"
if dtype == "nchar":
return f"nchar({size if size > 0 else '1'})"
if dtype == "char":
return f"char({size if size > 0 else '1'})"
return f"varchar({size if size > 0 else '8000'})"

def literal(self, value: Any) -> str:
Expand All @@ -48,42 +65,47 @@ def data_type(self) -> str:
if self.dtype.lower() == "datetime2":
return "datetime2(6)"
if self.is_string():
return self.string_type(self.string_size())
return self.string_type_instance(self.string_size())
elif self.is_numeric():
return self.numeric_type(self.dtype, self.numeric_precision, self.numeric_scale)
else:
return self.dtype

def is_string(self) -> bool:
return self.dtype.lower() in ["varchar", "char"]
return self.dtype.lower() in ["varchar", "char", "nvarchar", "nchar"]

def is_number(self):
return any([self.is_integer(), self.is_numeric(), self.is_float()])
return any(
[self.is_integer(), self.is_numeric(), self.is_float(), self.is_fixed_numeric()]
)

def is_float(self):
return self.dtype.lower() in ["float", "real"]

def is_integer(self) -> bool:
return self.dtype.lower() in [
# real types
"smallint",
"integer",
"bigint",
"smallserial",
"serial",
"bigserial",
# aliases
"int2",
"int4",
"int8",
"serial2",
"serial4",
"serial8",
"int",
"tinyint",
"bit",
]

def is_numeric(self) -> bool:
return self.dtype.lower() in ["numeric", "decimal", "money", "smallmoney"]
return self.dtype.lower() in ["numeric", "decimal"]

def is_fixed_numeric(self) -> bool:
return self.dtype.lower() in ["money", "smallmoney"]

def string_size(self) -> int:
if not self.is_string():
Expand All @@ -93,10 +115,64 @@ def string_size(self) -> int:
else:
return int(self.char_size)

def can_expand_to(self, other_column: "SQLServerColumn") -> bool:
if not self.is_string() or not other_column.is_string():
def can_expand_to(self, other_column: "Column") -> bool:
self_dtype = self.dtype.lower()
other_dtype = other_column.dtype.lower()
if self.is_string() and other_column.is_string():
self_size = self.string_size()
other_size = other_column.string_size()
if other_size > self_size and self_dtype == other_dtype:
return True
return False

def can_expand_safe(self, other_column: "SQLServerColumn") -> bool:
self_dtype = self.dtype.lower()
other_dtype = other_column.dtype.lower()

if self.is_string() and other_column.is_string():
self_size = self.string_size()
other_size = other_column.string_size()
if self_dtype in ("varchar", "char") and other_dtype in ("nvarchar", "nchar"):
return other_size >= self_size
return False
return other_column.string_size() > self.string_size()

if not self.is_number() or not other_column.is_number():
return False

int_family = ("bit", "tinyint", "smallint", "int", "bigint")
if self_dtype in int_family and other_dtype in int_family:
return int_family.index(other_dtype) > int_family.index(self_dtype)

self_prec = int(self.numeric_precision or 0)
other_prec = int(other_column.numeric_precision or 0)

if self.is_integer() and other_column.is_numeric():
minimum_int_precision: int
if self_dtype in ("tinyint",):
minimum_int_precision = 3
elif self_dtype in ("smallint", "int2"):
minimum_int_precision = 5
elif self_dtype in ("bigint", "int8", "bigserial", "serial8"):
minimum_int_precision = 19
elif self_dtype in ("bit",):
minimum_int_precision = 1
else:
minimum_int_precision = 10
effective_self_prec = max(self_prec, minimum_int_precision)
if other_prec >= effective_self_prec:
return True

if (self.is_numeric() or self.is_fixed_numeric()) and (
other_column.is_numeric() or other_column.is_fixed_numeric()
):
self_scale = int(self.numeric_scale or 0)
other_scale = int(other_column.numeric_scale or 0)

if other_prec >= self_prec and other_scale >= self_scale:
if other_prec > self_prec or other_scale > self_scale or self_dtype != other_dtype:
return True

return False


class SQLServerColumnNative(SQLServerColumn):
Expand Down
2 changes: 2 additions & 0 deletions dbt/adapters/sqlserver/sqlserver_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@
@dataclass
class SQLServerConfigs(AdapterConfig):
auto_provision_aad_principals: Optional[bool] = False
prefer_single_alter_column: Optional[bool] = False
column_type_expansion_max_rows: Optional[int] = None
4 changes: 2 additions & 2 deletions dbt/include/sqlserver/macros/adapters/catalog.sql
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@
c.column_id as column_index,
t.name as column_type
from sys.columns as c {{ information_schema_hints() }}
left join sys.types as t {{ information_schema_hints() }} on c.system_type_id = t.system_type_id
left join sys.types as t {{ information_schema_hints() }} on c.user_type_id = t.user_type_id
)

select
Expand Down Expand Up @@ -226,7 +226,7 @@
c.column_id as column_index,
t.name as column_type
from sys.columns as c {{ information_schema_hints() }}
left join sys.types as t on c.system_type_id = t.system_type_id
left join sys.types as t on c.user_type_id = t.user_type_id
)

select
Expand Down
Loading