Skip to content

Airtable source consistently throwing dlt.common.schema.exceptions.UnboundColumnException using provided pipeline example #679

@jakegibb

Description

@jakegibb

dlt version

1.20.0

Describe the problem

I'm trying to load in a minimum viable Airtable base following the guide for Airtable on dltHub. For every Airtable base I've tried to load, I get an error that there is no data in my primary field, despite the fact that this data exists.

Expected behavior

I'm consistently receiving the following error, despite confirming that there is data in Airtable that should be loaded from the primary field.

<class 'dlt.common.schema.exceptions.UnboundColumnException'>
`In schema `airtable_source`: The following columns in table `tasks` did not receive any data during this load:
  - task (marked as non-nullable primary key and must have values)

This can happen if you specify columns manually, for example, using the `merge_key`, `primary_key` or `columns` argument but they do not exist in the data.

I've made no changes to the provided script, am not specifying columns manually, and have taken the following steps to confirm that dlt should be able to load and normalize the data without issue:

  1. Ensured that there is a non-null value in each primary key field.
  2. Attempted loading multiple bases.
  3. Checked scopes for my PAT to confirm that there are read privileges for "data records" and "schema bases", as well as the base I'm working with.
  4. Created a new minimally viable base to reduce other potential errors.
  5. Add a pipeline.drop() statement to confirm it wasn't an issue with a previous schema causing the error.
  6. Called the Airtable API using pyairtable confirm that it is returning data in other instances.

Steps to reproduce

Error can be reproduced by cloning the following repository and Airtable base, and running airtable_pipeline.py. You will need to create your own Airtable Personal Access token and secrets.toml file.

GitHub Repository: https://github.com/jakegibb/dbt_test_airtable

Airtable Base: https://airtable.com/app4gfva87b0CJcUo/shrRZNnqLfKhIPgB1

from typing import List, Dict, Any

import dlt

from airtable import airtable_source


def load_select_tables_from_base_by_id(base_id: str, table_names: List[str]) -> None:
    # configure the pipeline with your destination details
    pipeline = dlt.pipeline(
        pipeline_name="airtable", destination="duckdb", dataset_name="airtable_data"
    )

    airtables = airtable_source(
        base_id=base_id,
        table_names=table_names,
    )

    load_info = pipeline.run(airtables, write_disposition="replace")
    print(load_info)


if __name__ == "__main__":
    pipeline = dlt.pipeline("airtable")
    pipeline.drop()

    load_select_tables_from_base_by_id(
        base_id="app4gfva87b0CJcUo",
        table_names=["tblfmJnnZS78IKmDT", "tblBxrkbWecewTMVJ"],
    )

Operating system

macOS

Runtime environment

Local

Python version

3.12

dlt data source

Airtable (dlt verified source)

dlt destination

DuckDB

Other deployment details

Using uv to manage dependencies and virtual environments.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    Status

    Planned

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions