Navigate multi-table datasets with metadata using VisiData. Originally developed for ABCD data, adaptable to any similar structure.
# Install VisiData
pip install visidata
# Open metadata file with config
vd --config abcd-visidatarc dd.parquet
# Navigate to a row and press:
# zo - Open data table
# zO - Open and focus on specific column
# zm - Merge selected columns into one table| Key | Action | Description |
|---|---|---|
zo |
Open table | Opens data table from current metadata row |
zO |
Open + focus | Opens table and jumps to specific column |
gzo |
Batch open | Opens all selected rows' tables |
zi |
Show info | Display table info in status bar |
zm |
Merge columns | Merge selected columns into single table |
- Auto key detection: Sets
participant_id/subject_idandsession_idas join keys - Format support: Works with both
.tsvand.parquetfiles - Smart merging: Replicates static data (no session_id) across sessions when merging
- ID mapping: Automatically maps
subject_id↔participant_id - Column type detection: Automatically applies appropriate types based on column name patterns:
_age,_hrs,_height,_weight,_bmi,_pct,_score→ float_year,_count,_num,grade→ integer_dt,_date→ date (YYYY-MM-DD)_dtt,_datetime→ datetime (ISO 8601)
Customize via environment variables to work with any dataset structure:
# Data directory (default: 'data')
export ABCD_TABLES_DIR=my_data
# Metadata column names (defaults shown)
export ABCD_TABLE_COL=table_name # Column containing table name
export ABCD_COLUMN_COL=name # Column containing column/variable name
export ABCD_ID_COLS_COL=identifier_columns # Column listing join keys
# Default identifier columns (default: 'participant_id|session_id')
export ABCD_DEFAULT_IDS=subject_id|visit_idExample: Custom dataset
export ABCD_TABLES_DIR=raw_data
export ABCD_TABLE_COL=source_table
export ABCD_COLUMN_COL=variable
vd --config abcd-visidatarc metadata.tsvvd --config abcd-visidatarc dd.parquet
/ # Search for table name
zo # Open the table
vd --config abcd-visidatarc dd.parquet
g/ # Global search for variable
zO # Open table at that column
vd --config abcd-visidatarc dd.parquet
s # Select first row (column to merge)
s # Select second row
s # Select more rows...
zm # Merge into single table
The merge automatically:
- Joins on participant_id and session_id
- Replicates static data (tables without session_id) across all sessions
- Orders columns: participant_id first, session_id second, then data columns
vd --config abcd-visidatarc dd.parquet
s s s # Select tables you want
gzo # Open all selected tables
S # View sheets list
s s # Select data sheets to join
& # Join (VisiData built-in)
vd --config abcd-visidatarc dd.parquetln -s $(pwd)/abcd-visidatarc ~/.visidatarc# Add to ~/.bashrc or ~/.zshrc
alias vd-abcd='vd --config /path/to/abcd-visidatarc'
# Usage
vd-abcd dd.parquetYour metadata file should have columns for:
| Column | Purpose | Example Value |
|---|---|---|
| Table name | Points to data file | demographics |
| Column name | Variable in table | demo_age |
| Identifier cols | Join keys (optional) | participant_id|session_id |
| Label | Description (optional) | Age at visit |
Example metadata row:
table_name: demographics
name: demo_age
identifier_columns: participant_id | session_id
label: Age at visit in years
Corresponding data file: data/demographics.tsv or .parquet
Tables are automatically formatted based on column name patterns:
| Pattern | Type | Example | VisiData Type |
|---|---|---|---|
*_age, *_hrs, *_height, *_weight, *_bmi |
Float | demo_age |
FloatColumn |
*_pct, *_percent, *_score, *_rate, *_ratio |
Float | accuracy_pct |
FloatColumn |
*_year, *_count, *_num, grade |
Integer | birth_year |
IntColumn |
*_dt, *_date |
Date | visit_dt |
DateColumn (YYYY-MM-DD) |
*_dtt, *_datetime |
DateTime | scan_dtt |
DateColumn (ISO 8601) |
| Everything else | String | participant_id |
Default |
This enables:
- Proper numeric sorting and aggregation
- Date arithmetic and formatting
- Type-specific operations in VisiData
Example: Opening demographics.tsv automatically formats demo_age as float, allowing you to compute statistics (mean, median) with VisiData's aggregation features.
The merge function handles mixed time-varying and static data:
Example:
# demographics.tsv (has session_id)
participant_id session_id demo_age
NDAR001 baseline 10
NDAR001 year_1 11
# static_demo.tsv (no session_id)
participant_id birth_year
NDAR001 2015
# After merge (zm):
participant_id session_id demo_age birth_year
NDAR001 baseline 10 2015 # Replicated
NDAR001 year_1 11 2015 # Replicated
.
├── abcd-visidatarc # Main configuration file
├── dd.parquet # Metadata file
├── data/ # Data tables directory
│ ├── demographics.tsv
│ ├── cognitive.tsv
│ └── ...
└── tests/ # Test suite
├── test_merge.py
└── test_visidata_integration.py
Key binding not working?
- Ensure you're using
--config abcd-visidatarcor have symlinked to~/.visidatarc - Check you're in the metadata sheet, not a data table
Table not opening?
- Verify
ABCD_TABLES_DIRpoints to correct directory - Check file exists:
ls $ABCD_TABLES_DIR/{table_name}.*
Merge creates wrong structure?
- Verify your metadata has correct
identifier_columnsvalues - Check VisiData status bar for config on startup
Wrong columns being used?
- Set appropriate env vars:
ABCD_TABLE_COL,ABCD_COLUMN_COL, etc. - Status bar shows current config when VisiData loads
Useful keys for this workflow:
| Key | Action |
|---|---|
/ g/ |
Search in column / globally |
n |
Next search result |
s u |
Select / unselect row |
! |
Mark column as key |
& |
Join sheets |
S |
View all sheets |
F |
Frequency table |
Ctrl+S |
Save/export |
q |
Quit/close sheet |
See QUICK_REFERENCE.md for complete key bindings reference.
Run the test suite:
python3 tests/test_merge.py
python3 tests/test_visidata_integration.py- Quick Reference: QUICK_REFERENCE.md - Key bindings cheat sheet
- VisiData Docs: https://www.visidata.org/docs/
- VisiData Tutorial: https://jsvine.github.io/intro-to-visidata/
- VisiData >= 2.0 (
pip install visidata) - Python >= 3.7
- Optional:
pyarrowfor better Parquet support
This configuration is provided as-is for navigating multi-table datasets.