Skip to content

Force-categorical toggle for numeric metadata columns (cluster IDs) #35

@diegospa99

Description

@diegospa99

Goal

Allow the user to treat a numeric metadata column as categorical (per-value editable colors + legend) instead of continuous (gradient + lo/hi sliders).

Why this matters

Lots of Seurat / single-cell metadata are integer-coded categorical variables:

  • Cluster IDs (`seurat_clusters`, `*_snn_res.X.X`, `neighborhoodPCA.SNN.clusters`, etc.)
  • Phenotype codes
  • Bin assignments

Today the frontend auto-routes based on dtype: strings → categorical palette; ints/floats → continuous viridis. That means cluster IDs (which arrive as integers from `fwrite(..., row.names=TRUE)` on a Seurat `@meta.data`) get rendered as a viridis gradient with lo/hi sliders. To pick out cluster 4 you'd have to set lo=4, hi=4 — fragile, and you lose the categorical editable-palette UX.

Symptoms / repro

  1. Export `@meta.data` from a Seurat object as a CSV via `ExportMetaDataforTissuePlex.R` or equivalent
  2. Drop into the dataset folder; reload
  3. Color cells by a cluster column (e.g., `neighborhoodPCA.SNN.clusters`) → continuous viridis
  4. Color cells by `Type.6` (string column) → discrete legend with editable per-category colors ✓

Same problem hits any int-coded categorical: `Xenium_snn_res.0.3`, `*_snn_res.0.6`, etc.

Workaround in use today

Post-process the CSV after `fwrite` to prefix the cluster IDs with a letter so they're string-typed (`4` → `N4`). Works but loses semantic numeric ordering (e.g., for sorting in legend) and requires a custom script per export.

Proposed UI

In the "Color cells" panel, when a numeric column is selected, show a small "Treat as categorical" toggle alongside the colormap dropdown.

  • Off (default): current behavior — continuous gradient + lo/hi sliders
  • On: switch to categorical UI — distinct color per unique value, editable per-value palette, legend with checkbox visibility per category

Optional behavior:

  • Auto-suggest "Treat as categorical" when the column has < N (e.g., 50) unique integer values
  • Preserve numeric sort order in the legend (cluster 1, 2, 3, …) rather than alphabetical / hash order
  • Persist the choice per-column within a session

Out of scope

  • Per-value reordering in legend (separate enhancement)
  • Numeric → binned categorical ("discretize a continuous variable") — different feature

Use case driving this

Manuscript-level visual audit of LR mechanisms in neighborhood-specific tissue regions. Need to color cells by the lab's `neighborhoodPCA.SNN.clusters` column to find N4 (cluster 4 = fibrotic-mimetic) patches and overlay LR-mechanism connectivity. The continuous gradient makes this much harder than it needs to be.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions