Hi,
I'm scalesc to run a big-scale spatial transcriptomic data using it's GPU supported. My data is ~2.7 million cells x 6,175 genes. However, when I tried to run PCA, it would always return me the CUDA memory error. More specifically, I suspect that it's due to the process when it tries to convert the sparse data to dense. Please see the detailed version and error message below. I'm wondering if the team has any insights in fixing this?
Environment
- ScaleSC version: 0.1.0
- CuPy version: 13.6.0
- GPU: NVIDIA L40S (47.67GB VRAM × 2)
- CUDA: 13.0, Driver: 580.159.03
Dataset
- 2,702,045 cells × 6,157 genes
- Sparse CSR float32
- 3,000 HVGs selected via seurat_v3
Note on initialization
I'm using a custom wrapper to initialize ScaleSC due to a separate shape reader issue on large datasets:
def init_scalesc_with_fix(data_dir, **kwargs):
ssc_obj = ssc.ScaleSC(data_dir=data_dir, **kwargs)
adata = ssc_obj.reader._get_anndata_obj()
n_obs, n_vars = adata.shape
ssc_obj.reader.n_cell = n_obs
ssc_obj.reader.n_gene = n_vars
ssc_obj.reader.n_cell_origin = n_obs
ssc_obj.reader.n_gene_origin = n_vars
return ssc_obj
ssc_obj = init_scalesc_with_fix(
data_dir=SCALESC_DIR,
preload_on_cpu=True,
preload_on_gpu=False,
gpus=[0, 1],
save_after_each_step=True,
max_cell_batch=25000
)
normalize_log1p and highly_variable_genes both complete successfully. The full pipeline also ran without issues on smaller datasets.
Problem
pca() consistently fails with:
MemoryError: std::bad_alloc: out_of_memory: CUDA error at: /gscratch/stf/wz34/envs/ScaleSC/include/rmm/mr/device/cuda_memory_resource.hpp
I noticed
1 — allocation is always exactly ~17GB
The allocation request is consistently 17383920128B (~17GB) regardless of max_cell_batch value (tested with 100,000, 50,000, 25,000, and 1,000). This suggests the ~17GB is a fixed overhead unrelated to batch size, making max_cell_batch ineffective as a workaround.
2 — only GPU 0 is used
GPU memory monitoring confirms GPU 1 is completely unused despite gpus=[0, 1]:
GPU 0: free=30.69 GB, total=47.67 GB ← all allocations here
GPU 1: free=47.18 GB, total=47.67 GB ← completely idle
Any insights or help would be greatly appreciated. Thank you!
Hi,
I'm scalesc to run a big-scale spatial transcriptomic data using it's GPU supported. My data is ~2.7 million cells x 6,175 genes. However, when I tried to run PCA, it would always return me the CUDA memory error. More specifically, I suspect that it's due to the process when it tries to convert the sparse data to dense. Please see the detailed version and error message below. I'm wondering if the team has any insights in fixing this?
Environment
Dataset
Note on initialization
I'm using a custom wrapper to initialize ScaleSC due to a separate shape reader issue on large datasets:
normalize_log1pandhighly_variable_genesboth complete successfully. The full pipeline also ran without issues on smaller datasets.Problem
pca()consistently fails with:I noticed
1 — allocation is always exactly ~17GB
The allocation request is consistently 17383920128B (~17GB) regardless of
max_cell_batchvalue (tested with 100,000, 50,000, 25,000, and 1,000). This suggests the ~17GB is a fixed overhead unrelated to batch size, makingmax_cell_batchineffective as a workaround.2 — only GPU 0 is used
GPU memory monitoring confirms GPU 1 is completely unused despite
gpus=[0, 1]:Any insights or help would be greatly appreciated. Thank you!