NOVA

Implementation of NOVA: Non-Contrastive Vision-Language Learning with Predictive Embedding Alignment on top of stable-pretraining.

NOVA trains a randomly initialized ViT image encoder to predict embeddings from a frozen ClinicalBERT text encoder. It uses MSE alignment to the text anchor plus joint SIGReg regularization over all image-view predictions and the text embedding:

loss = (1 - lambda) * MSE(image_views, text_anchor) + lambda * SIGReg([image_views, text_anchor])

The code follows the neighboring LeVLJEPA-release structure, but replaces the learnable text encoder/cross-prediction setup with the paper's frozen ClinicalBERT target stack and MIMIC-style radiology datasets.

Layout

main.py - Hydra/stable-pretraining training entry point
forwards.py - NOVA, InfoNCE, and SigLIP forward/loss functions
callbacks.py - gradient clipping, embedding diagnostics, checkpointing, zero-shot eval
utils/dataset.py - MIMIC/CheXpert/ChestX-ray14 manifest datasets and augmentations
utils/eval.py - binary prompt zero-shot AUC evaluation
configs/ - ViT-S/ViT-B and objective configs

Install

uv sync

For local framework development, install the parent checkout instead:

uv pip install -e ../stable-pretraining

Data Manifests

Training is manifest-driven so protected datasets stay outside the repo. A training CSV/parquet/jsonl needs at least:

image_path,impression,ViewPosition
p10/p10000032/s50414267/xxx.jpg,"No acute cardiopulmonary abnormality.",PA

If impression is missing, set data.report_col to a full radiology report column and the loader extracts the IMPRESSION section.

Evaluation manifests need an image path and binary label columns. CheXpert-style uncertain labels (-1) are treated as negative by default.

Train NOVA

python main.py \
  data.train_manifest=/path/to/mimic_train.csv \
  data.image_root=/path/to/images \
  run_name=nova_vitb

ViT-S:

python main.py model=small run_name=nova_vits

Multi-GPU is handled by Lightning:

python main.py devices=8 batch_size=256

Paper Defaults

The default configs/nova.yaml matches the paper setup:

frozen emilyalsentzer/Bio_ClinicalBERT
ViT-B/16 from scratch
embedding dimension 64
predictor hidden width 2048
2 global crops at 224, 6 local crops at 96
AdamW, cosine decay 1e-4 -> 1e-5
batch size 256, 100 epochs
lambda=0.02, gradient clipping 1.0, bf16 mixed precision

Zero-Shot Evaluation

Add datasets under evals in a config or CLI override. Example:

evals:
  - name: chexpert
    enabled: true
    manifest: /path/to/chexpert_test.csv
    image_root: /path/to/CheXpert-v1.0
    image_col: image_path
    label_cols: [Atelectasis, Cardiomegaly, Edema, Pleural Effusion, Consolidation]
    positive_prompts: [atelectasis, cardiomegaly, edema, pleural effusion, consolidation]
    negative_prompts: [no atelectasis, no cardiomegaly, no edema, no pleural effusion, no consolidation]

The callback reports per-label AUC and macro AUC every eval_every_n_steps.

Baselines

The same frozen ClinicalBERT + ViT stack can train the comparison objectives:

python main.py --config-name infonce
python main.py --config-name siglip
python main.py --config-name medclip

These are intentionally single-crop objectives, matching the paper's distinction from NOVA's multi-crop training.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
scripts		scripts
slurm		slurm
utils		utils
.gitignore		.gitignore
README.md		README.md
callbacks.py		callbacks.py
evaluate.py		evaluate.py
forwards.py		forwards.py
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOVA

Layout

Install

Data Manifests

Train NOVA

Paper Defaults

Zero-Shot Evaluation

Baselines

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NOVA

Layout

Install

Data Manifests

Train NOVA

Paper Defaults

Zero-Shot Evaluation

Baselines

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages