Skip to content

Added subcaption, summary and modality label generation scripts#38

Open
saidul-islam98 wants to merge 10 commits into
mainfrom
subcaption-summary-generation
Open

Added subcaption, summary and modality label generation scripts#38
saidul-islam98 wants to merge 10 commits into
mainfrom
subcaption-summary-generation

Conversation

@saidul-islam98
Copy link
Copy Markdown
Collaborator

@saidul-islam98 saidul-islam98 commented Jan 19, 2026

PR Type

Feature

Short Description

Adds a complete vLLM inference pipeline for Open-PMC-18M, including Python inference scripts, Slurm bash launch scripts, and a README documenting the full 3-stage workflow for subcaption extraction, image-context summary generation, and modality labeling.

Python Scripts Added

  • generate_subcaption_vllm.py
  • generate_summary_vllm.py
  • generate_modality_labels_vllm.py

Slurm Scripts Added

  • run_vllm_subcaption_inference.sh
  • run_vllm_summary_inference.sh
  • run_vllm_modality_inference.sh

Tests Added

None


This change is Reviewable

echo "Module Loaded and Environment Activated!"

# Specify which GPUs to use
CUDA_VISIBLE_DEVICES=0,1 \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the trailing whitespace.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

import argparse
import re
from tqdm import tqdm
from tqdm.auto import tqdm
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two imports of tqdm.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

max_new_tokens=args.max_new_tokens,
)

fdf.to_csv(data_path, index=False)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process_data_batched_vllm saves to csv at the end. So this looks redundant?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

enc = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

if len(enc) > max_length:
enc = enc[:max_length]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since tokenize=False above, I believe the output is text here, max_length applies to character count, not token count.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


* **Stage 1 (Subcaption extraction, VLM):** `Qwen2.5-VL-32B-Instruct` generates a *verbatim* subfigure caption from a full figure caption + subfigure image.
* **Stage 2 (Context summary, LLM):** `Qwen2.5-14B-Instruct` generates a focused summary of the context passage relevant to the subcaption.
* **Stage 2 (Modality Labeling, VLM):** `Qwen2.5-VL-32B-Instruct` generates L2 labels, then L1 and L0 labels are inferred from a predefined set based on the generated L2 label.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stage 3

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@afkanpour afkanpour requested a review from Negiiiin May 14, 2026 20:46
Copy link
Copy Markdown
Collaborator

@afkanpour afkanpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@afkanpour partially reviewed 12 files, made 2 comments, and resolved 3 discussions.
Reviewable status: 7 of 12 files reviewed, 3 unresolved discussions (waiting on Negiiiin and saidul-islam98).


.DS_Store line 0 at r1 (raw file):
Please delete all .DS_Store files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants