Added subcaption, summary and modality label generation scripts#38
Open
saidul-islam98 wants to merge 10 commits into
Open
Added subcaption, summary and modality label generation scripts#38saidul-islam98 wants to merge 10 commits into
saidul-islam98 wants to merge 10 commits into
Conversation
afkanpour
reviewed
May 14, 2026
| echo "Module Loaded and Environment Activated!" | ||
|
|
||
| # Specify which GPUs to use | ||
| CUDA_VISIBLE_DEVICES=0,1 \ |
Collaborator
There was a problem hiding this comment.
Remove the trailing whitespace.
| import argparse | ||
| import re | ||
| from tqdm import tqdm | ||
| from tqdm.auto import tqdm |
| max_new_tokens=args.max_new_tokens, | ||
| ) | ||
|
|
||
| fdf.to_csv(data_path, index=False) |
Collaborator
There was a problem hiding this comment.
The process_data_batched_vllm saves to csv at the end. So this looks redundant?
| enc = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | ||
|
|
||
| if len(enc) > max_length: | ||
| enc = enc[:max_length] |
Collaborator
There was a problem hiding this comment.
Since tokenize=False above, I believe the output is text here, max_length applies to character count, not token count.
|
|
||
| * **Stage 1 (Subcaption extraction, VLM):** `Qwen2.5-VL-32B-Instruct` generates a *verbatim* subfigure caption from a full figure caption + subfigure image. | ||
| * **Stage 2 (Context summary, LLM):** `Qwen2.5-14B-Instruct` generates a focused summary of the context passage relevant to the subcaption. | ||
| * **Stage 2 (Modality Labeling, VLM):** `Qwen2.5-VL-32B-Instruct` generates L2 labels, then L1 and L0 labels are inferred from a predefined set based on the generated L2 label. |
afkanpour
approved these changes
May 14, 2026
Collaborator
afkanpour
left a comment
There was a problem hiding this comment.
@afkanpour partially reviewed 12 files, made 2 comments, and resolved 3 discussions.
Reviewable status: 7 of 12 files reviewed, 3 unresolved discussions (waiting on Negiiiin and saidul-islam98).
.DS_Store line 0 at r1 (raw file):
Please delete all .DS_Store files
Negiiiin
approved these changes
May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Type
Feature
Short Description
Adds a complete vLLM inference pipeline for Open-PMC-18M, including Python inference scripts, Slurm bash launch scripts, and a README documenting the full 3-stage workflow for subcaption extraction, image-context summary generation, and modality labeling.
Python Scripts Added
generate_subcaption_vllm.pygenerate_summary_vllm.pygenerate_modality_labels_vllm.pySlurm Scripts Added
Tests Added
None
This change is