Feature Request: Incremental Adapter Composition (Add / Remove / Update)
Problem
Currently GraniteSwitchComposer.from_base_and_adapters() only supports building a fresh Granite Switch model from a base model + a full set of adapters. There is no way to take an already-composed model and add, remove, or update individual adapters without repeating the entire composition from scratch.
This makes iteration cumbersome — updateing an adapter in existing model requires a full rebuild, this might not even be possible if the original composition setup is not available.
Proposed Capability
Support three incremental operations on an existing composed model:
| Operation |
Input |
Effect |
| Add |
Existing composed model + new adapter(s) |
Appends adapter slot(s), expands stacked tensors, adds control tokens |
| Remove |
Existing composed model + adapter name(s) |
Removes adapter slot(s), compacts tensors, removes control tokens |
| Update |
Existing composed model + updated adapter(s) |
Overwrites adapter weights in-place, no structural change |
Current Architecture Constraints
The compose pipeline makes several assumptions that tie it to fresh builds:
- Fixed-size stacked tensors — LoRA weights are stored as
[num_adapters, 1, ...] tensors. Adding/removing adapters requires reshaping every LoRA parameter in the model.
- Positional adapter indices — Adapter
i always occupies slot i in stacked tensors, adapter_token_ids[i], adapter_names[i], etc. Removing an adapter from the middle creates index gaps.
- Config is static —
GraniteSwitchConfig is built once; num_adapters, adapter_token_ids, hiding_groups, and hiding_policy are all computed at compose time.
- Tokenizer is additive — Control tokens are added but never removed. The tokenizer vocabulary only grows.
- Base weights are re-transferred —
transfer_base_weights loads the upstream base model every time, even though base weights are identical across compositions.
High-Level Implementation Approach
Phase 1: Load Existing Composed Model
Add a class method to load an already-composed checkpoint:
GraniteSwitchComposer.from_existing(model_path) -> (model, config, tokenizer, adapter_registry)
This loads the model via GraniteSwitchForCausalLM.from_pretrained() and reconstructs the adapter registry (name → slot index mapping) from the config.
Phase 3: Implement Operations
Add Adapter
- Load existing model + registry
- Discover and validate new adapter(s) — rank must be ≤
max_lora_rank (or resize all if larger)
- Expand stacked LoRA tensors along dim 0:
[N, 1, ...] → [N+K, 1, ...]
- Remap and insert new adapter weights into the new slots
- Add control tokens to tokenizer, resize embeddings
- Update config:
num_adapters, adapter_token_ids, adapter_names, hiding_groups, hiding_policy
- Re-validate and save
Remove Adapter
- Load existing model + registry
- Resolve adapter name(s) to slot indices
- Delete rows from stacked LoRA tensors (compact remaining)
- Remap indices in remaining config fields
- Remove control tokens from tokenizer (or mark as unused — see open question)
- Update config
- Re-validate and save
Update Adapter
- Load existing model + registry
- Resolve adapter name to slot index
- Load new adapter weights, validate rank/modules match existing slot
- Zero the slot in all stacked tensors, then write new weights in-place
- Update registry metadata (source path, timestamp)
- Re-validate and save
Phase 4: CLI Integration Proposition
Extend the compose_granite_switch CLI:
# Fresh build (existing behavior, unchanged)
python -m granite_switch.composer.compose_granite_switch \
--adapters adapter1 adapter2
# Add adapter(s) to existing model
python -m granite_switch.composer.compose_granite_switch \
--model ./existing-composed-model \
--add-adapters new_adapter1 new_adapter2
# Remove adapter(s)
python -m granite_switch.composer.compose_granite_switch \
--model ./existing-composed-model \
--remove-adapters adapter_name
# Update adapter(s) in-place
python -m granite_switch.composer.compose_granite_switch \
--model ./existing-composed-model \
--update-adapters adapter_name=/path/to/new/weights
Feature Request: Incremental Adapter Composition (Add / Remove / Update)
Problem
Currently
GraniteSwitchComposer.from_base_and_adapters()only supports building a fresh Granite Switch model from a base model + a full set of adapters. There is no way to take an already-composed model and add, remove, or update individual adapters without repeating the entire composition from scratch.This makes iteration cumbersome — updateing an adapter in existing model requires a full rebuild, this might not even be possible if the original composition setup is not available.
Proposed Capability
Support three incremental operations on an existing composed model:
Current Architecture Constraints
The compose pipeline makes several assumptions that tie it to fresh builds:
[num_adapters, 1, ...]tensors. Adding/removing adapters requires reshaping every LoRA parameter in the model.ialways occupies slotiin stacked tensors,adapter_token_ids[i],adapter_names[i], etc. Removing an adapter from the middle creates index gaps.GraniteSwitchConfigis built once;num_adapters,adapter_token_ids,hiding_groups, andhiding_policyare all computed at compose time.transfer_base_weightsloads the upstream base model every time, even though base weights are identical across compositions.High-Level Implementation Approach
Phase 1: Load Existing Composed Model
Add a class method to load an already-composed checkpoint:
This loads the model via
GraniteSwitchForCausalLM.from_pretrained()and reconstructs the adapter registry (name → slot index mapping) from the config.Phase 3: Implement Operations
Add Adapter
max_lora_rank(or resize all if larger)[N, 1, ...] → [N+K, 1, ...]num_adapters,adapter_token_ids,adapter_names,hiding_groups,hiding_policyRemove Adapter
Update Adapter
Phase 4: CLI Integration Proposition
Extend the
compose_granite_switchCLI: