Skip to content

feat(aggregation): Add MoDoWeighting#717

Open
KhusPatel4450 wants to merge 3 commits into
SimplexLab:mainfrom
KhusPatel4450:feat/modo-weighting
Open

feat(aggregation): Add MoDoWeighting#717
KhusPatel4450 wants to merge 3 commits into
SimplexLab:mainfrom
KhusPatel4450:feat/modo-weighting

Conversation

@KhusPatel4450
Copy link
Copy Markdown
Contributor

Adds MoDoWeighting from Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance (JMLR 2024).

It's a stateful Weighting[PSDMatrix] implementing the λ-update from Algorithm 2:

  • λ_{t+1} = softmax(λ_t − γ·(G·λ_t + ρ·λ_t))

Per the discussion with @PierreQuinton and @ValerianRey on Discord, this follows the official LibMTL implementation which uses softmax rather than the paper's hard simplex projection.

Designed to be composed with autogram.Engine in a two-batch training loop so that MoDo's double-sampling property is preserved (Gramian comes from batch 1; backward uses batch 2).

Test plan

  • Unit tests in tests/unit/aggregation/test_modo.py (12 functions, 72 cases — structural, reset, parameter validation, softmax boundary cases, recurrence verification)
  • Full unit suite passes: 3098 passed, 66 skipped, 33 xfailed
  • ty check passes on _modo.py
  • Sphinx doctest: 97 tests, 0 failures
  • HTML build clean with -W --keep-going -n
    EOF
    )"

@PierreQuinton PierreQuinton added cc: feat Conventional commit type for new features. package: aggregation labels May 29, 2026
Copy link
Copy Markdown
Contributor

@PierreQuinton PierreQuinton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, it still needs few changes but once this is merge, I think this makes #676 easier to merge.

Comment thread src/torchjd/aggregation/_modo.py Outdated
Comment thread src/torchjd/aggregation/_modo.py
Comment thread src/torchjd/aggregation/_modo.py Outdated
Comment thread src/torchjd/aggregation/_modo.py Outdated
Comment thread src/torchjd/aggregation/_modo.py Outdated

with torch.no_grad():
grad = gramian @ lambd + self._rho * lambd
lambd = torch.softmax(lambd - self._gamma * grad, dim=-1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the end, this is a softmax. @rkhosrowshahi I think this means that moco is essentially just a composition with this weighting, where essentially you give yy_t to it, and then multiply yy_t by the obtained weights. Is that correct? If yes, I think we should change #676 accordingly.

from ._weighting_bases import _GramianWeighting


class MoDoWeighting(_GramianWeighting, Stateful, _NonDifferentiable):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/opencode:Plan is the inheritance order correct here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it matched _gradvac.py, and also the warning in docstring for _NonDifferntiable states "Placing this mixin before the primary base will cause it to shadow the primary class's call signature in generated documentation."

So yes, I believe it is

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Copy Markdown
Contributor

@PierreQuinton PierreQuinton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this is ready, let's wait for @ValerianRey 's review s still.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cc: feat Conventional commit type for new features. package: aggregation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants