Skip to content

Add MultiViewLightGBM model#430

Open
tereshchuk1 wants to merge 2 commits into
daisybio:developmentfrom
tereshchuk1:add-lightgbm
Open

Add MultiViewLightGBM model#430
tereshchuk1 wants to merge 2 commits into
daisybio:developmentfrom
tereshchuk1:add-lightgbm

Conversation

@tereshchuk1

Copy link
Copy Markdown
Contributor

Description

Add MultiViewLightGBM model. Supports gene expression, methylation, mutations, copy number variation, and proteomics and drug fingerprints

Changes

  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!

New features

  • Added MultiViewLightGBM model
  • Registered model in drevalpy/models/__init__.py
  • Added hyperparameters to hyperparameters.yaml
  • Added lightgbm as optional dependency in pyproject.toml
  • Added lightgbm to nox test session extras in noxfile.py
  • Added MultiViewLightGBM to baseline tests

@tereshchuk1 tereshchuk1 force-pushed the add-lightgbm branch 2 times, most recently from df5be1f to 61e15af Compare June 12, 2026 04:10
@codecov-commenter

codecov-commenter commented Jun 12, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 87.50000% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.92%. Comparing base (7d24cd6) to head (89f0cab).
⚠️ Report is 13 commits behind head on development.

Files with missing lines Patch % Lines
drevalpy/models/baselines/multi_view_lightgbm.py 87.00% 13 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@               Coverage Diff               @@
##           development     #430      +/-   ##
===============================================
+ Coverage        80.34%   80.92%   +0.58%     
===============================================
  Files              101      103       +2     
  Lines             8171     8336     +165     
===============================================
+ Hits              6565     6746     +181     
+ Misses            1606     1590      -16     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +281 to +293
max_depth:
- 10
num_leaves:
- 63
- 127
subsample:
- 0.8
colsample_bytree:
- 0.6
- 0.8
reg_alpha:
- 0
- 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment by Claude because I'm not an expert on LightGBM and its standard parameters, but maybe worth looking into:

num_leaves vs max_depth are in conflict
This is the main issue. LightGBM is a leaf-wise tree grower, so num_leaves is the primary complexity control — max_depth is secondary and mainly used as a guardrail. The rule of thumb is:
num_leaves < 2^max_depth
With max_depth: 10, you could have up to 2^10 = 1024 leaves. Your num_leaves values of 63 and 127 are well within that, so there's no hard conflict — but if you intend max_depth: 10 to constrain complexity, it's effectively doing nothing here since num_leaves is already much smaller. You'd be better off either:

  • Dropping max_depth and just tuning num_leaves, or
  • Setting max_depth to something tighter like 6 or 7 if you actually want it to bite

num_leaves: 127 can be aggressive
127 leaves is fairly complex — fine for large datasets, but if your dataset is small/medium you may be giving the model too much capacity. Worth adding a smaller value like 31 to the search.

reg_lambda: 0.1 only
You're searching reg_alpha over two values but reg_lambda (L2) over only one. LightGBM defaults reg_lambda to 0.0, so 0.1 already adds regularization — but it's worth also trying 0 and maybe 1 here for symmetry with how you're treating reg_alpha.

Comment thread pyproject.toml Outdated
Comment on lines 53 to 57
xgboost = { version = "^3.2.0", optional = true }
lightgbm = { version = "^4.0.0", optional = true }
typer = ">=0.26,<0.27"
rich = "^15.0.0"
gseapy = { version = "^1.1.0", optional = true }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just seeing this now: any particular reason why the xgboost, lightgbm, and gseapy requirements HAVE to be this version? I'd prefer >= for easier maintenance.

@tereshchuk1 tereshchuk1 changed the title Add MultiViewLightGBM model and include lightgbm in nox test session Add MultiViewLightGBM model Jun 14, 2026
@PascalIversen

Copy link
Copy Markdown
Collaborator

Thanks! Could you also run the model and add it to the leaderboard?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants