Skip to content

Commit 60fde3b

Browse files
authored
Merge pull request #12 from BioinfoMachineLearning/0.6.0
0.6.0
2 parents 0f87ff2 + f569076 commit 60fde3b

469 files changed

Lines changed: 143322 additions & 5199 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,7 @@ dmypy.json
147147
# PoseBench
148148
configs/local/default.yaml
149149
/*cache_dir/
150+
/*alignment_viz/
150151
/casp15_ligand_scoring/
151152
/data/
152153
/ensemble_generation_scripts/
@@ -163,6 +164,7 @@ configs/local/default.yaml
163164

164165
# Forks
165166
/workdir/
167+
/forks/alphafold3/*prediction_outputs/
166168
/forks/chai-lab/chai-lab/
167169
/forks/chai-lab/prediction_inputs/
168170
/forks/chai-lab/prediction_outputs/
@@ -174,10 +176,13 @@ configs/local/default.yaml
174176
/forks/DynamicBind/workdir/
175177
/forks/FABind/ckpt/best_model.bin
176178
/forks/FABind/FABind/
179+
/forks/FlowDock/FlowDock/
180+
/forks/FlowDock/checkpoints/
177181
/forks/NeuralPLexer/NeuralPLexer/
178182
/forks/NeuralPLexer/**/neuralplexermodels*
183+
/forks/NeuralPLexer*/prediction_inputs/
179184
/forks/P2Rank/
180-
/forks/*/inference*/
185+
/forks/*/*inference*/
181186
/forks/RoseTTAFold-All-Atom/blast-2.2.26
182187
/forks/RoseTTAFold-All-Atom/rf2aa/config/inference/*_rfaa_inference.yaml
183188
/forks/RoseTTAFold-All-Atom/csblast-2.2.3

.pre-commit-config.yaml

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ exclude: "^forks/"
55

66
repos:
77
- repo: https://github.com/pre-commit/pre-commit-hooks
8-
rev: v4.4.0
8+
rev: v5.0.0
99
hooks:
1010
# list of supported hooks: https://pre-commit.com/hooks.html
1111
- id: trailing-whitespace
@@ -18,32 +18,32 @@ repos:
1818
- id: check-toml
1919
- id: check-case-conflict
2020
- id: check-added-large-files
21-
args: ["--maxkb=15000"]
21+
args: ["--maxkb=40000"]
2222

2323
# python code formatting
2424
- repo: https://github.com/psf/black
25-
rev: 23.1.0
25+
rev: 24.10.0
2626
hooks:
2727
- id: black
2828
args: [--line-length, "99"]
2929

3030
# python import sorting
3131
- repo: https://github.com/PyCQA/isort
32-
rev: 5.12.0
32+
rev: 5.13.2
3333
hooks:
3434
- id: isort
3535
args: ["--profile", "black", "--filter-files"]
3636

3737
# python upgrading syntax to newer version
3838
- repo: https://github.com/asottile/pyupgrade
39-
rev: v3.3.1
39+
rev: v3.19.1
4040
hooks:
4141
- id: pyupgrade
4242
args: [--py38-plus]
4343

4444
# python docstring formatting
4545
- repo: https://github.com/myint/docformatter
46-
rev: v1.7.4
46+
rev: v1.7.5
4747
hooks:
4848
- id: docformatter
4949
args:
@@ -57,7 +57,7 @@ repos:
5757

5858
# python docstring coverage checking
5959
- repo: https://github.com/econchick/interrogate
60-
rev: 1.5.0 # or master if you're bold
60+
rev: 1.7.0 # or master if you're bold
6161
hooks:
6262
- id: interrogate
6363
args:
@@ -74,7 +74,7 @@ repos:
7474

7575
# python check (PEP8), programming errors and code complexity
7676
- repo: https://github.com/PyCQA/flake8
77-
rev: 6.0.0
77+
rev: 7.1.1
7878
hooks:
7979
- id: flake8
8080
args:
@@ -88,28 +88,28 @@ repos:
8888

8989
# python security linter
9090
- repo: https://github.com/PyCQA/bandit
91-
rev: "1.7.5"
91+
rev: "1.8.0"
9292
hooks:
9393
- id: bandit
9494
args: ["-s", "B101"]
9595

9696
# yaml formatting
9797
- repo: https://github.com/pre-commit/mirrors-prettier
98-
rev: v3.0.0-alpha.6
98+
rev: v4.0.0-alpha.8
9999
hooks:
100100
- id: prettier
101101
types: [yaml]
102102
exclude: "environment.yaml"
103103

104104
# shell scripts linter
105105
- repo: https://github.com/shellcheck-py/shellcheck-py
106-
rev: v0.9.0.2
106+
rev: v0.10.0.1
107107
hooks:
108108
- id: shellcheck
109109

110110
# md formatting
111111
- repo: https://github.com/executablebooks/mdformat
112-
rev: 0.7.16
112+
rev: 0.7.21
113113
hooks:
114114
- id: mdformat
115115
args: ["--number"]
@@ -122,22 +122,22 @@ repos:
122122

123123
# word spelling linter
124124
- repo: https://github.com/codespell-project/codespell
125-
rev: v2.2.4
125+
rev: v2.3.0
126126
hooks:
127127
- id: codespell
128128
args:
129-
- --skip=logs/**,data/**,*.ipynb,posebench/utils/data_utils.py,posebench/utils/residue_utils.py,posebench/data/components/protein_fasta_preparation.py,posebench/models/minimize_energy.py,posebench/data/components/create_casp15_ensemble_input_csv.py,posebench/analysis/casp15_ligand_scoring/casp_parser.py
129+
- --skip=logs/**,data/**,*.ipynb,posebench/utils/data_utils.py,posebench/utils/residue_utils.py,posebench/data/components/fasta_preparation.py,posebench/models/minimize_energy.py,posebench/data/components/create_casp15_ensemble_input_csv.py,posebench/analysis/casp15_ligand_scoring/casp_parser.py,*Components-smiles-stereo-oe.smi,notebooks/pdb_reports/transferase/*
130130
# - --ignore-words-list=abc,def
131131

132132
# jupyter notebook cell output clearing
133133
- repo: https://github.com/kynan/nbstripout
134-
rev: 0.6.1
134+
rev: 0.8.1
135135
hooks:
136136
- id: nbstripout
137137

138138
# jupyter notebook linting
139139
- repo: https://github.com/nbQA-dev/nbQA
140-
rev: 1.6.3
140+
rev: 1.9.1
141141
hooks:
142142
- id: nbqa-black
143143
args: ["--line-length=99"]

CHANGELOG.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,29 @@
1+
### 0.6.0 - 02/09/2025
2+
3+
**Additions**:
4+
5+
- Added new baseline methods (AlphaFold 3, Chai-1 with multiple sequence alignments (MSAs))
6+
- Added new binding site-focused implementation of `complex_alignment.py` based on PyMOL's `align` command, which in many cases yields 3x better docking evaluation scores for baseline methods
7+
- Added new script for analyzing baseline methods' protein conformational changes w.r.t. input (e.g., AlphaFold) protein structures and the corresponding reference (crystal) protein structures
8+
- Added the new centroid RMSD and **PLIF-EMD/WM** metrics (n.b., see new arXiv preprint for more details)
9+
- Added a failure mode analysis notebook (n.b., see new arXiv preprint for more details)
10+
11+
**Changes**:
12+
13+
- Introducing **DockGen-E**, a new version of the DockGen benchmark dataset featuring enhanced biomolecular context for docking and co-folding predictions - namely, now all DockGen complexes represent the first (biologically relevant) bioassembly of the corresponding PDB structure
14+
- For the single-ligand datasets (i.e., Astex Diverse, PoseBusters Benchmark, and DockGen), now providing each baseline method with primary *and cofactor* ligand SMILES strings for prediction, to enhance the biomolecular context of these methods' predicted structures - as a result, for these single-ligand datasets, now the predicted ligand *most similar* to the primary ligand (in terms of both Tanimoto and structural similarity) is selected for scoring (which adds an additional layer of challenges for baseline methods)
15+
- Updated Chai-1's inference code to commit `44375d5d4ea44c0b5b7204519e63f40b063e4a7c`, and ran it also with standardized (paired) MSAs
16+
- Replaced all AlphaFold 3 server predictions of each dataset's protein structures with predictions from AlphaFold 3's local inference code
17+
18+
**Deprecations**:
19+
20+
- Pocket-only benchmarking has been deprecated
21+
22+
**Results**:
23+
24+
- With all the above changed in place, simplified, re-ran, and re-analyzed all baseline methods for each benchmark dataset, and updated the baseline predictions and datasets (now containing standardized MSAs) hosted on Zenodo
25+
- **NOTE**: The updated arXiv preprint should be publicly available by 02/12/2025
26+
127
### 0.5.0 - 09/30/2024
228

329
- Added results with AlphaFold 3 predicted structures (now the default)

0 commit comments

Comments
 (0)