This release brings ir_mul layout support for
OpenEquivariance. Pass the parameter
layout='ir_mul' to any TPProblem instance to use
a transposed layout for the input and output
irreps. To transpose input and output irreps use
oeq.transpose_irreps or oeq.jax.transpose_irreps;
see our API page for usage details.
Bugfix: added missing MLIR lowerings for a pair of JAX primitives (thanks @teddykoker!)
OpenEquivariance v0.6.3 brings long-needed improvements to the PyTorch frontend. We strongly encourage all users to upgrade to PyTorch 2.10 and OEQ v0.6.3.
Added:
- OpenEquivariance triggers a build of the CUDA extension module
at
pipinstall time and will use this precompiled extension if the user has PyTorch >=2.10 installed. If PyTorch <2.10 is installed, the JIT-compiled extension is used instead. - PyTorch ABI support for C++ backend, using new features in PyTorch 2.10 to support stable, forward-compatible ahead-of-time extensions.
- Dropped support for TorchBind classes and a new kernel cache in its place, which greatly improves flexibility for automatic mixed precision and AOTI compilation. An inference test in C++ is included.
openequivariance_extjaxhas a version number that synchronizes with the mainopenequivariancepackage; ensure the two packages stay in sync.
Fixed:
torch.to()is now called when eitherTensorProductorTensorProductConvis a submodule of another PyTorch module.
Improvements to JAX frontend.
Added:
- Jacobian Vector Products (JVP)
for both
TensorProductandTensorProductConvvia custom primitives, in addition to VJP. - Arbitrary higher-order derivatives in JAX.
- JAX JIT support; in particular, support for Phonon Fine Tuning in Nequix.
Fixed:
- Zero'd all output buffers in the backwards and double-backwards implementations of convolution before calling kernels.
Minor bugfixes related to packaging and JAX.
JAX support is now available in OpenEquivariance for BOTH NVIDIA and AMD GPUs! See the documentation and README.md for instructions on installation and usage.
Minor changes:
- Defer error reporting when CUDA is not available to the first library usage in code, not library load.
Minor update, fixes a bug loading JIT-compiled modules with PyTorch 2.9.
This release adds a benchmark against FlashTP, exposes weight reordering functions for e3nn compatibility, adds input validation, and provides rudimentary support for PyTorch automatic mixed precision (AMP). Our fused, JIT-compiled kernels exhibit up to 2x speedup over FlashTP!
Added:
- Both
TensorProductandTensorProductConvnow have the methodsreoder_weights_from_e3nnandreorder_weights_to_e3nn. These convert the buffer of trainable weights from / to e3nn's canonical ordering. See the API page for usage details. - If you have FlashTP installed, see our documentation ("Tests and Benchmarks" page) to benchmark FlashTP against OpenEquivariance.
- Tensor product inputs with incorrect sizes or datatypes now trigger clear errors in advance of execution.
- OpenEquivariance now has some support for
automatic mixed precision (AMP), but only if
TensorProduct/TensorProductConvobjects are constructed withfloat32precision for bothirrep_dtypeandweight_dtype.
Fixed / Enhanced:
- Added additional fake functions to remove warnings from TorchBind.
- Removed bloat from benchmarking code.
This release includes bugfixes and new opaque operations that
compose with torch.compile
for PT2.4-2.7. These will be unnecessary for PT2.8+.
Added:
- Opaque variants of major operations
via PyTorch
custom_opdeclarations. These functions cannot be traced through and fail for JITScript / AOTI. They are shims that enable composition withtorch.compilepre-PT2.8. torch.load/torch.savefunctionality that, withouttorch.compile, is portable across GPU architectures..to()support to moveTensorProductandTensorProductConvbetween devices or change datatypes.
Fixed:
- Gracefully records an error if
libpython.sois not linked against C++ extension. - Resolves Kahan summation / various other bugs for HIP at O3 compiler-optimization level.
- Removes multiple contexts spawning for GPU 0 when multiple devices are used.
- Zero-initialized gradient buffers to prevent backward pass garbage accumulation.
Our first stable release, v0.2.0, introduces several new features. Highlights include:
- Full HIP support for all kernels.
- Support for
torch.compile, JITScript and export, preliminary support for AOTI. - Faster double backward performance for training.
- Ability to install versioned releases from PyPI.
- Support for CUDA streams and multiple devices.
- An extensive test suite and newly released documentation.
If you successfully run OpenEquivariance on a GPU model not listed here, let us know! We can add your name to the list.
Known issues:
- Kahan summation is broken on HIP – fix planned.
- FX + Export + Compile has trouble with PyTorch dynamo; fix planned.
- AOTI broken on PT <2.8; you need the nightly build due to incomplete support for TorchBind in prior versions.
Initial Github release with preprint.