feat(s3): add optional S3-over-RDMA (cuObject) data plane#87
Draft
harshavardhana wants to merge 1 commit into
Draft
feat(s3): add optional S3-over-RDMA (cuObject) data plane#87harshavardhana wants to merge 1 commit into
harshavardhana wants to merge 1 commit into
Conversation
Add an `rdma` option to the `s3` storage provider that routes object transfers through NVIDIA cuObject (libcuobjclient) instead of the HTTP body. When enabled, a contiguous host buffer is registered with cuObject and its RDMA descriptor is carried to the endpoint as the signed `x-amz-rdma-token` header; the RDMA-capable endpoint transfers the payload directly into or out of the buffer, leaving the HTTP body empty. This offloads the bulk transfer from the CPU and the HTTP/TLS path. The option mirrors the existing `rust_client` sub-option: it swaps only the data plane and is mutually exclusive with it. Enabling RDMA forces the empty-body wire contract (unsigned payload, checksums only when_required) and single-shot put/get, since multipart does not apply to a single registered-buffer transfer. cuObject is a C++ library, so a thin extern "C" shim (providers/cuobj_shim.cpp) wraps a process-wide cuObjClient and is loaded over ctypes (providers/_cuobj.py); the module is import-safe without the native library present, mirroring the torch.cuda.cuobj / BotoCuObjClient split in the PyTorch checkpoint backend. Test Plan: - Unit (native engine mocked), run with the boto3 extra: `uv run --extra boto3 pytest tests/test_multistorageclient/unit/providers/test_s3_rdma.py` - Regression on existing S3/schema unit tests: passing. - End-to-end against a live RDMA endpoint: `python examples/rdma_roundtrip.py` (see the script header for the required cuObject runtime and environment).
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Contributor
|
/ok to test 72f8703 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add an optional S3-over-RDMA data plane to the
s3storage provider, backed by NVIDIA cuObject (libcuobjclient). A newrdmaprovider option routes object transfers directly into/out of a registered host buffer over RDMA instead of through the HTTP body: the buffer is registered with cuObject and its RDMA descriptor is carried to the endpoint as the signedx-amz-rdma-tokenheader, leaving the HTTP body empty. This offloads bulk transfer from the CPU and the HTTP/TLS path for RDMA-capable endpoints (e.g. MinIO AIStor), benefiting checkpoint and data-loading workloads.Design notes:
rust_clientsub-option — it swaps only the data plane (inheriting all metadata/list/credentials/error handling) and is mutually exclusive withrust_client. No new provider type;type: s3is unchanged.payload_signing_enabled=False, checksums onlywhen_required) and single-shot put/get, since multipart does not apply to a single registered-buffer transfer. Addressing style stays user-controlled via thes3option.extern "C"shim (providers/cuobj_shim.cpp) wraps a process-widecuObjClientand is loaded over ctypes (providers/_cuobj.py). The module is import-safe without the native library present; the provider only instantiates the engine whenrdmais configured. This follows thetorch.cuda.cuobj/BotoCuObjClientsplit used in the PyTorch cuObject checkpoint backend.Files:
providers/_cuobj.py(primitives +CuObjEnginecontrol plane),providers/cuobj_shim.cpp(build instructions in the header),providers/s3.py(option wiring +_rdma_put/_rdma_get),schema.py(rdmaoption),examples/rdma_roundtrip.py(E2E), unit tests.Validated
examples/rdma_roundtrip.pywithrdma: {}round-trips 1 / 64 / 256 MiB byte-identical over RDMA, and thex-amz-rdma-replycheck confirms the payload moved over RDMA rather than the standard path. (Host-memory data plane.)Checklist
.release_notes/.unreleased.md