Add storage operations to et_def.proto#216
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
|
recheck |
| COMM_RECV_NODE = 6; | ||
| COMM_COLL_NODE = 7; | ||
| STORAGE_NODE = 8; | ||
| BATCH_NODE = 9; |
There was a problem hiding this comment.
qq why is the batch node separate? or is it better named as DATALOADER_NODE?
There was a problem hiding this comment.
We discussed this internally and decided that we will remove the BATCH_NODE and CHECKPOINT_NODE types by consolidating these two types under STORAGE_NODE in the following way:
We want to add either STORAGE_LEVEL_FRAMEWORK or STORAGE_LEVEL_MODEL to the StorageLevel enum.
STORAGE_LEVEL_MODEL/FRAMEWORK will have sub-types to differentiate between BATCH, CHECKPOINT, or KV-CACHE operations:
- This sub-type may be a part of the opaque
Node.attr - The sub-type may be represented in a more structured way
- This will be determined after getting feedback from all involved parties
Like STORAGE_LEVEL_DEVICE, we intend for there not to be a corresponding field in the detail oneof in StorageInfo. This implies load-bearing attributes... we can share the attributes that our replayer implementation converges on to aid in Oana's replayer development.
There was a problem hiding this comment.
I think we should have a separate node type for framework level ops like DATALOADER, CHECKPOINT, KVCACHE nodes. and STORAGE_NODE.
Reasoning being datal-oading / checkpointing could have different semantics than purely storage nodes. For example data loader node may have the batch item information and could represent some pre-processing.
I think different node types will make replay more easier as well.
Just my thought on this.
briancoutinho
left a comment
There was a problem hiding this comment.
Looks good overall, added some questions and suggestions
| // int sync_file_range(int fd, off_t offset, off_t nbytes, unsigned int flags); | ||
|
|
||
| message FsOp { | ||
| enum Op { |
There was a problem hiding this comment.
should we call it OpType or so as it sounds close to FsOp
There was a problem hiding this comment.
We agree with this change for FsOp, PageOp, and BlockOp.
| int32 fd = 3; | ||
| uint64 offset = 4; // byte offset within the file | ||
| uint64 size_bytes = 5; // bytes requested / transferred | ||
| uint32 flags = 6; |
There was a problem hiding this comment.
just wanted to ask if 32bit bit field will be sufficient for the future
There was a problem hiding this comment.
The use of 32bit fields was based on the data types of the corresponding parameters in the listed Linux syscalls. If those function signatures ever change and the capturing system uses that new version of Linux, we believe a breaking proto change is acceptable because it makes that difference in the underlying system more explicit.
There was a problem hiding this comment.
Yea I agree it is unlikely to change so should be fine
| // Two readahead variants distinguished by who triggered them. | ||
| // SYNC: cache-miss-driven during a read syscall — the user needs these | ||
| // pages now, the kernel widens the request to also prefetch ahead. | ||
| PAGE_OP_READAHEAD_SYNC = 3; // kprobe: page_cache_sync_ra |
There was a problem hiding this comment.
Is this closely tied to Linux Page cache management code today? Just curious how we will handle this on future changees. IMO should be ok to add more enums if there is a semantic change
There was a problem hiding this comment.
Yes. I would employ a similar philosophy as above; make the proto changes breaking if the underlying Linux page cache changes are breaking rather than try to maintain backwards compatibility. The tracing model is inherently tied to the underlying system, so trying to maintain proto compatibility only introduces the possibility for subtle bugs down the line.
There was a problem hiding this comment.
(Resolved, just getting an error when I try to resolve it)
| string device = 6; // Tensor object device location. | ||
| } | ||
|
|
||
| // ============================================================================= |
There was a problem hiding this comment.
How about moving the storage structure ops in storage.proto and include it in this file?
##Summary
Extends the Chakra execution-trace schema (schema/protobuf/et_def.proto) with a typed model for storage operations so traces can represent I/O alongside compute and communication nodes.