Our current volatile APIs do not let people write semantically correct MMIO with a DMA device. The interaction with such a device typically looks as follows:
- set up some data somewhere in regular memory using regular stores
- emit a kind of release fence
- write a single value to an MMIO register to tell the device that the data is ready
If the final write is just volatile but not atomic, then the data writes and the MMIO write are not guaranteed to happen in any particular order in the real (concrete) execution. A volatile write is observably-ordered wrt all other observable events (volatile accesses and I/O), but not observably-ordered wrt non-volatile memory accesses. A release fence only does something when there's a data write before the fence and an atomic write after the fence. So to get the desired semantics,
we need to make that write atomic.
Note that here I am using "atomic" in the sense of the C++ memory model. This is separate from the question of whether the write can tear or not. An atomic operation must be compiled in a way that avoids tearing, but the converse does not hold -- an operation can be non-torn while still also being non-atomic.
Another way to view this: a volatile store is basically a syscall, so the compiler treats it kind of like an opaque function call -- except that the compiler makes a lot of assumptions about the possible side-effects of that call: it cannot read or write any Rust-accessible memory except for the locations that the store overwrites. It also cannot read or write the "synchronization state" of the current thread, i.e. the set of memory events that that thread has observed or that can participate in fence-based synchronization. (In LLVM terms, volatile accesses are nosync.) However, to explain what happens in the DMA setting, we want the volatile MMIO write to interact with the synchronization state (specifically, the set of memory events in the "release fence" state of the current thread is made available to anyone reading from this volatile write). We need to tell the compiler that this operation may synchronize (i.e., we need to get rid of the nosync). The way we typically do this is by making an operation atomic, so we should have atomic volatile accesses.
So, I propose that t-opsem should decide that Rust shall have atomic volatile accesses: operations that are both observable (i.e., like I/O) and participate in the concurrency memory model. The exact API is left to t-libs-api; there is an ACP for it at rust-lang/libs-team#801.
Our current volatile APIs do not let people write semantically correct MMIO with a DMA device. The interaction with such a device typically looks as follows:
If the final write is just volatile but not atomic, then the data writes and the MMIO write are not guaranteed to happen in any particular order in the real (concrete) execution. A volatile write is observably-ordered wrt all other observable events (volatile accesses and I/O), but not observably-ordered wrt non-volatile memory accesses. A release fence only does something when there's a data write before the fence and an atomic write after the fence. So to get the desired semantics,
we need to make that write atomic.
Note that here I am using "atomic" in the sense of the C++ memory model. This is separate from the question of whether the write can tear or not. An atomic operation must be compiled in a way that avoids tearing, but the converse does not hold -- an operation can be non-torn while still also being non-atomic.
Another way to view this: a volatile store is basically a syscall, so the compiler treats it kind of like an opaque function call -- except that the compiler makes a lot of assumptions about the possible side-effects of that call: it cannot read or write any Rust-accessible memory except for the locations that the store overwrites. It also cannot read or write the "synchronization state" of the current thread, i.e. the set of memory events that that thread has observed or that can participate in fence-based synchronization. (In LLVM terms, volatile accesses are
nosync.) However, to explain what happens in the DMA setting, we want the volatile MMIO write to interact with the synchronization state (specifically, the set of memory events in the "release fence" state of the current thread is made available to anyone reading from this volatile write). We need to tell the compiler that this operation may synchronize (i.e., we need to get rid of thenosync). The way we typically do this is by making an operation atomic, so we should have atomic volatile accesses.So, I propose that t-opsem should decide that Rust shall have atomic volatile accesses: operations that are both observable (i.e., like I/O) and participate in the concurrency memory model. The exact API is left to t-libs-api; there is an ACP for it at rust-lang/libs-team#801.