Refactor big mutex to reduce contention#917
Conversation
f6131ee to
a1a3c07
Compare
a1a3c07 to
170914e
Compare
|
Very exciting! Nice work. I gave it a quick try and found it would reliably deadlock in our setup. Full trace logs here: The flow is running under https://github.com/istio/ztunnel, opening 4 parallel iperf streams on the same connection. usually lasts for 1-2s before it starts to hang. Ill work on investigating some more later just need to shift to something else right now and wanted to give quick feedback |
|
Extremely helpful, thanks! I'm scouring the traces now, hopefully I'll have a fix. 🤓 |
167cbdf to
1c670de
Compare
|
I took a look, and I'm hopeful I've fixed it (I found a lost waker bug). Any further benchmarking is greatly appreciated! |
|
With 1c670de I get the same symptoms unfortunately. The repro is a bit annoying but: clone https://github.com/istio/ztunnel Run In another terminal: In a third terminal: usually runs for 2-3s before it dies on my machine |
|
Actually just |
1c670de to
e159e34
Compare
This comment was marked as resolved.
This comment was marked as resolved.
|
Thanks! stress_test_multiple_source does still seem to fail (or rather, hangs forever) and the iperf test still does reproduce. So seems possible one fix was made but other issues remain |
e159e34 to
d58e405
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
np! If it helps I am on a amd 9950x (16c, 32 threads) on linux 7.0.9 x86 |
aa2cc3c to
51155df
Compare
51155df to
b9b304a
Compare
|
Did some more review for any lost wakers or lock ordering. CI is happy, and local ztunnel is happy. Meanwhile, I'm going to try pushing up a new commit to checkout ztunnel in our CI since GH actions runners have 4 cores. I can ditch the commit once all is happy. |
|
Well, the CI job checking ztunnel test suite passed. |
This refactor puts the big internal shared mutex on a diet. The goal is to reduce contention, providing performance improvements when
h2is used on multi-threaded runtimes.Warning
This is a large internal refactor, and it hasn't yet been sufficiently tested in a production deployment. While I tried to keep the behavior exactly the same, it's possible there's subtle changes or bugs.
If you do test this, let me know. I hope to have this tested thoroughly before possibly landing.
Architectural changes
There is now unique ownership of much of the connection state, owned solely by the "connection task".
Most conceptual things were split between an owned handle and shared values such as
counts::Countsandcounts::Shared. TheStoreis now uniquely owned by the connection, and theStreams stored in it contain anArc<stream::Shared>for the parts that need to be shared with a "stream handle".Stream handle operations now do very little. For receiving, it locks the shared connection
recvbuffer, pops a frame, and unlocks.Likewise, when sending data, (or any other status-updating operation), it briefly locks a connection
pending_streamsqueue, inserts itself, and unlocks. All actual changes are written on the stream'spending_opsfield. The connection task includes a loop to briefly lock the queue, pop a stream, unlock, and then process the streams pending ops.So, while this does introduce more locks, they are finer-grained, and should only be held very briefly. Longer "work" is no longer done while holding any lock.
This should result in tasks needing to wait less time for another task that previously was holding the world-lock.
cc #531