Skip to content

Commit a018b87

Browse files
authored
8.6 Support HOTKEYS (#3008)
* propose API for HOTKEYS * stubs * untested stab at message impl * untested result processor * basic integration tests * more integration tests * release notes and [Experimental] * github link * sample ration default is 1, not zero * - RESP3 - don't expose raw arrays - expose API-shaped ms/us accessors - reuse shared all-slots array * validate/fix cluster slot filter * validate duration * docs; more tests and compensation * make SharedAllSlots lazy; explicitly track empty cpu/network/slots * More docs * "wow"? * more words * update meaning of count * expose a bunch of values that are conditionally present * tests on the sampled/slot-filtered metrics * - naming in HotKeysResult - prefer Nullable<T> when not-always-present * pre-empt typo fix * CI: use internal 8.6 preview build * additional validation on conditional members * CI image update * stabilize CI for Windows Server * be explicit about per-protocol/collection on cluster
1 parent 0aa6005 commit a018b87

18 files changed

Lines changed: 1148 additions & 8 deletions

docs/HotKeys.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
Hot Keys
2+
===
3+
4+
The `HOTKEYS` command allows for server-side profiling of CPU and network usage by key. It is available in Redis 8.6 and later.
5+
6+
This command is available via the `IServer.HotKeys*` methods:
7+
8+
``` c#
9+
// Get the server instance.
10+
IConnectionMultiplexer muxer = ... // connect to Redis 8.6 or later
11+
var server = muxer.GetServer(endpoint); // or muxer.GetServer(key)
12+
13+
// Start the capture; you can specify a duration, or manually use the HotKeysStop[Async] method; specifying
14+
// a duration is recommended, so that the profiler will not be left running in the case of failure.
15+
// Optional parameters allow you to specify the metrics to capture, the sample ratio, and the key slots to include;
16+
// by default, all metrics are captured, every command is sampled, and all key slots are included.
17+
await server.HotKeysStartAsync(duration: TimeSpan.FromSeconds(30));
18+
19+
// Now either do some work ourselves, or await for some other activity to happen:
20+
await Task.Delay(TimeSpan.FromSeconds(35)); // whatever happens: happens
21+
22+
// Fetch the results; note that this does not stop the capture, and you can fetch the results multiple times
23+
// either while it is running, or after it has completed - but only a single capture can be active at a time.
24+
var result = await server.HotKeysGetAsync();
25+
26+
// ...investigate the results...
27+
28+
// Optional: discard the active capture data at the server, if any.
29+
await server.HotKeysResetAsync();
30+
```
31+
32+
The `HotKeysResult` class (our `result` value above) contains the following properties:
33+
34+
- `Metrics`: The metrics captured during this profiling session.
35+
- `TrackingActive`: Indicates whether the capture currently active.
36+
- `SampleRatio`: Profiling frequency; effectively: measure every Nth command. (also: `IsSampled`)
37+
- `SelectedSlots`: The key slots active for this profiling session.
38+
- `CollectionStartTime`: The start time of the capture.
39+
- `CollectionDuration`: The duration of the capture.
40+
- `AllCommandsAllSlotsTime`: The total CPU time measured for all commands in all slots, without any sampling or filtering applied.
41+
- `AllCommandsAllSlotsNetworkBytes`: The total network usage measured for all commands in all slots, without any sampling or filtering applied.
42+
43+
When slot filtering is used, the following properties are also available:
44+
45+
- `AllCommandsSelectedSlotsTime`: The total CPU time measured for all commands in the selected slots.
46+
- `AllCommandsSelectedSlotsNetworkBytes`: The total network usage measured for all commands in the selected slots.
47+
48+
When slot filtering *and* sampling is used, the following properties are also available:
49+
50+
- `SampledCommandsSelectedSlotsTime`: The total CPU time measured for the sampled commands in the selected slots.
51+
- `SampledCommandsSelectedSlotsNetworkBytes`: The total network usage measured for the sampled commands in the selected slots.
52+
53+
If CPU metrics were captured, the following properties are also available:
54+
55+
- `TotalCpuTimeUser`: The total user CPU time measured in the profiling session.
56+
- `TotalCpuTimeSystem`: The total system CPU time measured in the profiling session.
57+
- `TotalCpuTime`: The total CPU time measured in the profiling session.
58+
- `CpuByKey`: Hot keys, as measured by CPU activity; for each:
59+
- `Key`: The key observed.
60+
- `Duration`: The time taken.
61+
62+
If network metrics were captured, the following properties are also available:
63+
64+
- `TotalNetworkBytes`: The total network data measured in the profiling session.
65+
- `NetworkBytesByKey`: Hot keys, as measured by network activity; for each:
66+
- `Key`: The key observed.
67+
- `Bytes`: The network activity, in bytes.
68+
69+
Note: to use slot-based filtering, you must be connected to a Redis Cluster instance. The
70+
`IConnectionMultiplexer.HashSlot(RedisKey)` method can be used to determine the slot for a given key. The key
71+
can also be used in place of an endpoint when using `GetServer(...)` to get the `IServer` instance for a given key.

docs/ReleaseNotes.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ Current package versions:
66
| ------------ | ----------------- | ----- |
77
| [![StackExchange.Redis](https://img.shields.io/nuget/v/StackExchange.Redis.svg)](https://www.nuget.org/packages/StackExchange.Redis/) | [![StackExchange.Redis](https://img.shields.io/nuget/vpre/StackExchange.Redis.svg)](https://www.nuget.org/packages/StackExchange.Redis/) | [![StackExchange.Redis MyGet](https://img.shields.io/myget/stackoverflow/vpre/StackExchange.Redis.svg)](https://www.myget.org/feed/stackoverflow/package/nuget/StackExchange.Redis) |
88

9-
## unreleased
10-
9+
## 2.11.unreleased
1110

11+
- Add support for `HOTKEYS` ([#3008 by mgravell](https://github.com/StackExchange/StackExchange.Redis/pull/3008))
1212
- Add support for keyspace notifications ([#2995 by mgravell](https://github.com/StackExchange/StackExchange.Redis/pull/2995))
13+
- Add support for idempotent stream entry (`XADD IDMP[AUTO]`) support ([#3006 by mgravell](https://github.com/StackExchange/StackExchange.Redis/pull/3006))
1314
- (internals) split AMR out to a separate options provider ([#2986 by NickCraver and philon-msft](https://github.com/StackExchange/StackExchange.Redis/pull/2986))
14-
- Implement idempotent stream entry (IDMP) support ([#3006 by mgravell](https://github.com/StackExchange/StackExchange.Redis/pull/3006))
1515

1616
## 2.10.14
1717

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Documentation
4040
- [Events](Events) - the events available for logging / information purposes
4141
- [Pub/Sub Message Order](PubSubOrder) - advice on sequential and concurrent processing
4242
- [Pub/Sub Key Notifications](KeyspaceNotifications) - how to use keyspace and keyevent notifications
43+
- [Hot Keys](HotKeys) - how to use `HOTKEYS` profiling
4344
- [Using RESP3](Resp3) - information on using RESP3
4445
- [ServerMaintenanceEvent](ServerMaintenanceEvent) - how to listen and prepare for hosted server maintenance (e.g. Azure Cache for Redis)
4546
- [Streams](Streams) - how to use the Stream data type

src/StackExchange.Redis/ClusterConfiguration.cs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,11 @@ private SlotRange(short from, short to)
4545
/// </summary>
4646
public int To => to;
4747

48+
internal const int MinSlot = 0, MaxSlot = 16383;
49+
50+
private static SlotRange[]? s_SharedAllSlots;
51+
internal static SlotRange[] SharedAllSlots => s_SharedAllSlots ??= [new(MinSlot, MaxSlot)];
52+
4853
/// <summary>
4954
/// Indicates whether two ranges are not equal.
5055
/// </summary>

src/StackExchange.Redis/CommandMap.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ public sealed class CommandMap
4141

4242
RedisCommand.BGREWRITEAOF, RedisCommand.BGSAVE, RedisCommand.CLIENT, RedisCommand.CLUSTER, RedisCommand.CONFIG, RedisCommand.DBSIZE,
4343
RedisCommand.DEBUG, RedisCommand.FLUSHALL, RedisCommand.FLUSHDB, RedisCommand.INFO, RedisCommand.LASTSAVE, RedisCommand.MONITOR, RedisCommand.REPLICAOF,
44-
RedisCommand.SAVE, RedisCommand.SHUTDOWN, RedisCommand.SLAVEOF, RedisCommand.SLOWLOG, RedisCommand.SYNC, RedisCommand.TIME,
44+
RedisCommand.SAVE, RedisCommand.SHUTDOWN, RedisCommand.SLAVEOF, RedisCommand.SLOWLOG, RedisCommand.SYNC, RedisCommand.TIME, RedisCommand.HOTKEYS,
4545
});
4646

4747
/// <summary>
@@ -65,7 +65,7 @@ public sealed class CommandMap
6565

6666
RedisCommand.BGREWRITEAOF, RedisCommand.BGSAVE, RedisCommand.CLIENT, RedisCommand.CLUSTER, RedisCommand.CONFIG, RedisCommand.DBSIZE,
6767
RedisCommand.DEBUG, RedisCommand.FLUSHALL, RedisCommand.FLUSHDB, RedisCommand.INFO, RedisCommand.LASTSAVE, RedisCommand.MONITOR, RedisCommand.REPLICAOF,
68-
RedisCommand.SAVE, RedisCommand.SHUTDOWN, RedisCommand.SLAVEOF, RedisCommand.SLOWLOG, RedisCommand.SYNC, RedisCommand.TIME,
68+
RedisCommand.SAVE, RedisCommand.SHUTDOWN, RedisCommand.SLAVEOF, RedisCommand.SLOWLOG, RedisCommand.SYNC, RedisCommand.TIME, RedisCommand.HOTKEYS,
6969

7070
// supported by envoy but not enabled by stack exchange
7171
// RedisCommand.BITFIELD,

src/StackExchange.Redis/Enums/RedisCommand.cs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ internal enum RedisCommand
8181
HLEN,
8282
HMGET,
8383
HMSET,
84+
HOTKEYS,
8485
HPERSIST,
8586
HPEXPIRE,
8687
HPEXPIREAT,
@@ -432,6 +433,7 @@ internal static bool IsPrimaryOnly(this RedisCommand command)
432433
case RedisCommand.HKEYS:
433434
case RedisCommand.HLEN:
434435
case RedisCommand.HMGET:
436+
case RedisCommand.HOTKEYS:
435437
case RedisCommand.HPEXPIRETIME:
436438
case RedisCommand.HPTTL:
437439
case RedisCommand.HRANDFIELD:
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
namespace StackExchange.Redis;
2+
3+
public sealed partial class HotKeysResult
4+
{
5+
internal static readonly ResultProcessor<HotKeysResult?> Processor = new HotKeysResultProcessor();
6+
7+
private sealed class HotKeysResultProcessor : ResultProcessor<HotKeysResult?>
8+
{
9+
protected override bool SetResultCore(PhysicalConnection connection, Message message, in RawResult result)
10+
{
11+
if (result.IsNull)
12+
{
13+
SetResult(message, null);
14+
return true;
15+
}
16+
17+
// an array with a single element that *is* an array/map that is the results
18+
if (result is { Resp2TypeArray: ResultType.Array, ItemsCount: 1 })
19+
{
20+
ref readonly RawResult inner = ref result[0];
21+
if (inner is { Resp2TypeArray: ResultType.Array, IsNull: false })
22+
{
23+
var hotKeys = new HotKeysResult(in inner);
24+
SetResult(message, hotKeys);
25+
return true;
26+
}
27+
}
28+
29+
return false;
30+
}
31+
}
32+
33+
private HotKeysResult(in RawResult result)
34+
{
35+
var metrics = HotKeysMetrics.None; // we infer this from the keys present
36+
var iter = result.GetItems().GetEnumerator();
37+
while (iter.MoveNext())
38+
{
39+
ref readonly RawResult key = ref iter.Current;
40+
if (!iter.MoveNext()) break; // lies about the length!
41+
ref readonly RawResult value = ref iter.Current;
42+
var hash = key.Payload.Hash64();
43+
long i64;
44+
switch (hash)
45+
{
46+
case tracking_active.Hash when tracking_active.Is(hash, key):
47+
TrackingActive = value.GetBoolean();
48+
break;
49+
case sample_ratio.Hash when sample_ratio.Is(hash, key) && value.TryGetInt64(out i64):
50+
SampleRatio = i64;
51+
break;
52+
case selected_slots.Hash when selected_slots.Is(hash, key) & value.Resp2TypeArray is ResultType.Array:
53+
var len = value.ItemsCount;
54+
if (len == 0)
55+
{
56+
_selectedSlots = [];
57+
continue;
58+
}
59+
60+
var items = value.GetItems().GetEnumerator();
61+
var slots = len == 1 ? null : new SlotRange[len];
62+
for (int i = 0; i < len && items.MoveNext(); i++)
63+
{
64+
ref readonly RawResult pair = ref items.Current;
65+
if (pair.Resp2TypeArray is ResultType.Array)
66+
{
67+
long from = -1, to = -1;
68+
switch (pair.ItemsCount)
69+
{
70+
case 1 when pair[0].TryGetInt64(out from):
71+
to = from; // single slot
72+
break;
73+
case 2 when pair[0].TryGetInt64(out from) && pair[1].TryGetInt64(out to):
74+
break;
75+
}
76+
77+
if (from < SlotRange.MinSlot)
78+
{
79+
// skip invalid ranges
80+
}
81+
else if (len == 1 & from == SlotRange.MinSlot & to == SlotRange.MaxSlot)
82+
{
83+
// this is the "normal" case when no slot filter was applied
84+
slots = SlotRange.SharedAllSlots; // avoid the alloc
85+
}
86+
else
87+
{
88+
slots ??= new SlotRange[len];
89+
slots[i] = new((int)from, (int)to);
90+
}
91+
}
92+
}
93+
_selectedSlots = slots;
94+
break;
95+
case all_commands_all_slots_us.Hash when all_commands_all_slots_us.Is(hash, key) && value.TryGetInt64(out i64):
96+
AllCommandsAllSlotsMicroseconds = i64;
97+
break;
98+
case all_commands_selected_slots_us.Hash when all_commands_selected_slots_us.Is(hash, key) && value.TryGetInt64(out i64):
99+
AllCommandSelectedSlotsMicroseconds = i64;
100+
break;
101+
case sampled_command_selected_slots_us.Hash when sampled_command_selected_slots_us.Is(hash, key) && value.TryGetInt64(out i64):
102+
case sampled_commands_selected_slots_us.Hash when sampled_commands_selected_slots_us.Is(hash, key) && value.TryGetInt64(out i64):
103+
SampledCommandsSelectedSlotsMicroseconds = i64;
104+
break;
105+
case net_bytes_all_commands_all_slots.Hash when net_bytes_all_commands_all_slots.Is(hash, key) && value.TryGetInt64(out i64):
106+
AllCommandsAllSlotsNetworkBytes = i64;
107+
break;
108+
case net_bytes_all_commands_selected_slots.Hash when net_bytes_all_commands_selected_slots.Is(hash, key) && value.TryGetInt64(out i64):
109+
NetworkBytesAllCommandsSelectedSlotsRaw = i64;
110+
break;
111+
case net_bytes_sampled_commands_selected_slots.Hash when net_bytes_sampled_commands_selected_slots.Is(hash, key) && value.TryGetInt64(out i64):
112+
NetworkBytesSampledCommandsSelectedSlotsRaw = i64;
113+
break;
114+
case collection_start_time_unix_ms.Hash when collection_start_time_unix_ms.Is(hash, key) && value.TryGetInt64(out i64):
115+
CollectionStartTimeUnixMilliseconds = i64;
116+
break;
117+
case collection_duration_ms.Hash when collection_duration_ms.Is(hash, key) && value.TryGetInt64(out i64):
118+
CollectionDurationMicroseconds = i64 * 1000; // ms vs us is in question: support both, and abstract it from the caller
119+
break;
120+
case collection_duration_us.Hash when collection_duration_us.Is(hash, key) && value.TryGetInt64(out i64):
121+
CollectionDurationMicroseconds = i64;
122+
break;
123+
case total_cpu_time_sys_ms.Hash when total_cpu_time_sys_ms.Is(hash, key) && value.TryGetInt64(out i64):
124+
metrics |= HotKeysMetrics.Cpu;
125+
TotalCpuTimeSystemMicroseconds = i64 * 1000; // ms vs us is in question: support both, and abstract it from the caller
126+
break;
127+
case total_cpu_time_sys_us.Hash when total_cpu_time_sys_us.Is(hash, key) && value.TryGetInt64(out i64):
128+
metrics |= HotKeysMetrics.Cpu;
129+
TotalCpuTimeSystemMicroseconds = i64;
130+
break;
131+
case total_cpu_time_user_ms.Hash when total_cpu_time_user_ms.Is(hash, key) && value.TryGetInt64(out i64):
132+
metrics |= HotKeysMetrics.Cpu;
133+
TotalCpuTimeUserMicroseconds = i64 * 1000; // ms vs us is in question: support both, and abstract it from the caller
134+
break;
135+
case total_cpu_time_user_us.Hash when total_cpu_time_user_us.Is(hash, key) && value.TryGetInt64(out i64):
136+
metrics |= HotKeysMetrics.Cpu;
137+
TotalCpuTimeUserMicroseconds = i64;
138+
break;
139+
case total_net_bytes.Hash when total_net_bytes.Is(hash, key) && value.TryGetInt64(out i64):
140+
metrics |= HotKeysMetrics.Network;
141+
TotalNetworkBytesRaw = i64;
142+
break;
143+
case by_cpu_time_us.Hash when by_cpu_time_us.Is(hash, key) & value.Resp2TypeArray is ResultType.Array:
144+
metrics |= HotKeysMetrics.Cpu;
145+
len = value.ItemsCount / 2;
146+
if (len == 0)
147+
{
148+
_cpuByKey = [];
149+
continue;
150+
}
151+
152+
var cpuTime = new MetricKeyCpu[len];
153+
items = value.GetItems().GetEnumerator();
154+
for (int i = 0; i < len && items.MoveNext(); i++)
155+
{
156+
var metricKey = items.Current.AsRedisKey();
157+
if (items.MoveNext() && items.Current.TryGetInt64(out var metricValue))
158+
{
159+
cpuTime[i] = new(metricKey, metricValue);
160+
}
161+
}
162+
163+
_cpuByKey = cpuTime;
164+
break;
165+
case by_net_bytes.Hash when by_net_bytes.Is(hash, key) & value.Resp2TypeArray is ResultType.Array:
166+
metrics |= HotKeysMetrics.Network;
167+
len = value.ItemsCount / 2;
168+
if (len == 0)
169+
{
170+
_networkBytesByKey = [];
171+
continue;
172+
}
173+
174+
var netBytes = new MetricKeyBytes[len];
175+
items = value.GetItems().GetEnumerator();
176+
for (int i = 0; i < len && items.MoveNext(); i++)
177+
{
178+
var metricKey = items.Current.AsRedisKey();
179+
if (items.MoveNext() && items.Current.TryGetInt64(out var metricValue))
180+
{
181+
netBytes[i] = new(metricKey, metricValue);
182+
}
183+
}
184+
185+
_networkBytesByKey = netBytes;
186+
break;
187+
} // switch
188+
} // while
189+
Metrics = metrics;
190+
}
191+
192+
#pragma warning disable SA1134, SA1300
193+
// ReSharper disable InconsistentNaming
194+
[FastHash] internal static partial class tracking_active { }
195+
[FastHash] internal static partial class sample_ratio { }
196+
[FastHash] internal static partial class selected_slots { }
197+
[FastHash] internal static partial class all_commands_all_slots_us { }
198+
[FastHash] internal static partial class all_commands_selected_slots_us { }
199+
[FastHash] internal static partial class sampled_command_selected_slots_us { }
200+
[FastHash] internal static partial class sampled_commands_selected_slots_us { }
201+
[FastHash] internal static partial class net_bytes_all_commands_all_slots { }
202+
[FastHash] internal static partial class net_bytes_all_commands_selected_slots { }
203+
[FastHash] internal static partial class net_bytes_sampled_commands_selected_slots { }
204+
[FastHash] internal static partial class collection_start_time_unix_ms { }
205+
[FastHash] internal static partial class collection_duration_ms { }
206+
[FastHash] internal static partial class collection_duration_us { }
207+
[FastHash] internal static partial class total_cpu_time_user_ms { }
208+
[FastHash] internal static partial class total_cpu_time_user_us { }
209+
[FastHash] internal static partial class total_cpu_time_sys_ms { }
210+
[FastHash] internal static partial class total_cpu_time_sys_us { }
211+
[FastHash] internal static partial class total_net_bytes { }
212+
[FastHash] internal static partial class by_cpu_time_us { }
213+
[FastHash] internal static partial class by_net_bytes { }
214+
215+
// ReSharper restore InconsistentNaming
216+
#pragma warning restore SA1134, SA1300
217+
}

0 commit comments

Comments
 (0)