Skip to content

Commit 29a4a65

Browse files
authored
Add Vis Spec tutorials. (#157)
* Initial commit. Signed-off-by: Hannah Troisi <htroisi@pixielabs.ai> * Add Vis Spec tutorials. Signed-off-by: Hannah Troisi <htroisi@pixielabs.ai> * TEMP Signed-off-by: Hannah Troisi <htroisi@pixielabs.ai> * Latest updates Signed-off-by: Hannah Troisi <htroisi@pixielabs.ai> * Remove unused image. Signed-off-by: Hannah Troisi <htroisi@pixielabs.ai> * Update images. Signed-off-by: Hannah Troisi <htroisi@pixielabs.ai>
1 parent fc071a7 commit 29a4a65

11 files changed

Lines changed: 1299 additions & 16 deletions

File tree

819 KB
Loading
203 KB
Loading
233 KB
Loading
95.6 KB
Loading
177 KB
Loading
347 KB
Loading

content/en/04-tutorials/02-pxl-scripts/01-write-pxl-scripts/02-custom-pxl-scripts-2.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,3 +183,5 @@ This script could be used to:
183183

184184
- Examine the balance of a `pod`'s incoming vs outgoing traffic.
185185
- Investigate if `pods` under the same `service` receive a similar amount of traffic or if there is an imbalance in traffic received.
186+
187+
In [Tutorial #3](/tutorials/pxl-scripts/write-pxl-scripts/custom-pxl-scripts-3) we will learn how to add more visualizations for this script.

content/en/04-tutorials/02-pxl-scripts/01-write-pxl-scripts/03-custom-pxl-scripts-3.md

Lines changed: 376 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 343 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,343 @@
1+
---
2+
title: "Tutorial #4: Add a Timeseries chart to your Vis Spec"
3+
metaTitle: "Tutorials | PxL Scripts | Write Custom Scripts | Tutorial #4: Add a Timeseries chart to your Vis Spec"
4+
metaDescription: ""
5+
order: 4
6+
---
7+
8+
In [Tutorial #3](/tutorials/pxl-scripts/write-pxl-scripts/custom-pxl-scripts-3) you learned how write a Vis Spec in order to visualize the PxL script query output as a table in the Live UI.
9+
10+
In this tutorial, we will add two time series charts to our Vis Spec:
11+
12+
<svg title='Live View with table and time series chart widgets.' src='pxl-scripts/first-vis-spec-7.png'/>
13+
14+
## Setting up the Scratch Pad
15+
16+
We will continue to use the Live UI's [`Scratch Pad`](/using-pixie/using-live-ui/#write-your-own-pxl-scripts-use-the-scratch-pad) to develop our scripts. Let's set it up with the first version of the PxL Script and Vis Spec we developed in [Tutorial #3](/tutorials/pxl-scripts/write-pxl-scripts/custom-pxl-scripts-3):
17+
18+
1. Open Pixie's [Live UI](/using-pixie/using-live-ui/).
19+
20+
2. Select the `Scratch Pad` script from the `script` drop-down menu in the top left.
21+
22+
3. Open the script editor using the keyboard shortcut: `ctrl+e` (Windows, Linux) or `cmd+e` (Mac).
23+
24+
4. Replace the contents of the `PxL Script` tab with the following:
25+
26+
```python:numbers
27+
# Import Pixie's module for querying data
28+
import px
29+
30+
def network_traffic_per_pod(start_time: str):
31+
32+
# Load the `conn_stats` table into a Dataframe.
33+
df = px.DataFrame(table='conn_stats', start_time=start_time)
34+
35+
# Each record contains contextual information that can be accessed by the reading ctx.
36+
df.pod = df.ctx['pod']
37+
df.service = df.ctx['service']
38+
39+
# Calculate connection stats for each process for each unique pod.
40+
df = df.groupby(['service', 'pod', 'upid']).agg(
41+
# The fields below are counters per UPID, so we take
42+
# the min (starting value) and the max (ending value) to subtract them.
43+
bytes_sent_min=('bytes_sent', px.min),
44+
bytes_sent_max=('bytes_sent', px.max),
45+
bytes_recv_min=('bytes_recv', px.min),
46+
bytes_recv_max=('bytes_recv', px.max),
47+
)
48+
49+
# Calculate connection stats over the time window.
50+
df.bytes_sent = df.bytes_sent_max - df.bytes_sent_min
51+
df.bytes_recv = df.bytes_recv_max - df.bytes_recv_min
52+
53+
# Calculate connection stats for each unique pod. Since there
54+
# may be multiple processes per pod we perform an additional aggregation to
55+
# consolidate those into one entry.
56+
df = df.groupby(['service', 'pod']).agg(
57+
bytes_sent=('bytes_sent', px.sum),
58+
bytes_recv=('bytes_recv', px.sum),
59+
)
60+
61+
# Filter out connections that don't have their service identified.
62+
df = df[df.service != '']
63+
64+
return df
65+
```
66+
67+
5. Replace the contents of the `Vis Spec` tab with the following:
68+
69+
```json:numbers
70+
{
71+
"variables": [
72+
{
73+
"name": "start_time",
74+
"type": "PX_STRING",
75+
"description": "The relative start time of the window. Current time is assumed to be now",
76+
"defaultValue": "-5m"
77+
}
78+
],
79+
"widgets": [
80+
{
81+
"name": "Network Traffic per Pod",
82+
"position": {
83+
"x": 0,
84+
"y": 0,
85+
"w": 12,
86+
"h": 3
87+
},
88+
"func": {
89+
"name": "network_traffic_per_pod",
90+
"args": [
91+
{
92+
"name": "start_time",
93+
"variable": "start_time"
94+
}
95+
]
96+
},
97+
"displaySpec": {
98+
"@type": "types.px.dev/px.vispb.Table"
99+
}
100+
}
101+
],
102+
"globalFuncs": []
103+
}
104+
```
105+
106+
6. Make sure the script runs by clicking the `RUN` button or keyboard shortcut: `ctrl+enter` (Windows, Linux) or `cmd+enter` (Mac).
107+
108+
## Adding a Function to the PxL Script
109+
110+
Our PxL script contains a single `network_traffic_per_pod()` function. This function calculates two values for each pod in our cluster: total bytes sent and total bytes received (for the selected time window).
111+
112+
In order to add time series charts to our Live View, we'll need to calculate time series data for each metric (bytes sent and bytes received). Let's add a second function to our PxL script to do that:
113+
114+
1. Replace the contents of the `PxL Script` tab with the following:
115+
116+
```python:numbers
117+
# Import Pixie's module for querying data
118+
import px
119+
120+
def network_traffic_per_pod(start_time: str):
121+
122+
# Load the `conn_stats` table into a Dataframe.
123+
df = px.DataFrame(table='conn_stats', start_time=start_time)
124+
125+
# Each record contains contextual information that can be accessed by the reading ctx.
126+
df.pod = df.ctx['pod']
127+
df.service = df.ctx['service']
128+
129+
# Calculate connection stats for each process for each unique pod.
130+
df = df.groupby(['service', 'pod', 'upid']).agg(
131+
# The fields below are counters per UPID, so we take
132+
# the min (starting value) and the max (ending value) to subtract them.
133+
bytes_sent_min=('bytes_sent', px.min),
134+
bytes_sent_max=('bytes_sent', px.max),
135+
bytes_recv_min=('bytes_recv', px.min),
136+
bytes_recv_max=('bytes_recv', px.max),
137+
)
138+
139+
# Calculate connection stats over the time window.
140+
df.bytes_sent = df.bytes_sent_max - df.bytes_sent_min
141+
df.bytes_recv = df.bytes_recv_max - df.bytes_recv_min
142+
143+
# Calculate connection stats for each unique pod. Since there
144+
# may be multiple processes per pod we perform an additional aggregation to
145+
# consolidate those into one entry.
146+
df = df.groupby(['service', 'pod']).agg(
147+
bytes_sent=('bytes_sent', px.sum),
148+
bytes_recv=('bytes_recv', px.sum),
149+
)
150+
151+
# Filter out connections that don't have their service identified.
152+
df = df[df.service != '']
153+
154+
return df
155+
156+
def network_traffic_timeseries(start_time: str):
157+
158+
# Load the `conn_stats` table into a Dataframe.
159+
df = px.DataFrame(table='conn_stats', start_time=start_time)
160+
161+
# Each record contains contextual information that can be accessed by the reading ctx.
162+
df.pod = df.ctx['pod']
163+
164+
# Window size to use on time_ column for bucketing.
165+
ns_per_s = 1000 * 1000 * 1000
166+
window_ns = px.DurationNanos(10 * ns_per_s)
167+
df.timestamp = px.bin(df.time_, window_ns)
168+
169+
# Calculate connection stats for each unique pod / upid / timestamp pair.
170+
df = df.groupby(['pod', 'upid', 'timestamp']).agg(
171+
# The fields below are counters per UPID, so we take
172+
# the min (starting value) and the max (ending value) to subtract them.
173+
bytes_sent_min=('bytes_sent', px.min),
174+
bytes_sent_max=('bytes_sent', px.max),
175+
bytes_recv_min=('bytes_recv', px.min),
176+
bytes_recv_max=('bytes_recv', px.max),
177+
)
178+
179+
# Calculate connection stats over the time window.
180+
df.bytes_sent = df.bytes_sent_max - df.bytes_sent_min
181+
df.bytes_recv = df.bytes_recv_max - df.bytes_recv_min
182+
183+
# Calculate connection stats for each unique pod / timestamp pair. Since there
184+
# may be multiple processes per pod we perform an additional aggregation to
185+
# consolidate those into one entry.
186+
df = df.groupby(['pod', 'timestamp']).agg(
187+
bytes_sent=('bytes_sent', px.sum),
188+
bytes_recv=('bytes_recv', px.sum),
189+
)
190+
191+
# The timeseries chart widget expects a `time_` column
192+
df.time_ = df.timestamp
193+
df = df.drop('timestamp')
194+
195+
return df
196+
```
197+
198+
> On `line 40` we define a new function called `network_traffic_timeseries()`.
199+
200+
> On `line 43` we create a DataFrame and populate it with data from the same [`conn_stats`](/reference/datatables/conn_stats/) telemetry data table.
201+
202+
> On `line 46` we use the [`ctx`](/reference/pxl/operators/dataframe.ctx.__getitem__/) function to add a `pod` column which contains the name of the pod that initiated the traced connection.
203+
204+
> On `lines 49-51` we use the [`bin`](/reference/pxl/udf/bin/) function to create a `timestamp` column from the `time_` column. The `timestamp` column contains the values in the `time_` column rounded down to the nearest multiple of 10 seconds.
205+
206+
> On `lines 54-73` we group and aggregate the connection stats according to unique pod and timestamp pairs.
207+
208+
> The time series chart widget expects a `time_` column in the DataFrame, so on `line 76` we rename the `timestamp` column to `time_`.
209+
210+
## Adding the Time Series Charts to the Vis Spec
211+
212+
Let's add two time series chart widgets to our Vis Spec:
213+
214+
1. Replace the contents of the `Vis Spec` tab with the following:
215+
216+
```json:numbers
217+
{
218+
"variables": [
219+
{
220+
"name": "start_time",
221+
"type": "PX_STRING",
222+
"description": "The relative start time of the window. Current time is assumed to be now",
223+
"defaultValue": "-5m"
224+
}
225+
],
226+
"widgets": [
227+
{
228+
"name": "Network Traffic per Pod",
229+
"position": {
230+
"x": 0,
231+
"y": 0,
232+
"w": 12,
233+
"h": 3
234+
},
235+
"func": {
236+
"name": "network_traffic_per_pod",
237+
"args": [
238+
{
239+
"name": "start_time",
240+
"variable": "start_time"
241+
}
242+
]
243+
},
244+
"displaySpec": {
245+
"@type": "types.px.dev/px.vispb.Table",
246+
"gutterColumn": "status"
247+
}
248+
},
249+
{
250+
"name": "Bytes Sent",
251+
"position": {
252+
"x": 0,
253+
"y": 3,
254+
"w": 6,
255+
"h": 3
256+
},
257+
"globalFuncOutputName": "resource_timeseries",
258+
"displaySpec": {
259+
"@type": "types.px.dev/px.vispb.TimeseriesChart",
260+
"timeseries": [
261+
{
262+
"value": "bytes_sent",
263+
"mode": "MODE_LINE",
264+
"series": "pod"
265+
}
266+
],
267+
"title": "",
268+
"yAxis": {
269+
"label": "Bytes sent"
270+
},
271+
"xAxis": null
272+
}
273+
},
274+
{
275+
"name": "Bytes Received",
276+
"position": {
277+
"x": 6,
278+
"y": 3,
279+
"w": 6,
280+
"h": 3
281+
},
282+
"globalFuncOutputName": "resource_timeseries",
283+
"displaySpec": {
284+
"@type": "types.px.dev/px.vispb.TimeseriesChart",
285+
"timeseries": [
286+
{
287+
"value": "bytes_recv",
288+
"mode": "MODE_LINE",
289+
"series": "pod"
290+
}
291+
],
292+
"title": "",
293+
"yAxis": {
294+
"label": "Bytes received"
295+
},
296+
"xAxis": null
297+
}
298+
}
299+
],
300+
"globalFuncs": [
301+
{
302+
"outputName": "resource_timeseries",
303+
"func": {
304+
"name": "network_traffic_timeseries",
305+
"args": [
306+
{
307+
"name": "start_time",
308+
"variable": "start_time"
309+
}
310+
]
311+
}
312+
}
313+
]
314+
}
315+
```
316+
317+
> On `lines 85-96` we add our new `network_traffic_timeseries()` function to the `globalFuncs` list. This function will be used by both of the time series chart widgets that we will add next.
318+
319+
> On `lines 33-57` we add a new times series chart widget named "Bytes Sent":
320+
321+
> - The time series widget contains the same `name` and `position` fields as the table widget.
322+
323+
> - Instead of using the `func` field to define the function inline (as we did with the table widget), we use the `globalFuncOutputName` field to reference our global function.
324+
325+
> - In the `displaySpec` field we use the `timeseries` field to define the `value` and `series`. This chart will plot the `bytes_sent` values for each `pod` series.
326+
327+
> On `lines 58-82` we add a widget named "Bytes Received" that is identical to the "Bytes Sent" chart, but instead plots the `bytes_recv` column of values from the `resource_timeseries` function output table.
328+
329+
<Alert variant="outlined" severity="info">
330+
For a detailed description of every Vis Spec field, please refer to the <a href="https://github.com/pixie-io/pixie/blob/bdae78cc266a078e73db2d9be205fc3ce5cc823b/src/api/proto/vispb/vis.proto">Vis Spec proto</a>.
331+
</Alert>
332+
333+
2. Run the script using the keyboard shortcut: `ctrl+enter` (Windows, Linux) or `cmd+enter` (Mac).
334+
335+
> Your Live UI output should now contain two charts in addition to the table:
336+
337+
<svg title='Live View with table and time series chart widgets.' src='pxl-scripts/first-vis-spec-7.png'/>
338+
339+
## Conclusion
340+
341+
Congratulations, you edited your PxL script and Vis Spec to produce a time series chart in the Live UI!
342+
343+
Tables and time series charts are useful for visualizing your observability data, but graphs can help you even more quickly make sense of what's happening with your Kubernetes applications. In [Tutorial #5](/tutorials/pxl-scripts/write-pxl-scripts/custom-pxl-scripts-5) we'll add a graph to our Live View.

0 commit comments

Comments
 (0)