Skip to content

Commit 444f303

Browse files
Jonathan Woollett-LightJonathanWoollett-Light
Jonathan Woollett-Light
authored andcommitted
docs: Tracing doc
Add `tracing.md` to document how to use the instrumentation to trace the Firecracker process. Signed-off-by: Jonathan Woollett-Light <[email protected]>
1 parent 8a297ad commit 444f303

File tree

1 file changed

+264
-0
lines changed

1 file changed

+264
-0
lines changed

docs/tracing.md

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
# Tracing
2+
3+
## Introduction
4+
5+
Firecracker implements a framework for instrumentation based tracing.
6+
7+
Instrumentation based tracing as defined by
8+
[Sheng Liang on usenix.org](https://www.usenix.org/legacy/publications/library/proceedings/coots99/full_papers/liang/liang_html/node9.html):
9+
10+
> There are two ways to obtain profiling information: either statistical
11+
> sampling or code instrumentation. Statistical sampling is less
12+
> disruptive to program execution, but cannot provide completely
13+
> accurate information. Code instrumentation, on the other hand, may be
14+
> more disruptive, but allows the profiler to record all the events it
15+
> is interested in. Specifically in CPU time profiling, statistical
16+
> sampling may reveal, for example, the relative percentage of time
17+
> spent in frequently-called methods, whereas code instrumentation can
18+
> report the exact number of time each method is invoked.
19+
20+
The focus with tracing in Firecracker is to improve debug-ability.
21+
22+
Enabling tracing adds logs output on each functions entry and exit.
23+
This assists debugging problems that relate to deadlocks or high
24+
latencies by quickly identifying elongated function calls.
25+
26+
## Implementation
27+
28+
Firecracker implements instrumentation based tracing via
29+
[`log`](https://github.com/rust-lang/log) and
30+
[`log_instrument`](https://github.com/JonathanWoollett-Light/log-instrument),
31+
outputting a `Trace` level log when entering and exiting every function.
32+
33+
It is disabled by default at compile-time. Tracing functionality has no
34+
impact on the release binary.
35+
36+
To enable tracing in Firecracker, add instrumentation with:
37+
38+
```
39+
clippy-tracing \
40+
--action fix \
41+
--path ./src \
42+
--exclude vmm_config/logger.rs,virtio/gen,bindings.rs,net/gen,benches,logger/,signal_handler.rs,time.rs
43+
```
44+
45+
and re-compile with `--features tracing`:
46+
47+
```
48+
cargo build --features tracing
49+
```
50+
51+
This will result in an increase in the binary size (~100kb) and a
52+
significant regression in performance (>10x). To mitigate the
53+
performance impact you can filter the tracing output as described in the
54+
next section.
55+
56+
## Filtering
57+
58+
You can filter tracing output both at run-time and compile-time.
59+
This can be used to mitigate the performance impact of logging many
60+
traces.
61+
62+
Run-time filtering is implemented with the `/logger` API call, this can
63+
significantly mitigate the impact on execution time but cannot mitigate
64+
the impact on memory usage. Execution time impact is mitigated by
65+
avoiding constructing and writing the trace log, it still needs to check
66+
the condition in every place it would output a log. Memory usage impact
67+
is not mitigated as the instrumentation remains in the binary unchanged.
68+
69+
Compile-time filtering is a manual process using the
70+
[`clippy-tracing`](https://github.com/JonathanWoollett-Light/clippy-tracing)
71+
tool. This can almost entirely mitigate the impact on execution time and
72+
the impact on memory usage.
73+
74+
### Run-time
75+
76+
You can filter by module path and/or file path at runtime, e.g.:
77+
78+
```bash
79+
curl -X PUT --unix-socket "${API_SOCKET}" \
80+
--data "{
81+
\"level\": \"Trace\",
82+
\"module\": \"api_server::request\",
83+
}" \
84+
"http://localhost/logger"
85+
```
86+
87+
Instrumentation logs are `Trace` level logs, at runtime the level must
88+
be set to `Trace` to see them. The module filter applied here ensures
89+
only logs from the `request` modules within the `api_server` crate will
90+
be output.
91+
92+
This will mitigate most of the performance regression.
93+
94+
### Compile-time
95+
96+
Specific environments can restrict run-time configuration. In these
97+
environments it becomes necessary to support targeted tracing without
98+
run-time re-configuration, for this compile-time filtering must be used.
99+
100+
To reproduce the same filtering as run-time at compile-time, you can use
101+
[`clippy-tracing`](../src/clippy-tracing)
102+
at compile-time like:
103+
104+
```bash
105+
# Remove all instrumentation.
106+
clippy-tracing --action strip --path ./src
107+
# Adds instrumentation to the specific file/s.
108+
clippy-tracing --action fix --path ./src/api_server/src/request
109+
# Build Firecracker.
110+
cargo build --features tracing
111+
```
112+
113+
Then at run-time:
114+
115+
```bash
116+
curl -X PUT --unix-socket "${API_SOCKET}" \
117+
--data "{
118+
\"level\": \"Trace\",
119+
}" \
120+
"http://localhost/logger"
121+
```
122+
123+
The instrumentation has been stripped from all files other than those at
124+
`./src/api_server/src/request` so we do not need to apply a run-time
125+
filter. Runtime filtering could be applied but in this case it yields no
126+
additional benefit.
127+
128+
## Example
129+
130+
In this example we start Firecracker with tracing then make a simple API
131+
call.
132+
133+
### API call
134+
135+
```
136+
~/Projects/firecracker$ sudo curl -X GET --unix-socket "/run/firecracker.socket" "http://localhost/"
137+
{"id":"anonymous-instance","state":"Not started","vmm_version":"1.6.0-dev","app_name":"Firecracker"}
138+
```
139+
140+
### Firecracker
141+
142+
```
143+
~/Projects/firecracker$ sudo ./firecracker/build/cargo_target/release/firecracker --level Trace
144+
2023-10-13T14:15:38.851263983 [anonymous-instance:main] Running Firecracker v1.6.0-dev
145+
2023-10-13T14:15:38.851316122 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
146+
2023-10-13T14:15:38.851322264 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
147+
2023-10-13T14:15:38.851325119 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
148+
2023-10-13T14:15:38.851328776 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
149+
2023-10-13T14:15:38.851331351 [anonymous-instance:main] ThreadId(1)::main::main_exec>>flag_present
150+
2023-10-13T14:15:38.851335809 [anonymous-instance:main] ThreadId(1)::main::main_exec::flag_present>>value_of
151+
2023-10-13T14:15:38.851338254 [anonymous-instance:main] ThreadId(1)::main::main_exec::flag_present<<value_of
152+
2023-10-13T14:15:38.851342091 [anonymous-instance:main] ThreadId(1)::main::main_exec<<flag_present
153+
2023-10-13T14:15:38.851345638 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
154+
2023-10-13T14:15:38.851349245 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
155+
2023-10-13T14:15:38.851352721 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
156+
2023-10-13T14:15:38.851355827 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
157+
2023-10-13T14:15:38.851359444 [anonymous-instance:main] ThreadId(1)::main::main_exec>>from_args
158+
2023-10-13T14:15:38.851362931 [anonymous-instance:main] ThreadId(1)::main::main_exec<<from_args
159+
2023-10-13T14:15:38.851366207 [anonymous-instance:main] ThreadId(1)::main::main_exec>>get_filters
160+
2023-10-13T14:15:38.851368401 [anonymous-instance:main] ThreadId(1)::main::main_exec::get_filters>>get_default_filters
161+
2023-10-13T14:15:38.851372068 [anonymous-instance:main] ThreadId(1)::main::main_exec::get_filters::get_default_filters>>deserialize_binary
162+
2023-10-13T14:15:38.851380033 [anonymous-instance:main] ThreadId(1)::main::main_exec::get_filters::get_default_filters<<deserialize_binary
163+
2023-10-13T14:15:38.851383990 [anonymous-instance:main] ThreadId(1)::main::main_exec::get_filters::get_default_filters>>filter_thread_categories
164+
2023-10-13T14:15:38.851388098 [anonymous-instance:main] ThreadId(1)::main::main_exec::get_filters::get_default_filters<<filter_thread_categories
165+
2023-10-13T14:15:38.851391845 [anonymous-instance:main] ThreadId(1)::main::main_exec::get_filters<<get_default_filters
166+
2023-10-13T14:15:38.851394360 [anonymous-instance:main] ThreadId(1)::main::main_exec<<get_filters
167+
2023-10-13T14:15:38.851398077 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
168+
2023-10-13T14:15:38.851400462 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
169+
2023-10-13T14:15:38.851403507 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
170+
2023-10-13T14:15:38.851410961 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
171+
2023-10-13T14:15:38.851414107 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
172+
2023-10-13T14:15:38.851417955 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
173+
2023-10-13T14:15:38.851420650 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
174+
2023-10-13T14:15:38.851426130 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
175+
2023-10-13T14:15:38.851428434 [anonymous-instance:main] ThreadId(1)::main::main_exec>>flag_present
176+
2023-10-13T14:15:38.851430949 [anonymous-instance:main] ThreadId(1)::main::main_exec::flag_present>>value_of
177+
2023-10-13T14:15:38.851434766 [anonymous-instance:main] ThreadId(1)::main::main_exec::flag_present<<value_of
178+
2023-10-13T14:15:38.851438133 [anonymous-instance:main] ThreadId(1)::main::main_exec<<flag_present
179+
2023-10-13T14:15:38.851440577 [anonymous-instance:main] ThreadId(1)::main::main_exec>>flag_present
180+
2023-10-13T14:15:38.851444575 [anonymous-instance:main] ThreadId(1)::main::main_exec::flag_present>>value_of
181+
2023-10-13T14:15:38.851447941 [anonymous-instance:main] ThreadId(1)::main::main_exec::flag_present<<value_of
182+
2023-10-13T14:15:38.851450005 [anonymous-instance:main] ThreadId(1)::main::main_exec<<flag_present
183+
2023-10-13T14:15:38.851453772 [anonymous-instance:main] ThreadId(1)::main::main_exec>>arguments
184+
2023-10-13T14:15:38.851456488 [anonymous-instance:main] ThreadId(1)::main::main_exec<<arguments
185+
2023-10-13T14:15:38.851458902 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
186+
2023-10-13T14:15:38.851462679 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
187+
2023-10-13T14:15:38.851466587 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
188+
2023-10-13T14:15:38.851470324 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>as_single_value
189+
2023-10-13T14:15:38.851473239 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<as_single_value
190+
2023-10-13T14:15:38.851476896 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
191+
2023-10-13T14:15:38.851479521 [anonymous-instance:main] ThreadId(1)::main::main_exec>>arguments
192+
2023-10-13T14:15:38.851485062 [anonymous-instance:main] ThreadId(1)::main::main_exec<<arguments
193+
2023-10-13T14:15:38.851488398 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
194+
2023-10-13T14:15:38.851491925 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
195+
2023-10-13T14:15:38.851494900 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
196+
2023-10-13T14:15:38.851496934 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
197+
2023-10-13T14:15:38.851499689 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
198+
2023-10-13T14:15:38.851502374 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
199+
2023-10-13T14:15:38.851504629 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
200+
2023-10-13T14:15:38.851507234 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>as_single_value
201+
2023-10-13T14:15:38.851508897 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<as_single_value
202+
2023-10-13T14:15:38.851510630 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
203+
2023-10-13T14:15:38.851513576 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
204+
2023-10-13T14:15:38.851515559 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
205+
2023-10-13T14:15:38.851517503 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
206+
2023-10-13T14:15:38.851520068 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
207+
2023-10-13T14:15:38.851521731 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
208+
2023-10-13T14:15:38.851525628 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
209+
2023-10-13T14:15:38.851529045 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
210+
2023-10-13T14:15:38.851533153 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
211+
2023-10-13T14:15:38.851536339 [anonymous-instance:main] ThreadId(1)::main::main_exec>>single_value
212+
2023-10-13T14:15:38.851538883 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value>>value_of
213+
2023-10-13T14:15:38.851542330 [anonymous-instance:main] ThreadId(1)::main::main_exec::single_value<<value_of
214+
2023-10-13T14:15:38.851544704 [anonymous-instance:main] ThreadId(1)::main::main_exec<<single_value
215+
2023-10-13T14:15:38.851548572 [anonymous-instance:main] ThreadId(1)::main::main_exec>>run_with_api
216+
2023-10-13T14:15:38.851664621 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api>>new
217+
2023-10-13T14:15:38.851672586 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api<<new
218+
2023-10-13T14:15:38.851677876 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api>>init
219+
2023-10-13T14:15:38.851684739 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api<<init
220+
2023-10-13T14:15:38.851724064 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api>>build_microvm_from_requests
221+
2023-10-13T14:15:38.851728171 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api::build_microvm_from_requests>>default
222+
2023-10-13T14:15:38.851731888 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api::build_microvm_from_requests<<default
223+
2023-10-13T14:15:38.851734634 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api::build_microvm_from_requests>>new
224+
2023-10-13T14:15:38.851737830 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api::build_microvm_from_requests<<new
225+
2023-10-13T14:15:38.851748550 [anonymous-instance:fc_api] ThreadId(2)>>new
226+
2023-10-13T14:15:38.851761404 [anonymous-instance:fc_api] ThreadId(2)<<new
227+
2023-10-13T14:15:38.851764861 [anonymous-instance:fc_api] ThreadId(2)>>run
228+
2023-10-13T14:15:38.851775200 [anonymous-instance:fc_api] ThreadId(2)::run>>apply_filter
229+
2023-10-13T14:15:38.851823462 [anonymous-instance:fc_api] ThreadId(2)::run<<apply_filter
230+
2023-10-13T14:15:55.422397039 [anonymous-instance:fc_api] ThreadId(2)::run>>handle_request
231+
2023-10-13T14:15:55.422417909 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request>>try_from
232+
2023-10-13T14:15:55.422420554 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from>>describe
233+
2023-10-13T14:15:55.422424551 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from<<describe
234+
2023-10-13T14:15:55.422426395 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from>>log_received_api_request
235+
2023-10-13T14:15:55.422429270 [anonymous-instance:fc_api] The API server received a Get request on "/".
236+
2023-10-13T14:15:55.422431354 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from<<log_received_api_request
237+
2023-10-13T14:15:55.422433298 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from>>parse_get_instance_info
238+
2023-10-13T14:15:55.422435211 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from::parse_get_instance_info>>new_sync
239+
2023-10-13T14:15:55.422437165 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from::parse_get_instance_info::new_sync>>new
240+
2023-10-13T14:15:55.422439289 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from::parse_get_instance_info::new_sync<<new
241+
2023-10-13T14:15:55.422441123 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from::parse_get_instance_info<<new_sync
242+
2023-10-13T14:15:55.422444459 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::try_from<<parse_get_instance_info
243+
2023-10-13T14:15:55.422446833 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request<<try_from
244+
2023-10-13T14:15:55.422448837 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request>>into_parts
245+
2023-10-13T14:15:55.422450921 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request<<into_parts
246+
2023-10-13T14:15:55.422453967 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request>>serve_vmm_action_request
247+
2023-10-13T14:15:55.422472552 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api::build_microvm_from_requests>>handle_preboot_request
248+
2023-10-13T14:15:55.422480477 [anonymous-instance:main] ThreadId(1)::main::main_exec::run_with_api::build_microvm_from_requests<<handle_preboot_request
249+
2023-10-13T14:15:55.422488963 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request>>convert_to_response
250+
2023-10-13T14:15:55.422492289 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request::convert_to_response>>success_response_with_data
251+
2023-10-13T14:15:55.422493983 [anonymous-instance:fc_api] The request was executed successfully. Status code: 200 OK.
252+
2023-10-13T14:15:55.422498331 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request::convert_to_response::success_response_with_data>>serialize
253+
2023-10-13T14:15:55.422501387 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request::convert_to_response::success_response_with_data::serialize>>fmt
254+
2023-10-13T14:15:55.422506086 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request::convert_to_response::success_response_with_data::serialize<<fmt
255+
2023-10-13T14:15:55.422509171 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request::convert_to_response::success_response_with_data<<serialize
256+
2023-10-13T14:15:55.422511776 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request::convert_to_response<<success_response_with_data
257+
2023-10-13T14:15:55.422514371 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request::serve_vmm_action_request<<convert_to_response
258+
2023-10-13T14:15:55.422516385 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request<<serve_vmm_action_request
259+
2023-10-13T14:15:55.422518719 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request>>take_deprecation_message
260+
2023-10-13T14:15:55.422520533 [anonymous-instance:fc_api] ThreadId(2)::run::handle_request<<take_deprecation_message
261+
2023-10-13T14:15:55.422522847 [anonymous-instance:fc_api] ThreadId(2)::run<<handle_request
262+
2023-10-13T14:15:55.422525422 [anonymous-instance:fc_api] Total previous API call duration: 132 us.
263+
264+
```

0 commit comments

Comments
 (0)