Skip to content

Have Firecracker tell UFFD handler about page size during initial handshake #4449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ and this project adheres to
supported snapshot version format. This change renders all previous
Firecracker snapshots (up to Firecracker version v1.6.0) incompatible with the
current Firecracker version.
- [#4449](https://github.com/firecracker-microvm/firecracker/pull/4449): Added
information about page size to the payload Firecracker sends to the UFFD
handler. Each memory region object now contains a `page_size_kib` field. See
also the [hugepages documentation](docs/hugepages.md).

### Fixed

Expand Down
5 changes: 5 additions & 0 deletions docs/hugepages.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,11 @@ microVMs backed with huge pages can only be restored via UFFD. Lastly, note that
even for guests backed by huge pages, differential snapshots will always track
write accesses to guest memory at 4K granularity.

When restoring snapshots via UFFD, Firecracker will send the configured page
size (in KiB) for each memory region as part of the initial handshake, as
described in our documentation on
[UFFD-assisted snapshot-restore](snapshotting/handling-page-faults-on-snapshot-resume.md).

## Known Limitations

Currently, hugetlbfs support is mutually exclusive with the following
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,8 @@ Firecracker and the page fault handler.
![](../images/uffd_flow3.png)

- Firecracker passes the userfault file descriptor and the guest memory layout
to the page fault handler process through the socket.
(e.g. dimensions of each memory region, and their [page size](../hugepages.md)
in KiB) to the page fault handler process through the socket.

![](../images/uffd_flow4.png)

Expand All @@ -106,7 +107,7 @@ Firecracker and the page fault handler.
happens, the page fault handler issues `UFFDIO_COPY` to load the previously
mmaped file contents into the correspondent memory region.

After Firecracker sends the payload (i.e mem mappings and file descriptor), no
After Firecracker sends the payload (i.e. mem mappings and file descriptor), no
other communication happens on the UDS socket (or otherwise) between Firecracker
and the page fault handler process.

Expand Down Expand Up @@ -161,7 +162,7 @@ connect/send data.
### Example

An example of a handler process can be found
[here](../../src/firecracker/examples/uffd/valid_4k_handler.rs). The process is
[here](../../src/firecracker/examples/uffd/valid_handler.rs). The process is
designed to tackle faults on a certain address by loading into memory the entire
region that the address belongs to, but users can choose any other behavior that
suits their use case best.
12 changes: 4 additions & 8 deletions src/firecracker/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,12 @@ serde_json = "1.0.113"
tracing = ["log-instrument", "seccompiler/tracing", "utils/tracing", "vmm/tracing"]

[[example]]
name = "uffd_malicious_4k_handler"
path = "examples/uffd/malicious_4k_handler.rs"
name = "uffd_malicious_handler"
path = "examples/uffd/malicious_handler.rs"

[[example]]
name = "uffd_valid_4k_handler"
path = "examples/uffd/valid_4k_handler.rs"

[[example]]
name = "uffd_valid_2m_handler"
path = "examples/uffd/valid_2m_handler.rs"
name = "uffd_valid_handler"
path = "examples/uffd/valid_handler.rs"

[[example]]
name = "uffd_fault_all_handler"
Expand Down
8 changes: 1 addition & 7 deletions src/firecracker/examples/uffd/fault_all_handler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ use std::fs::File;
use std::os::unix::net::UnixListener;

use uffd_utils::{Runtime, UffdHandler};
use utils::get_page_size;

fn main() {
let mut args = std::env::args();
Expand All @@ -24,13 +23,8 @@ fn main() {
let listener = UnixListener::bind(uffd_sock_path).expect("Cannot bind to socket path");
let (stream, _) = listener.accept().expect("Cannot listen on UDS socket");

// Populate a single page from backing memory file.
// This is just an example, probably, with the worst-case latency scenario,
// of how memory can be loaded in guest RAM.
let len = get_page_size().unwrap(); // page size does not matter, we fault in everything on the first fault

let mut runtime = Runtime::new(stream, file);
runtime.run(len, |uffd_handler: &mut UffdHandler| {
runtime.run(|uffd_handler: &mut UffdHandler| {
// Read an event from the userfaultfd.
let event = uffd_handler
.read_event()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ fn main() {
let (stream, _) = listener.accept().expect("Cannot listen on UDS socket");

let mut runtime = Runtime::new(stream, file);
runtime.run(4096, |uffd_handler: &mut UffdHandler| {
runtime.run(|uffd_handler: &mut UffdHandler| {
// Read an event from the userfaultfd.
let event = uffd_handler
.read_event()
Expand Down
20 changes: 10 additions & 10 deletions src/firecracker/examples/uffd/uffd_utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ pub struct GuestRegionUffdMapping {
pub size: usize,
/// Offset in the backend file/buffer where the region contents are.
pub offset: u64,
/// The configured page size for this memory region.
pub page_size_kib: usize,
}

#[derive(Debug, Clone, Copy)]
Expand All @@ -49,18 +51,13 @@ pub struct MemRegion {
#[derive(Debug)]
pub struct UffdHandler {
pub mem_regions: Vec<MemRegion>,
page_size: usize,
pub page_size: usize,
backing_buffer: *const u8,
uffd: Uffd,
}

impl UffdHandler {
pub fn from_unix_stream(
stream: &UnixStream,
page_size: usize,
backing_buffer: *const u8,
size: usize,
) -> Self {
pub fn from_unix_stream(stream: &UnixStream, backing_buffer: *const u8, size: usize) -> Self {
let mut message_buf = vec![0u8; 1024];
let (bytes_read, file) = stream
.recv_with_fd(&mut message_buf[..])
Expand All @@ -73,6 +70,8 @@ impl UffdHandler {
let mappings = serde_json::from_str::<Vec<GuestRegionUffdMapping>>(&body)
.expect("Cannot deserialize memory mappings.");
let memsize: usize = mappings.iter().map(|r| r.size).sum();
// Page size is the same for all memory regions, so just grab the first one
let page_size = mappings.first().unwrap().page_size_kib;

// Make sure memory size matches backing data size.
assert_eq!(memsize, size);
Expand Down Expand Up @@ -214,7 +213,7 @@ impl Runtime {
/// When uffd is polled, page fault is handled by
/// calling `pf_event_dispatch` with corresponding
/// uffd object passed in.
pub fn run(&mut self, page_size: usize, pf_event_dispatch: impl Fn(&mut UffdHandler)) {
pub fn run(&mut self, pf_event_dispatch: impl Fn(&mut UffdHandler)) {
let mut pollfds = vec![];

// Poll the stream for incoming uffds
Expand Down Expand Up @@ -249,7 +248,6 @@ impl Runtime {
// Handle new uffd from stream
let handler = UffdHandler::from_unix_stream(
&self.stream,
page_size,
self.backing_memory,
self.backing_memory_size,
);
Expand Down Expand Up @@ -330,7 +328,7 @@ mod tests {
let (stream, _) = listener.accept().expect("Cannot listen on UDS socket");
// Update runtime with actual runtime
let runtime = uninit_runtime.write(Runtime::new(stream, file));
runtime.run(4096, |_: &mut UffdHandler| {});
runtime.run(|_: &mut UffdHandler| {});
});

// wait for runtime thread to initialize itself
Expand All @@ -343,6 +341,7 @@ mod tests {
base_host_virt_addr: 0,
size: 0x1000,
offset: 0,
page_size_kib: 4096,
}];
let dummy_memory_region_json = serde_json::to_string(&dummy_memory_region).unwrap();

Expand Down Expand Up @@ -375,6 +374,7 @@ mod tests {
base_host_virt_addr: 0,
size: 0,
offset: 0,
page_size_kib: 4096,
}];
let error_memory_region_json = serde_json::to_string(&error_memory_region).unwrap();
stream
Expand Down
51 changes: 0 additions & 51 deletions src/firecracker/examples/uffd/valid_2m_handler.rs

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ use std::fs::File;
use std::os::unix::net::UnixListener;

use uffd_utils::{MemPageState, Runtime, UffdHandler};
use utils::get_page_size;

fn main() {
let mut args = std::env::args();
Expand All @@ -24,13 +23,8 @@ fn main() {
let listener = UnixListener::bind(uffd_sock_path).expect("Cannot bind to socket path");
let (stream, _) = listener.accept().expect("Cannot listen on UDS socket");

// Populate a single page from backing memory file.
// This is just an example, probably, with the worst-case latency scenario,
// of how memory can be loaded in guest RAM.
let len = get_page_size().unwrap();

let mut runtime = Runtime::new(stream, file);
runtime.run(len, |uffd_handler: &mut UffdHandler| {
runtime.run(|uffd_handler: &mut UffdHandler| {
// Read an event from the userfaultfd.
let event = uffd_handler
.read_event()
Expand All @@ -40,7 +34,9 @@ fn main() {
// We expect to receive either a Page Fault or Removed
// event (if the balloon device is enabled).
match event {
userfaultfd::Event::Pagefault { addr, .. } => uffd_handler.serve_pf(addr.cast(), len),
userfaultfd::Event::Pagefault { addr, .. } => {
uffd_handler.serve_pf(addr.cast(), uffd_handler.page_size)
}
userfaultfd::Event::Remove { start, end } => uffd_handler.update_mem_state_mappings(
start as u64,
end as u64,
Expand Down
Loading