Skip to content

machine: enable nested virt on libkrun by default #25922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jakecorrenti
Copy link
Member

@jakecorrenti jakecorrenti commented Apr 18, 2025

With the recent release of krunkit 0.2.0, a CLI option was added to enable nested virtualization on macOS hosts with an M3 or higher. Enable this by default on supported hosts.

Does this PR introduce a user-facing change?

For users running Podman Machines with krunkit on an M3 or newer host, nested virtualization is enabled by default.

For users running Podman Machines with krunkit on an M3 or newer host running macOS 15+, nested virtualization is enabled by default.

Copy link
Contributor

openshift-ci bot commented Apr 18, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jakecorrenti
Once this PR has been reviewed and has the lgtm label, please assign ashley-cui for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jakecorrenti jakecorrenti added No New Tests Allow PR to proceed without adding regression tests machine and removed machine labels Apr 18, 2025
@jakecorrenti
Copy link
Member Author

jakecorrenti commented Apr 18, 2025

Disclaimer: I only have an M2 so I can't test the nested virtualization works. I was able to test that it failed to start the VM, however, if I modified the condition to include M2.

CC @slp

@mheon
Copy link
Member

mheon commented Apr 18, 2025

LGTM

@baude
Copy link
Member

baude commented Apr 18, 2025

is there any downside to doing this @slp or @jakecorrenti ?

@jakecorrenti
Copy link
Member Author

jakecorrenti commented Apr 18, 2025

There's no immediate downsides AFAIK. There's some inherent risk since it's a new feature from Apple, but if there's any issues we can always just remove the use of the flag.

@slp
Copy link
Contributor

slp commented Apr 21, 2025

LGTM, thanks @jakecorrenti

@mheon
Copy link
Member

mheon commented Apr 21, 2025

@baude Want to merge, or have concerns?

@baude
Copy link
Member

baude commented Apr 21, 2025

what happens if the mac or os is not capable of the nested virt?

@jakecorrenti
Copy link
Member Author

On the libkrun side, if the hardware isn't capable of nested virt it will return an error and not try to start the VM. I avoid that in this PR by checking the CPUID for the CPU model to ensure it's a new-enough chip.

With regards to OS version, the feature requires macOS 15. I could add an OS check in Podman to make sure it's not enabled if you see that as necessary/beneficial. I don't have the right hardware so I'm not sure what would happen in this case if you have the right chip but the wrong OS version.

@slp
Copy link
Contributor

slp commented Apr 22, 2025

With regards to OS version, the feature requires macOS 15. I could add an OS check in Podman to make sure it's not enabled if you see that as necessary/beneficial. I don't have the right hardware so I'm not sure what would happen in this case if you have the right chip but the wrong OS version.

Hm... I haven't thought of OS version. When running on macOS < 15, libkrun does detect the missing feature and errors out with EINVAL. The caller could retry, but ideally this should be detected beforehand (@jakecorrenti).

I have 14 in a VM, so I can give it a try once this PR is updated.

Comment on lines 237 to 238
// only M3 processors and newer can support nested virtualization
if strings.Contains(cpuBrandString, "M3") || strings.Contains(cpuBrandString, "M4") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like constant maintenance cost, if there is an M5 chip it would be fair to assume it would work there as well I think. And I rather not have that updated every year because we forget until someone reports a bug.

(And in general trying to match the logic from libkrun here again just introduces the possibility of drift between the two implementations. I rather have some smart feature flag in krunkit that enabled the virt but only if possible? Any reason this is not the default in libkrun?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far we tried to avoid adding policy in libkrun, leaving that to the caller. What we can do is add a new API to allow callers to check if nested is supported in the platform. WDYT @jakecorrenti ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good compromise. I can look into that today

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't nested be on by default and libkrun figure out if it is supported or not? This seems like an issue on the libkrun side? Is there ever a reason a user would want to run --nested=false?

Copy link
Member Author

@jakecorrenti jakecorrenti May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've bumped krunkit to 0.2.1, it includes the krun_check_nested_virt API from the latest libkrun release. This now allows Podman to do --nested, but if the host isn't supported it will ignore the argument and continue starting the VM as normal. This avoids Podman having to worry about the maintenance burden of managing supported OS and CPU versions.

@jakecorrenti
Copy link
Member Author

Update: krunkit has cut a release with a fix, but I'm still waiting on libkrun to cut a release. Once that happens, I'll update the PR.

With the recent release of krunkit 0.2.0, a CLI option was added to
enable nested virtualization on macOS hosts with an M3 or higher. Enable
this by default. If the host does not support this feature, krunkit will
ignore the argument and continue starting the VM.

Signed-off-by: Jake Correnti <[email protected]>
@antreos
Copy link

antreos commented May 16, 2025

Hi there,

not sure if I did something wrong, but I tested this with macOS 15.4.1 & macOS 15.5 with an M4 Chip.
Using podman machine --log-level=debug init --disk-size=25 --image-path docker://quay.io/podman/machine-os:5.6
I get an machine created as expected.

If I now start it with podman machine --log-level=debug start :
image1
I don't get any further as seen here:
hangs

if I kill the krunkit process, copy everything from inside [ ]
DEBU[0000] helper command-line: [/opt/homebrew/bin/krunkit --cpus 6 --memory 2048 --bootloader efi,variable-store=/Users/andreas/.local/share/containers/podman/machine/libkrun/efi-bl-podman-machine-default,create --device virtio-blk,path=/Users/andreas/.local/share/containers/podman/machine/libkrun/podman-machine-default-arm64.raw --device virtio-rng --device virtio-vsock,port=1025,socketURL=/var/folders/5r/nrh75dld0052v9262wyfz1040000gn/T/podman/podman-machine-default.sock,listen --device virtio-net,unixSocketPath=/var/folders/5r/nrh75dld0052v9262wyfz1040000gn/T/podman/podman-machine-default-gvproxy.sock,mac=5a:94:ef:e4:0c:ee --device virtio-fs,sharedDir=/Users,mountTag=a2a0ee2c717462feb1de2f5afd59de5fd2d8 --device virtio-fs,sharedDir=/private,mountTag=71708eb255bc230cd7c91dd26f7667a7b938 --device virtio-fs,sharedDir=/var/folders,mountTag=a0bb3a2c8b0b02ba5958b0576f0d6530e104 --restful-uri tcp://localhost:50742 --device virtio-gpu,width=800,height=600 --device virtio-input,pointing --device virtio-input,keyboard --gui --nested --device virtio-vsock,port=1024,socketURL=/Users/andreas/.local/share/containers/podman/machine/libkrun/podman-machine-default-ignition.sock,listen]

and paste it in a fresh terminal just without the --nested flag everything boots up.
Bildschirmfoto 2025-05-16 um 20 45 23

I'm not on 0.2.1 but based on the changes this shouldn't make a difference.
grafik

I've built podman from jakecorrenti:krunkit-cmdline yesterday.

podmanbuild/podman 
$ git log -1 --oneline
97609a0a3 (HEAD -> krunkit-cmdline, origin/krunkit-cmdline) machine: enable nested virt on libkrun by default

If there is anything else I can provide let me know. If this isn't the place for this, let me know.

@jakecorrenti
Copy link
Member Author

Hi there,

not sure if I did something wrong, but I tested this with macOS 15.4.1 & macOS 15.5 with an M4 Chip. Using podman machine --log-level=debug init --disk-size=25 --image-path docker://quay.io/podman/machine-os:5.6 I get an machine created as expected.

If I now start it with podman machine --log-level=debug start : image1 I don't get any further as seen here: hangs

if I kill the krunkit process, copy everything from inside [ ] DEBU[0000] helper command-line: [/opt/homebrew/bin/krunkit --cpus 6 --memory 2048 --bootloader efi,variable-store=/Users/andreas/.local/share/containers/podman/machine/libkrun/efi-bl-podman-machine-default,create --device virtio-blk,path=/Users/andreas/.local/share/containers/podman/machine/libkrun/podman-machine-default-arm64.raw --device virtio-rng --device virtio-vsock,port=1025,socketURL=/var/folders/5r/nrh75dld0052v9262wyfz1040000gn/T/podman/podman-machine-default.sock,listen --device virtio-net,unixSocketPath=/var/folders/5r/nrh75dld0052v9262wyfz1040000gn/T/podman/podman-machine-default-gvproxy.sock,mac=5a:94:ef:e4:0c:ee --device virtio-fs,sharedDir=/Users,mountTag=a2a0ee2c717462feb1de2f5afd59de5fd2d8 --device virtio-fs,sharedDir=/private,mountTag=71708eb255bc230cd7c91dd26f7667a7b938 --device virtio-fs,sharedDir=/var/folders,mountTag=a0bb3a2c8b0b02ba5958b0576f0d6530e104 --restful-uri tcp://localhost:50742 --device virtio-gpu,width=800,height=600 --device virtio-input,pointing --device virtio-input,keyboard --gui --nested --device virtio-vsock,port=1024,socketURL=/Users/andreas/.local/share/containers/podman/machine/libkrun/podman-machine-default-ignition.sock,listen]

and paste it in a fresh terminal just without the --nested flag everything boots up. Bildschirmfoto 2025-05-16 um 20 45 23

I'm not on 0.2.1 but based on the changes this shouldn't make a difference. grafik

I've built podman from jakecorrenti:krunkit-cmdline yesterday.

podmanbuild/podman 
$ git log -1 --oneline
97609a0a3 (HEAD -> krunkit-cmdline, origin/krunkit-cmdline) machine: enable nested virt on libkrun by default

If there is anything else I can provide let me know. If this isn't the place for this, let me know.

Hi @antreos, thanks for taking a look. Just to cover all our bases, can you try updating krunkit to 0.2.1 to see if that fixes anything. I don't have M3+ hardware, so I can't try and reproduce this myself.

@baude @slp any chance you could try and reproduce this?

@antreos
Copy link

antreos commented May 19, 2025

Hi @antreos, thanks for taking a look. Just to cover all our bases, can you try updating krunkit to 0.2.1 to see if that fixes anything. I don't have M3+ hardware, so I can't try and reproduce this myself.

@baude @slp any chance you could try and reproduce this?

Hi @jakecorrenti

thanks for the quick response! I updated via brew upgrade to 0.2.1. Unfortunately it's the same behavior meaning: without --nested it's working again. With --nested it stalls at the same point during boot (after "EFI stub: Using"). I tried various configurations including --rootful & with higher memory as well as only 1 CPU, but no luck.
krun021

If I should try a different image/configuration let me know. I'd really like to see nested virtualization working, as it would be quite useful for me.

@jakecorrenti
Copy link
Member Author

@antreos could you try using the krunkit commandline to boot a different image, such as a Fedora .raw image, and use the --nested option?

@Luap99
Copy link
Member

Luap99 commented May 19, 2025

@antreos Do an rosetta bug in applehv our images are held back a bit so the kernel is a bit behind.

We have a new build in containers/podman-machine-os#116, so if you download the image from there you could try that, it should have a newer kernel in case this is kernel related:
https://api.cirrus-ci.com/v1/artifact/build/6008012294324224/image_build/image/podman-machine.aarch64.applehv.raw.zst

Once that is downloaded you could use it with podman machine init --image /path/to/image.raw.zst.

@antreos
Copy link

antreos commented May 19, 2025

@antreos could you try using the krunkit commandline to boot a different image, such as a Fedora .raw image, and use the --nested option?

@jakecorrenti - I hope the command is correct:
krunkit --cpus 2 --memory 4096 --device virtio-blk,path=/Users/andreas/Downloads/Fedora-Server-Host-Generic-42-1.1.aarch64.raw

Without --nested I can configure the root password and get into a shell. (Fedora-Server-Host-Generic-42-1.1.aarch64.raw.xz )
fedoraSrv

With --nested I get a bit further then with podman but it seems to hang aswell. (krunkit also gets to 100% again)
fedoraSrvnested

Edit:
I've attached a --krun-log-level trace with the --nested flag set.
In the end it just loops these two lines forever (only the time is getting updated)
[2025-05-19T18:13:47Z DEBUG utils::macos::epoll] ret: 0
[2025-05-19T18:13:47Z DEBUG utils::macos::epoll] kevs len: 128
krunkit-boot-trace-fedoraserver42.txt

@antreos
Copy link

antreos commented May 19, 2025

@antreos Do an rosetta bug in applehv our images are held back a bit so the kernel is a bit behind.

We have a new build in containers/podman-machine-os#116, so if you download the image from there you could try that, it should have a newer kernel in case this is kernel related: api.cirrus-ci.com/v1/artifact/build/6008012294324224/image_build/image/podman-machine.aarch64.applehv.raw.zst

Once that is downloaded you could use it with podman machine init --image /path/to/image.raw.zst.

Hi @Luap99

thanks for the newer image. I had to use this URL: https://api.cirrus-ci.com/v1/artifact/task/6008012294324224/image/podman-machine.aarch64.applehv.raw.zst as your URL gave me an 404 error.

Unfortunately the result seems the same.
cirrusMachineOS

Edit: I've attached a --krun-log-level trace with the --nested flag set.
Here aswell it just loops the two lines forever with only the time getting updated.
[2025-05-19T19:01:24Z DEBUG utils::macos::epoll] ret: 0
[2025-05-19T19:01:24Z DEBUG utils::macos::epoll] kevs len: 128

krunkit-trace_podman-machine.aarch64.applehv.raw.zst.txt

@Honny1
Copy link
Member

Honny1 commented May 21, 2025

With this change, I was able to run the podman machine on an Apple M3 Pro.

Command:

 CONTAINERS_MACHINE_PROVIDER=libkrun ./bin/darwin/podman machine --log-level=debug init --now --disk-size=25 --image-path docker://quay.io/podman/machine-os:5.6

Machine info:

host:
    arch: arm64
    currentmachine: podman-machine-default
    defaultmachine: ""
    eventsdir: /var/folders/49/slxl_wsj6yxcht2lfhkh9lf00000gn/T/storage-run-501/podman
    machineconfigdir: /Users/jrodak/.config/containers/podman/machine/libkrun
    machineimagedir: /Users/jrodak/.local/share/containers/podman/machine/libkrun
    machinestate: Running
    numberofmachines: 1
    os: darwin
    vmtype: libkrun
version:
    apiversion: 5.6.0-dev
    version: 5.6.0-dev
    goversion: go1.24.3
    gitcommit: 97609a0a334ad10dd4796c7ab82cbeb720a20962
    builttime: Wed May 21 17:29:59 2025
    built: 1747841399
    osarch: darwin/arm64
    os: darwin

Host OS version:

ProductName:		macOS
ProductVersion:		15.5
BuildVersion:		24F74

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
machine No New Tests Allow PR to proceed without adding regression tests release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants