-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hw: number of vCPUS exceeding number of host cores triggers pager error #5442
Comments
@skalk I have noticed that this error message is expected in |
While running
So it's still worth investigating, but not a regression in 2728853. |
Running the same scenario with 4 vCPUs on nova does not cause the error message. |
The issue is present since at least 24.11 and seems to be a race condition (i.e., does not always trigger). |
@alex-ab addressed vCPU race issues in 2024-09/10 in genode-world/seoul. How do genodelabs/genode-world@0cb6a8c and genodelabs/genode-world@5d1f087 affected this issue? |
I did test with an up to date genode-world repo previously and without the commits I reliably get the panic they are fixing on nova even with 2 vCPUs, but on hw the commits don't appear to make a difference. When reverting the commits in genode-world and testing against Genode 24.11,
And the guest subsequently fails to bring up CPU#3. As I mentioned before, previous to 2728853 I haven't seen the pager error when running on qemu. |
I crafted the debug commit cbdf3d4, which triggers the same symptom in vmm_x86 (which is way simpler to understand than seoul) with hw, so the reason is not specific for seoul at all.
|
That's great, thanks @alex-ab! I wonder if |
It does trigger. |
Thanks for the debug commit @alex-ab! As you probably figured already, |
Could you please check that the following two lines produce the same result on base-hw? Entrypoint ep1 { env, Component::stack_size(), "ep", Affinity::Location() };
Entrypoint ep2 { env, Component::stack_size(), "ep", env.cpu().affinity_space().location_of_index(0) }; |
No they don't cause the error message. |
@atopia sorry for the late response. Just for completeness: this error message in general is a page-fault message. Within |
Commit 2728853 causes the following error when running
run/seoul-auto
on hw:Initially it appeared that the error only occured on the qemu target. However closer inspection revealed that the error is triggered when the number of vCPUs reaches the number of CPUs in the system, i.e. when running with the default of
vcpus_to_be_used 2
and adding-smp 3
to the qemu command line, the error does not trigger. When running with-smp 2
, it does.Similarly, running the scenario on a Lenovo X260 with 4 logical cores works fine with
vcpus_to_be_used 3
but triggers the error withvcpus_to_be_used 4
(or more).The runscript
run/vmm_x86
works fine despite spawning two vCPUs per core on qemu with-smp 2
.The text was updated successfully, but these errors were encountered: