-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pitfalls of using rdtsc and possible improvements of Trace::timestamp() #5430
Comments
atopia
added a commit
to atopia/genode
that referenced
this issue
Jan 21, 2025
While implementing TSC calibration in genodelabs#5215, the issue of properly serializing TSC reads came up. Some learnings of the discussion were noted in genodelabs#5430. Using `cpuid` for serialization as in Trace::timestamp() is portable, but will cause VM exits on VMX and SVM and is therefore unsuitable to retain a roughly working calibration loop while running virtualized. On the other hand on most AMD systems, dispatch serializing `lfence` needs to be explicitly enabled via a non-architectural MSR. Enable setting up dispatch serializing lfence on AMD systems and always serialize rdtsc accesses in Hw::Tsc::rdtsc() for maximum reliability. Issues genodelabs#5215, genodelabs#5430
atopia
referenced
this issue
in atopia/genode
Jan 21, 2025
To get the Time Stamp Counter's frequency, hw relied on a complex and incomplete algorithm. Since this is a one-time initialization issue, move TSC calibration to bootstrap and implement it using the ACPI timer. Issue genodelabs#5215
atopia
added a commit
to atopia/genode
that referenced
this issue
Jan 22, 2025
Since rdtsc() provides ordered timestamps now, we should reordering of statements by the compiler too. Issues genodelabs#5215, genodelabs#5430
chelmuth
pushed a commit
that referenced
this issue
Jan 22, 2025
While implementing TSC calibration in #5215, the issue of properly serializing TSC reads came up. Some learnings of the discussion were noted in #5430. Using `cpuid` for serialization as in Trace::timestamp() is portable, but will cause VM exits on VMX and SVM and is therefore unsuitable to retain a roughly working calibration loop while running virtualized. On the other hand on most AMD systems, dispatch serializing `lfence` needs to be explicitly enabled via a non-architectural MSR. Enable setting up dispatch serializing lfence on AMD systems and always serialize rdtsc accesses in Hw::Tsc::rdtsc() for maximum reliability. Issues #5215, #5430
chelmuth
pushed a commit
that referenced
this issue
Jan 30, 2025
While implementing TSC calibration in #5215, the issue of properly serializing TSC reads came up. Some learnings of the discussion were noted in #5430. Using `cpuid` for serialization as in Trace::timestamp() is portable, but will cause VM exits on VMX and SVM and is therefore unsuitable to retain a roughly working calibration loop while running virtualized. On the other hand on most AMD systems, dispatch serializing `lfence` needs to be explicitly enabled via a non-architectural MSR. Enable setting up dispatch serializing lfence on AMD systems and always serialize rdtsc accesses in Hw::Tsc::rdtsc() for maximum reliability. Issues #5215, #5430
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The discussion at atopia@272e77a#r151401339 has brought up some possible room for improvement regarding the
Trace::timestamp()
method and highlighted some pitfalls regarding the non-serializing nature ofrdtsc
that I want to outline here mostly for documentation:rdtsc
is non-serializing.cpuid
orlfence
for serialization.cpuid
has the downside of unconditionally causing VM exits when running virtualized on VMX (cf. Intel SDM September 2023 Vol. 3C 26.1.2 Instructions That Cause VM Exits Unconditionally) and SVM (cf. AMD64 Architecture Programmer’s Manual November 2021 Volume 2 15.9Instruction Intercepts). Given the unreliable nature of TSC based measurements when running virtualized (last but not least because the
rdtsc
instruction will itself cause VM exits depending on hypervisor settings), this is probably of little practical concern.lfence
is only dispatch serializing on AMD hardware when MSR C001_1029[1]=1 (see the previously mentioned https://hadibrais.wordpress.com/2018/05/14/the-significance-of-the-x86-lfence-instruction/ )lfence
instruction afterrdtsc
(see the previously mentioned https://sites.utexas.edu/jdm4372/2018/07/23/comments-on-timing-short-code-sections-on-intel-processors/ ) but from the discussion at https://stackoverflow.com/a/59766404 , this should usually not be a problem:Seeing that contrary to the Linux kernel (cf. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=be261ffce6f13229dad50f59c5e491f933d3167f ) we don't set
lfence
to be dispatch serializing on AMD across our supported kernels and the limited usefulness ofrdtsc
for measuring virtualized code execution, the current use ofcpuid
for serialization inTrace::timestamp()
is probably fine. If at some point we need a more precise measurement, we could set the MSR on AMD and switch to a fulllfence; rdtsc; lfence
(or alternativelyrdtscp; lfence
) sequence for maximum reliability with minimal measurement overhead.The text was updated successfully, but these errors were encountered: