Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? - Mailing list pgsql-hackers
| From | Lukas Fittl |
|---|---|
| Subject | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Date | |
| Msg-id | CAP53PkwR8gEteMDTK0=hGx5YmLMUhW3aFXAergr_VWgmBFFBig@mail.gmail.com Whole thread |
| In response to | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? (Andres Freund <andres@anarazel.de>) |
| Responses |
Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
|
| List | pgsql-hackers |
On Thu, Apr 9, 2026 at 9:02 AM Andres Freund <andres@anarazel.de> wrote: > > On 2026-04-08 21:36:48 -0700, Lukas Fittl wrote: > > > > And that is indeed the problem. Looks like we can't trust CPUID > > 0x15/0x16 when we're under a Hypervisor and its not KVM or VMware. > > Why you'd report a TSC frequency but populate it with a distinct frequency > from the actual tsc is beyond me, but oh well, we gotta deal. > > Pushed the fix. Thanks! > > > What do you think about making pg_test_timing warn and return 1 if there is a > > > tsc clocksource but the calibrated frequency differs by more than, idk, 10%? > > > I'm worried that there might be other problems like this lurking and we > > > wouldn't know about them unless the issue is of a similar magnitude. > > > > Yeah, that seems like a good idea. If I understand correctly you're > > thinking we could tell the user to switch to > > timing_clock_source=system in that case? (i.e. this is only a > > pg_test_timing notice, not something "smarter" in the backend itself) > > I'd even just say "investigate your system an/or report a bug to postgres" :) > Sure, seems reasonable. I went ahead and added that in the attached v27 (squashed with your other change). Example how that looks like (tested without the fix in place): --- TSC frequency source: x86, hypervisor, cpuid 0x15 TSC frequency in use: 7 kHz TSC frequency from calibration: 2500260 kHz WARNING: Calibrated TSC frequency differs by 35717900.0% from the TSC frequency in use HINT: Consider setting timing_clock_source to 'system'. Report bugs to <pgsql-bugs@lists.postgresql.org>. TSC clock source will be used by default, unless timing_clock_source is set to 'system'. --- I also added the extra newline before the "will be used by default" message, because I felt its too much information bunched together otherwise. > > Attached 0001 fixes the issue for me on my test instance, and > > presumably will fix drongo as well. > > > > 0002 is the updated version of emitting the additional debug info. I > > think this is certainly less critical to have in 19 now, but could > > still be useful if there are any future oddities. > > I think we should do something, probably together with the test enhancement I > described, because otherwise we won't actually find potential breakage before > it hits production environments. Ack, makes sense to me. > Any reason you didn't include the hypervisor like in the prior version? Just > simplicity? > > I think this actually ends up getting overwritten if > x86_hypervisor_tsc_frequency_khz() then "fails" to detect a frequency. Feels > like it'd be good to continue reporting that it's in a hypervisor, because > hypervisors can set tsc frequency multipliers and stuff. > Agreed that seems reasonable. > > What do you think about the attached incremental patch? > > If I e.g. intentionally force the hypervisor path being taken, on a non-VM, I > get: > TSC frequency source: x86, hypervisor, cpuid 0x40000010, calibration > TSC frequency in use: 2497902 kHz > TSC frequency from calibration: 2497902 kHz > TSC clock source will be used by default, unless timing_clock_source is set to 'system'. > > And if rdtscp is not available: > TSC frequency source: x86, no rdtscp > TSC frequency in use: 0 kHz > TSC frequency from calibration: 2500040 kHz > TSC clock source is not usable. Likely unable to determine TSC frequency. Are you running in an unsupported virtualizedenvironment? > > It's not perfect, but seems like it might be good enough? Yeah, I think that looks good. On an m4.xlarge instance (Linux / xen) with its very slow clock I get the following: --- System clock source: clock_gettime (CLOCK_MONOTONIC) Average loop time including overhead: 570.09 ns Histogram of timing durations: ... TSC frequency source: x86, not invariant TSC frequency in use: 0 kHz TSC frequency from calibration: 2299714 kHz TSC clock source is not usable. Likely unable to determine TSC frequency. Are you running in an unsupported virtualized environment? --- FWIW, Linux has current_clocksource "xen" instead of "tsc" on that instance. I assume we're okay with not reporting "hypervisor" in the source string in the early failure case? If we wanted to, it'd make the diff a bit larger since we'd need an extra hypervisor feature check. > Note to future self: Need to consider update the sgml docs example. Probably > just fudge it, to avoid having to update the numbers too. Yeah, I wouldn't update the numbers in the docs. I've added an example of the new output in the attached. Thanks, Lukas -- Lukas Fittl
Attachment
pgsql-hackers by date: