On Fri, Apr 10, 2026 at 12:29 AM Lukas Fittl <lukas@fittl.com> wrote:
On Fri, Apr 10, 2026 at 12:12 AM Lukas Fittl <lukas@fittl.com> wrote: > > On Thu, Apr 9, 2026 at 9:02 AM Andres Freund <andres@anarazel.de> wrote: > > > > On 2026-04-08 21:36:48 -0700, Lukas Fittl wrote: > > > > > > > What do you think about making pg_test_timing warn and return 1 if there is a > > > > tsc clocksource but the calibrated frequency differs by more than, idk, 10%? > > > > I'm worried that there might be other problems like this lurking and we > > > > wouldn't know about them unless the issue is of a similar magnitude. > > > > > > Yeah, that seems like a good idea. If I understand correctly you're > > > thinking we could tell the user to switch to > > > timing_clock_source=system in that case? (i.e. this is only a > > > pg_test_timing notice, not something "smarter" in the backend itself) > > > > I'd even just say "investigate your system an/or report a bug to postgres" :) > > > > Sure, seems reasonable. I went ahead and added that in the attached > v27 (squashed with your other change). > > Example how that looks like (tested without the fix in place): > > --- > > TSC frequency source: x86, hypervisor, cpuid 0x15 > TSC frequency in use: 7 kHz > TSC frequency from calibration: 2500260 kHz > WARNING: Calibrated TSC frequency differs by 35717900.0% from the TSC > frequency in use > HINT: Consider setting timing_clock_source to 'system'. Report bugs to > <pgsql-bugs@lists.postgresql.org>. > > TSC clock source will be used by default, unless timing_clock_source > is set to 'system'. > > --- > > I also added the extra newline before the "will be used by default" > message, because I felt its too much information bunched together > otherwise.
I just realized that you initially suggested to do a "return 1", but I did not include that in the version I sent. I think we could easily do an "exit(1)" after that warning is printed - seems sensible to get CI/buildfarm to fail clearly.
Thanks, Lukas
-- Lukas Fittl
Hi Lukas, Thanks for the patch. I may be missing something, but I wonder if the new debug path can still end up calling pg_tsc_calibrate_frequency() after tsc_detect_frequency() already bailed out with no rdtscp or not invariant. If so, it seems the diagnostics path is no longer following the same gate as the normal TSC-usability path, and could still execute pg_rdtscp() while only trying to print debug info. Maybe I am overlooking a guard somewhere, but I think it would be safer either to use the same prerequisites here, or keep this helper purely passive. Best,