Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Date
Msg-id znzsoci7b3crunqxf66xdylnllijbqij4vafos7yskmrbtqxhs@6hegy3qeu7e7
Whole thread
In response to Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?  (Lukas Fittl <lukas@fittl.com>)
Responses Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
List pgsql-hackers
Hi,

On 2026-04-06 20:41:46 -0700, Lukas Fittl wrote:
> On Mon, Apr 6, 2026 at 5:40 PM Andres Freund <andres@anarazel.de> wrote:
> > I wonder if the cpuid tests should be a bit further abstracted into
> > pg_cpu_x86.c.
> >
> > E.g. instead of tsc_detect_frequency() checking for PG_RDTSCP,
> > PG_TSC_INVARIANT, PG_TSC_ADJUST we could have
> >
> > PG_TSC_AVAILABLE /* RDTSCP & INVARIANT */
> > PG_TSC_KNOWN_RELIABLE /* PG_TSC_AVAILABLE && PG_TSC_ADJUST */
> > PG_TSC_FREQUENCY_KNOWN /* x86_tsc_frequency_khz works */
> >
> > and always run all of that during set_x86_features().
> 
> I think that could work, but I kept the flags in features closer to
> being direct mappings to CPUID bits since that seemed to be intent of
> how John designed the facility originally.
> 
> John, do you have thoughts on this? (I've not changed it for now)

I'm ok either way.


> FWIW, I don't think having PG_TSC_KNOWN_RELIABLE makes sense in any
> case, because that would tie together x86_tsc_frequency_khz and
> set_x86_features, i.e. you'd either have the frequency return function
> modify X86Features later, or always run x86_tsc_frequency_khz when
> setting features (and that'd then require you to put the frequency
> value somewhere, etc.)

I was thinking the latter.

> I've gone ahead and rewritten that whole paragraph for clarity, and
> also split it into two. Feedback welcome:
> 
> <para>
>   If enabled, the TSC clock source will use specialized CPU instructions
>   when measuring time intervals. This lowers timing overhead compared to
>   reading the OS system clock, and reduces the measurement error on top
>   of the actual runtime, for example with EXPLAIN ANALYZE.
> </para>


> <para>
>   On x86-64 CPUs the TSC clock source utilizes the Time-Stamp Counter (TSC)

It's a bit weird that the third use of TSC in these paragraphs introduces
Time-Stamp Counter. I can see how you get there, but ...

Now I wonder if we should rename 'tsc' to 'cpu'...


>   of the CPU. The RDTSC instruction is used to read the TSC for EXPLAIN ANALYZE.
>   For timings that require higher precision the RDTSCP instruction is used,
>   which avoids inaccuracies due to CPU instruction re-ordering. Use of
>   RDTSC/RDTSCP is not supported on older x86-64 CPUs or hypervisors that don't
>   pass the TSC frequency to guest VMs, and is not advised on systems that

s/guest VMs/virtual machines/?

>   utilize an emulated TSC. The TSC clock source is currently not supported on
>   other architectures.

The not support bit about hypervisors isn't quite right though? We do even use
it automatically if TSC_ADJUST is set (and the calibration loop succeeds).



> </para>
> <para>
>   To help decide which clock source to use you can run the
> <application>pg_test_timing</application>
>   utility to check TSC availability, and perform timing measurements.
> </para>

How about a link to to the pg_test_timing page?  Hm, I guess that should also
be updated with new output.


I'd also sprinkle a few <acronym> and <command>s around.


Wonder if it's worth adding something like
        <indexterm><primary><acronym>RDTSC</acronym></primary></indexterm>
        <indexterm>
         <primary>Time-Stamp Counter</primary>
         <see><acronym>TSC</acronym></see>
        </indexterm>
        <indexterm><primary><acronym>TSC</acronym></primary></indexterm>

otherwise somebody seeing one of these in logs, pg_test_timing output or
whatever has even less of a chance to figure it out within our docs. They're
not hard to search for terms exactly, so ...


> I've also marked pg_get_ticks(_fast) as pg_attribute_always_inline,
> per an off-list comment from Andres that he observed GCC not fully
> inlining that function in pg_test_timing, presumably due to the
> likely(..) in it.

It's not the likely, I reproduced it even without that.  I mouthed off about
compilers on mastodon and was kindly asked to just open a bug report :)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124795


I think discussing which indexterms should be added signals that this is
pretty close.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: John Naylor
Date:
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?