Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? - Mailing list pgsql-hackers

From Lukas Fittl
Subject Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Date
Msg-id CAP53PkzLstzEHfZkXb_G7gpqDOWQa9bpG69HDmTSEA0ovRyiXw@mail.gmail.com
Whole thread Raw
In response to Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?  (David Geier <geidav.pg@gmail.com>)
Responses Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
List pgsql-hackers
On Tue, Nov 18, 2025 at 11:26 PM David Geier <geidav.pg@gmail.com> wrote:
On 22.10.2025 15:32, Andres Freund wrote:
> On 2025-09-01 12:36:24 +0200, David Geier wrote:
>>> Open questions I have:
>>> - Could we rely on checking whether the TSC timesource is invariant (via
>>> CPUID), instead of relying on Linux choosing it as a clocksource?
>>
>> Why do you want to do that? Are you concerned that Linux might pick a
>> different clock source even though invariant TSC is available?
>
> Not sure about Lukas, but I'm slightly concerned about making this a linux
> specific mechanism unnecessarily.
>

Considering [1], Lukas seems to share my concerns that building or own
has the risk of missing cases.

I had an off-list discussion with Andres about this at PGConf.EU, and one idea that was floated is that we could keep the Linux specific mechanism when on Linux, but not do this check on other platforms, as to not affect portability.
 
>> We could code our own check but looking at the Linux kernel code, this
>> is a bit more involved if we want to do it completely right. They check
>> e.g. if the TSC is also synchronized across different CPUs, which is not
>> the case if they're on different chassis (see unsynchronized_tsc() ->
>> apic_is_clustered_box()).
>
> I think Linux has higher fidelity requirements than our instrumentation usage
> - with linux an inaccurate clock would lead to broken timers, wrong wall clock
> etc, whereas for us it's just a skewed instrumentation result.

That's true. As long as we use the RDTSCP basd code only in places where
it doesn't affect "correctness" it's not the end of the world if they're
skewed.

I think my general worry here is that we basically give the user no escape hatch - you might end up with a case where Postgres gives you unusable EXPLAIN timings and you can't do anything to fix that.

Overall, I'm still thinking a GUC might be the way to go, but I don't think anyone else was enthusiastic about that idea :)

Thanks for working on an updated patch!

Thanks,
Lukas

--
Lukas Fittl

pgsql-hackers by date:

Previous
From: Viktor Holmberg
Date:
Subject: Re: ON CONFLICT DO SELECT (take 3)
Next
From: Bruce Momjian
Date:
Subject: Re: pgsql: doc: remove verbiage about "receiving" data from rep. slots