Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Date
Msg-id 20230116023639.rn36vf6ajqmfciua@awork3.anarazel.de
Whole thread Raw
In response to Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2023-01-13 11:55:47 -0800, Andres Freund wrote:
> Does anybody see a reason to not move forward with this aspect? We do a fair
> amount of INSTR_TIME_ACCUM_DIFF() etc, and that gets a good bit cheaper by
> just using nanoseconds. We'd also save memory in BufferUsage (144-122 bytes),
> Instrumentation (16 bytes saved in Instrumentation itself, 32 via
> BufferUsage).

This actually under-counted the benefits, because we have two BufferUsage and
two WalUsage in Instrumentation.

Before:
        /* size: 448, cachelines: 7, members: 20 */
        /* sum members: 445, holes: 1, sum holes: 3 */
After
        /* size: 368, cachelines: 6, members: 20 */
        /* sum members: 365, holes: 1, sum holes: 3 */


The difference in the number of instructions in InstrStopNode is astounding:
1016 instructions with timespec, 96 instructions with nanoseconds. Some of
that is the simpler data structure, some because the compiler now can
auto-vectorize the four INSTR_TIME_ACCUM_DIFF in BufferUsageAccumDiff into
one.

We probably should convert Instrumentation->firsttuple to a instr_time now as
well, no point in having the code for conversion to double in the hot routine,
that can easily happen in explain. But that's for a later patch.


I suggested downthread that we should convert the win32 implementation to be
more similar to the unix-nanoseconds representation. A blind conversion looks
good, and lets us share a number of macros.


I wonder if we should deprecate INSTR_TIME_IS_ZERO()/INSTR_TIME_SET_ZERO() and
allow 0 to be used instead. Not needing INSTR_TIME_SET_ZERO() allows variable
definitions to initialize the value, which does avoid some unnecessarily
awkward code.  Alternatively we could introduce INSTR_TIME_ZERO() for that
purpose?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl
Next
From: Michael Paquier
Date:
Subject: Re: Lazy allocation of pages required for verifying FPI consistency