Hi,
On 2023-01-13 11:55:47 -0800, Andres Freund wrote:
> Does anybody see a reason to not move forward with this aspect? We do a fair
> amount of INSTR_TIME_ACCUM_DIFF() etc, and that gets a good bit cheaper by
> just using nanoseconds. We'd also save memory in BufferUsage (144-122 bytes),
> Instrumentation (16 bytes saved in Instrumentation itself, 32 via
> BufferUsage).
This actually under-counted the benefits, because we have two BufferUsage and
two WalUsage in Instrumentation.
Before:
/* size: 448, cachelines: 7, members: 20 */
/* sum members: 445, holes: 1, sum holes: 3 */
After
/* size: 368, cachelines: 6, members: 20 */
/* sum members: 365, holes: 1, sum holes: 3 */
The difference in the number of instructions in InstrStopNode is astounding:
1016 instructions with timespec, 96 instructions with nanoseconds. Some of
that is the simpler data structure, some because the compiler now can
auto-vectorize the four INSTR_TIME_ACCUM_DIFF in BufferUsageAccumDiff into
one.
We probably should convert Instrumentation->firsttuple to a instr_time now as
well, no point in having the code for conversion to double in the hot routine,
that can easily happen in explain. But that's for a later patch.
I suggested downthread that we should convert the win32 implementation to be
more similar to the unix-nanoseconds representation. A blind conversion looks
good, and lets us share a number of macros.
I wonder if we should deprecate INSTR_TIME_IS_ZERO()/INSTR_TIME_SET_ZERO() and
allow 0 to be used instead. Not needing INSTR_TIME_SET_ZERO() allows variable
definitions to initialize the value, which does avoid some unnecessarily
awkward code. Alternatively we could introduce INSTR_TIME_ZERO() for that
purpose?
Greetings,
Andres Freund