PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size - Mailing list pgsql-hackers

Hi,

Mark Callaghan reported a regression in 15, in the post linked here (with
comments in the twitter thread, hence linked here)
https://twitter.com/MarkCallaghanDB/status/1537475430227161098

A differential flame graph shows increased time spent doing memory
allocations, below ExecInitExpr():
https://mdcallag.github.io/reports/22_06_06_ibench.20m.pg.all/l.i0.0.143.15b1.n.svg

Quite the useful comparison.


That lead to me to look at the size of ExprEvalStep - and indeed, it has
*exploded* in 15. 10-13: 64 bytes, 14: 320 bytes.

The comment above the union for data fields says:
    /*
     * Inline data for the operation.  Inline data is faster to access, but
     * also bloats the size of all instructions.  The union should be kept to
     * no more than 40 bytes on 64-bit systems (so that the entire struct is
     * no more than 64 bytes, a single cacheline on common systems).
     */

However, jsonexpr/EEOP_JSONEXPR is 296 bytes, and
hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the
limit is 40 bytes.

The EEOP_JSONEXPR stuff was added during 15 development in:

commit 1a36bc9dba8eae90963a586d37b6457b32b2fed4
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   2022-03-03 13:11:14 -0500

    SQL/JSON query functions


the EEOP_HASHED_SCALARARRAYOP stuff was added during 14 development in:

commit 50e17ad281b8d1c1b410c9833955bc80fbad4078
Author: David Rowley <drowley@postgresql.org>
Date:   2021-04-08 23:51:22 +1200

    Speedup ScalarArrayOpExpr evaluation


Unfortunately ExprEvalStep is public, so I don't think we can fix the 14
regression. If somebody installed updated server packages while the server is
running, we could end up loading extensions referencing ExprEvalStep (at least
plpgsql and LLVMJIT).

It's not great to have an ABI break at this point of the 15 cycle, but I don't
think we have a choice. Exploding the size of ExprEvalStep by ~4x is bad -
both for memory overhead (expressions often have dozens to hundreds of steps)
and expression evaluation performance.


The Hashed SAO case can perhaps be squeezed sufficiently to fit inline, but
clearly that's not going to happen for the json case. So we should just move
that out of line.


Maybe it's worth sticking a StaticAssert() for the struct size somewhere. I'm
a bit wary about that being too noisy, there are some machines with odd
alignment requirements. Perhaps worth restricting the assertion to x86-64 +
armv8 or such?


It very well might be that this isn't the full explanation of the regression
Mark observed. E.g. the difference in DecodeDateTime() looks more likely to be
caused by 591e088dd5b - but we need to fix the above issue, whether it's the
cause of the regression Mark observed or not.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: better page-level checksums
Next
From: Tom Lane
Date:
Subject: Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size