[COMMITTERS] pgsql: Redesign get_attstatsslot()/free_attstatsslot() for moresafety - Mailing list pgsql-committers

From Tom Lane
Subject [COMMITTERS] pgsql: Redesign get_attstatsslot()/free_attstatsslot() for moresafety
Date
Msg-id E1d9cV3-0004FI-0w@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed.

The mess cleaned up in commit da0759600 is clear evidence that it's a
bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
to provide the correct type OID for the array elements in the slot.
Moreover, we weren't even getting any performance benefit from that,
since get_attstatsslot() was extracting the real type OID from the array
anyway.  So we ought to get rid of that requirement; indeed, it would
make more sense for get_attstatsslot() to pass back the type OID it found,
in case the caller isn't sure what to expect, which is likely in binary-
compatible-operator cases.

Another problem with the current implementation is that if the stats array
element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
for each element.  That seemed acceptable when the code was written because
we were targeting O(10) array sizes --- but these days, stats arrays are
almost always bigger than that, sometimes much bigger.  We can save a
significant number of cycles by doing one palloc/memcpy/pfree of the whole
array.  Indeed, in the now-probably-common case where the array is toasted,
that happens anyway so this method is basically free.  (Note: although the
catcache code will inline any out-of-line toasted values, it doesn't
decompress them.  At the other end of the size range, it doesn't expand
short-header datums either.  In either case, DatumGetArrayTypeP would have
to make a copy.  We do end up using an extra array copy step if the element
type is pass-by-value and the array length is neither small enough for a
short header nor large enough to have suffered compression.  But that
seems like a very acceptable price for winning in pass-by-ref cases.)

Hence, redesign to take these insights into account.  While at it,
convert to an API in which we fill a struct rather than passing a bunch
of pointers to individual output arguments.  That will make it less
painful if we ever want further expansion of what get_attstatsslot can
pass back.

It's certainly arguable that this is new development and not something to
push post-feature-freeze.  However, I view it as primarily bug-proofing
and therefore something that's better to have sooner not later.  Since
we aren't quite at beta phase yet, let's put it in.

Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/9aab83fc5039d83e84144b7bed3fb1d62a74ae78

Modified Files
--------------
contrib/intarray/_int_selfuncs.c            |  31 +-
src/backend/executor/nodeHash.c             |  25 +-
src/backend/tsearch/ts_selfuncs.c           |  18 +-
src/backend/utils/adt/array_selfuncs.c      |  85 ++----
src/backend/utils/adt/network_selfuncs.c    | 214 ++++++--------
src/backend/utils/adt/rangetypes_selfuncs.c |  58 ++--
src/backend/utils/adt/selfuncs.c            | 419 +++++++++++-----------------
src/backend/utils/cache/lsyscache.c         | 149 +++++-----
src/include/utils/lsyscache.h               |  34 ++-
src/include/utils/selfuncs.h                |   4 +-
10 files changed, 436 insertions(+), 601 deletions(-)


pgsql-committers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [COMMITTERS] pgsql: Use a better way of skipping all subscriptiontests on Windows
Next
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Fix multi-column range partitioning constraints.