pgsql: Fix get_actual_variable_range() to cope with broken HOT chains. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Fix get_actual_variable_range() to cope with broken HOT chains.
Date
Msg-id E1hm26T-0006KR-KI@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix get_actual_variable_range() to cope with broken HOT chains.

Commit 3ca930fc3 modified get_actual_variable_range() to use a new
"SnapshotNonVacuumable" snapshot type for selecting tuples that it
would consider valid.  However, because that snapshot type can accept
recently-dead tuples, this caused a bug when using a recently-created
index: we might accept a recently-dead tuple that is an early member
of a broken HOT chain and does not actually match the index entry.
Then, the data extracted from the heap tuple would not necessarily be
an endpoint value of the column; it could even be NULL, leading to
get_actual_variable_range() itself reporting "found unexpected null
value in index".  Even without an error, this could lead to poor
plan choices due to an erroneous notion of the endpoint value.

We can improve matters by changing the code to use the index-only
scan technique (which didn't exist when get_actual_variable_range was
originally written).  If any of the tuples in a HOT chain are live
enough to satisfy SnapshotNonVacuumable, we take the data from the
index entry, ignoring what is in the heap.  This fixes the problem
without changing the live-vs-dead-tuple behavior from what was
intended by commit 3ca930fc3.

A side benefit is that for static tables we might not have to touch
the heap at all (when the extremal value is in an all-visible page).
In addition, we can save some overhead by not having to create a
complete ExecutorState, and we don't need to run FormIndexDatum,
avoiding more cycles as well as the possibility of failure for
indexes on expressions.  (I'm not sure that this code would ever
be used to determine the extreme value of an expression, in the
current state of the planner; but it's definitely possible that
lower-order columns of the selected index could be expressions.
So one could construct perhaps-artificial examples in which the
old code unexpectedly failed due to trying to compute an
expression's value for a now-dead row.)

Per report from Manuel Rigger.  Back-patch to v11 where commit
3ca930fc3 came in.

Discussion: https://postgr.es/m/CA+u7OA7W4NWEhCvftdV6_8bbm2vgypi5nuxfnSEJQqVKFSUoMg@mail.gmail.com

Branch
------
REL_11_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/5c1b7edc23a01afedc10702f06b6e0da7b0c56f6

Modified Files
--------------
src/backend/utils/adt/selfuncs.c | 282 +++++++++++++++++++++++----------------
1 file changed, 170 insertions(+), 112 deletions(-)


pgsql-committers by date:

Previous
From: David Rowley
Date:
Subject: pgsql: Fix RANGE partition pruning with multiple boolean partitionkeys
Next
From: Thomas Munro
Date:
Subject: pgsql: Warn if wal_level is too low when creating a publication.