pgsql: Fix pathological nbtree split point choice issue. - Mailing list pgsql-committers

From Peter Geoghegan
Subject pgsql: Fix pathological nbtree split point choice issue.
Date
Msg-id E1hn7Ra-0003QU-A5@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix pathological nbtree split point choice issue.

Specific ever-decreasing insertion patterns could cause successive
unbalanced nbtree page splits.  Problem cases involve a large group of
duplicates to the left, and ever-decreasing insertions to the right.

To fix, detect the situation by considering the newitem offset before
performing a split using nbtsplitloc.c's "many duplicates" strategy.  If
the new item was inserted just to the right of our provisional "many
duplicates" split point, infer ever-decreasing insertions and fall back
on a 50:50 (space delta optimal) split.  This seems to barely affect
cases that already had acceptable space utilization.

An alternative fix also seems possible.  Instead of changing
nbtsplitloc.c split choice logic, we could instead teach _bt_truncate()
to generate a new value for new high keys by interpolating from the
lastleft and firstright key values.  That would certainly be a more
elegant fix, but it isn't suitable for backpatching.

Discussion: https://postgr.es/m/CAH2-WznCNvhZpxa__GqAa1fgQ9uYdVc=_apArkW2nc-K3O7_NA@mail.gmail.com
Backpatch: 12-, where the nbtree page split enhancements were introduced.

Branch
------
REL_12_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/2f23453e492955c32d952bc1cff651e66657d32b

Modified Files
--------------
src/backend/access/nbtree/nbtsplitloc.c | 90 +++++++++++++++++++++++----------
1 file changed, 64 insertions(+), 26 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Represent Lists as expansible arrays, not chains of cons-cells.
Next
From: Peter Geoghegan
Date:
Subject: pgsql: Correct nbtsplitloc.c comment.