pgsql: Consider outliers in split interval calculation. - Mailing list pgsql-committers

From Peter Geoghegan
Subject pgsql: Consider outliers in split interval calculation.
Date
Msg-id E1jQwFb-0001CQ-QI@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Consider outliers in split interval calculation.

Commit 0d861bbb, which introduced deduplication to nbtree, added some
logic to take large posting list tuples into account when choosing a
split point.  We subtract firstright posting list overhead from the
projected new high key size when calculating leftfree/rightfree values
for an affected candidate split point.  Posting list tuples aren't
special to nbtsplitloc.c, but taking them into account like this makes a
huge difference in practice.  Posting list tuples are frequently tuple
size outliers.

However, commit 0d861bbb missed a closely related issue: split interval
itself is calculated based on the assumption that tuples on the page
being split are roughly equisized.  That assumption was acceptable back
when commit fab25024 taught the logic for choosing a split point about
suffix truncation, but it's pretty questionable now that very large
tuple sizes are common.  This oversight led to unbalanced page splits in
low cardinality multi-column indexes when deduplication was used: page
splits that don't give sufficient weight to how unbalanced the split is
when the interval happens to include some large posting list tuples (and
when most other tuples on the page are not so large).

Nail this down by calculating an initial split interval in a way that's
attuned to the actual cost that we want to keep under control (not a
fuzzy proxy for the cost): apply a leftfree + rightfree evenness test to
each candidate split point that actually gets included in the split
interval (for the default strategy).  This replaces logic that used a
percentage of all legal split points for the page as the basis of the
initial split interval.

Discussion: https://postgr.es/m/CAH2-WznJt5aT2uUB2Bs+JBLdwe0XTX67+xeLFcaNvCKxO=QBVQ@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1542e16f2ca257656717ab8ea263bc310f1b6d51

Modified Files
--------------
src/backend/access/nbtree/nbtsplitloc.c | 104 +++++++++++++++++++++++++-------
1 file changed, 83 insertions(+), 21 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Allow matchingsel() to be used with operators that might return
Next
From: Alvaro Herrera
Date:
Subject: pgsql: Fix detaching partitions with cloned row triggers