pgsql: Harmonize nbtree page split point code. - Mailing list pgsql-committers

From Peter Geoghegan
Subject pgsql: Harmonize nbtree page split point code.
Date
Msg-id E1jO8gb-0002f6-Oq@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Harmonize nbtree page split point code.

An nbtree split point can be thought of as a point between two adjoining
tuples from an imaginary version of the page being split that includes
the incoming/new item (in addition to the items that really are on the
page).  These adjoining tuples are called the lastleft and firstright
tuples.

The variables that represent split points contained a field called
firstright, which is an offset number of the first data item from the
original page that goes on the new right page.  The corresponding tuple
from origpage was usually the same thing as the actual firstright tuple,
but not always: the firstright tuple is sometimes the new/incoming item
instead.  This situation seems unnecessarily confusing.

Make things clearer by renaming the origpage offset returned by
_bt_findsplitloc() to "firstrightoff".  We now have a firstright tuple
and a firstrightoff offset number which are comparable to the
newitem/lastleft tuples and the newitemoff/lastleftoff offset numbers
respectively.  Also make sure that we are consistent about how we
describe nbtree page split point state.

Push the responsibility for dealing with pg_upgrade'd !heapkeyspace
indexes down to lower level code, relieving _bt_split() from dealing
with it directly.  This means that we always have a palloc'd left page
high key on the leaf level, no matter what.  This enables simplifying
some of the code (and code comments) within _bt_split().

Finally, restructure the page split code to make it clearer why suffix
truncation (which only takes place during leaf page splits) is
completely different to the first data item truncation that takes place
during internal page splits.  Tuples are marked as having fewer
attributes stored in both cases, and the firstright tuple is truncated
in both cases, so it's easy to imagine somebody missing the distinction.

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/bc3087b626d1073c9b7c9687b334785909ca2237

Modified Files
--------------
contrib/amcheck/verify_nbtree.c         |   2 +-
src/backend/access/nbtree/nbtinsert.c   | 343 ++++++++++++++++++--------------
src/backend/access/nbtree/nbtsort.c     |  42 ++--
src/backend/access/nbtree/nbtsplitloc.c | 202 ++++++++++---------
src/backend/access/nbtree/nbtutils.c    |  11 +-
src/backend/access/nbtree/nbtxlog.c     |  13 +-
src/backend/access/rmgrdesc/nbtdesc.c   |   4 +-
src/include/access/nbtxlog.h            |   8 +-
8 files changed, 334 insertions(+), 291 deletions(-)


pgsql-committers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pgsql: Use perl warnings pragma consistently
Next
From: Alvaro Herrera
Date:
Subject: pgsql: Silence Perl warning