Thread: GiST secondary split

GiST secondary split

From
Peter Griggs
Date:
I am hacking some GIST code for a research project and wanted clarification about what exactly a secondary split is in GIST. More specifically I am wondering why the supportSecondarySplit function (which is in src/backend/access/gist/gistsplit.c) can assume that the data is currently on the left side in order to swap it.

/*
* Clean up when we did a secondary split but the user-defined PickSplit
* method didn't support it (leaving spl_ldatum_exists or spl_rdatum_exists
* true).
*
* We consider whether to swap the left and right outputs of the secondary
* split; this can be worthwhile if the penalty for merging those tuples into
* the previously chosen sets is less that way.
*
* In any case we must update the union datums for the current column by
* adding in the previous union keys (oldL/oldR), since the user-defined
* PickSplit method didn't do so.
*/
static void
supportSecondarySplit(Relation r, GISTSTATE *giststate, int attno,
GIST_SPLITVEC *sv, Datum oldL, Datum oldR)
{

Best,
Peter

--
Peter Griggs
Masters of Engineering (Meng) in Computer Science
Massachusetts Institute of Technology | 2020

Re: GiST secondary split

From
Alexander Korotkov
Date:
Hi, Peter!

On Sat, Mar 21, 2020 at 12:36 AM Peter Griggs <petergriggs33@gmail.com> wrote:
> I am hacking some GIST code for a research project and wanted clarification about what exactly a secondary split is
inGIST.
 

Secondary split in GiST is the split by second and subsequent columns
on multicolumn GiST indexes.  In the general it works as following.
Split by the first column produced two union keys.  It might happen
that some of first column values are contained in both of union keys.
If so, corresponding tuples are subject of secondary split.

> More specifically I am wondering why the supportSecondarySplit function (which is in
src/backend/access/gist/gistsplit.c)can assume that the data is currently on the left side in order to swap it.
 

I don't think it assumes that all the data is currently on the left
side.  There is left and right sides of primary split.  And in the
same time there is left and right sides of secondary split.  The might
union them straight or crosswise.  The name of leaveOnLeft variable
might be confusing.  leaveOnLeft == true means straight union, while
leaveOnLeft == false means crosswise union.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company