Neil Conway <neilc@samurai.com> writes:
> I'm not sure the right way to fix it (at least without significant
> changes to the GiST API). At present, the PickSplit() method is passed a
> vector of GISTENTRYs and fills in a GIST_SPLITVEC. The GISTENTRYs
> correspond to the first attributes of all the tuples in the node to be
> split. There is no provision for the GiST extension to be informed about
> any additional attributes in the index tuples. Even if we changed the
> API to allow that, the GiST extension would have a hard time making a
> reasonable decision in the multi-attribute case: the additional index
> attributes could well be implemented using a different GiST extension.
Right, when you consider attributes of different types it gets tough.
I'm not familiar with the details of the GiST code, but would it work to
generalize PickSplit to return a three-way classification? That is,
instead of actually splitting the node, have it identify each item as
"definitely left", "definitely right", or "don't care" (think "less",
"greater", or "equal to" the desired split value). If there is another
index column, you then invoke its PickSplit on just the "don't care" set
to try to classify them; and so on. Once you are out of columns, any
remaining "don't care" set is split using some standardized method like
trying to equalize the physical space.
Of course this requires API changes for PickSplit (not only a more
complex result, but it will have to deal with null inputs). But it
seems like it would be localized otherwise. Also it might allow removal
of duplicated logic to deal with splitting equal values sanely.
regards, tom lane