I'm sorry if I missed something, but ISTM this is beginning to look a
lot like GiST. This was pointed out by Robert Haas last year.
On Wed, Jun 18, 2014 at 12:09:42PM -0300, Claudio Freire wrote:
> So, you have:
>
> An aggregate to generate a "compressed set" from several values
Which GiST does by calling 'compress' on each value, and the 'unions' the
results together.
> A function which adds a new value to the "compressed set" and returns
> the new "compressed set"
Again, 'compress' + 'union'
> A function which tests if a value is in a "compressed set"
Which GiST does using 'compress' +'consistant'
> A function which tests if a "compressed set" overlaps another
> "compressed set" of equal type
Which GiST calls 'consistant'
So I'm wondering why you can't just reuse the btree_gist functions we
already have in contrib. It seems to me that these MinMax indexes are
in fact a variation on GiST that indexes the pages of a table based
upon the 'union' of all the elements in a page. By reusing the GiST
operator class you get support for many datatypes for free.
> If you can define different compressed sets, you can use this to
> generate both min/max indexes as well as bloom filter indexes. Whether
> we'd want to have both is perhaps questionable, but having the ability
> to is probably desirable.
You could implement bloom filter in GiST too. It's been discussed
before but I can't find any implementation. Probably because the filter
needs to be parameterised and if you store the bloom filter for each
element it gets expensive very quickly. However, hooked into a minmax
structure which only indexes whole pages it could be quite efficient.
> One problem with such a generalized implementation would be, that I'm
> not sure in-place modification of the "compressed set" on-disk can be
> assumed to be safe on all cases. Surely, for strictly-enlarging sets
> it would, but while min/max and bloom filters both fit the bill, it's
> not clear that one can assume this for all structures.
I think GiST has already solved this problem.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts. -- Arthur Schopenhauer