Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence
Date
Msg-id 771d014f-1b78-2770-4e97-c2413b889e77@postgrespro.ru
Whole thread Raw
In response to Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On 31.12.2019 01:40, Peter Geoghegan wrote:
> On Mon, Dec 30, 2019 at 9:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
>>> For example, float and numeric types are "never bitwise equal", while array,
>>> text, and other container types are "maybe bitwise equal". An array of
>>> integers
>>> or text with C collation can be treated as bitwise equal attributes, and it
>>> would be too harsh to restrict them from deduplication.
> We might as well support container types (like array) in the first
> Postgres version that has nbtree deduplication, I suppose. Even still,
> I don't think that it actually matters much to users. B-Tree indexes
> on arrays are probably very rare. Note that I don't consider text to
> be a container type here -- obviously btree/text_ops is a very
> important opclass for the deduplication feature. It may be the most
> important opclass overall.
>
> Recursively invoking a support function for the "contained" data type
> in the btree/array_ops support function seems like it might be messy.
> Not sure about that, though.
>
>>> What bothers me is that this option will unlikely be helpful on its own
>>> and we
>>> should also provide some kind of recheck function along with opclass, which
>>> complicates this idea even further and doesn't seem very clear.
>> It seems like the simplest thing might be to forget about the 'char'
>> column and just have a support function which can be used to assess
>> whether a given opclass's notion of equality is bitwise.
> I like the idea of relying only on a support function.

In attachment you can find the WIP patch that adds support function for 
btree opclasses.
Before continuing, I want to ensure that I understood the discussion 
above correctly.

Current version of the patch adds:

1) new syntax, which allow to provide support function:

CREATE OPERATOR CLASS int4_ops_test
FOR TYPE int4 USING btree AS
         OPERATOR 1 =(int4, int4),
         FUNCTION 1 btint4cmp(int4, int4),
         SUPPORT datum_image_eqisbitwise;

We probably can add more words to specify the purpose of the support 
function.
Do you have any other objections about the place of this new element in 
CreateOplcass syntax structure?

2) trivial support function that always returns true 
'datum_image_eqisbitwise'.
It is named after 'datum_image_eq', because we define this support 
function via its behavior.

If this prototype is fine, I will continue this work and add support 
functions for other opclasses, update pg_dump and documentation.

Thoughts?


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Removing pg_pltemplate and creating "trustable" extensions
Next
From: Stephen Frost
Date:
Subject: Re: Removing pg_pltemplate and creating "trustable" extensions