Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence
Date
Msg-id CAH2-Wz=3pmbGrtBf_oFc0wS=aDxdsTCink7yKbRSw0gu92P5Vw@mail.gmail.com
Whole thread Raw
In response to Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Jan 2, 2020 at 6:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Dec 30, 2019 at 6:58 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > I propose that we adopt the following definition: For an operator
> > class to be safe, its equality operator has to always agree with
> > datum_image_eq() (i.e. two datums must be bitwise equal after
> > detoasting).
>
> I suggested using datumIsEqual() as the canonical definition. (I
> wonder why datum_image_eq() does not reuse that function?)

The difference between datum_image_eq() and datumIsEqual() is that
only the former will consider two datums equal when they happen to
have different TOAST input states -- we need that here. datumIsEqual()
avoids doing this because sometimes it needs to work for callers
operating within an aborted transaction.  datum_image_eq() was
originally used for the "*=, *<>, *<, *<=, *>, and *>=" rowtype B-Tree
operator class needed by REFRESH MATERIALIZED VIEW CONCURRENTLY.
(Actually, that's not quite true, since datum_image_eq() is a spin-off
of the rowtype code that was added much more recently to fix a bug in
foreign keys.)

The B-Tree code and amcheck need to be tolerant of inconsistent TOAST
input states. This isn't particularly likely to happen, but it would
be hard to revoke the general assumption that that's okay now. Also,
it's not that hard to deal with it directly. For example, we're not
reliant on equal index tuples all being the same size in the
deduplication patch.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Greatest Common Divisor
Next
From: rmrodriguez@carto.com
Date:
Subject: Re: avoid some calls to memset with array initializer