Re: [PATCH] nbtree: Do not show debugmessage if deduplication is disabled - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [PATCH] nbtree: Do not show debugmessage if deduplication is disabled
Date
Msg-id CAH2-Wz=FM=oAK9jOwb1Rz9CuPLQRD=fapAK7Cgq4XwjdSUPbQw@mail.gmail.com
Whole thread Raw
In response to [PATCH] nbtree: Do not show debugmessage if deduplication is disabled  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Wed, Dec 16, 2020 at 5:28 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> Even though the message literally says whether the index "can safely" or
> "cannot" use deduplication, the function specifically avoids the debug message
> for system columns, so I think it also makes sense to hide it when
> deduplication is turned off.

I disagree. The point of the message is to advertise whether
deduplication is possible in principle for indexes where support is
not precluded by a significant design issue that will almost certainly
not change in the future. The debug message should only apply to
indexes without INCLUDE non-key columns that are not system catalog
indexes.

In general, I think of the storage parameter as advisory. If it wasn't
advisory then we'd have no way of rescinding support for deduplication
in the event of an opclass that somehow gets the "equality implies
image equality" question wrong. If it wasn't advisory then we might
end up raising an error when the user explicitly asks for
deduplication but that isn't possible -- which might break somebody's
pg_restore workflow.

Even when deduplication is both the safe and the desired behavior,
there is at least one case where it's applied selectively. We do this
in unique indexes, where deduplication can only help with version
churn duplicates and so we only try to deduplicate when that appears
to be a factor. By the same token, when the user disables
deduplication via the storage parameter (presumably due to the
performance trade-off somehow not seeming useful), they cannot expect
to get back to an on-disk representation without posting list tuples,
unless and until they REINDEX.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: range_agg
Next
From: Peter Geoghegan
Date:
Subject: Re: Optimizing the documentation