Re: Enabling B-Tree deduplication by default - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Enabling B-Tree deduplication by default
Date
Msg-id CAH2-Wzm1u8HmCamGj2LmtvUudzai5qDJryTotu++JLLD9KVMRw@mail.gmail.com
Whole thread Raw
In response to Re: Enabling B-Tree deduplication by default  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Enabling B-Tree deduplication by default
List pgsql-hackers
On Thu, Jan 30, 2020 at 11:40 AM Peter Geoghegan <pg@bowt.ie> wrote:
> I think that I should commit the patch without the GUC tentatively.
> Just have the storage parameter, so that everyone gets the
> optimization without asking for it. We can then review the decision to
> enable deduplication generally after the feature has been in the tree
> for several months.

This is how things work in the committed patch (commit 0d861bbb):
There is a B-Tree storage parameter named deduplicate_items, which is
enabled by default. In general, users will get deduplication unless
they opt out, including in unique indexes (though note that we're more
selective about triggering a deduplication patch in unique indexes
[1]).

> There is no need to make a final decision about whether or not the
> optimization gets enabled before committing the patch.

It's now time to make a final decision on this. Does anyone have any
reason to believe that leaving deduplication enabled by default is the
wrong way to go?

Note that using deduplication isn't strictly better than not using
deduplication for all indexes in all workloads; that's why it's
possible to disable the optimization. This thread has lots of
information about the reasons why enabling deduplication by default
seems appropriate, all of which still apply. Note that there have been
no bug reports involving deduplication since it was committed on
February 26th, with the exception of some minor issues that I reported
and fixed.

The view of the RMT is that the feature should remain enabled by
default (i.e. no changes are required). Of course, I am a member of
the RMT this year, as well as one of the authors of the patch. I am
hardly an impartial voice here. I believe that that did not sway the
decision making process of the RMT in this instance. If there are no
objections in the next week or so, then I'll close out the relevant
open item.

[1] https://www.postgresql.org/docs/devel/btree-implementation.html#BTREE-DEDUPLICATION
-- See "Tip"
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view
Next
From: Tomas Vondra
Date:
Subject: Re: Default setting for enable_hashagg_disk