Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id CAH2-WzmqHsBW6t3TwDRygbzgMJFtukNwNcX9B+w_OrYXbftaPQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Bruce Momjian <bruce@momjian.us>)
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Tue, Dec 3, 2019 at 12:13 PM Peter Geoghegan <pg@bowt.ie> wrote:
> The new criteria/heuristic for unique indexes is very simple: If a
> unique index has an existing item that is a duplicate on the incoming
> item at the point that we might have to split the page, then apply
> deduplication. Otherwise (when the incoming item has no duplicates),
> don't apply deduplication at all -- just accept that we'll have to
> split the page. We already cache the bounds of our initial binary
> search in insert state, so we can reuse that information within
> _bt_findinsertloc() when considering deduplication in unique indexes.

Attached is v26, which adds this new criteria/heuristic for unique
indexes. We now seem to consistently get good results with unique
indexes.

Other changes:

* A commit message is now included for the main patch/commit.

* The btree_deduplication GUC is now a boolean, since it is no longer
up to the user to indicate when deduplication is appropriate in unique
indexes (the new heuristic does that instead). The GUC now only
affects non-unique indexes.

* Simplified the user docs. They now only mention deduplication of
unique indexes in passing, in line with the general idea that
deduplication in unique indexes is an internal optimization.

* Fixed bug that made backwards scans that touch posting lists fail to
set LP_DEAD bits when that was possible (i.e. the kill_prior_tuple
optimization wasn't always applied there with posting lists, for no
good reason). Also documented the assumptions made by the new code in
_bt_readpage()/_bt_killitems() -- if that was clearer in the first
place, then the LP_DEAD/kill_prior_tuple bug might never have
happened.

* Fixed some memory leaks in nbtree VACUUM.

Still waiting for some review of the first patch, to get it out of the
way. Anastasia?

-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: shared tempfile was not removed on statement_timeout(unreproducible)
Next
From: Amit Kapila
Date:
Subject: Re: Wrong assert in TransactionGroupUpdateXidStatus