Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id CAH2-Wz=_XUu4j=vqGLmU=Re=caPy49yG4kwvyaeQiWKug4djKw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Fri, Nov 8, 2019 at 10:35 AM Peter Geoghegan <pg@bowt.ie> wrote:
> There is more bitrot, so I attach v22.

The patch has stopped applying once again, so I attach v23.

One reason for the bitrot is that I pushed preparatory commits,
including today's "Make _bt_keep_natts_fast() use datum_image_eq()"
commit. Good to get that out of the way.

Other changes:

* Decided to go back to turning deduplication on by default with
non-unique indexes, and off by default using unique indexes.

The unique index stuff was regressed enough with INSERT-heavy
workloads that I was put off, despite my initial enthusiasm for
enabling deduplication everywhere.

* Disabled deduplication in system catalog indexes by deeming it
generally unsafe.

I realized that it would be impossible to provide a way to disable
deduplication in system catalog indexes if it was enabled at all. The
reason for this is simple: in general, it's not possible to set
storage parameters for system catalog indexes.

While I think that deduplication should work with system catalog
indexes on general principle, this is about an existing limitation.
Deduplication in catalog indexes can be revisited if and when somebody
figures out a way to make storage parameters work with system catalog
indexes.

* Basic user documentation -- this still needs work, but the basic
shape is now in place. I think that we should outline how the feature
works by describing the internals, including details of the data
structures. This provides guidance to users on when they should
disable or enable the feature.

This is discussed in the existing chapter on B-Tree internals. This
felt natural because it's similar to how GIN explains its compression
related features -- the discussion of the storage parameters in the
CREATE INDEX page of the docs links to a description of GIN internals
from "66.4. Implementation [of GIN]".

* nbtdedup.c "single value" strategy stuff now considers the
contribution of the page high key when considering how to deduplicate
such that nbtsplitloc.c's "single value" strategy has a usable split
point that helps it to hit its target free space. Not a very important
detail. It's nice to be consistent with the corresponding code within
nbtsplitloc.c.

* Worked through all remaining XXX/TODO/FIXME comments, except one:
The one that talks about the need for opclass infrastructure to deal
with cases like btree/numeric_ops, or text with a nondeterministic
collation.

The user docs now reference the BITWISE opclass stuff that we're
discussing over on the other thread. That's the only really notable
open item now IMV.

--
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: checking my understanding of TupleDesc
Next
From: Andres Freund
Date:
Subject: Re: make pg_attribute_noreturn() work for msvc?