Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id CAH2-Wz=1FaPa2TroKYxpG7mqUJrssOfmNQOPW+vTVmG4vyXD7Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
List pgsql-hackers
On Wed, Sep 25, 2019 at 8:05 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
> Attached is v18. In this version bt_dedup_one_page() is refactored so that:
> - no temp page is used, all updates are applied to the original page.
> - each posting tuple wal logged separately.
> This also allowed to simplify btree_xlog_dedup significantly.

This looks great! Even if it isn't faster than using a temp page
buffer, the flexibility seems like an important advantage. We can do
things like have the _bt_dedup_one_page() caller hint that
deduplication should start at a particular offset number. If that
doesn't work out by the time the end of the page is reached (whatever
"works out" may mean), then we can just start at the beginning of the
page, and work through the items we skipped over initially.

> > We still haven't added an "off" switch to deduplication, which seems
> > necessary. I suppose that this should look like GIN's "fastupdate"
> > storage parameter.

> Why is it necessary to save this information somewhere but rel->rd_options,
> while we can easily access this field from _bt_findinsertloc() and
> _bt_load().

Maybe, but we also need to access a flag that says it's safe to use
deduplication. Obviously deduplication is not safe for datatypes like
numeric and text with a nondeterministic collation. The "is
deduplication safe for this index?" mechanism will probably work by
doing several catalog lookups. This doesn't seem like something we
want to do very often, especially with a buffer lock held -- ideally
it will be somewhere that's convenient to access.

Do we want to do that separately, and have a storage parameter that
says "I would like to use deduplication in principle, if it's safe"?
Or, do we store both pieces of information together, and forbid
setting the storage parameter to on when it's known to be unsafe for
the underlying opclasses used by the index? I don't know.

I think that you can start working on this without knowing exactly how
we'll do those catalog lookups. What you come up with has to work with
that before the patch can be committed, though.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [PATCH][PROPOSAL] Add enum releation option type
Next
From: Laurenz Albe
Date:
Subject: Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)