Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id CAH2-WzkkCa4ES9BTt4VCe1RygOqf8ZZppvnSg_URaW7EFa1KDQ@mail.gmail.com
Whole thread Raw
In response to Fwd: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Rafia Sabih <rafia.pghackers@gmail.com>)
Responses Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
List pgsql-hackers
On Thu, Jul 11, 2019 at 8:34 AM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> I tried this patch and found the improvements impressive. However,
> when I tried with multi-column indexes it wasn't giving any
> improvement, is it the known limitation of the patch?

It'll only deduplicate full duplicates. It works with multi-column
indexes, provided the entire set of values in duplicated -- not just a
prefix. Prefix compression is possible, but it's more complicated. It
seems to generally require the DBA to specify a prefix length,
expressed as a number of prefix columns.

> I am surprised to find that such a patch is on radar since quite some
> years now and not yet committed.

The v12 work on nbtree (making heap TID a tiebreaker column) seems to
have made the general approach a lot more effective. Compression is
performed lazily, not eagerly, which seems to work a lot better.

> + elog(DEBUG4, "insert_itupprev_to_page. compressState->ntuples %d
> IndexTupleSize %zu free %zu",
> + compressState->ntuples, IndexTupleSize(to_insert), PageGetFreeSpace(page));
> +
> and other such DEBUG4 statements are meant to be removed, right...?

I hope so too.

> /*
> * If we have only 10 uncompressed items on the full page, it probably
> * won't worth to compress them.
> */
> if (maxoff - n_posting_on_page < 10)
> return;
>
> Is this a magic number...?

I think that this should be a constant or something.

> /*
> * We do not expect to meet any DEAD items, since this function is
> * called right after _bt_vacuum_one_page(). If for some reason we
> * found dead item, don't compress it, to allow upcoming microvacuum
> * or vacuum clean it up.
> */
> if (ItemIdIsDead(itemId))
> continue;
>
> This makes me wonder about those 'some' reasons.

I think that this is just defensive. Note that _bt_vacuum_one_page()
is prepared to find no dead items, even when the BTP_HAS_GARBAGE flag
is set for the page.

> Caller is responsible for checking BTreeTupleIsPosting to ensure that
> + * he will get what he expects
>
> This can be re-framed to make the caller more gender neutral.

Agreed. I also don't like anthropomorphizing code like this.

> Other than that, I am curious about the plans for its backward compatibility.

Me too. There is something about a new version 5 in comments in
nbtree.h, but the version number isn't changed. I think that we may be
able to get away with not increasing the B-Tree version from 4 to 5,
actually. Deduplication is performed lazily when it looks like we
might have to split the page, so there isn't any expectation that
tuples will either be compressed or uncompressed in any context.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Next
From: Dave Cramer
Date:
Subject: Re: let's make the list of reportable GUCs configurable (was Re: Add%r substitution for psql prompts to show recovery status)