Fwd: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Rafia Sabih
Subject Fwd: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id CA+FpmFcRxkikQJQk7QOSEJdTM_KZ0osGP8907QiuKFCfHwAgBw@mail.gmail.com
Whole thread Raw
In response to [PROPOSAL] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Sun, 7 Jul 2019 at 01:08, Peter Geoghegan <pg@bowt.ie> wrote:

> * Maybe we could do compression with unique indexes when inserting
> values with NULLs? Note that we now treat an insertion of a tuple with
+1

I tried this patch and found the improvements impressive. However,
when I tried with multi-column indexes it wasn't giving any
improvement, is it the known limitation of the patch?
I am surprised to find that such a patch is on radar since quite some
years now and not yet committed.

Going through the patch, here are a few comments from me,

 /* Add the new item into the page */
+ offnum = OffsetNumberNext(offnum);
+
+ elog(DEBUG4, "insert_itupprev_to_page. compressState->ntuples %d
IndexTupleSize %zu free %zu",
+ compressState->ntuples, IndexTupleSize(to_insert), PageGetFreeSpace(page));
+
and other such DEBUG4 statements are meant to be removed, right...?
Just because I didn't find any other such statements in this API and
there are many in this patch, so not sure how much are they needed.

/*
* If we have only 10 uncompressed items on the full page, it probably
* won't worth to compress them.
*/
if (maxoff - n_posting_on_page < 10)
return;

Is this a magic number...?

/*
* We do not expect to meet any DEAD items, since this function is
* called right after _bt_vacuum_one_page(). If for some reason we
* found dead item, don't compress it, to allow upcoming microvacuum
* or vacuum clean it up.
*/
if (ItemIdIsDead(itemId))
continue;

This makes me wonder about those 'some' reasons.

Caller is responsible for checking BTreeTupleIsPosting to ensure that
+ * he will get what he expects

This can be re-framed to make the caller more gender neutral.

Other than that, I am curious about the plans for its backward compatibility.

--
Regards,
Rafia Sabih



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: SQL/JSON path issues/questions
Next
From: Tomas Vondra
Date:
Subject: Re: accounting for memory used for BufFile during hash joins