Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date
Msg-id 4c772f2c-36bc-d5df-c643-aa655493b098@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
List pgsql-hackers
11.07.2019 21:19, Peter Geoghegan wrote:
> On Thu, Jul 11, 2019 at 8:34 AM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
Hi,
Peter, Rafia, thanks for the review. New version is attached.

>> + elog(DEBUG4, "insert_itupprev_to_page. compressState->ntuples %d
>> IndexTupleSize %zu free %zu",
>> + compressState->ntuples, IndexTupleSize(to_insert), PageGetFreeSpace(page));
>> +
>> and other such DEBUG4 statements are meant to be removed, right...?
> I hope so too.
Yes, these messages are only for debugging.
I haven't delete them since this is still work in progress
and it's handy to be able to print inner details.
Maybe I should also write a patch for pageinspect.

>> /*
>> * If we have only 10 uncompressed items on the full page, it probably
>> * won't worth to compress them.
>> */
>> if (maxoff - n_posting_on_page < 10)
>> return;
>>
>> Is this a magic number...?
> I think that this should be a constant or something.
Fixed. Now this is a constant in nbtree.h. I'm not 100% sure about the 
value.
When the code will stabilize we can benchmark it and find optimal value.

>> /*
>> * We do not expect to meet any DEAD items, since this function is
>> * called right after _bt_vacuum_one_page(). If for some reason we
>> * found dead item, don't compress it, to allow upcoming microvacuum
>> * or vacuum clean it up.
>> */
>> if (ItemIdIsDead(itemId))
>> continue;
>>
>> This makes me wonder about those 'some' reasons.
> I think that this is just defensive. Note that _bt_vacuum_one_page()
> is prepared to find no dead items, even when the BTP_HAS_GARBAGE flag
> is set for the page.
You are right, now it is impossible to meet dead items in this function.
Though it can change in the future if, for example, _bt_vacuum_one_page 
will behave lazily.
So this is just a sanity check. Maybe it's worth to move it to Assert.

>
>> Caller is responsible for checking BTreeTupleIsPosting to ensure that
>> + * he will get what he expects
>>
>> This can be re-framed to make the caller more gender neutral.
> Agreed. I also don't like anthropomorphizing code like this.

Fixed.

>> Other than that, I am curious about the plans for its backward compatibility.
> Me too. There is something about a new version 5 in comments in
> nbtree.h, but the version number isn't changed. I think that we may be
> able to get away with not increasing the B-Tree version from 4 to 5,
> actually. Deduplication is performed lazily when it looks like we
> might have to split the page, so there isn't any expectation that
> tuples will either be compressed or uncompressed in any context.
Current implementation is backward compatible.
To distinguish posting tuples, it only adds one new flag combination.
This combination was never possible before. Comment about version 5 is 
deleted.

I also added a patch for amcheck.

There is one major issue left - preserving TID order in posting lists.
For a start, I added a sort into BTreeFormPostingTuple function.
It turned out to be not very helpful, because we cannot check this 
invariant lazily.

Now I work on patching _bt_binsrch_insert() and _bt_insertonpg() to 
implement
insertion into the middle of the posting list. I'll send a new version 
this week.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: sepgsql seems rather thoroughly broken on Fedora 30
Next
From: Shawn Debnath
Date:
Subject: Re: Adding SMGR discriminator to buffer tags