Re: GiST VACUUM - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: GiST VACUUM
Date
Msg-id d7e89e1b-efd4-1464-f03d-4ba4ea9e1835@iki.fi
Whole thread Raw
In response to Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: GiST VACUUM  (Heikki Linnakangas <hlinnaka@iki.fi>)
Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
On 28/10/2018 19:32, Andrey Borodin wrote:
> Hi everyone!
> 
>> 2 окт. 2018 г., в 6:14, Michael Paquier <michael@paquier.xyz> написал(а):
>> Andrey, your latest patch does not apply.  I am moving this to the next
>> CF, waiting for your input.
> 
> I'm doing preps for CF.
> Here's rebased version.

Thanks, I had another look at these.

In patch #1, to do the vacuum scan in physical order:

* The starting NSN was not acquired correctly for unlogged and temp 
relations. They don't use WAL, so their NSN values are based on the 
'unloggedLSN' counter, rather than current WAL insert pointer. So must 
use gistGetFakeLSN() rather than GetInsertRecPtr() for them. Fixed that.

* Adjusted comments a little bit, mostly by copy-pasting the 
better-worded comments from the corresponding nbtree code, and ran pgindent.

I think this is ready to be committed, except that I didn't do any 
testing. We discussed the need for testing earlier. Did you write some 
test scripts for this, or how have you been testing?


Patch #2:

* Bitmapset stores 32 bit signed integers, but BlockNumber is unsigned. 
So this will fail with an index larger than 2^31 blocks. That's perhaps 
academical, I don't think anyone will try to create a 16 TB GiST index 
any time soon. But it feels wrong to introduce an arbitrary limitation 
like that.

* I was surprised that the bms_make_empty() function doesn't set the 
'nwords' field to match the allocated size. Perhaps that was 
intentional, so that you don't need to scan the empty region at the end, 
when you scan through all matching bits? Still, seems surprising, and 
needs a comment at least.

* We're now scanning all internal pages in the 2nd phase. Not only those 
internal pages that contained downlinks to empty leaf pages. That's 
probably OK, but the comments need updating on that.

* In general, comments need to be fixed and added in several places. For 
example, there's no clear explanation on what the "delete XID", stored 
in pd_prune_xid, means. (I know what it is, because I'm familiar with 
the same mechanism in B-tree, but it's not clear from the code itself.)

These can be fixed, they're not show-stoppers, but patch #2 isn't quite 
ready yet.

- Heikki


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: explain plans with information about (modified) gucs
Next
From: Heikki Linnakangas
Date:
Subject: Re: GiST VACUUM