Unsplitting btree index leaf pages - Mailing list pgsql-hackers

From Simon Riggs
Subject Unsplitting btree index leaf pages
Date
Msg-id 1135209627.2964.351.camel@localhost.localdomain
Whole thread Raw
Responses Re: Unsplitting btree index leaf pages
List pgsql-hackers
When we discussed online REINDEX recently we focused on the REINDEX
command itself rather than look at alternative approaches.

One reason to REINDEX is because of index page splits getting things out
of sequence and generally bloating the index.

When we VACUUM, each index is scanned in logical order.

While we scan, if we found two adjacent pages, both of which have less
than (say) 40% rows, we could re-join or "unsplit" those pages together.
The index blocks are fully locked during the read anyway and there is no
MVCC problem with moving index rows between blocks. All we have to do is
to lock both blocks, having locked them in the correct order.

The rows would always be moved to the lowest physical block id, so that
data would naturally migrate towards the start of the index file. Blocks
would then be marked half-dead just as if they had just had their last
index row removed by the vacuum.

We could start the scan by locking block 1 AND block2, then scan forward
always holding 2 locks as we go. That way we would not need to unlock
and relock the blocks to lock two blocks. The concurrency loss would not
be that great, but we would gain the ability to unsplit the two blocks
into one.

If we do this, we would could possibly avoid the need to full REINDEX
entirely.

If this method checks out we could do one of:
- make VACUUM do this always
- add an option for REINDEX: CLEAN/COMPRESS/VACUUM etc to do this upon
command only

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: problem with nasty latin2 sorting
Next
From: Tom Lane
Date:
Subject: Re: Unsplitting btree index leaf pages