On Tue, Jul 9, 2019 at 11:27 AM John Lumby <johnlumby@hotmail.com> wrote:
> And the point of the REINDEX at that point (below) is to remove dead tuple keys-tids
> and reorganize those split pages back into physical order without losing the freespace.
VACUUM already removes the tuples, accounting for all overhead.
You are right that it would be possible for us to "defragment" the
pages, so that they'd be in sequential order on disk from the point of
view of a whole index scan -- this is what the "leaf_fragmentation"
statistic from pgstatindex() reports on. We could in principle come up
with a way of moving pages around, which would have some modest
benefit for certain types of queries (it wouldn't improve the
heap/index correlation, though, which is far more important). That
would either necessitate that the command acquire a disruptive lock on
the index (i.e. no writes, just like regular REINDEX), or that we
drastically rearchitect the B-Tree code to make it support this.
Neither of which seem particularly appealing.
I believe that this is a lot more important in systems that generally
use clustered indexes, such as MS SQL Server. This kind of
"fragmentation" isn't usually much of a problem when using Postgres.
--
Peter Geoghegan