Re: Lock-free compaction. Why not? - Mailing list pgsql-hackers

From Ahmed Yarub Hani Al Nuaimi
Subject Re: Lock-free compaction. Why not?
Date
Msg-id CAF239vpZ3zn0zZqBEiP-hBXKorxumCbFifKjc9E-A==a7b6TXA@mail.gmail.com
Whole thread Raw
In response to Re: Lock-free compaction. Why not?  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Lock-free compaction. Why not?
List pgsql-hackers
That clearly explains the problem. But this got me thinking: what if we do both index and heap optimization at the same time?
Meaning that the newly move heap tuple which is used to compact/defragment heap pages would be followed by moving the index (creating and then deleting) a new index tuple at the right place in the index data files (the one that had its dead tuples removed and internally defragmented, aka vacuumed). Deleting the old index could be done immediately after moving the heap tuple. I think that this can both solve the bloating problem and make sure that both the table and index heaps are in optimum shape, all of this being done lazily to make sure that these operations would only be done when the servers are not overwhelmed (or just using whatever logic our lazy vacuuming uses). What do you think?

On Sat, Jul 20, 2024 at 10:52 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Sun, 21 Jul 2024 at 04:00, Ahmed Yarub Hani Al Nuaimi
<ahmedyarubhani@gmail.com> wrote:
> 2- Can you point me to a resource explaining why this might lead to index bloating?

No resource links, but if you move a tuple to another page then you
must also adjust the index.  If you have no exclusive lock on the
table, then you must assume older transactions still need the old
tuple version, so you need to create another index entry rather than
re-pointing the existing index entry's ctid to the new tuple version.
It's not hard to imagine that would cause the index to become larger
if you had to move some decent portion of the tuples to other pages.

FWIW, I think it would be good if we had some easier way to compact
tables without blocking concurrent users.  My primary interest in TID
Range Scans was to allow easier identification of tuples near the end
of the heap that could be manually UPDATEd after a vacuum to allow the
heap to be shrunk during the next vacuum.

David

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Trying to build x86 version on windows using meson
Next
From: Kirill Reshke
Date:
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?