Andres Freund <andres@anarazel.de> wrote:
> On 2021-04-21 11:21:31 -0400, Robert Haas wrote:
> > This scheme adds a lot of complexity, which is a concern, but it seems
> > to me that it might have several benefits. One is concurrency. You
> > could have one process gathering dead TIDs and adding them to the
> > dead-TID fork while another process is vacuuming previously-gathered
> > TIDs from some index.
>
> I think it might even open the door to using multiple processes
> gathering dead TIDs for the same relation.
I think the possible concurrency improvements are themselves a valid reason to
do the decoupling. Or rather it's hard to imagine how the current
implementation of VACUUM can get parallel workers involved in gathering the
dead heap TIDs efficiently. Currently, a single backend gathers the heap TIDs,
and it can then launch several parallel workers to remove the TIDs from
indexes. If parallel workers gathered the heap TIDs, then (w/o the decoupling)
the parallel index processing would be a problem because a parallel worker
cannot launch other parallel workers.
> > In fact, every index could be getting vacuumed at the same time, and
> > different indexes could be removing different TID ranges.
>
> We kind of have this feature right now, due to parallel vacuum...
--
Antonin Houska
Web: https://www.cybertec-postgresql.com