parallel vacuum comments - Mailing list pgsql-hackers

From Andres Freund
Subject parallel vacuum comments
Date
Msg-id 20211030212101.ae3qcouatwmy7tbr@alap3.anarazel.de
Whole thread Raw
Responses Re: parallel vacuum comments
Re: parallel vacuum comments
List pgsql-hackers
Hi,

Due to bug #17245: [1] I spent a considerably amount of time looking at vacuum
related code. And I found a few things that I think could stand improvement:

- There's pretty much no tests. This is way way too complicated feature for
  that. If there had been tests for the obvious edge case of some indexes
  being too small to be handled in parallel, but others needing parallelism,
  the mistake leading to #17245 would have been caught during development.


- There should be error check verifying that all indexes have actually been
  vacuumed. It's way too easy to have bugs leading to index vacuuming being
  skipped.


- The state machine is complicated. It's very unobvious that an index needs to
  be processed serially by the leader if shared_indstats == NULL.


- I'm very confused by the existance of LVShared->bitmap. Why is it worth
  saving 7 bits per index for something like this (compared to a simple
  array of bools)? Nor does the naming explain what it's for.

  The presence of the bitmap requires stuff like SizeOfLVShared(), which
  accounts for some of the bitmap size, but not all?

  But even though we have this space optimized bitmap thing, we actually need
  more memory allocated for each index, making this whole thing pointless.


- Imo it's pretty confusing to have functions like
  lazy_parallel_vacuum_indexes() (in 13, renamed in 14) that "Perform index
  vacuum or index cleanup with parallel workers.", based on
  lps->lvshared->for_cleanup.


- I don't like some of the new names introduced in 14. E.g.
  "do_parallel_processing" is way too generic.


- On a higher level, a lot of this actually doesn't seem to belong into
  vacuumlazy.c, but should be somewhere more generic. Pretty much none of this
  code is heap specific. And vacuumlazy.c is large enough without the parallel
  code.


Greetings,

Andres Freund

[1] https://postgr.es/m/17245-ddf06aaf85735f36%40postgresql.org



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Add additional information to src/test/ssl/README
Next
From: Alvaro Herrera
Date:
Subject: Re: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.