Hi,
Due to bug #17245: [1] I spent a considerably amount of time looking at vacuum
related code. And I found a few things that I think could stand improvement:
- There's pretty much no tests. This is way way too complicated feature for
that. If there had been tests for the obvious edge case of some indexes
being too small to be handled in parallel, but others needing parallelism,
the mistake leading to #17245 would have been caught during development.
- There should be error check verifying that all indexes have actually been
vacuumed. It's way too easy to have bugs leading to index vacuuming being
skipped.
- The state machine is complicated. It's very unobvious that an index needs to
be processed serially by the leader if shared_indstats == NULL.
- I'm very confused by the existance of LVShared->bitmap. Why is it worth
saving 7 bits per index for something like this (compared to a simple
array of bools)? Nor does the naming explain what it's for.
The presence of the bitmap requires stuff like SizeOfLVShared(), which
accounts for some of the bitmap size, but not all?
But even though we have this space optimized bitmap thing, we actually need
more memory allocated for each index, making this whole thing pointless.
- Imo it's pretty confusing to have functions like
lazy_parallel_vacuum_indexes() (in 13, renamed in 14) that "Perform index
vacuum or index cleanup with parallel workers.", based on
lps->lvshared->for_cleanup.
- I don't like some of the new names introduced in 14. E.g.
"do_parallel_processing" is way too generic.
- On a higher level, a lot of this actually doesn't seem to belong into
vacuumlazy.c, but should be somewhere more generic. Pretty much none of this
code is heap specific. And vacuumlazy.c is large enough without the parallel
code.
Greetings,
Andres Freund
[1] https://postgr.es/m/17245-ddf06aaf85735f36%40postgresql.org