> First of all, I don't think we need to declare ParallelVacuumProgress
> in vacuum.c since it's set and used only in vacuumparallel.c. But I
> don't even think it's a good idea to declare it in vacuumparallel.c as
> a static variable. The primary reason is that it adds things we need
> to care about. For example, what if we raise an error during index
> vacuum? The transaction aborts but ParallelVacuumProgress still refers
> to something old. Suppose further that the next parallel vacuum
> doesn't launch any workers, the leader process would still end up
> accessing the old value pointed by ParallelVacuumProgress, which
> causes a SEGV. So we need to reset it anyway at the beginning of the
> parallel vacuum. It's easy to fix at this time but once the parallel
> vacuum code gets more complex, it could forget to care about it.
> IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different
> story. They are set in vacuumparallel.c and are used in vacuum.c for
> vacuum delay. If they weren't global variables, we would need to pass
> them to every function that could eventually call the vacuum delay
> function. So it makes sense to me to have them as global variables.On
> the other hand, for ParallelVacuumProgress, it's a common pattern that
> ambulkdelete(), amvacuumcleanup() or a common index scan routine like
> btvacuumscan() checks the progress. I don't think index AM needs to
> pass the value down to many of its functions. So it makes sense to me
> to pass it via IndexVacuumInfo.
Thanks for the detailed explanation and especially clearing up
my understanding of VacuumSharedCostBalance and VacuumActiveNWorker.
I do now think that passing ParallelVacuumState in IndexVacuumInfo is
a more optimal choice.
Attached version addresses the above and the previous comments.
Thanks
Sami Imseih
Amazon Web Services (AWS)