Re: Add index scan progress to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From Imseih (AWS), Sami
Subject Re: Add index scan progress to pg_stat_progress_vacuum
Date
Msg-id 20F7ED07-4D32-4231-A81C-432EB111F0B2@amazon.com
Whole thread Raw
In response to Re: Add index scan progress to pg_stat_progress_vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Add index scan progress to pg_stat_progress_vacuum  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
>    First of all, I don't think we need to declare ParallelVacuumProgress
>    in vacuum.c since it's set and used only in vacuumparallel.c. But I
>    don't even think it's a good idea to declare it in vacuumparallel.c as
>    a static variable. The primary reason is that it adds things we need
>   to care about. For example, what if we raise an error during index
>    vacuum? The transaction aborts but ParallelVacuumProgress still refers
>    to something old. Suppose further that the next parallel vacuum
>    doesn't launch any workers, the leader process would still end up
>    accessing the old value pointed by ParallelVacuumProgress, which
>    causes a SEGV. So we need to reset it anyway at the beginning of the
>    parallel vacuum. It's easy to fix at this time but once the parallel
>   vacuum code gets more complex, it could forget to care about it.

>    IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different
>    story. They are set in vacuumparallel.c and are used in vacuum.c for
>    vacuum delay. If they weren't global variables, we would need to pass
>    them to every function that could eventually call the vacuum delay
>    function. So it makes sense to me to have them as global variables.On
>    the other hand, for ParallelVacuumProgress, it's a common pattern that
>    ambulkdelete(), amvacuumcleanup() or a common index scan routine like
>    btvacuumscan() checks the progress. I don't think index AM needs to
>    pass the value down to many of its functions. So it makes sense to me
>    to pass it via IndexVacuumInfo.

Thanks for the detailed explanation and especially clearing up
my understanding of VacuumSharedCostBalance and VacuumActiveNWorker.

I do now think that passing ParallelVacuumState in IndexVacuumInfo is
a more optimal choice.

Attached version addresses the above and the previous comments.


Thanks

Sami Imseih
Amazon Web Services (AWS)


Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15
Next
From: Erik Rijkers
Date:
Subject: Re: Schema variables - new implementation for Postgres 15 (typo)