Thread: Call lazy_check_wraparound_failsafe earlier for parallel vacuum
While looking through vacuum code, I noticed that
unlike non-parallel vacuum, parallel vacuum only gets
a failsafe check after an entire index cycle completes.
In vacuumlazy.c, lazy_check_wraparound_failsafe is checked
after every index completes, while in parallel, it is checked
after an entire index cycle completed.
if (!ParallelVacuumIsActive(vacrel))
{
for (int idx = 0; idx < vacrel->nindexes; idx++)
{
Relation indrel = vacrel->indrels[idx];
IndexBulkDeleteResult *istat = vacrel->indstats[idx];
vacrel->indstats[idx] =
lazy_vacuum_one_index(indrel, istat, vacrel->old_live_tuples,
vacrel);
/*
* Done vacuuming an index. Increment the indexes completed
*/
pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
idx + 1);
if (lazy_check_wraparound_failsafe(vacrel))
{
/* Wraparound emergency -- end current index scan */
allindexes = false;
break;
}
}
}
else
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, vacrel->old_live_tuples,
vacrel->num_index_scans);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
* that parallel VACUUM only gets the precheck and this postcheck.
*/
if (lazy_check_wraparound_failsafe(vacrel))
allindexes = false;
}
When a user is running a parallel vacuum and the vacuum is long running
due to many large indexes, it would make sense to check for failsafe earlier.
Also, checking after every index for parallel vacuum will provide the same
failsafe behavior for both parallel and non-parallel vacuums.
To make this work, it is possible to call lazy_check_wraparound_failsafe
inside parallel_vacuum_process_unsafe_indexes and
parallel_vacuum_process_safe_indexes of vacuumparallel.c
Regards,
Sami Imseih
Amazon Web Services (AWS)
On Wed, Nov 9, 2022 at 6:29 AM Imseih (AWS), Sami <simseih@amazon.com> wrote: > When a user is running a parallel vacuum and the vacuum is long running > > due to many large indexes, it would make sense to check for failsafe earlier. It makes sense to prefer consistency here, I suppose. The reason why we're not consistent is because it was easier not to be, which isn't exactly the best reason (nor the worst). I don't think that it's obvious that we need to call the failsafe at any particular frequency. There is probably an argument to be made for the idea that we're not checking frequently enough (even in the common serial VACUUM case), just as there is an argument to be made for the opposite idea. It's not like there is some simple linear relationship (or any kind of relationship) between the amount of physical work performed by one VACUUM operation, and the rate of XID consumption by the system as a whole. And so the details of how we do it have plenty to do with what we can afford to do. My gut instinct is that the most important thing is to at least call lazy_check_wraparound_failsafe() once per index scan. Multiple index scans are disproportionately involved in VACUUMs that take far longer than expected, which are presumably the kind of VACUUMs that tend to be running when the failsafe actually triggers. We don't really expect the failsafe to trigger, so presumably when it actually does things haven't been going well for some time. (Index corruption that prevents forward progress on one particular index is another example.) That said, one thing that does bother me in this area occurs to me: we call lazy_check_wraparound_failsafe() from lazy_scan_heap() (before we get to the index scans that you're talking about) at an interval that is based on how many heap pages we've either processed (and recorded as a scanned_pages page) *or* have skipped over using the visibility map. In other words we use blkno here, when we should really be using scanned_pages instead: if (blkno - next_failsafe_block >= FAILSAFE_EVERY_PAGES) { lazy_check_wraparound_failsafe(vacrel); next_failsafe_block = blkno; } This code effectively treats pages skipped using the visibility map as equivalent to pages physically scanned (scanned_pages), even though skipping requires essentially no work at all. That just isn't logical, and feels like something worth fixing. The fundamental unit of work in lazy_scan_heap() is a *scanned* heap page. -- Peter Geoghegan
> It makes sense to prefer consistency here, I suppose. The reason why > we're not consistent is because it was easier not to be, which isn't > exactly the best reason (nor the worst). Consistency is the key point here. It is odd that a serial vacuum may skip the remainder of the indexes if failsafe kicks-in, but in the parallel case it will go through the entire index cycle. > My gut instinct is that the most important thing is to at least call > lazy_check_wraparound_failsafe() once per index scan. I agree. And this should happen in the serial and parallel case. > That said, one thing that does bother me in this area occurs to me: we > call lazy_check_wraparound_failsafe() from lazy_scan_heap() (before we > get to the index scans that you're talking about) at an interval that > is based on how many heap pages we've either processed (and recorded > as a scanned_pages page) *or* have skipped over using the visibility > map. In other words we use blkno here, when we should really be using > scanned_pages instead: > if (blkno - next_failsafe_block >= FAILSAFE_EVERY_PAGES) > { > lazy_check_wraparound_failsafe(vacrel); > next_failsafe_block = blkno; > } > This code effectively treats pages skipped using the visibility map as > equivalent to pages physically scanned (scanned_pages), even though > skipping requires essentially no work at all. That just isn't logical, > and feels like something worth fixing. The fundamental unit of work in > lazy_scan_heap() is a *scanned* heap page. It makes perfect sense to use the scanned_pages instead. Regards, Sami imseih Amazon Web Services (AWS)
On Thu, Nov 10, 2022 at 10:20 AM Imseih (AWS), Sami <simseih@amazon.com> wrote: > Consistency is the key point here. It is odd that a serial > vacuum may skip the remainder of the indexes if failsafe > kicks-in, but in the parallel case it will go through the entire index > cycle. Yeah, it's a little inconsistent. > > My gut instinct is that the most important thing is to at least call > > lazy_check_wraparound_failsafe() once per index scan. > > I agree. And this should happen in the serial and parallel case. I meant that there should definitely be a check between each round of index scans (one index scan here affects each and every index). Doing more than a single index scan is presumably rare, but are likely common among VACUUM operations that take an unusually long time -- which is where the failsafe is relevant. I'm just highlighting that multiple index scans (rather than just 0 or 1 index scans) is by far the primary risk factor that leads to a VACUUM that takes way longer than is typical. (The other notable risk comes from aggressive VACUUMs that freeze a great deal of heap pages all at once, which I'm currently addressing by getting rid of the whole concept of discrete aggressive mode VACUUM operations.) > > This code effectively treats pages skipped using the visibility map as > > equivalent to pages physically scanned (scanned_pages), even though > > skipping requires essentially no work at all. That just isn't logical, > > and feels like something worth fixing. The fundamental unit of work in > > lazy_scan_heap() is a *scanned* heap page. > > It makes perfect sense to use the scanned_pages instead. Want to have a go at writing a patch for that? -- Peter Geoghegan
> Yeah, it's a little inconsistent. Yes, this should be corrected by calling the failsafe inside the parallel vacuum loops and handling the case by exiting the loop and parallel vacuum if failsafe kicks in. > I meant that there should definitely be a check between each round of > index scans (one index scan here affects each and every index). Doing > more than a single index scan is presumably rare, but are likely > common among VACUUM operations that take an unusually long time -- > which is where the failsafe is relevant. Ah, OK. I was confused by the terminology. I read "index scans" as a single Index scan rather than a index scan cycle. FWIW, even in the parallel case, the failsafe is checked after every index scan cycle. > Want to have a go at writing a patch for that? Yes, I can. Regards, Sami Imseih Amazon Web Services (AWS)
Attached is a patch to check scanned pages rather than blockno. Regards, Sami Imseih Amazon Web Services (AWS)
Attachment
On Sat, Nov 12, 2022 at 12:28 AM Imseih (AWS), Sami <simseih@amazon.com> wrote: > > > Yeah, it's a little inconsistent. > > Yes, this should be corrected by calling the failsafe > inside the parallel vacuum loops and handling the case by exiting > the loop and parallel vacuum if failsafe kicks in. I agree it's better to be consistent but I think we cannot simply call lazy_check_wraparound_failsafe() inside the parallel vacuum loops. IIUC the failsafe is heap (or lazyvacuum ) specific, whereas parallel vacuum is a common infrastructure to do index vacuum in parallel. We should not break this design. For example, we would need to have a callback for index scan loop so that the caller (i.e. lazy vacuum) can do its work. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, Dec 21, 2022 at 2:44 AM Imseih (AWS), Sami <simseih@amazon.com> wrote: > > Attached is a patch to check scanned pages rather > than blockno. Thank you for the patch. It looks good to me. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Dec 20, 2022 at 9:44 AM Imseih (AWS), Sami <simseih@amazon.com> wrote: > Attached is a patch to check scanned pages rather > than blockno. Pushed, thanks! I adjusted the FAILSAFE_EVERY_PAGES comments, which now point out that FAILSAFE_EVERY_PAGES is a power-of-two. The implication is that the compiler is all but guaranteed to be able to reduce the modulo division into a shift in the lazy_scan_heap loop, at the point of the failsafe check. I doubt that it would really matter if the compiler had to generate a DIV instruction, but it seems like a good idea to avoid it on general principle, at least in performance sensitive code. -- Peter Geoghegan
> I adjusted the FAILSAFE_EVERY_PAGES comments, which now point out that > FAILSAFE_EVERY_PAGES is a power-of-two. The implication is that the > compiler is all but guaranteed to be able to reduce the modulo > division into a shift in the lazy_scan_heap loop, at the point of the > failsafe check. I doubt that it would really matter if the compiler > had to generate a DIV instruction, but it seems like a good idea to > avoid it on general principle, at least in performance sensitive code. Thank you! Regards, Sami Imseih Amazon Web Services (AWS)