Re: Call lazy_check_wraparound_failsafe earlier for parallel vacuum - Mailing list pgsql-hackers

From Imseih (AWS), Sami
Subject Re: Call lazy_check_wraparound_failsafe earlier for parallel vacuum
Date
Msg-id 1F2117A6-83F1-4C4E-8DAC-5A3C2B5BC7A0@amazon.com
Whole thread Raw
In response to Re: Call lazy_check_wraparound_failsafe earlier for parallel vacuum  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Call lazy_check_wraparound_failsafe earlier for parallel vacuum
List pgsql-hackers
>    It makes sense to prefer consistency here, I suppose. The reason why
>    we're not consistent is because it was easier not to be, which isn't
>    exactly the best reason (nor the worst).

Consistency is the key point here. It is odd that a serial
vacuum may skip the remainder of the indexes if failsafe
kicks-in, but in the parallel case it will go through the entire index
cycle.

>    My gut instinct is that the most important thing is to at least call
>    lazy_check_wraparound_failsafe() once per index scan. 

I agree. And this should happen in the serial and parallel case.

> That said, one thing that does bother me in this area occurs to me: we
> call lazy_check_wraparound_failsafe() from lazy_scan_heap() (before we
> get to the index scans that you're talking about) at an interval that
> is based on how many heap pages we've either processed (and recorded
> as a scanned_pages page) *or* have skipped over using the visibility
> map. In other words we use blkno here, when we should really be using
> scanned_pages instead:

>            if (blkno - next_failsafe_block >= FAILSAFE_EVERY_PAGES)
>            {
>                lazy_check_wraparound_failsafe(vacrel);
>                next_failsafe_block = blkno;
>            }

>    This code effectively treats pages skipped using the visibility map as
>    equivalent to pages physically scanned (scanned_pages), even though
>    skipping requires essentially no work at all. That just isn't logical,
>    and feels like something worth fixing. The fundamental unit of work in
>    lazy_scan_heap() is a *scanned* heap page.

It makes perfect sense to use the scanned_pages instead.

Regards,

Sami imseih
Amazon Web Services (AWS)


pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Document parameter count limit
Next
From: "Regina Obe"
Date:
Subject: RE: Ability to reference other extensions by schema in extension scripts