Re: Parallel heap vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Parallel heap vacuum
Date
Msg-id CAD21AoA9eJ0Qx=3h77__K5ssj8R3KoVY3Uw5P7vux8HmJMRKBg@mail.gmail.com
Whole thread Raw
In response to Re: Parallel heap vacuum  (Andres Freund <andres@anarazel.de>)
Responses Re: Parallel heap vacuum
List pgsql-hackers
On Sat, Apr 5, 2025 at 1:32 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2025-04-04 14:34:53 -0700, Masahiko Sawada wrote:
> > On Fri, Apr 4, 2025 at 11:05 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > On Tue, Apr 1, 2025 at 5:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > >
> > > > I've attached the new version patch. There are no major changes; I
> > > > fixed some typos, improved the comment, and removed duplicated codes.
> > > > Also, I've updated the commit messages.
> > >
> > > I haven't looked closely at this version but I did notice that you do
> > > not document that parallel vacuum disables eager scanning. Imagine you
> > > are a user who has set the eager freeze related table storage option
> > > (vacuum_max_eager_freeze_failure_rate) and you schedule a regular
> > > parallel vacuum. Now that table storage option does nothing.
> >
> > Good point. That restriction should be mentioned in the documentation.
> > I'll update the patch.
>
> I don't think we commonly accept that a new feature B regresses a pre-existing
> feature A, particularly not if feature B is enabled by default. Why would that
> be OK here?

The eager freeze scan is the pre-existing feature but it's pretty new
code that was pushed just a couple months ago. I didn't want to make
the newly introduced code complex further in one major release
especially if it's in a vacuum area. While I agree that disabling
eager freeze scans during parallel heap vacuum is not very
user-friendly, there are still many cases where parallel heap vacuum
helps even without the eager freeze scan. FYI the parallel heap scan
can be disabled by setting min_parallel_table_scan_size. So I thought
we can incrementally improve this part.

>
>
> The justification in the code:
> +        * One might think that it would make sense to use the eager scanning even
> +        * during parallel lazy vacuum, but parallel vacuum is available only in
> +        * VACUUM command and would not be something that happens frequently,
> +        * which seems not fit to the purpose of the eager scanning. Also, it
> +        * would require making the code complex. So it would make sense to
> +        * disable it for now.
>
> feels not at all convincing to me. There e.g. are lots of places that run
> nightly vacuums. I don't think it's ok to just disable eager scanning in such
> a case, as it would mean that the "freeze cliff" would end up being *higher*
> because of the nightly vacuums than if just plain autovacuum would have been
> used.

That's a fair argument.

> I think it was already a mistake to allow the existing vacuum parallelism to
> be introduced without integrating it with autovacuum. I don't think we should
> go further down that road.

Okay, I think we can consider how to proceed with this patch including
the above point in the v19 development.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Next
From: Michael Paquier
Date:
Subject: Re: rename pg_log_standby_snapshot