Re: Trigger more frequent autovacuums of heavy insert tables - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Trigger more frequent autovacuums of heavy insert tables
Date
Msg-id CAAKRu_bKcmQaY2=KgwRvar7nedFZ9bJvsGgBChsuUgTzRiv1NA@mail.gmail.com
Whole thread Raw
In response to Re: Trigger more frequent autovacuums of heavy insert tables  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Feb 25, 2025 at 1:52 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > This does however leave me with the question of how to handle the
> > original question of whether or not to cap the proposed relallfrozen
> > to the value of relallvisible when updating stats at the end of
> > vacuum. The current code in heap_vacuum_rel() caps relallvisible to
> > relpages, so capping relallfrozen to relallvisible would follow that
> > pattern. However, the other places relallvisible is updated do no such
> > capping (do_analyze_rel(), index_update_stats()). It doesn't seem like
> > there is a good reason to do it one place and not the others. So, I
> > suggest either removing all the caps and adding a WARNING or capping
> > the value in all places. Because users can now manually update these
> > values in pg_class, there wouldn't be a way to detect the difference
> > between a bogus relallfrozen value due to VM corruption or a bogus
> > value due to manual statistics intervention. This led me to think that
> > a WARNING and no cap would be more effective for heap_vacuum_rel().
>
> I mean, does it really make any difference one way or the other?
>
> Given that users could manually update the catalog, we have to be able
> to tolerate bad data in the catalogs without the world ending. If that
> code has to exist anyway, then it's not mandatory to cap. On the other
> hand, there's no great virtue in refusing to correct data that we know
> to be wrong. Unless there is some other consideration which makes one
> way better than the other, this feels like author's choice.

I realized that whether or not we add a WARNING is an independent
question from whether or not we cap these values. In these instances,
we happen to have just read the whole VM and so we can tell you if it
is broken in a particular way. If I want to write a patch to warn
users of visibility map corruption after calling
visibilitymap_count(), I could do that and it might be a good idea,
but it should probably be a separate commit anyway.

- Melanie



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Next
From: Masahiko Sawada
Date:
Subject: Re: Parallel heap vacuum