Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date
Msg-id 20200828172916.4jzfpbo72mt6mwlf@alap3.anarazel.de
Whole thread Raw
In response to Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
List pgsql-hackers
Hi,

On 2020-08-28 12:37:17 -0400, Robert Haas wrote:
> On Mon, Jul 20, 2020 at 4:30 PM Andres Freund <andres@anarazel.de> wrote:
> > If we really were to do something like this the option would need to be
> > called vacuum_allow_making_corruption_worse or such. Its need to be
> > *exceedingly* clear that it will likely lead to making everything much
> > worse.
> 
> I don't really understand this objection. How does letting VACUUM
> continue after problems have been detected make anything worse?

It can break HOT chains, plain ctid chains etc, for example. Which, if
earlier / follower tuples are removed can't be detected anymore at a
later time.


> The point is that when you make VACUUM fail, you not only don't
> advance relfrozenxid/relminmxid, but also don't remove dead tuples. In
> the long run, either thing will kill you, but it is not difficult to
> have a situation where failing to remove dead tuples kills you a lot
> faster. The table can just bloat until performance tanks, and then the
> application goes down, even if you still had 100+ million XIDs before
> you needed a wraparound vacuum.
> 
> Honestly, I wonder why continuing (but without advancing relfrozenxid
> or relminmxid) shouldn't be the default behavior. I mean, if it
> actually corrupts your data, then it clearly shouldn't be, and
> probably shouldn't even be an optional behavior, but I still don't see
> why it would do that.

I think it's an EXTREMELY bad idea to enable anything like this by
default. It'll make bugs entirely undiagnosable, because we'll remove a
lot of the evidence of what the problem is. And we've had many long
standing bugs in this area, several only found because we actually
started to bleat about them. And quite evidently, we have more bugs to
fix in the area.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: New default role- 'pg_read_all_data'
Next
From: Pavel Stehule
Date:
Subject: Re: poc - possibility to write window function in PL languages