Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date
Msg-id 20200720203046.rjltg2pxr3c4qjx6@alap3.anarazel.de
Whole thread Raw
In response to Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Robert Haas <robertmhaas@gmail.com>)
Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote:
> The attached patch allows the vacuum to continue by emitting WARNING
> for the corrupted tuple instead of immediately error out as discussed
> at [1].
> 
> Basically, it provides a new GUC called vacuum_tolerate_damage, to
> control whether to continue the vacuum or to stop on the occurrence of
> a corrupted tuple.  So if the vacuum_tolerate_damage is set then in
> all the cases in heap_prepare_freeze_tuple where the corrupted xid is
> detected, it will emit a warning and return that nothing is changed in
> the tuple and the 'tuple_totally_frozen' will also be set to false.
> Since we are returning false the caller will not try to freeze such
> tuple and the tuple_totally_frozen is also set to false so that the
> page will not be marked to all frozen even if all other tuples in the
> page are frozen.

I'm extremely doubtful this is a good idea. In all likelihood this will
just exascerbate corruption.

You cannot just stop freezing tuples, that'll lead to relfrozenxid
getting *further* out of sync with the actual table contents. And you
cannot just freeze such tuples, because that has a good chance of making
deleted tuples suddenly visible, leading to unique constraint violations
etc. Which will then subsequently lead to clog lookup errors and such.

At the very least you'd need to signal up that relfrozenxid/relminmxid
cannot be increased. Without that I think it's entirely unacceptable to
do this.


If we really were to do something like this the option would need to be
called vacuum_allow_making_corruption_worse or such. Its need to be
*exceedingly* clear that it will likely lead to making everything much
worse.


> @@ -6123,6 +6124,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
>      frz->t_infomask = tuple->t_infomask;
>      frz->xmax = HeapTupleHeaderGetRawXmax(tuple);

I don't think it can be right to just update heap_prepare_freeze_tuple()
but not FreezeMultiXactId().

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Local visibility with logical decoding
Next
From: Tom Lane
Date:
Subject: Re: pg_subscription.subslotname is wrongly marked NOT NULL