Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING |
Date | |
Msg-id | CAFiTN-vVJ_hx6M4ufonh9EiK4E_mnjFoS-jh+G-Smw7=cB+aVA@mail.gmail.com Whole thread Raw |
In response to | Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
|
List | pgsql-hackers |
On Tue, Jul 21, 2020 at 2:00 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote: > > The attached patch allows the vacuum to continue by emitting WARNING > > for the corrupted tuple instead of immediately error out as discussed > > at [1]. > > > > Basically, it provides a new GUC called vacuum_tolerate_damage, to > > control whether to continue the vacuum or to stop on the occurrence of > > a corrupted tuple. So if the vacuum_tolerate_damage is set then in > > all the cases in heap_prepare_freeze_tuple where the corrupted xid is > > detected, it will emit a warning and return that nothing is changed in > > the tuple and the 'tuple_totally_frozen' will also be set to false. > > Since we are returning false the caller will not try to freeze such > > tuple and the tuple_totally_frozen is also set to false so that the > > page will not be marked to all frozen even if all other tuples in the > > page are frozen. > > I'm extremely doubtful this is a good idea. In all likelihood this will > just exascerbate corruption. > > You cannot just stop freezing tuples, that'll lead to relfrozenxid > getting *further* out of sync with the actual table contents. And you > cannot just freeze such tuples, because that has a good chance of making > deleted tuples suddenly visible, leading to unique constraint violations > etc. Which will then subsequently lead to clog lookup errors and such. I agree with the point. But, if we keep giving the ERROR in such cases then also the situation is not any better. Basically, we are not freezing such tuple as well as we can not advance the relfrozenxid. So if we follow the same rule that we don't freeze those tuples and also don't advance the relfrozenxid. The only difference is, allow the vacuum to continue with other tuples. > At the very least you'd need to signal up that relfrozenxid/relminmxid > cannot be increased. Without that I think it's entirely unacceptable to > do this. I agree with that point. I was just confused that shall we disallow to advance the relfrozenxid in all such cases or in some cases where the xid already precedes the relfrozenxid, we can allow it to advance as it can not become any worse. But, as Alvaro pointed out that there is no point in optimizing such cases. I will update the patch to stop advancing the relfrozenxid if we find any corrupted xid during tuple freeze. > If we really were to do something like this the option would need to be > called vacuum_allow_making_corruption_worse or such. Its need to be > *exceedingly* clear that it will likely lead to making everything much > worse. > Maybe we can clearly describe this in the document. > > @@ -6123,6 +6124,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, > > frz->t_infomask = tuple->t_infomask; > > frz->xmax = HeapTupleHeaderGetRawXmax(tuple); > > I don't think it can be right to just update heap_prepare_freeze_tuple() > but not FreezeMultiXactId(). oh, I missed this part. I will fix it. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: