Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING |
Date | |
Msg-id | CAFiTN-sWhf1gJBm=Q7X=gx0ZWaD2TYuujQHFOuzAjLnzD_Og6g@mail.gmail.com Whole thread Raw |
In response to | Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
|
List | pgsql-hackers |
On Tue, Jul 21, 2020 at 4:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Jul 21, 2020 at 11:00 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Jul 21, 2020 at 2:00 AM Andres Freund <andres@anarazel.de> wrote: > > > > > > Hi, > > > > > > On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote: > > > > The attached patch allows the vacuum to continue by emitting WARNING > > > > for the corrupted tuple instead of immediately error out as discussed > > > > at [1]. > > > > > > > > Basically, it provides a new GUC called vacuum_tolerate_damage, to > > > > control whether to continue the vacuum or to stop on the occurrence of > > > > a corrupted tuple. So if the vacuum_tolerate_damage is set then in > > > > all the cases in heap_prepare_freeze_tuple where the corrupted xid is > > > > detected, it will emit a warning and return that nothing is changed in > > > > the tuple and the 'tuple_totally_frozen' will also be set to false. > > > > Since we are returning false the caller will not try to freeze such > > > > tuple and the tuple_totally_frozen is also set to false so that the > > > > page will not be marked to all frozen even if all other tuples in the > > > > page are frozen. > > > > > > I'm extremely doubtful this is a good idea. In all likelihood this will > > > just exascerbate corruption. > > > > > > You cannot just stop freezing tuples, that'll lead to relfrozenxid > > > getting *further* out of sync with the actual table contents. And you > > > cannot just freeze such tuples, because that has a good chance of making > > > deleted tuples suddenly visible, leading to unique constraint violations > > > etc. Which will then subsequently lead to clog lookup errors and such. > > > > I agree with the point. But, if we keep giving the ERROR in such > > cases then also the situation is not any better. Basically, we are > > not freezing such tuple as well as we can not advance the > > relfrozenxid. So if we follow the same rule that we don't freeze > > those tuples and also don't advance the relfrozenxid. The only > > difference is, allow the vacuum to continue with other tuples. > > > > > At the very least you'd need to signal up that relfrozenxid/relminmxid > > > cannot be increased. Without that I think it's entirely unacceptable to > > > do this. > > > > I agree with that point. I was just confused that shall we disallow > > to advance the relfrozenxid in all such cases or in some cases where > > the xid already precedes the relfrozenxid, we can allow it to advance > > as it can not become any worse. But, as Alvaro pointed out that there > > is no point in optimizing such cases. I will update the patch to > > stop advancing the relfrozenxid if we find any corrupted xid during > > tuple freeze. > > > > > If we really were to do something like this the option would need to be > > > called vacuum_allow_making_corruption_worse or such. Its need to be > > > *exceedingly* clear that it will likely lead to making everything much > > > worse. > > > > > Maybe we can clearly describe this in the document. > > > > > > @@ -6123,6 +6124,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, > > > > frz->t_infomask = tuple->t_infomask; > > > > frz->xmax = HeapTupleHeaderGetRawXmax(tuple); > > > > > > I don't think it can be right to just update heap_prepare_freeze_tuple() > > > but not FreezeMultiXactId(). > > > > oh, I missed this part. I will fix it. > > Please find the updated patch. In this version, we don't allow the > relfrozenxid and relminmxid to advance if the corruption is detected > and also added the handling in FreezeMultiXactId. In the previous version, the feature was enabled for cluster/vacuum full command as well. in the attached patch I have enabled it only if we are running vacuum command. It will not be enabled during a table rewrite. If we think that it should be enabled for the 'vacuum full' then we might need to pass a flag from the cluster_rel, all the way down to the heap_freeze_tuple. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: