Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date
Msg-id CAFiTN-sWhf1gJBm=Q7X=gx0ZWaD2TYuujQHFOuzAjLnzD_Og6g@mail.gmail.com
Whole thread Raw
In response to Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
List pgsql-hackers
On Tue, Jul 21, 2020 at 4:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jul 21, 2020 at 11:00 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Jul 21, 2020 at 2:00 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > Hi,
> > >
> > > On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote:
> > > > The attached patch allows the vacuum to continue by emitting WARNING
> > > > for the corrupted tuple instead of immediately error out as discussed
> > > > at [1].
> > > >
> > > > Basically, it provides a new GUC called vacuum_tolerate_damage, to
> > > > control whether to continue the vacuum or to stop on the occurrence of
> > > > a corrupted tuple.  So if the vacuum_tolerate_damage is set then in
> > > > all the cases in heap_prepare_freeze_tuple where the corrupted xid is
> > > > detected, it will emit a warning and return that nothing is changed in
> > > > the tuple and the 'tuple_totally_frozen' will also be set to false.
> > > > Since we are returning false the caller will not try to freeze such
> > > > tuple and the tuple_totally_frozen is also set to false so that the
> > > > page will not be marked to all frozen even if all other tuples in the
> > > > page are frozen.
> > >
> > > I'm extremely doubtful this is a good idea. In all likelihood this will
> > > just exascerbate corruption.
> > >
> > > You cannot just stop freezing tuples, that'll lead to relfrozenxid
> > > getting *further* out of sync with the actual table contents. And you
> > > cannot just freeze such tuples, because that has a good chance of making
> > > deleted tuples suddenly visible, leading to unique constraint violations
> > > etc. Which will then subsequently lead to clog lookup errors and such.
> >
> > I agree with the point.  But, if we keep giving the ERROR in such
> > cases then also the situation is not any better.  Basically, we are
> > not freezing such tuple as well as we can not advance the
> > relfrozenxid.  So if we follow the same rule that we don't freeze
> > those tuples and also don't advance the relfrozenxid.  The only
> > difference is, allow the vacuum to continue with other tuples.
> >
> > > At the very least you'd need to signal up that relfrozenxid/relminmxid
> > > cannot be increased. Without that I think it's entirely unacceptable to
> > > do this.
> >
> > I agree with that point.  I was just confused that shall we disallow
> > to advance the relfrozenxid in all such cases or in some cases where
> > the xid already precedes the relfrozenxid, we can allow it to advance
> > as it can not become any worse.  But, as Alvaro pointed out that there
> > is no point in optimizing such cases.   I will update the patch to
> > stop advancing the relfrozenxid if we find any corrupted xid during
> > tuple freeze.
> >
> > > If we really were to do something like this the option would need to be
> > > called vacuum_allow_making_corruption_worse or such. Its need to be
> > > *exceedingly* clear that it will likely lead to making everything much
> > > worse.
> > >
> > Maybe we can clearly describe this in the document.
> >
> > > > @@ -6123,6 +6124,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
> > > >       frz->t_infomask = tuple->t_infomask;
> > > >       frz->xmax = HeapTupleHeaderGetRawXmax(tuple);
> > >
> > > I don't think it can be right to just update heap_prepare_freeze_tuple()
> > > but not FreezeMultiXactId().
> >
> > oh, I missed this part.  I will fix it.
>
> Please find the updated patch.  In this version, we don't allow the
> relfrozenxid and relminmxid to advance if the corruption is detected
> and also added the handling in FreezeMultiXactId.

In the previous version, the feature was enabled for cluster/vacuum
full command as well.   in the attached patch I have enabled it only
if we are running vacuum command.  It will not be enabled during a
table rewrite.  If we think that it should be enabled for the 'vacuum
full' then we might need to pass a flag from the cluster_rel, all the
way down to the heap_freeze_tuple.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Postgres-native method to identify if a tuple is frozen
Next
From: Surafel Temesgen
Date:
Subject: Re: WIP: System Versioned Temporal Table