Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date
Msg-id CAFiTN-ubHT+QwK58iWy2=y59nxRUTh5X6vwbWw6_O089rcGDBg@mail.gmail.com
Whole thread Raw
In response to Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
List pgsql-hackers
On Fri, Jul 17, 2020 at 4:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> The attached patch allows the vacuum to continue by emitting WARNING
> for the corrupted tuple instead of immediately error out as discussed
> at [1].
>
> Basically, it provides a new GUC called vacuum_tolerate_damage, to
> control whether to continue the vacuum or to stop on the occurrence of
> a corrupted tuple.  So if the vacuum_tolerate_damage is set then in
> all the cases in heap_prepare_freeze_tuple where the corrupted xid is
> detected, it will emit a warning and return that nothing is changed in
> the tuple and the 'tuple_totally_frozen' will also be set to false.
> Since we are returning false the caller will not try to freeze such
> tuple and the tuple_totally_frozen is also set to false so that the
> page will not be marked to all frozen even if all other tuples in the
> page are frozen.
>
> Alternatively,  we can try to freeze other XIDs in the tuple which is
> not corrupted but I don't think we will gain anything from this,
> because if one of the xmin or xmax is wrong then next time also if we
> run the vacuum then we are going to get the same WARNING or the ERROR.
> Is there any other opinion on this?

Robert has mentioned at [1] that we probably should refuse to update
'relfrozenxid/relminmxid' when we encounter such tuple and emit
WARNING instead of an error.  I think we shall do that in some cases
but IMHO it's not a very good idea in all the cases.  Basically, if
the xmin precedes the relfrozenxid then probably we should allow to
update the relfrozenxid whereas if the xmin precedes cutoff xid and
still uncommitted then probably we might stop relfrozenxid from being
updated so that we can stop CLOG from getting truncated.  I will make
these changes if we agree with the idea?  Or we should keep it simple
and never allow to update 'relfrozenxid/relminmxid' in such cases?

[1] http://postgr.es/m/CA+TgmoaZwZHtFFU6NUJgEAp6adDs-qWfNOXpZGQpZMmm0VTDfg@mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Next
From: Fujii Masao
Date:
Subject: Re: max_slot_wal_keep_size and wal_keep_segments