Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: recovering from "found xmin ... from before relfrozenxid ..."
Date
Msg-id CAE9k0PkBurw1L96n6D4qvu=-L73WqV5rwhF2h_3rTHw=_ejA-g@mail.gmail.com
Whole thread Raw
In response to Re: recovering from "found xmin ... from before relfrozenxid ..."  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: recovering from "found xmin ... from before relfrozenxid ..."  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Aug 7, 2020 at 9:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Aug 6, 2020 at 9:23 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> There's a certain inconsistency to these messages:
>
> rhaas=# create table foo (a int);
> CREATE TABLE
> rhaas=# insert into foo values (1);
> INSERT 0 1
> rhaas=# select heap_force_kill('foo'::regclass, array['(0,2)'::tid]);
> NOTICE:  skipping tid (0, 2) because it contains an invalid offset
>  heap_force_kill
> -----------------
>
> (1 row)
>
> rhaas=# select heap_force_kill('foo'::regclass, array['(1,0)'::tid]);
> ERROR:  invalid item pointer
> LOCATION:  tids_same_page_fetch_offnums, heap_surgery.c:347
> rhaas=# select heap_force_kill('foo'::regclass, array['(1,1)'::tid]);
> ERROR:  block number 1 is out of range for relation "foo"
>
> From a user perspective it seems like I've made three very similar
> mistakes: in the first case, the offset is too high, in the second
> case it's too low, and in the third case the block number is out of
> range. But in one case I get a NOTICE and in the other two cases I get
> an ERROR. In one case I get the relation name and in the other two
> cases I don't. The two complaints about an invalid offset are phrased
> completely differently from each other. For example, suppose you do
> this:
>
> ERROR: tid (%u, %u) is invalid for relation "%s" because the block
> number is out of range (%u..%u)
> ERROR: tid (%u, %u) is invalid for relation "%s" because the item
> number is out of range for this block (%u..%u)
> ERROR: tid (%u, %u) is invalid for relation "%s" because the item is unused
> ERROR: tid (%u, %u) is invalid for relation "%s" because the item is dead
>

Thank you for your suggestions. To make this consistent, I am planning
to do the following changes:

Remove the error message to report "invalid item pointer" from
tids_same_page_fetch_offnums() and expand the if-check to detect any
invalid offset number in the CRITICAL section such that it not just
checks if the offset number is > maxoffset, but also checks if the
offset number is equal to 0. That way it would also do the job that
"if (!ItemPointerIsValid)" was doing for us.

Further, if any invalid block number is detected, then I am planning
to skip all the tids associated with this block and move to the next
block. Hence, instead of reporting the error I would report the NOTICE
message to the user.

The other two messages for reporting unused items and dead items
remain the same. Hence, with above change, we would be reporting the
following 4 messages:

NOTICE:  skipping all the tids in block %u for relation "%s" because
the block number is out of range

NOTICE:  skipping tid (%u, %u) for relation "%s" because the item
number is out of range for this block

NOTICE:  skipping tid (%u, %u) for relation "%s" because it is marked dead

NOTICE:  skipping tid (%u, %u) for relation "%s" because it is marked unused

Please let me know if you are okay with the above changes or not?

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: massive FPI_FOR_HINT load after promote
Next
From: "Smith, Peter"
Date:
Subject: RE: Libpq support to connect to standby server as priority