Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: recovering from "found xmin ... from before relfrozenxid ..."
Date
Msg-id CAE9k0PnRnBk0nbmh=pdmZWO+xjc=mJx7oJCA3VqJRMqwPT62vg@mail.gmail.com
Whole thread Raw
In response to Re: recovering from "found xmin ... from before relfrozenxid ..."  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Aug 28, 2020 at 1:44 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Aug 26, 2020 at 10:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > Okay, point noted.
>
> I spent some time today working on this patch. I'm fairly happy with
> it now and intend to commit it if nobody sees a big problem with that.
> Per discussion, I do not intend to back-patch at this time. The two
> most significant changes I made to your version are:
>
> 1. I changed things around to avoid using any form of ereport() in a
> critical section. I'm not actually sure whether it is project policy
> to avoid ereport(NOTICE, ...) or similar in a critical section, but it
> seems prudent, because if anything fails in a critical section, we
> will PANIC, so doing fewer things there seems prudent.
>
> 2. I changed the code so that it does not try to follow redirected
> line pointers; instead, it skips them with an appropriate message, as
> we were already doing for dead and unused line pointers. I think the
> way you had it coded might've been my suggestion originally, but the
> more I looked into it the less I liked it. One problem is that it
> didn't match the docs. A second is that following a corrupted line
> pointer might index off the end of the line pointer array, and while
> that probably shouldn't happen, we are talking about corruption
> recovery here. Then I realized that, as you coded it, if the line
> pointer was redirected to a line pointer that is in turn dead (or
> unused, if there's corruption) the user would get a NOTICE complaining
> about a TID they hadn't specified, which seems like it would be very
> confusing. I thought about trying to fix all that stuff, but it just
> didn't seem worth it, because I can't think of a good reason to pass
> this function the TID of a redirected line pointer in the first place.
> If you're doing surgery, you should probably specify the exact thing
> upon which you want to operate, not some other thing that points to
> it.
>
> Here is a list of other changes I made:
>
> * Added a .gitignore file.
> * Renamed the regression test file from pg_surgery to heap_surgery to
> match the name of the single C source file we currently have.
> * Capitalized TID in a few places.
> * Ran pgindent.
> * Adjusted various comments.
> * Removed the check for an empty TID array. I don't see any reason why
> this should be an error case and I don't see much precedent for having
> such a check.
> * Fixed the code to work properly with that change and added a test case.
> * Added a check that the array is not multi-dimensional.
> * Put the AM type check before the relkind check, following existing precedent.
> * Adjusted the check to use the AM OID rather than the handler OID,
> following existing precedent. Fixed the message wording accordingly.
> * Changed the documentation wording to say less about specific
> recovery procedures and focus more on the general idea that this is
> dangerous.
> * Removed all but one of the test cases that checked what happens if
> you use this on a non-heap; three tests for basically the same thing
> seemed excessive.
> * Added some additional tests to improve code coverage. There are now
> only a handful of lines not covered.
> * Reorganized the test cases somewhat.
>
> New patch attached.
>

Thank you Robert for the patch. I've looked into the changes you've
made to the v8 patch and they all look good to me.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior
Next
From: Ashutosh Sharma
Date:
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."