Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Date | |
Msg-id | CA+fd4k67ke0uj4ioK03o7bepxHadyM64c+veop+np_GKZk3TsA@mail.gmail.com Whole thread Raw |
In response to | Re: recovering from "found xmin ... from before relfrozenxid ..." (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Fri, 28 Aug 2020 at 23:39, Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Aug 28, 2020 at 4:07 AM Masahiko Sawada > <masahiko.sawada@2ndquadrant.com> wrote: > > You've removed the description about executing VACUUM with > > DISABLE_PAGE_SKIPPING option on the target relation after using > > pg_surgery functions from the doc but I guess it’s better to recommend > > that in the doc for safety. Could you please tell me the reason for > > removing that? > > Well, I think that was added because there wasn't code to clear the > visibility map bits, either page-level in the map, but we added code > for that, so now I don't really see why it's necessary or even > desirable. > > Here are a few example scenarios: > > 1. My table is not corrupt. For no particular reason, I force-freeze > or force-kill a tuple which is neither dead nor all-visible. > Concurrent queries might return wrong answers, but the table is not > corrupt. It does not require VACUUM and would not benefit from it. > Actually, it doesn't need anything at all. > > 2. My table is not corrupt. For no particular reason, I force-freeze a > tuple which is dead. I believe it's possible that the index entries > for that tuple might be gone already, but VACUUM won't fix that. > REINDEX or a table rewrite would, though. It's also possible, if the > dead tuple was added by an aborted transaction which added columns to > the table, that the tuple might have been created using a tuple > descriptor that differs from the table's current tuple descriptor. If > so, I think scanning the table could produce a crash. VACUUM won't fix > this, either. I would need to delete or force-kill the offending > tuple. > > 3. I have one or more tuples in my table that are intact except that > they have garbage values for xmin, resulting in VACUUM failure or > possibly even SELECT failure if the CLOG entries are also missing. I > force-kill or force-freeze them. If by chance the affected tuples were > also omitted from one or more indexes, a REINDEX or table rewrite is > needed to fix them, but a VACUUM will not help. On the other hand, if > those tuples are present in the indexes, there's no remaining problem > and VACUUM is not needed for the purpose of restoring the integrity of > the table. If the problem has been ongoing for a while, VACUUM might > be needed to advance relfrozenxid, but that doesn't require > DISABLE_PAGE_SKIPPING. > > 4. I have some pages in my table that have incorrect visibility map > bits. In this case, I need VACUUM (DISABLE_PAGE_SKIPPING). However, I > don't need the functions we're talking about here at all unless I also > have tuples with corrupted visibility information. If I do happen to > have both tuples with corrupted visibility information and also pages > with incorrect visibility map bits, then I suppose I need both these > tools and also VACUUM (DISABLE_PAGE_SKIPPING). Probably, I'll want to > do the VACUUM second. But, if I happened to do the VACUUM first and > then use these functions afterward, the worst thing that could happen > is that I might end up with a some dead tuples that could've gotten > removed faster if I'd switched the order. And that's not a disaster. > > Basically, I can see no real reason to recommend VACUUM > (DISABLE_PAGE_SKIPPING) here. There are problems that can be fixed > with that command, and there are problems that can be fixed by this > method, but they are mostly independent of each other. We should not > recommend that people run VACUUM "just in case." That kind of fuzzy > thinking seems relatively prevalent already, and it leads to people > spending a lot of time running slow maintenance commands that do > nothing to help them, and which occasionally make things worse. > Thank you for your explanation. That very makes sense to me. If vacuum could fix the particular kind of problem by using together with pg_surgery we could recommend using vacuum. But I agree that the corruption of heap table is not the case. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: