Re: Bug in VACUUM FULL ? - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Bug in VACUUM FULL ? |
Date | |
Msg-id | 3413.1173476446@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Bug in VACUUM FULL ? ("Pavan Deolasee" <pavan.deolasee@enterprisedb.com>) |
Responses |
Re: Bug in VACUUM FULL ?
Re: Bug in VACUUM FULL ? Re: Bug in VACUUM FULL ? Re: Bug in VACUUM FULL ? |
List | pgsql-hackers |
"Pavan Deolasee" <pavan.deolasee@enterprisedb.com> writes: > The problem mentioned before is hard to reproduce with the > suggested change, but its not completely gone away. I have > seen that again on CVS HEAD with the patch applied. > I am facing another issue with VACUUM FULL. This > problem gets reproduced with HOT very easily, but takes > few attempts to reproduce with CVS HEAD, but it > certainly exists. I've been banging away on this since yesterday, and I think I've achieved a full understanding of what's going on. There are three or four different-looking pathologies but they all seem to arise from the same problem: the update-chain-moving code assumes that RECENTLY_DEAD tuples will never have update successors that are entirely DEAD (according to HeapTupleSatisfiesVacuum). When faced with an update chain in which that does happen, it can move the chain multiple times, neglect to remove index entries for tuples that get truncated away, crash on an Assert, or other interesting stuff. Here's an example from some debug printouts I inserted into repair_frag: chain forward branches 17/174 to 17/183 (x 1993046 1993057) RECENTLY_DEAD chain forward branches 17/183 to 15/109 (x 1993057 1993055) RECENTLY_DEAD chain forward branches 15/109 to 15/111 (x 1993055 1993045) DEAD chain forward branches 15/111 to 15/114 (x 1993045 1993025) DEAD chain forward branches 15/114 to 15/116 (x 1993025 1993096) RECENTLY_DEAD chain forward branches 15/116 to 15/119 (x 1993096 1993107) RECENTLY_DEAD chain forward branches 15/119 to 15/121 (x 1993107 1993120) RECENTLY_DEAD chain forward branches 15/121 to 15/125 (x 1993120 1993121) RECENTLY_DEAD chain forward branches 15/125 to 15/128 (x 1993121 1993122) RECENTLY_DEAD chain forward branches 15/128 to 15/131 (x 1993122 1993092) RECENTLY_DEAD chain forward branches 15/131 to 15/133 (x 1993092 1993145) RECENTLY_DEAD chain forward branches 15/133 to 15/139 (x 1993145 1993182) RECENTLY_DEAD chain forward branches 15/139 to 15/141 (x 1993182 1993183) RECENTLY_DEAD chain forward branches 15/141 to 15/147 (x 1993183 1993155) RECENTLY_DEAD chain forward branches 15/147 to 15/150 (x 1993155 1993167) LIVE chain back stops at branches 15/114: xmin 1993025 < 1993050 moved branches 15/150 to 0/69; next 0/69 moved branches 15/147 to 0/70; next 0/69 moved branches 15/141 to 0/71; next 0/70 moved branches 15/139 to 0/72; next 0/71 moved branches 15/133 to 0/73; next 0/72 moved branches 15/131 to 0/74; next 0/73 moved branches 15/128 to 0/75; next 0/74 moved branches 15/125 to 0/76; next 0/75 moved branches 15/121 to 0/77; next 0/76 moved branches 15/119 to 0/78; next 0/77 moved branches 15/116 to 0/79; next 0/78 moved branches 15/114 to 0/80; next 0/79 Since TIDs 17/174 and 17/183 didn't get moved, when the repair_frag search arrives at 17/183 it will copy this chain again, leading to duplicate copies of the LIVE tuple at the chain end, leading to trouble. It's not surprising that tuples could have xmax less than xmin, since transactions can commit in orders different than they start; when using READ COMMITTED updates it's not at all surprising that a transaction might update rows after a later-numbered transaction does. However, in looking at this code previously I'd assumed that the OldestXmin cutoff could never fall between two such transactions, and so the above scenario wouldn't happen. I'm not real sure why I thought that. For the cases that VACUUM FULL is interested in, both XIDs mentioned in a DEAD tuple must have committed before OldestXmin was computed, but there doesn't seem to be a compelling reason why OldestXmin might not have been determined by an unrelated third transaction with a number between those two. I believe it's the case that any update chain members appearing before a DEAD entry must in fact also be dead (ie, not listed as live in any active snapshot) but a test based on OldestXmin hasn't got enough resolution to prove that they are dead. Does anyone want to argue that this is an error in the calculation of OldestXmin (and if so, how would you propose to fix it)? If not, I'll set to work on fixing the chain-moving logic. I think the correct behavior is that we shouldn't consider DEAD tuples part of a movable chain, and so in the above example there are two separate chains to move not one. Alternatively we could try to recognize that the older part of the chain is really dead and removable, but that seems complicated and likely to introduce new bugs. I wonder whether this has any implications for HOT ... regards, tom lane
pgsql-hackers by date: