Re: [BUG] Panic due to incorrect missingContrecPtr after promotion - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [BUG] Panic due to incorrect missingContrecPtr after promotion |
Date | |
Msg-id | CA+TgmoaNPKgeYfgvK1u-tP9mB60GVK8KCDJ7mQ4qAGnPf955WQ@mail.gmail.com Whole thread Raw |
In response to | Re: [BUG] Panic due to incorrect missingContrecPtr after promotion (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
|
List | pgsql-hackers |
On Mon, Jun 20, 2022 at 7:28 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > Hmm. I have not looked at that in depth, but if the intention is to > > check that the database is able to write WAL, looking at > > XLogCtl->SharedRecoveryState would be the way to go because that's the > > flip switching between crash recovery, archive recovery and the end of > > recovery (when WAL can be safely written). > > What we are checking there is "we are going to write WAL", not "we are > writing". > > (!StanbyMode && StandbyModeRequested) means the server have been > running crash-recovery before starting archive recovery. In that > case, the server is supposed to continue with archived WAL without > insering a record. However, if no archived WAL available and the > server promoted, the server needs to insert an "aborted contrecord" > record again. (I'm not sure how that case happens in the field, > though.) > > So I don't think !StandbyModeRequested is the right thing > here. Actually the attached test fails with the fix. It seems to me that what we want to do is: if we're about to start allowing WAL writes, then consider whether to insert an aborted contrecord record. Now, if we are about to start allowing WAL write, we must determine the LSN at which the first such record will be written. Then there are two possibilities: either that LSN is on an existing record boundary, or it isn't. In the former case, no aborted contrecord record is required. In the latter case, we're writing at LSN which would have been in the middle of a previous record, except that the record in question was never finished or at least we don't have all of it. So the first WAL we insert at our chosen starting LSN must be an aborted contrecord record. I'm not quite sure I understand the nature of the remaining bug here, but this logic seems quite sketchy to me: /* * When not in standby mode we find that WAL ends in an incomplete * record, keep track of that record. After recovery is done, * we'll write a record to indicate to downstream WAL readers that * that portion is to be ignored. */ if (!StandbyMode && !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr)) { abortedRecPtr = xlogreader->abortedRecPtr; missingContrecPtr = xlogreader->missingContrecPtr; } I don't see what StandbyMode has to do with anything here. The question is whether the place where we're going to begin writing WAL is on an existing record boundary or not. If it so happens that when StandbyMode is turned on we can never decide to promote in the middle of a record, then maybe there's no need to track this information when StandbyMode = true, but I don't see a good reason why we should make that assumption. What if in the future somebody added a command that says "don't keep trying to read more WAL, just promote RIGHT HERE?". I think this logic would surely be incorrect in that case. It feels to me like the right thing to do is to always keep track if WAL ends in an incomplete record, and then when we promote, we write an aborted contrecord record if WAL ended in an incomplete record, full stop. Now, we do need to keep in mind that, while in StandbyMode, we might reach the end of WAL more than once, because we have a polling loop that looks for more WAL. So it does not work to just set the values once and then assume that's the whole truth forever. But why not handle that by storing the abortedRecPtr and missingContrecPtr unconditionally after every call to XLogPrefetcherReadRecord()? They will go back and forth between XLogRecPtrIsInvalid and other values many times but the last value should -- I think -- be however things ended up at the point where we decided to stop reading WAL. I haven't really dived into this issue much so I might be totally off base, but it just doesn't feel right to me that this should depend on whether we're in standby mode. That seems at best incidentally related to whether an aborted contrecord record is required. P.S. "aborted contrecord record" is really quite an awkward turn of phrase! -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: