Thread: pg_stop_backup wait bug fix
Minor bug fix for pg_stop_backup() to prevent it waiting longer than necessary in certain circumstances. Was originally part of recovery_infrastructure patch. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Attachment
Simon Riggs wrote: > Minor bug fix for pg_stop_backup() to prevent it waiting longer than > necessary in certain circumstances. As far as I can tell, this patch needs to be applied to HEAD, 8.3 and 8.2 (when xlog switch was implemented), and in fact it applies cleanly to them. Please confirm. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Fri, 2008-11-07 at 18:32 -0300, Alvaro Herrera wrote: > Simon Riggs wrote: > > Minor bug fix for pg_stop_backup() to prevent it waiting longer than > > necessary in certain circumstances. > > As far as I can tell, this patch needs to be applied to HEAD, 8.3 and > 8.2 (when xlog switch was implemented), and in fact it applies cleanly > to them. Please confirm. There are other patches that we should consider along with this one. This particular patch only has meaning when applied with the changes to pg_stop_backup() earlier in 8.4dev cycle. It isn't a bug in xlog switch, it is a bug in how the new waiting code interprets the return value from xlog switch. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Hi, On Wed, Oct 8, 2008 at 10:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Minor bug fix for pg_stop_backup() to prevent it waiting longer than > necessary in certain circumstances. Why don't you use XLByteToPrevSeg like pg_xlogfile_name? I think that we should uniform the logic as much as possible. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao wrote: > On Wed, Oct 8, 2008 at 10:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Minor bug fix for pg_stop_backup() to prevent it waiting longer than >> necessary in certain circumstances. > > Why don't you use XLByteToPrevSeg like pg_xlogfile_name? > I think that we should uniform the logic as much as possible. Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace the current XLByteToSeg call with XLByteToPrevSeg? That would offset the return value of the function by one byte as well, as well as the value printed to the backup history file. In fact, I think the original patch got that wrong; it would return the location of the *beginning* of the last xlog file. I also noticed that the 2nd BackupHistoryFileName call in that function is useless; histfilepath variable is already filled in earlier. How does the attached patch look to you? Do you have an easy way to test this? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/transam/xlog.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.322 diff -c -r1.322 xlog.c *** src/backend/access/transam/xlog.c 9 Nov 2008 17:51:15 -0000 1.322 --- src/backend/access/transam/xlog.c 2 Dec 2008 20:10:03 -0000 *************** *** 6674,6679 **** --- 6674,6680 ---- char histfilepath[MAXPGPATH]; char startxlogfilename[MAXFNAMELEN]; char stopxlogfilename[MAXFNAMELEN]; + char lastxlogfilename[MAXFNAMELEN]; uint32 _logId; uint32 _logSeg; FILE *lfp; *************** *** 6711,6716 **** --- 6712,6720 ---- XLByteToSeg(stoppoint, _logId, _logSeg); XLogFileName(stopxlogfilename, ThisTimeLineID, _logId, _logSeg); + XLByteToPrevSeg(stoppoint, _logId, _logSeg); + XLogFileName(lastxlogfilename, ThisTimeLineID, _logId, _logSeg); + /* Use the log timezone here, not the session timezone */ stamp_time = (pg_time_t) time(NULL); pg_strftime(strfbuf, sizeof(strfbuf), *************** *** 6801,6813 **** * we assume the admin wanted his backup to work completely. If you * don't wish to wait, you can set statement_timeout. */ - BackupHistoryFileName(histfilepath, ThisTimeLineID, _logId, _logSeg, - startpoint.xrecoff % XLogSegSize); - seconds_before_warning = 60; waits = 0; ! while (XLogArchiveIsBusy(stopxlogfilename) || XLogArchiveIsBusy(histfilepath)) { CHECK_FOR_INTERRUPTS(); --- 6805,6814 ---- * we assume the admin wanted his backup to work completely. If you * don't wish to wait, you can set statement_timeout. */ seconds_before_warning = 60; waits = 0; ! while (XLogArchiveIsBusy(lastxlogfilename) || XLogArchiveIsBusy(histfilepath)) { CHECK_FOR_INTERRUPTS();
On Wed, Dec 3, 2008 at 5:13 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Fujii Masao wrote: >> >> On Wed, Oct 8, 2008 at 10:23 PM, Simon Riggs <simon@2ndquadrant.com> >> wrote: >>> >>> Minor bug fix for pg_stop_backup() to prevent it waiting longer than >>> necessary in certain circumstances. >> >> Why don't you use XLByteToPrevSeg like pg_xlogfile_name? >> I think that we should uniform the logic as much as possible. > > Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace the > current XLByteToSeg call with XLByteToPrevSeg? That would offset the return > value of the function by one byte as well, as well as the value printed to > the backup history file. In fact, I think the original patch got that wrong; > it would return the location of the *beginning* of the last xlog file. You're right. As you say, the value (stopxlogfilename) printed to the backup history file is wrong. But, since the value is not used fortunately, any troubles have not come up. So, I think that we can just replace them. > > I also noticed that the 2nd BackupHistoryFileName call in that function is > useless; histfilepath variable is already filled in earlier. Somewhat confusingly, BackupHistoryFileName is called only once. Isn't 1st (which probably you thought) BackupHistoryFilePath? In order to prevent confusion, we should add new local variable (histfilename) for the backup history file name? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao wrote: > On Wed, Dec 3, 2008 at 5:13 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace the >> current XLByteToSeg call with XLByteToPrevSeg? That would offset the return >> value of the function by one byte as well, as well as the value printed to >> the backup history file. In fact, I think the original patch got that wrong; >> it would return the location of the *beginning* of the last xlog file. > > You're right. As you say, the value (stopxlogfilename) printed to the backup > history file is wrong. But, since the value is not used fortunately, > any troubles > have not come up. So, I think that we can just replace them. Changing the return value doesn't seem like a good idea. If nothing else, it would be complicated to explain what it returns. I committed a patch that changes the waiting behavior, but not the return value or what's written into the backup label file, >> I also noticed that the 2nd BackupHistoryFileName call in that function is >> useless; histfilepath variable is already filled in earlier. > > Somewhat confusingly, BackupHistoryFileName is called only once. Isn't 1st > (which probably you thought) BackupHistoryFilePath? Ouch, you're right. That's subtle. > In order to prevent > confusion, we should add new local variable (histfilename) for the backup > history file name? Agreed. I included that in the patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Hi, I have looked at the patch and it looks OK to me. BTW I am not too much familiar with this area of code, so I am not at the position to argue that patch -:) . I haven't found an easy way to test the patch. On Wed, Dec 3, 2008 at 1:24 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Fujii Masao wrote: >> >> On Wed, Dec 3, 2008 at 5:13 AM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >>> >>> Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace >>> the >>> current XLByteToSeg call with XLByteToPrevSeg? That would offset the >>> return >>> value of the function by one byte as well, as well as the value printed >>> to >>> the backup history file. In fact, I think the original patch got that >>> wrong; >>> it would return the location of the *beginning* of the last xlog file. >> >> You're right. As you say, the value (stopxlogfilename) printed to the >> backup >> history file is wrong. But, since the value is not used fortunately, >> any troubles >> have not come up. So, I think that we can just replace them. > > Changing the return value doesn't seem like a good idea. If nothing else, it > would be complicated to explain what it returns. I committed a patch that > changes the waiting behavior, but not the return value or what's written > into the backup label file, > >>> I also noticed that the 2nd BackupHistoryFileName call in that function >>> is >>> useless; histfilepath variable is already filled in earlier. >> >> Somewhat confusingly, BackupHistoryFileName is called only once. Isn't 1st >> (which probably you thought) BackupHistoryFilePath? > > Ouch, you're right. That's subtle. > >> In order to prevent >> confusion, we should add new local variable (histfilename) for the backup >> history file name? > > Agreed. I included that in the patch. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Ibrar Ahmed EnterpriseDB http://www.enterprisedb.com