Re: PITR - Some data is not recovered. - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: PITR - Some data is not recovered. |
Date | |
Msg-id | 1690.1092173467@sss.pgh.pa.us Whole thread Raw |
In response to | PITR - Some data is not recovered. (OKADA Satoshi <okada.satoshi@lab.ntt.co.jp>) |
Responses |
Re: PITR - Some data is not recovered.
|
List | pgsql-hackers |
OKADA Satoshi <okada.satoshi@lab.ntt.co.jp> writes: > I'm testing PITR using pgbench and postgresql ver.8.0bata. > I think that result of recovery is wrong. Many thanks for this bug report. I have developed a patch (attached) that seems to fix it for me. Please test and see if you can still cause the problem. BTW, I found that it was much easier to cause the bug to happen if you force a long commit delay. To do this you need to havefsync = oncommit_siblings = 1commit_delay = 100000 and while the pgbench run is going, start a new psql and dobegin;create table foo(f1 int); (or any other command that modifies the database). If you are watching things with "top" you should notice a drastic decrease in the CPU consumption of the backend connected to pgbench as soon as you do this. Then proceed with taking the on-line backup. You can end the extra psql session after the backup is done, to allow pgbench to go back to normal speed. Doing it this way I got a failure just about every time (3 failures in 3 tries) whereas without, I had seen only one failure in four tries. regards, tom lane Index: src/backend/access/transam/xact.c =================================================================== RCS file: /cvsroot/pgsql-server/src/backend/access/transam/xact.c,v retrieving revision 1.177 diff -c -r1.177 xact.c *** src/backend/access/transam/xact.c 3 Aug 2004 15:57:26 -0000 1.177 --- src/backend/access/transam/xact.c 10 Aug 2004 20:55:48 -0000 *************** *** 574,586 **** START_CRIT_SECTION(); /* ! * We only need to log the commit in XLOG if the transaction made ! * any transaction-controlled XLOG entries or will delete files. * (If it made no transaction-controlledXLOG entries, its XID * appears nowhere in permanent storage, so no one else will ever care ! * if it committed.) */ madeTCentries = (MyLastRecPtr.xrecoff != 0); if (madeTCentries|| nrels > 0) { XLogRecData rdata[3]; --- 574,601 ---- START_CRIT_SECTION(); /* ! * If our transaction made any transaction-controlled XLOG entries, ! * we need to lock out checkpoint start between writing our XLOG ! * record and updating pg_clog. Otherwise it is possible for the ! * checkpoint to set REDO after the XLOG record but fail to flush the ! * pg_clog update to disk, leading to loss of the transaction commit ! * if we crash a little later. Slightly klugy fix for problem ! * discovered 2004-08-10. ! * * (If it made no transaction-controlled XLOG entries, its XID * appears nowhere in permanentstorage, so no one else will ever care ! * if it committed; so it doesn't matter if we lose the commit flag.) ! * ! * Note we only need a shared lock. */ madeTCentries = (MyLastRecPtr.xrecoff != 0); + if (madeTCentries) + LWLockAcquire(CheckpointStartLock, LW_SHARED); + + /* + * We only need to log the commit in XLOG if the transaction made + * any transaction-controlled XLOG entries or will delete files. + */ if (madeTCentries || nrels > 0) { XLogRecData rdata[3]; *************** *** 668,673 **** --- 683,692 ---- TransactionIdCommitTree(nchildren, children); } + /* Unlock checkpoint lock if we acquired it */ + if (madeTCentries) + LWLockRelease(CheckpointStartLock); + END_CRIT_SECTION(); } *************** *** 850,855 **** --- 869,876 ---- * * We do not flush XLOG to disk unless deleting files, since the * default assumptionafter a crash would be that we aborted, anyway. + * For the same reason, we don't need to worry about interlocking + * against checkpoint start. */ if (MyLastRecPtr.xrecoff != 0 || nrels > 0) { Index: src/backend/access/transam/xlog.c =================================================================== RCS file: /cvsroot/pgsql-server/src/backend/access/transam/xlog.c,v retrieving revision 1.158 diff -c -r1.158 xlog.c *** src/backend/access/transam/xlog.c 9 Aug 2004 16:26:01 -0000 1.158 --- src/backend/access/transam/xlog.c 10 Aug 2004 20:55:49 -0000 *************** *** 4699,4704 **** --- 4699,4713 ---- checkPoint.ThisTimeLineID = ThisTimeLineID; checkPoint.time = time(NULL); + /* + * We must hold CheckpointStartLock while determining the checkpoint + * REDO pointer. This ensures that any concurrent transaction commits + * will be either not yet logged, or logged and recorded in pg_clog. + * See notes in RecordTransactionCommit(). + */ + LWLockAcquire(CheckpointStartLock, LW_EXCLUSIVE); + + /* And we need WALInsertLock too */ LWLockAcquire(WALInsertLock, LW_EXCLUSIVE); /* *************** *** 4731,4736 **** --- 4740,4746 ---- ControlFile->checkPointCopy.redo.xrecoff) { LWLockRelease(WALInsertLock); + LWLockRelease(CheckpointStartLock); LWLockRelease(CheckpointLock); END_CRIT_SECTION(); return; *************** *** 4789,4794 **** --- 4799,4807 ---- * GetSnapshotData needs to get XidGenLock while holding SInvalLock, * so there's a risk of deadlock.Need to find a better solution. See * pgsql-hackers discussion of 17-Dec-01. + * + * XXX actually, the whole UNDO code is dead code and unlikely to ever + * be revived, so the lack of a good solution here is not troubling. */ #ifdef NOT_USED checkPoint.undo =GetUndoRecPtr(); *************** *** 4798,4808 **** #endif /* ! * Now we can release insert lock, allowing other xacts to proceed ! * even while we are flushing disk buffers. */ LWLockRelease(WALInsertLock); /* * Get the otherinfo we need for the checkpoint record. */ --- 4811,4823 ---- #endif /* ! * Now we can release insert lock and checkpoint start lock, allowing ! * other xacts to proceed even while we are flushing disk buffers. */ LWLockRelease(WALInsertLock); + LWLockRelease(CheckpointStartLock); + /* * Get the other info we need for the checkpoint record. */ Index: src/include/storage/lwlock.h =================================================================== RCS file: /cvsroot/pgsql-server/src/include/storage/lwlock.h,v retrieving revision 1.12 diff -c -r1.12 lwlock.h *** src/include/storage/lwlock.h 11 Jun 2004 16:43:24 -0000 1.12 --- src/include/storage/lwlock.h 10 Aug 2004 20:55:49 -0000 *************** *** 36,41 **** --- 36,42 ---- WALWriteLock, ControlFileLock, CheckpointLock, + CheckpointStartLock, RelCacheInitLock, BgWriterCommLock,
pgsql-hackers by date: