Re: txid failed epoch increment, again, aka 6291 - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: txid failed epoch increment, again, aka 6291 |
Date | |
Msg-id | 20120904150425.GA28600@tornado.leadboat.com Whole thread Raw |
In response to | txid failed epoch increment, again, aka 6291 (Daniel Farina <daniel@heroku.com>) |
Responses |
Re: txid failed epoch increment, again, aka 6291
|
List | pgsql-hackers |
On Tue, Sep 04, 2012 at 02:07:30AM -0700, Daniel Farina wrote: > It seems like this has reproduced once more. And once again, there > doesn't appear to be any funny business in pg_control (but the structs > are pasted here for your re-check), and there are successful sensical > updates to it. The primary is running 9.0.6. What version is the standby running? > However, we do have a new piece of data: there was a very brief period > where txid_snapshot did report an xmin greater than 2^33, our next > epoch boundary, by a few thousand transactions. That could be because > the reporting function GetNextXidAndEpoch does its own epoch > calculation before the checkpoint and then after a checkpoint that > forgets to increment the epoch there is no need to add post-facto > adjust the epoch anymore. That makes sense. > I've been reviewing the mechanism in CreateCheckPoint for this on and > off for a couple of days, but so far I haven't come up with a > convincing mechanism. However, given that it seems historically that > this bug is more likely to surface than *not* surface on this system, > perhaps we can try for a sometimes-reproducing test case. I'm still > struggling for a hint of a solution, though, so toss your thoughts > here. The cause is not apparent to me, either, and simple tests here do not reproduce the problem. I am suspicious of xlog_redo() updating ControlData->checkPointCopy without acquiring ControlFileLock. If a restartpoint is finishing at about the same time, ControlFile->checkPointCopy.nextXid might come from the just-read checkpoint record while ControlFile->checkPointCopy.nextXidEpoch bears the value from the older checkpoint just adopted as a restartpoint. The resulting inconsistency would, however, vanish at the next ControlFile->checkPointCopy update. This does not explain the symptoms you have reported, and it cannot explain much of anything on a primary. > (gdb) p *ControlFile > $4 = {system_identifier = 5613733157253676693, pg_control_version = > 903, catalog_version_no = 201008051, > state = DB_IN_ARCHIVE_RECOVERY, time = 1346746898, checkPoint = This capture, it seems, is from a standby. > {xlogid = 18751, xrecoff = 1072693824}, prevCheckPoint = { > xlogid = 18751, xrecoff = 1072693824}, checkPointCopy = {redo = > {xlogid = 18751, xrecoff = 602482536}, ThisTimeLineID = 6, > nextXidEpoch = 1, nextXid = 9904084, nextOid = 2047524, nextMulti > = 1119, nextMultiOffset = 3115, oldestXid = 4115479553, > oldestXidDB = 1, time = 1346746496, oldestActiveXid = 9558248}, You expected checkPointCopy = { ... nextXidEpoch = 2, ... }, correct? (Well, nextXidEpoch = 3 if you count the previous missed increment.) Does pg_control on the primary also bear epoch 1 where epoch 2 is expected? Your last restartpoint was exactly five minutes before your last checkpoint, so there's no evidence of a dearth of safe restartpoint opportunities. nm
pgsql-hackers by date: