Thread: A problem
I have a problem partly caused by stupidity, which I desperately need to fix if at all possible. Me and a colleague have tried fixing it and may've made it worse... I have a reasonably big db (21GB on disk, fully vaccuumed). I upgraded from 7.3.2 to 7.3.3, forgetting to shut it down cleanly first. It then got killed, for reasons best forgotten, with signal 15 to th main postmaster. On attempting to restart it, an error with the initscript, which came in teh source RPM, led to it partially stomping some bits of the DB, including pg_control in global. I have the old xlogs, clogs, everything. I removed the 'new' data subdirectory. The init script was pointed at /db/data when it should've been just /db, I still have the old /data/base dir, all that has been altered in there (recent mtime) is base/1175459/pg_internal.init Attemtping to use the pg_resetxlog command (having backed up pg_control and all the xlogs), we managed to start the server again - however, on connecting to it with psql, it still had the right DB in teh cluster ('stats') but it claimed this db was completely empty. I have kept everything I've tried reversible as far as my and my colleagues knowledge allowed. I hope someone can suggest something - I'm desperate! It'll take weeks to re-input that data automatically. Thanks -- Sam Barnett-Cormack Software Developer | Student of Physics & Maths UK Mirror Service (http://www.mirror.ac.uk) | Lancaster University
Sam Barnett-Cormack <s.barnett-cormack@lancaster.ac.uk> writes: > [ accidentally clobbered pg_control ] It sounds like the main problem is too small a value in the XID (transaction ID) counter, so that all your data tuples appear to be in the future. pg_resetxlog has an option to force an initial XID but you need to use it. You can guess a suitable XID by looking at the names of the old clog files --- you want something that will be past the end of the newest clog file, keeping in mind that there are 1M (1024^2) XIDs per clog file and that the clog file names are in hex. (I am not sure, but you might also need to provide a zero-filled clog file spanning the point you start XID at.) You will also need to be sure that the WAL file position is greater than the end of the old WAL --- again, look at the existing WAL files to see the largest number, and pick something bigger. BTW, make real sure that pg_control is set up with the same locale values as you were using before, or you'll be in a world of hurt with corrupt indexes. Use pg_controldata to look at what you have put into pg_control. (If you're not sure what was there before, get an old copy of pg_control off your last backup tape ... you do have a filesystem backup someplace, I trust ...) > I still have the old > /data/base dir, all that has been altered in there (recent mtime) is > base/1175459/pg_internal.init You can and probably should blow away pg_internal.init --- it's just a cache file and can be rebuilt. I'd not trust it at this point. regards, tom lane
Late as it is, I'll try this monday. I hope it works, other suggestions still welcome of course, thanks very much. On Fri, 27 Jun 2003, Tom Lane wrote: > Sam Barnett-Cormack <s.barnett-cormack@lancaster.ac.uk> writes: > > [ accidentally clobbered pg_control ] > > It sounds like the main problem is too small a value in the XID > (transaction ID) counter, so that all your data tuples appear to be in > the future. pg_resetxlog has an option to force an initial XID but > you need to use it. You can guess a suitable XID by looking at the > names of the old clog files --- you want something that will be past > the end of the newest clog file, keeping in mind that there are 1M > (1024^2) XIDs per clog file and that the clog file names are in hex. > (I am not sure, but you might also need to provide a zero-filled clog > file spanning the point you start XID at.) > > You will also need to be sure that the WAL file position > is greater than the end of the old WAL --- again, look at the existing > WAL files to see the largest number, and pick something bigger. > > BTW, make real sure that pg_control is set up with the same locale > values as you were using before, or you'll be in a world of hurt > with corrupt indexes. Use pg_controldata to look at what you have put > into pg_control. (If you're not sure what was there before, get an > old copy of pg_control off your last backup tape ... you do have a > filesystem backup someplace, I trust ...) > > > I still have the old > > /data/base dir, all that has been altered in there (recent mtime) is > > base/1175459/pg_internal.init > > You can and probably should blow away pg_internal.init --- it's just a > cache file and can be rebuilt. I'd not trust it at this point. > > regards, tom lane > -- Sam Barnett-Cormack Software Developer | Student of Physics & Maths UK Mirror Service (http://www.mirror.ac.uk) | Lancaster University