Thread: Can't restart postmaster!
I had postgresql die this morning because of lack of disk space on /var (base grew to 422MB fairly quickly, and I'm not sure why yet... pg_xlog is at 16450. So, I thought I go in and look at some table sizes to try and figure out what's so large, but I can't restart postmaster to look at the tables! Here's the log output: ======================================== postmaster successfully started DEBUG: database system was shut down at 2001-06-01 07:49:04 MST DEBUG: CheckPoint record at (0, 821998612) DEBUG: Redo record at (0, 821998612); Undo record at (0, 0); Shutdown TRUE DEBUG: NextTransactionId: 251410; NextOid: 2612961 FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.16747) failed: No such file or directory /usr/bin/postmaster: Startup proc 16747 exited with status 512 - abort ======================================== Sure enough, there is no xlogtemp.16747 in pg_xlog, just a file "0000000000000030". Is there anyway to recover from this? There are some tables I can safely remove and reconstruct, but other tables that I'd really like to preserve. However, with the new file naming scheme I can't tell what's what without running pgsql (or equivalent), which requires a running postmaster. Thanks for any help!! -- Steve Wampler- SOLIS Project, National Solar Observatory swampler@noao.edu
Steve Wampler <swampler@noao.edu> writes: > FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.16747) failed: No such file or directory The error message is bogus --- almost certainly, the real problem is not enough free space to create another 16-MB WAL segment. (I have a TODO item about making sure that this case doesn't return a misleading error code...) As for recovery, if you can't free any space elsewhere, try removing the oldest (lowest-numbered) WAL segment file. If the thing refuses to start after you do that, you might have to resort to applying contrib/pg_resetxlog, but don't do that unless you have to. BTW, if you have WAL_FILES set higher than zero, set it back to zero untill you're out of the woods. regards, tom lane
Steve Wampler <swampler@noao.edu> writes: > Are the WAL segment files the ones located in pg_xlog? Right, the ones with sixteen-hex-digit filenames. Sorry for not being perfectly clear. regards, tom lane
> Steve Wampler <swampler@noao.edu> writes: > > Are the WAL segment files the ones located in pg_xlog? > > Right, the ones with sixteen-hex-digit filenames. Sorry for not being > perfectly clear. Am I correct that with no UNDO in 7.1.X, we should only be keeping one WAL file around, or maybe two? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> > Am I correct that with no UNDO in 7.1.X, we should only be keeping one > > WAL file around, or maybe two? > > Not necessarily. How much do you do between checkpoints? > > But yeah, there's no reason to save data further back than one or maybe > two checkpoints, as long as UNDO isn't there. I am thinking of long transactions that keep extra WAL files around. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> Am I correct that with no UNDO in 7.1.X, we should only be keeping one > WAL file around, or maybe two? Not necessarily. How much do you do between checkpoints? But yeah, there's no reason to save data further back than one or maybe two checkpoints, as long as UNDO isn't there. regards, tom lane