Thread: Can't restart postmaster!

Can't restart postmaster!

From
Steve Wampler
Date:
I had postgresql die this morning because of lack of disk space
on /var (base grew to 422MB fairly quickly, and I'm not sure why
yet... pg_xlog is at 16450.

So, I thought I go in and look at some table sizes to try and
figure out what's so large, but I can't restart postmaster to
look at the tables!  Here's the log output:
========================================
postmaster successfully started
DEBUG:  database system was shut down at 2001-06-01 07:49:04 MST
DEBUG:  CheckPoint record at (0, 821998612)
DEBUG:  Redo record at (0, 821998612); Undo record at (0, 0); Shutdown TRUE
DEBUG:  NextTransactionId: 251410; NextOid: 2612961
FATAL 2:  ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.16747) failed: No such file or directory
/usr/bin/postmaster: Startup proc 16747 exited with status 512 - abort
========================================
Sure enough, there is no xlogtemp.16747 in pg_xlog, just a file
"0000000000000030".

Is there anyway to recover from this?  There are some tables I can
safely remove and reconstruct, but other tables that I'd really
like to preserve.  However, with the new file naming scheme I can't
tell what's what without running pgsql (or equivalent), which
requires a running postmaster.

Thanks for any help!!

--
Steve Wampler-  SOLIS Project, National Solar Observatory
swampler@noao.edu

Re: Can't restart postmaster!

From
Tom Lane
Date:
Steve Wampler <swampler@noao.edu> writes:
> FATAL 2:  ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.16747) failed: No such file or directory

The error message is bogus --- almost certainly, the real problem is not
enough free space to create another 16-MB WAL segment.  (I have a TODO
item about making sure that this case doesn't return a misleading error
code...)

As for recovery, if you can't free any space elsewhere, try removing the
oldest (lowest-numbered) WAL segment file.  If the thing refuses to
start after you do that, you might have to resort to applying
contrib/pg_resetxlog, but don't do that unless you have to.

BTW, if you have WAL_FILES set higher than zero, set it back to zero
untill you're out of the woods.

            regards, tom lane

Re: Can't restart postmaster!

From
Tom Lane
Date:
Steve Wampler <swampler@noao.edu> writes:
> Are the WAL segment files the ones located in pg_xlog?

Right, the ones with sixteen-hex-digit filenames.  Sorry for not being
perfectly clear.

            regards, tom lane

Re: Can't restart postmaster!

From
Bruce Momjian
Date:
> Steve Wampler <swampler@noao.edu> writes:
> > Are the WAL segment files the ones located in pg_xlog?
>
> Right, the ones with sixteen-hex-digit filenames.  Sorry for not being
> perfectly clear.

Am I correct that with no UNDO in 7.1.X, we should only be keeping one
WAL file around, or maybe two?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Can't restart postmaster!

From
Bruce Momjian
Date:
> > Am I correct that with no UNDO in 7.1.X, we should only be keeping one
> > WAL file around, or maybe two?
>
> Not necessarily.  How much do you do between checkpoints?
>
> But yeah, there's no reason to save data further back than one or maybe
> two checkpoints, as long as UNDO isn't there.

I am thinking of long transactions that keep extra WAL files around.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Can't restart postmaster!

From
Tom Lane
Date:
> Am I correct that with no UNDO in 7.1.X, we should only be keeping one
> WAL file around, or maybe two?

Not necessarily.  How much do you do between checkpoints?

But yeah, there's no reason to save data further back than one or maybe
two checkpoints, as long as UNDO isn't there.

            regards, tom lane