Thread: 7.1b6 - pg_xlog filled fs, postmaster won't start

7.1b6 - pg_xlog filled fs, postmaster won't start

From
"Gordon A. Runkle"
Date:
Yes, I was loading a large table.  :-)

The filesystem with pg_xlog filled up, and the
backend (all backends) died abnormally.  I can't
restart postmaster, either.

There are no stray IPC resources left allocated.

Is it OK to delete the files from pg_xlog?  What
will be the result?

Will I be able to avoid this problem by splitting
the load data into multiple files?

Here's some information from the log:

-------------
DEBUG:  copy: line 1853131, XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  copy: line 1867494, XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  copy: line 1884676, XLogWrite: new log file created - consider increasing WAL_FILES
FATAL 2:  copy: line 1897094, ZeroFill(logfile 0 seg 196) failed: No space left on device
Server process (pid 7867) exited with status 512 at Tue Mar 20 18:16:27 2001
Terminating any active server processes...
NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend  died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am       going to terminate your database system connection and
exit.
        Please reconnect to the database system and repeat your query.
Server processes were terminated at Tue Mar 20 18:16:27 2001
Reinitializing shared memory and semaphores
DEBUG:  database system was interrupted at 2001-03-20 18:16:03 EST
DEBUG:  CheckPoint record at (0, 3274567656)
DEBUG:  Redo record at (0, 3271658024); Undo record at (0, 1464223896); Shutdown FALSE
DEBUG:  NextTransactionId: 1639; NextOid: 4000032
DEBUG:  database system was not properly shut down; automatic recovery in progress...
DEBUG:  redo starts at (0, 3271658024)
DEBUG:  open(logfile 0 seg 196) failed: No such file or directory
DEBUG:  redo done at (0, 3288327848)
FATAL 2:  ZeroFill(logfile 0 seg 196) failed: No space left on device
/opt/postgresql/bin/postmaster: Startup proc 7922 exited with status 512 - abort

-------------

Thanks,

Gordon.


--
It doesn't get any easier, you just go faster.
   -- Greg LeMond

RE: 7.1b6 - pg_xlog filled fs, postmaster won't start

From
"Mikheev, Vadim"
Date:
> >> Is it OK to delete the files from pg_xlog?  What will be
> >> the result?
> > It's not Ok. Though you could remove files numbered from
> > 00000000000000 to 0000000000012 (in hex), if any.
>
> OK, thanks.  Is there any documentation on these files, and what
> our options are if something like this happens?

DEBUG:  Redo record at (FileID, Offset)...

says what is the oldest file required: FileID is in first (leftmost)
8 chars of 16 chars file names, Offset/(16*1024*1024) gives you
last 8 chars (don't forget to convert numbers to hex).

> With other RDBMS products I use, DB2 and Sybase, there
> are options in the import/load/bcp utilities which commit
> every n records, selectable by the user.  I think having
> a feature like this in COPY would greatly facilitate
> data migrations (which is what I'm doing, and the reason
> for such a big file).  What do you think?

It wouldn't help in 7.1 where transaction rollback using log
is not implemented and anyway we need in checkpoint in log
to restart.

Vadim

Re: 7.1b6 - pg_xlog filled fs, postmaster won't start

From
Tom Lane
Date:
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> DEBUG:  Redo record at (FileID, Offset)...

> says what is the oldest file required: FileID is in first (leftmost)
> 8 chars of 16 chars file names, Offset/(16*1024*1024) gives you
> last 8 chars (don't forget to convert numbers to hex).

BTW, I've been thinking that it'd make more sense if the debug messages
displayed XLOG locations in hex.  It'd be a lot easier to mentally
associate them with segment files that way.

            regards, tom lane