Thread: Postgre 7.3.2 pg_clog error

Postgre 7.3.2 pg_clog error

From
"Rao Kumar"
Date:
We are running into a pg_clog problem since we recently upgraded to 7.3.2
version. The error log reports:

PANIC:  open of /usr/local/pgsql/data/pg_clog/0002 failed: No such file or
directory
2003-03-19 05:46:40 LOG:  recycled transaction log file 00000001000000B8
2003-03-19 05:51:02 PANIC:  open of /usr/local/pgsql/data/pg_clog/0020
failed: No such file or directory

Following the error, the database shuts down and goes into recovery mode,
sometimes
corrupting the index/data files making the database irrecoverable.

To give you a quick background, We run an OLTP application on postgres with
large transaction volume. We were able to successfully test our app with
simulators producing
these high transaction volumes with 7.1.3. But ever since we upgraded the
database to
7.3.2, we are encountering the above pg_clog error during high transaction
volume/activity.

If you need further information ( such as the .conf file), please email me
and I will be more than glad to provide you.

Thank You

Rao Kumar

2003-03-17 01:32:32 FATAL:  The database system is starting up
2003-03-17 01:32:32 FATAL:  The database system is starting up
2003-03-17 01:32:32 FATAL:  The database system is starting up
2003-03-17 01:32:33 FATAL:  The database system is starting up
2003-03-17 01:32:33 FATAL:  The database system is starting up
2003-03-17 01:32:33 FATAL:  The database system is starting up
2003-03-17 01:32:33 LOG:  recycled transaction log file 0000001700000092
2003-03-17 01:32:33 LOG:  recycled transaction log file 0000001700000093
2003-03-17 01:32:33 LOG:  database system is ready
2003-03-17 01:32:44 PANIC:  open of /usr/local/pgsql/data/pg_clog/0000
failed: No such file or directory
2003-03-17 01:32:44 LOG:  server process (pid 39313) was terminated by
signal 6
2003-03-17 01:32:44 LOG:  terminating any other active server processes
2003-03-17 01:32:44 LOG:  all server processes terminated; reinitializing
shared memory and semaphores
2003-03-17 01:32:44 FATAL:  The database system is starting up
2003-03-17 01:32:44 LOG:  database system was interrupted at 2003-03-17
01:32:33 EST
2003-03-17 01:32:44 LOG:  checkpoint record is at 17/96B42798
2003-03-17 01:32:44 LOG:  redo record is at 17/96B42798; undo record is at
0/0; shutdown TRUE
2003-03-17 01:32:44 LOG:  next transaction id: 3584774; next oid: 5063218
2003-03-17 01:32:44 LOG:  database system was not properly shut down;
automatic recovery in progress
2003-03-17 01:32:44 FATAL:  The database system is starting up
2003-03-17 01:32:45 FATAL:  The database system is starting up
2003-03-17 01:32:45 FATAL:  The database system is starting up
2003-03-17 01:32:45 LOG:  redo starts at 17/96B427D8
2003-03-17 01:32:45 FATAL:  The database system is starting up
2003-03-17 01:32:45 FATAL:  The database system is starting up
2003-03-17 01:32:45 LOG:  ReadRecord: record with zero length at 17/96F60DF4
2003-03-17 01:32:45 LOG:  redo done at 17/96F60DD0
2003-03-17 01:32:45 FATAL:  The database system is starting up
2003-03-17 01:32:45 FATAL:  The database system is starting up
2003-03-17 01:32:45 FATAL:  The database system is starting up
003-03-17 01:32:47 FATAL:  The database system is starting up
2003-03-17 01:32:47 FATAL:  The database system is starting up
2003-03-17 01:32:48 FATAL:  The database system is starting up
2003-03-17 01:32:48 FATAL:  The database system is starting up
2003-03-17 01:32:48 LOG:  recycled transaction log file 0000001700000095
2003-03-17 01:32:48 LOG:  recycled transaction log file 0000001700000094
2003-03-17 01:32:48 LOG:  database system is ready
2003-03-17 01:32:54 PANIC:  open of /usr/local/pgsql/data/pg_clog/0000
failed: No such file or directory
2003-03-17 01:32:54 LOG:  server process (pid 39349) was terminated by
signal 6
2003-03-17 01:32:54 LOG:  terminating any other active server processes
2003-03-17 01:32:54 LOG:  all server processes terminated; reinitializing
shared memory and semaphores
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 LOG:  database system was interrupted at 2003-03-17
01:32:49 EST
2003-03-17 01:32:55 LOG:  checkpoint record is at 17/96F60DF4
2003-03-17 01:32:55 LOG:  redo record is at 17/96F60DF4; undo record is at
0/0; shutdown TRUE
2003-03-17 01:32:55 LOG:  next transaction id: 3584892; next oid: 5071410
2003-03-17 01:32:55 LOG:  database system was not properly shut down;
automatic recovery in progress
2003-03-17 01:32:55 LOG:  redo starts at 17/96F60E34
2003-03-17 01:32:55 LOG:  ReadRecord: record with zero length at 17/97202654
2003-03-17 01:32:55 LOG:  redo done at 17/97202630
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:55 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:56 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up
2003-03-17 01:32:57 FATAL:  The database system is starting up





Re: Postgre 7.3.2 pg_clog error

From
Michael Brusser
Date:
I just observed a very similar problem with Postgres 7.2.1 on Linux.
It happened while running transaction which inserts a large number of
records.

From the log-file:
... ...
2003-03-19 17:00:18 DEBUG:  recycled transaction log file 0000000000000000
2003-03-19 17:09:56 DEBUG:  XLogWrite: new log file created - consider
increasing WAL_FILES
2003-03-19 17:10:22 DEBUG:  recycled transaction log file 0000000000000001
2003-03-19 17:15:25 DEBUG:  recycled transaction log file 0000000000000002
2003-03-19 17:23:40 DEBUG:  XLogWrite: new log file created - consider
increasing WAL_FILES
... ...
Rao, what platform you're running on?

Mike

> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org
> [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Rao Kumar
> Sent: Friday, March 21, 2003 10:12 AM
> To: pgsql-admin@postgresql.org
> Subject: [ADMIN] Postgre 7.3.2 pg_clog error
>
>
> We are running into a pg_clog problem since we recently upgraded to 7.3.2
> version. The error log reports:
>
> PANIC:  open of /usr/local/pgsql/data/pg_clog/0002 failed: No such file or
> directory
> 2003-03-19 05:46:40 LOG:  recycled transaction log file 00000001000000B8
> 2003-03-19 05:51:02 PANIC:  open of /usr/local/pgsql/data/pg_clog/0020
> failed: No such file or directory
>
> Following the error, the database shuts down and goes into recovery mode,
> sometimes
> corrupting the index/data files making the database irrecoverable.
>
> To give you a quick background, We run an OLTP application on
> postgres with
> large transaction volume. We were able to successfully test our app with
> simulators producing
> these high transaction volumes with 7.1.3. But ever since we upgraded the
> database to
> 7.3.2, we are encountering the above pg_clog error during high transaction
> volume/activity.
>
> If you need further information ( such as the .conf file), please email me
> and I will be more than glad to provide you.
>
> Thank You
>
> Rao Kumar
>
> 2003-03-17 01:32:32 FATAL:  The database system is starting up
> 2003-03-17 01:32:32 FATAL:  The database system is starting up
> 2003-03-17 01:32:32 FATAL:  The database system is starting up
> 2003-03-17 01:32:33 FATAL:  The database system is starting up
> 2003-03-17 01:32:33 FATAL:  The database system is starting up
> 2003-03-17 01:32:33 FATAL:  The database system is starting up
> 2003-03-17 01:32:33 LOG:  recycled transaction log file 0000001700000092
> 2003-03-17 01:32:33 LOG:  recycled transaction log file 0000001700000093
> 2003-03-17 01:32:33 LOG:  database system is ready
> 2003-03-17 01:32:44 PANIC:  open of /usr/local/pgsql/data/pg_clog/0000
> failed: No such file or directory
> 2003-03-17 01:32:44 LOG:  server process (pid 39313) was terminated by
> signal 6
> 2003-03-17 01:32:44 LOG:  terminating any other active server processes
> 2003-03-17 01:32:44 LOG:  all server processes terminated; reinitializing
> shared memory and semaphores
> 2003-03-17 01:32:44 FATAL:  The database system is starting up
> 2003-03-17 01:32:44 LOG:  database system was interrupted at 2003-03-17
> 01:32:33 EST
> 2003-03-17 01:32:44 LOG:  checkpoint record is at 17/96B42798
> 2003-03-17 01:32:44 LOG:  redo record is at 17/96B42798; undo record is at
> 0/0; shutdown TRUE
> 2003-03-17 01:32:44 LOG:  next transaction id: 3584774; next oid: 5063218
> 2003-03-17 01:32:44 LOG:  database system was not properly shut down;
> automatic recovery in progress
> 2003-03-17 01:32:44 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 LOG:  redo starts at 17/96B427D8
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 LOG:  ReadRecord: record with zero length at
> 17/96F60DF4
> 2003-03-17 01:32:45 LOG:  redo done at 17/96F60DD0
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 2003-03-17 01:32:45 FATAL:  The database system is starting up
> 003-03-17 01:32:47 FATAL:  The database system is starting up
> 2003-03-17 01:32:47 FATAL:  The database system is starting up
> 2003-03-17 01:32:48 FATAL:  The database system is starting up
> 2003-03-17 01:32:48 FATAL:  The database system is starting up
> 2003-03-17 01:32:48 LOG:  recycled transaction log file 0000001700000095
> 2003-03-17 01:32:48 LOG:  recycled transaction log file 0000001700000094
> 2003-03-17 01:32:48 LOG:  database system is ready
> 2003-03-17 01:32:54 PANIC:  open of /usr/local/pgsql/data/pg_clog/0000
> failed: No such file or directory
> 2003-03-17 01:32:54 LOG:  server process (pid 39349) was terminated by
> signal 6
> 2003-03-17 01:32:54 LOG:  terminating any other active server processes
> 2003-03-17 01:32:54 LOG:  all server processes terminated; reinitializing
> shared memory and semaphores
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 LOG:  database system was interrupted at 2003-03-17
> 01:32:49 EST
> 2003-03-17 01:32:55 LOG:  checkpoint record is at 17/96F60DF4
> 2003-03-17 01:32:55 LOG:  redo record is at 17/96F60DF4; undo record is at
> 0/0; shutdown TRUE
> 2003-03-17 01:32:55 LOG:  next transaction id: 3584892; next oid: 5071410
> 2003-03-17 01:32:55 LOG:  database system was not properly shut down;
> automatic recovery in progress
> 2003-03-17 01:32:55 LOG:  redo starts at 17/96F60E34
> 2003-03-17 01:32:55 LOG:  ReadRecord: record with zero length at
> 17/97202654
> 2003-03-17 01:32:55 LOG:  redo done at 17/97202630
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:55 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:56 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
> 2003-03-17 01:32:57 FATAL:  The database system is starting up
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



Re: Postgre 7.3.2 pg_clog error

From
Tom Lane
Date:
"Rao Kumar" <raokumar@netwolves.com> writes:
> We are running into a pg_clog problem since we recently upgraded to 7.3.2
> version. The error log reports:

> PANIC:  open of /usr/local/pgsql/data/pg_clog/0002 failed: No such file or
> directory
> 2003-03-19 05:46:40 LOG:  recycled transaction log file 00000001000000B8
> 2003-03-19 05:51:02 PANIC:  open of /usr/local/pgsql/data/pg_clog/0020
> failed: No such file or directory

What are the actual names, mod dates, and sizes of the files in the
pg_clog directory?

The PANICs should produce core dumps --- can you get stack traces from
the crashed backends?  (If they don't, fix the postmaster's environment
--- it's probably being started with ulimit -c 0.)

            regards, tom lane