Thread: Version 7.2.3 unrecoverable crash on missing pg_clog

Version 7.2.3 unrecoverable crash on missing pg_clog

From
Andy Osborne
Date:
All,

One of our databases crashed yesterday with a bug that looks
a lot like the non superuser vacuum issue that 7.2.3 was
intended to fix, although we do our vacuum with a user that
has usesuper=t in pg_user so I guess it's not that simple.

 From the logs:

DEBUG:  pq_recvbuf: unexpected EOF on client connection
FATAL 2:  open of /u0/pgdata/pg_clog/0726 failed: No such file or directory
DEBUG:  server process (pid 4232) exited with exit code 2
DEBUG:  terminating any other active server processes
NOTICE:  Message from PostgreSQL backend:
         The Postmaster has informed me that some other backend
         died abnormally and possibly corrupted shared memory.
         I have rolled back the current transaction and am
         going to terminate your database system connection and exit.
         Please reconnect to the database system and repeat your query.
[backend message repeated]
FATAL 1:  The database system is in recovery mode
[repeated]
DEBUG:  all server processes terminated; reinitializing shared memory and semaphores
DEBUG:  database system was interrupted at 2003-01-08 20:14:06 GMT
DEBUG:  checkpoint record is at 69/74D200E4
DEBUG:  redo record is at 69/74D0DA14; undo record is at 0/0; shutdown FALSE
DEBUG:  next transaction id: 221940405; next oid: 281786728
DEBUG:  database system was not properly shut down; automatic recovery in progress
DEBUG:  redo starts at 69/74D0DA14
DEBUG:  ReadRecord: record with zero length at 69/74D2634C
DEBUG:  redo done at 69/74D26328
FATAL 1:  The database system is starting up
[repeated]
DEBUG:  database system is ready

Then almost immediately it went out again

FATAL 2:  open of /u0/pgdata/pg_clog/0656 failed: No such file or directory
DEBUG:  server process (pid 13054) exited with exit code 2
DEBUG:  terminating any other active server processes
NOTICE:  Message from PostgreSQL backend:
         The Postmaster has informed me that some other backend
         died abnormally and possibly corrupted shared memory.
         I have rolled back the current transaction and am
         going to terminate your database system connection and exit.
         Please reconnect to the database system and repeat your query.
[repeated]
FATAL 1:  The database system is in recovery mode
FATAL 1:  The database system is in recovery mode
FATAL 1:  The database system is in recovery mode
DEBUG:  all server processes terminated; reinitializing shared memory and semaphores
DEBUG:  database system was interrupted at 2003-01-08 20:16:12 GMT
DEBUG:  checkpoint record is at 69/74D2634C
DEBUG:  redo record is at 69/74D2634C; undo record is at 0/0; shutdown TRUE
DEBUG:  next transaction id: 221940709; next oid: 281786728
DEBUG:  database system was not properly shut down; automatic recovery in progress
FATAL 1:  The database system is starting up
[repeated]
DEBUG:  redo starts at 69/74D2638C
DEBUG:  ReadRecord: record with zero length at 69/754828E8
DEBUG:  redo done at 69/754828C4
FATAL 1:  The database system is starting up
[repeated]
DEBUG:  database system is ready

and again

FATAL 2:  open of /u0/pgdata/pg_clog/0452 failed: No such file or directory
DEBUG:  server process (pid 13451) exited with exit code 2
DEBUG:  terminating any other active server processes
NOTICE:  Message from PostgreSQL backend:
         The Postmaster has informed me that some other backend
         died abnormally and possibly corrupted shared memory.
         I have rolled back the current transaction and am
         going to terminate your database system connection and exit.
         Please reconnect to the database system and repeat your query.

and so on until we shut it down.

Platform is a Dell 6650 Quad Xeon 1.4GHz with hyperthreading
switched on.  2GB RAM.  Running RedHat 7.3 with their kernel
"2.4.18-10smp #1 SMP Wed Aug 7 11:17:48 EDT 2002 i686 unknown".
We built our postgresql from source with:

  ./configure  --with-perl --with-openssl --enable-syslog

and with NAMEDATALEN = 64 in postgres_ext.h.  select version()
reports PostgreSQL 7.2.3 on i686-pc-linux-gnu, compiled by GCC 2.96
and we are building against perl "perl5 (revision 5.0 version 6
subversion 1)" which is from the RedHat rpm but rebuilt to have
a shared libperl.

With postgres up in single user mode, anything that touched
one particular table (called news - very active and about 450MB
in size and about 83k rows) caused postgres to fail as above.

In the end we dropped this table, vacuumed (full) the database
and put the table back from a backup that was about 3 hrs old.
The database has been ok since.  We vacuum every night and
vacuum --full once a week. The database cluster has six databases
(8 incl template[01]) of which five are very active.  Typically
150 or so connections active. postgresql.conf options that we've
altered from default are ...

max_connections = 512
shared_buffers = 8192
wal_buffers = 12
sort_mem = 32768
vacuum_mem = 32768
wal_files = 8

This is the only time we've seen this happen and I can't
reproduce it on our test machines.  Pretty scary none the less !.

Has anyone else had similar problems with 7.2.3 ?.  Any clues ?

Andy


--
Andy Osborne      ****************      "Vertical B2B Communities"
Senior Internet Engineer
Sift Group                    100 Victoria Street, Bristol BS1 6HZ
tel:+44 117 915 9600  fax:+44 117 915 9630   http://www.sift.co.uk

Re: Version 7.2.3 unrecoverable crash on missing pg_clog

From
Tom Lane
Date:
Andy Osborne <andy@sift.co.uk> writes:
> One of our databases crashed yesterday with a bug that looks
> a lot like the non superuser vacuum issue that 7.2.3 was
> intended to fix, although we do our vacuum with a user that
> has usesuper=t in pg_user so I guess it's not that simple.

> FATAL 2:  open of /u0/pgdata/pg_clog/0726 failed: No such file or directory

What range of file names do you actually see in pg_clog?

The fixes in 7.2.3 were for cases that would try to access
already-removed clog segments (file numbers less than what's present).
In this case the accessed file name is large enough that I'm thinking
the problem is due to a garbage transaction number being passed to the
transaction-status-check code.  So my bet is on physical data corruption
in the table that was causing the problem.  It turns out that the first
detectable symptom of a trashed tuple header is often a failure like
this :-(.

You didn't happen to make a physical copy of the news table before
dropping it, did you?  It'd be interesting to examine the remains.
So far, the cases I have seen like this all seem to be due to hardware
faults, but we've seen it just often enough to make me wonder if there
is a software issue too.

            regards, tom lane

Re: Version 7.2.3 unrecoverable crash on missing pg_clog

From
Andy Osborne
Date:
Tom Lane wrote:
> Andy Osborne <andy@sift.co.uk> writes:
>
>>One of our databases crashed yesterday with a bug that looks
>>a lot like the non superuser vacuum issue that 7.2.3 was
>>intended to fix, although we do our vacuum with a user that
>>has usesuper=t in pg_user so I guess it's not that simple.
>
>
>>FATAL 2:  open of /u0/pgdata/pg_clog/0726 failed: No such file or directory
>
>
> What range of file names do you actually see in pg_clog?

Currently 0000 to 00D6. I don't know what it was last night.

> The fixes in 7.2.3 were for cases that would try to access
> already-removed clog segments (file numbers less than what's present).
> In this case the accessed file name is large enough that I'm thinking
> the problem is due to a garbage transaction number being passed to the
> transaction-status-check code.  So my bet is on physical data corruption
> in the table that was causing the problem.  It turns out that the first
> detectable symptom of a trashed tuple header is often a failure like
> this :-(.

/u0 is a Linux software RAID using RAID 1 on three disks
with two raid-disks and one spare-disk.  The previous backup
(a pg_dump in plain text SQL) ran ok and produced a clean backup
that we were able to reload.  We do this every three hours and
the next backup was running when the database crashed.  Any
attempt to access the table crashed it again.  I don't know if
it helps, but a select * from news where <conditional on a field
with an index) was ok but if the where was not indexed and resulted
in a table scan, it crashed it.

While I wouldn't rule out data corruption, the kernel message
ring has no errors for the md dirver, scsi host adapter or the
disks, which I would expect if we had bad blocks appearing on a
disk or somesuch.  The machine has been running with v7.2.3 for
about two months and hasn't shown this problem before.  My gut
feeling is that it's something else.

> You didn't happen to make a physical copy of the news table before
> dropping it, did you?  It'd be interesting to examine the remains.
> So far, the cases I have seen like this all seem to be due to hardware
> faults, but we've seen it just often enough to make me wonder if there
> is a software issue too.

Sadly, no I didn't.  This is one of our live database servers
and I was under a lot of pressure to get it back quickly.  If
it does it again, what can I do to provide the most useful
feedback ?.

Thanks,

Andy

--
Andy Osborne      ****************      "Vertical B2B Communities"
Senior Internet Engineer
Sift Group                    100 Victoria Street, Bristol BS1 6HZ
tel:+44 117 915 9600  fax:+44 117 915 9630   http://www.sift.co.uk

Re: Version 7.2.3 unrecoverable crash on missing pg_clog

From
Tom Lane
Date:
Andy Osborne <andy@sift.co.uk> writes:
> Tom Lane wrote:
>>> FATAL 2:  open of /u0/pgdata/pg_clog/0726 failed: No such file or directory
>> What range of file names do you actually see in pg_clog?

> Currently 0000 to 00D6. I don't know what it was last night.

Not any greater, for sure.  (FYI, each segment covers one million
transactions.)

> the next backup was running when the database crashed.  Any
> attempt to access the table crashed it again.  I don't know if
> it helps, but a select * from news where <conditional on a field
> with an index) was ok but if the where was not indexed and resulted
> in a table scan, it crashed it.

This is consistent with one page of the table being corrupted.

> While I wouldn't rule out data corruption, the kernel message
> ring has no errors for the md dirver, scsi host adapter or the
> disks, which I would expect if we had bad blocks appearing on a
> disk or somesuch.

Some of the cases that I've seen look like completely unrelated data
(not even Postgres stuff, just bits of text files) was written into
a page of a Postgres table.  This could possibly be a kernel bug,
along the lines of getting confused about which buffer belongs to
which file.  But with no way to reproduce it it's hard to pin blame.

>> You didn't happen to make a physical copy of the news table before
>> dropping it, did you?  It'd be interesting to examine the remains.

> Sadly, no I didn't.  This is one of our live database servers
> and I was under a lot of pressure to get it back quickly.  If
> it does it again, what can I do to provide the most useful
> feedback ?.

If the database isn't unreasonably large, perhaps you could take a
tarball dump of the whole $PGDATA directory tree while the postmaster
is stopped?  That would document the situation for examination at leisure.

            regards, tom lane

Re: Version 7.2.3 unrecoverable crash on missing pg_clog

From
Andy Osborne
Date:
Tom Lane wrote:
> Andy Osborne <andy@sift.co.uk> writes:
>
>>Tom Lane wrote:
 >
> But with no way to reproduce it it's hard to pin blame.

Sad but true :-(

>>>You didn't happen to make a physical copy of the news table before
>>>dropping it, did you?  It'd be interesting to examine the remains.
>>
>
>>Sadly, no I didn't.  This is one of our live database servers
>>and I was under a lot of pressure to get it back quickly.  If
>>it does it again, what can I do to provide the most useful
>>feedback ?.
>
>
> If the database isn't unreasonably large, perhaps you could take a
> tarball dump of the whole $PGDATA directory tree while the postmaster
> is stopped?  That would document the situation for examination at leisure.

It's about 8GB in total, so that seems like a plan.

Thanks for your help,

Andy

--
Andy Osborne      ****************      "Vertical B2B Communities"
Senior Internet Engineer
Sift Group                    100 Victoria Street, Bristol BS1 6HZ
tel:+44 117 915 9600  fax:+44 117 915 9630   http://www.sift.co.uk