Postgres 7.2.3 hangs during vacuum. - Mailing list pgsql-admin

From Robert M. Meyer
Subject Postgres 7.2.3 hangs during vacuum.
Date
Msg-id 1040661274.19418.39.camel@skymaster
Whole thread Raw
Responses Re: Postgres 7.2.3 hangs during vacuum.
List pgsql-admin
Having a bizarre one, today.  It seems that when doing a vacuum FULL on
our system, the backend postgres process gets into a strange state.  It
starts consuming 100% of a cpu (we have two) and is immune to any signal
(except 9).  Checking the logs against a successful vacuum, It seems to
stop between a table and an index.  I can tell this since it writes the
time of the last table that it did but doesn't write out the starting
information for the next table.

Here are the log entries around one that worked:
Dec 15 04:23:05 dolidb-n1 logger: ^ICPU 0.98s/0.79u sec elapsed 34.33
sec.
Dec 15 04:23:41 dolidb-n1 logger: DEBUG:  Index
client_obj_guid_ord_obj_key: Pages 10638; Tuples 687669: Deleted 24902.
Dec 15 04:23:41 dolidb-n1 logger: ^ICPU 0.87s/0.76u sec elapsed 36.18
sec.
Dec 15 04:24:12 dolidb-n1 logger: DEBUG:  Index
pos_obj_guid_ord_obj_key: Pages 8308; Tuples 687669: Deleted 24902.
Dec 15 04:24:12 dolidb-n1 logger: ^ICPU 0.71s/0.75u sec elapsed 31.20
sec.
****Dec 15 04:24:56 dolidb-n1 logger: DEBUG:  Index
ord_jeopardy_list_guid_ord_obj_: Pages 10832; Tuples 687669: Deleted
24902.
Dec 15 04:24:56 dolidb-n1 logger: ^ICPU 0.75s/0.69u sec elapsed 43.33
sec.

The breakage occurs at the line that is marked with '****'.  This line
never shows up in the log and the next thing I see is recycled
transaction log file messages.  That's the last message I get from
postgres.  If I check the process table, one postgres backend is chewing
100% cpu and nothing else is happening.  Killing it with any signal
other than KILL fails.  Of course sending it a signal 9 sends the
database into recovery mode and refuses connections until it's done.

Pertinant information:
Compaq CL380 with CR3500 RAID and two CPUs with 4 Gigabytes of memory
RedHat 7.1 with kernel 2.4.19
Postgres 7.2.3

Anybody have any ideas.  Right now, my solution is going to be to backup
the databases, wipe /usr/local/pgsql/data, initdb and restore the DBs.

Note that pg_dump of the databases and restore to another machine seems
to be going without a hitch so I'm figuring that a wipe and restore will
make things better (talk about shotgun approaches).  A vacuum on the
restored database on the other machine works fine.

Cheers!

Bob
--
Robert M. Meyer
Sr. Network Administrator
INSTALLS inc
14 Lafayette Sq, Ste 700
Buffalo, NY 14203-1904
(716)332-1451


pgsql-admin by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: any known issues with 7.3?
Next
From: Andreas Schmitz
Date:
Subject: locking a table