Help: 8.0.3 Vacuum of an empty table never completes ... - Mailing list pgsql-hackers

From James Robinson
Subject Help: 8.0.3 Vacuum of an empty table never completes ...
Date
Msg-id A408C5A2-6E05-439B-AAD5-71F93ADFDD54@socialserve.com
Whole thread Raw
Responses Re: Help: 8.0.3 Vacuum of an empty table never completes ...  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
G'day folks.

We have a production database running 8.0.3 which gets fully  
pg_dump'd and vacuum analyze'd hourly by cron. Something strange  
happened to us on the 5AM Friday Nov. 25'th cron run -- the:
/usr/local/pgsql/bin/vacuumdb -U postgres --all --analyze --verbose  >& $DATE/vacuum.log

step in our cron procedure never completed. Strange, since no known  
event of note happened on Friday since we were all out of the office  
past Wed. for the american Thanksgiving holiday. Anyway, running the  
vacuum line by hand shows it getting stuck -- processes the majority  
of our tables, then just stops, and the backend postmaster just stops  
accumulating CPU time.

Comparing the logs further with when it did complete, it seems that  
one table in particular (at least) seems afflicted:

social=# vacuum verbose analyze agency.swlog_client;

hangs up forever -- have to control-c the client. Likewise for w/o  
'analyze'.

pg-dump'ing the entire database works (phew!) and upon restoring on a  
backup box, said table can be vacuum'd:

social=# vacuum verbose analyze agency.swlog_client;
INFO:  vacuuming "agency.swlog_client"
INFO:  index "swlog_client_pkey" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "swlog_client": found 0 removable, 0 nonremovable row versions  
in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "agency.swlog_client"
INFO:  "swlog_client": scanned 0 of 0 pages, containing 0 live rows  
and 0 dead rows; 0 rows in sample, 0 estimated total rows
VACUUM

That's right -- completely empty table -- which is what we actually  
expect.

How should we proceed such that we can learn from this as well as we  
can proceed and get our entire database vacuuming again successfully?

Running on Linux 2.6.8-24.18-smp (SuSE 9.2). No juicy filesystem- 
related messages in dmesg nor /var/log/messages. 11% disk used on the  
postgres-related partition.


The table in question is defined as:

social=# \d agency.swlog_client;
Table "agency.swlog_client"
Column |  Type  | Modifiers
--------+--------+-----------
swlog  | bigint | not null
client | bigint | not null
Indexes:    "swlog_client_pkey" PRIMARY KEY, btree (swlog, client)
Foreign-key constraints:    "$2" FOREIGN KEY (client) REFERENCES agency.client(id)    "$1" FOREIGN KEY (swlog)
REFERENCESagency.swlog(id)
 


And the two fk'd tables:

social=# select count(*) from agency.client;
count
-------     0
(1 row)

social=# select count(*) from agency.swlog;
count
-------    69

We doubt that there could be any strange oddball extremely  
longrunning transaction in any of those related tables gumming up  
this table.

Finally, the only possibly potentially interesting event database- 
wise happened on Wed. Nov. 23'rd -- we SIGHUP'd the postmaster to  
have it learn a higher value for work_mem (10240, up from default of  
1024). But the hourly crons went great for the subsequent two days.  
maintenance_work_mem is still at the default of 16384.


Many thanks in advance!
James

----
James Robinson
Socialserve.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Getting different number of results when using hashjoin on/off
Next
From: Tom Lane
Date:
Subject: Re: Help: 8.0.3 Vacuum of an empty table never completes ...