Hi all,
I've a problem on a heavy loaded database: vacuums don't work since
about a week. All I got is:
mybase=# vacuum verbose analyze public.mytable;
INFO: vacuuming "public.mytable"
(I stop it after hours)
Looking with top and iotop, I see the process takes some cpu and disk io
time during several minutes, then it seems to fall asleep.
The process isn't locked according to pg_stat_activity.
My setup:
- postgresql 8.3.7 with contribs ltree and pgcrypto
- OS: debian etch kernel 2.6.24
- HW: 8cores Xeon/32GB RAM/3RAID10 volumes(index, data, pgxlog)
- dbase size: about 240GB
- millions of queries/day
- 1000 locks continually
- about 200 simultanous connections
- load: 30%iowait, 60%user, 10%sys
Autovacuum is disabled to prevent it from loading the server during peak
hours.
Regular vacuums running each night as cron job
Since about a week the nightly vacuums don't work. I tried manual ones
with no avail, same symptoms as above on small tables (350 rows) as well
as on big ones (almost 1 billion rows)
As the croned vacuums don't run anymore, I see now autovacuums (to
prevent wraparound) running all the time, but their process don't use
any cpu time nor disk io.
Autovacuum seems to work well on the pg_catalog schema.
The problem seems to start with some queries lasting more 15 hours. I
tried to kill them (signal 15) with no avail.
I can't restart the server as it's a big production server.
We're planning to upgrade the hardware soon, but I suspect we'll have
the same problems in the future as our platform is growing.
Does anyone have any info about this problem, and the means to prevent it ?
Thanks in advance.
Regards,
--
JC
Ph'nglui mglw'nafh Cthulhu n'gah Bill R'lyeh Wgah'nagl fhtagn!