[PG 8.1.0 / AIX 5.3] Vacuum processes freezing - Mailing list pgsql-performance

From RESTOUX, Loïc
Subject [PG 8.1.0 / AIX 5.3] Vacuum processes freezing
Date
Msg-id F10D59BC922A9D47B7FE1B0A31AC7B5343D4DB@CORPMAIL31.corp.capgemini.com
Whole thread Raw
Responses Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-performance

Hello everybody,

We're using PostgreSQL 8.1.0 on AIX 5.3 through NFS (a Netapp Filer hosts the database files), and we're encoutering
somesissues with vaccums. PostgreSQL binaries are built with xlc 6 (C for AIX Compiler 6.0.0.6) on AIX 5.2 (yes, I
know,building on 5.2 and running on 5.3 is not the best way to avoid bugs...). 


We have strong performance constraints with this database, so we planned vacuums with a crontab :
- Frequent vacuum analyze on some heavily-updated tables (few rows, but a lot of insert/update/delete). The frequency
variesbetween 5 and 15 minutes. 
- A nightly (not FULL) vacuum on the entire database.
We don't use autovacuum or FULL vacuum, because the high havailability needed for the database. We prefer to keep it
undercontrol. 


Since some weeks, the amount of data hosted by the database grows, and, some nights,  the database vacuum seems to
"freeze"during his execution. In verbose mode, the logs show that the vacuum clean up a table (not always the same
table),and... no more news. The system shows a vacuum process, which seems to be sleeping (no CPU used, no memory
consumption...).In addition, the logs of our application show that database transactions seems to be slower.  

For some internal reasons, the only way for us to workaround this problem is to shutdown of the application and the
database.After a full restart, things are ok.  


Some questions :

1) During the nightly database vacuum, some vacuums run concurrently (vacuums on heavily-updated tables). Can this
concurrencycause some deadlocks ? We're planning to patch our shell scripts to avoid this concurrency. 


2) I believed that the poor performances during the vacuum freeze were due to the obsolete data statistics. But after a
fullrestart of the dabatase, performances are good. Does PostgreSQL rebuild his statistics during startup ?  


3) Can we explain the freeze with a bad database configuration ? For instance, postgreSQL running out of connections,
orwhatever, causing the vacuum process to wait for free ressources ? 


4) This morning, just before the database vacuum freeze, the logs show this error :
<2007-06-13 03:20:35 DFT%>ERROR:  could not open relation 16391/16394/107937: A system call received an interrupt.
<2007-06-13 03:20:35 DFT%>CONTEXT:  writing block 2 of relation 16391/16394/107937
<2007-06-13 03:20:40 DFT%>LOG:  could not fsync segment 0 of relation 16392/16394/107925: A system call received an
interrupt.
<2007-06-13 03:20:40 DFT%>ERROR:  storage sync failed on magnetic disk: A system call received an interrupt.

This is the first time we're encountering this error. Can it be a cause of the vacuum freeze ?


Regards,


--
  Loic Restoux
  Capgemini Telecom & Media / ITDR
  tel : 02 99 27 82 30
  e-mail : loic.restoux@capgemini.com

This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It
isintended only for the person to whom it is addressed. If you are not the intended recipient,  you are not authorized
toread, print, retain, copy, disseminate,  distribute, or use this message or any part thereof. If you receive this
messagein error, please notify the sender immediately and delete all  copies of this message. 


pgsql-performance by date:

Previous
From: Markus Schiltknecht
Date:
Subject: Re: dbt2 NOTPM numbers
Next
From: "Mark Wong"
Date:
Subject: Re: dbt2 NOTPM numbers