We have a medium scale installation of Postgres 8.3 that is freezing
about once a week. I'm looking for any hints on how to diagnose the
situation, as nothing is logged.
The system is matched pair of Sunfire servers, running Debian Etch with
a 2.6.18-6-amd64 kernel, PostgreSQL 8.3.4, and DRBD 8.0.13.
During a failed state, pg_stat_activity will show hundreds of statements
pending. query_start will show the statements arriving at a normal rate
(a few per second), but clearly they never complete. The bulk of these
statement are a simple select that starts each web session, a statement
that generally completes in tenths of milliseconds. Restarting postgres
restores normal operation, at the loss of all chance of figuring out
what was wrong.
postgresql.conf has customized:
log_destination = 'syslog'
log_min_error_statement = error
log_min_duration_statement = 5000
Where can I go from here? I'm turning on log_checkpoints now, though
all I see is "postgres[14412]: [4-1] LOG: checkpoint starting: time"
with no actual time printed.