From pgsql-general: Huge number of disk writes after migration to 8.1 - Mailing list pgsql-bugs

From Marcin
Subject From pgsql-general: Huge number of disk writes after migration to 8.1
Date
Msg-id 43CD3F5D.2030002@op.pl
Whole thread Raw
Responses Re: From pgsql-general: Huge number of disk writes after migration to 8.1
List pgsql-bugs
Short introduction:
After migration from 8.0.3 to 8.1.2 I noticed huge increase in number of
bytes written to disk. I think it's caused by stats collector process,
but I may be wrong.

Some details:

1. Hardware: 2*Opteron 275, Tyan motherboard, 8GBs DDR ECC RAM, 2*SCSI
73GB 15k RPM drives.

2. Software: Debian Sarge amd64, kernel 2.6.12.5, PostgreSQL 8.1.2 built
from sources with:
./configure --host=x86_64-linux --build=x86_64-linux --enable-recode
--enable-nls --enable-integer-datetimes --disable-debug --disable-rpath
--without-tcl --without-perl --without-python --without-pam
--with-openssl --with-gnu-ld --without-krb5 --without-tk --without-java
--with-maxbackends=1024 --with-pgport=5432 --enable-thread-safety
gcc (GCC) 3.4.4 20050314 (prerelease) (Debian 3.4.3-13)
Note: Exactly the same environment were used to build previous 8.0.3
version.

3. 14-20GBs of data, ~120 DBs, ~60 used simultaneously, 1000-2000 tables
in each DB, 200-250 connections. Queries: 120-160 selects, 18-25
INSERT/UPDATE/DELETE per seconds during work hours.


4. postgresql.conf:
shared_buffers=30000
max_prepared_transactions=0 #(line added after migration 8.1)
work_mem=256000
maintenance_work_mem=512000
max_fsm_pages=1800000
max_fsm_relations=80000

fsync_on

checkpoints_segments=12
effective_cache_size=800000

log_destination='syslog'
client_min_messages=warning
log_min_messages=notice
log_min_duration_statement=200

stats_start_collector = on
stats_command_string = on


5. The writes rate stays at 20MB/s during work hours, the reads rate is
unnoticeable (60KB-100KB/s). With 8.0.3 the writes rate stayed at 1-1,5MB/s.
6. After disabling stats_command_string, the writes rate stepped down to
~10MB/s
7. No autovacuum, VACUUM FULL is performed every night, and VACUUM
ANALYZE is performed every 30 minutes on few critical tables.
8. About 2000 (with stats_command_string enabled) queries lasts longer
than 300ms, after disabling the stats_command_string it dropped to 900
(it used to be ~500-600 with 8.0.3). The duration of the longest query
in last two day was 18s.
9. ps aux shows:
208:39 postgres: stats collector process
 12:33 /usr/lib/postgresql/bin/postmaster
  0:28 postgres: writer process
  4:23 postgres: stats buffer process
after three days run. Two days with stats_command_string enabled
resulted in ~180 minutes of CPU time utilized by stats collector,
while during near one day with disabled command_string stats collector
utilized only 58 minutes.

With 8.0.3 it used to be like that:
 31:07 /usr/lib/postgresql/bin/postmaster
175:10 postgres: writer process
 30:55 postgres: stats buffer process
 58:43 postgres: stats collector process
(PostgreSQL was running for almost 12 days).

Tom Lane wrote:
> Yeah, something wrong there :-(.  What did you say your platform was
> exactly?  Would you strace the collector process, and send maybe ten K
> or so of trace output to pgsql-bugs?  It's probably not appropriate for
> pgsql-general.

The gzipped strace output of collector is attached. Unfortunately, the
server wasn't very busy (just 10-20% utilization).
The stats_command_string was disabled during stracing.

Thanks in advance
--
Marcin

Attachment

pgsql-bugs by date:

Previous
From: tomas@tuxteam.de (Tomas Zerolo)
Date:
Subject: Re: BUG #2175: 22021: invalid byte sequence for encoding \"UNICODE\":
Next
From: Tom Lane
Date:
Subject: Re: From pgsql-general: Huge number of disk writes after migration to 8.1