My company is developing a PostgreSQL application. We're using 7.3.4
but will soon upgrade to 7.4.x. Our OS is RedHat 9. Our production
machines have 512 MB RAM and IDE disks. So far we've been using
default configuration settings, but I have started to examine
performance and to modify these settings.
Our typical transaction involves 5-10 SELECT, INSERT or UPDATEs,
(usually 1/2 SELECT and the remainder a mixture of INSERT and UPDATE).
There are a few aggregation queries which need to scan an entire
table. We observed highly uneven performance for the small
transactions. A transaction usually runs in under 100 msec, but we
would see spikes as high as 40,000 msec. These spikes occurred
regularly, every 4-5 minutes, and I speculated that checkpointing
might be the issue.
I created a test case, based on a single table:
create table test(
id int not null,
count int not null,
filler varchar(200),
primary key(id))
I loaded a database with 1,000,000 rows, with the filler column always
filled with 200 characters.
I then ran a test in which a random row was selected, and the count
column incremented. Each transaction contained ten such updates. In
this test, I set
shared_buffers = 2000
checkpoint_segments = 40
checkpoint_timeout = 600
wal_debug = 1
I set checkpoint_segments high because I wanted to see whether the
spikes correlated with checkpoints.
Most transactions completed in under 60 msec. Approximately every 10th
transaction, the time went up to 500-600 msec, (which is puzzling, but
not my major concern). I did see a spike every 10 minutes, in which
transaction time goes up to 5000-8000 msec. The spikes were correlated
with checkpoint activity, occurring slightly before a log entry that
looks like this:
2004-05-09 16:34:19 LOG: INSERT @ 2/C2A0F628: prev 2/C2A0F5EC;
xprev 0/0; xid 0: XLOG - checkpoint: redo 2/C2984D4C; undo 0/0;
sui 36; xid 1369741; oid 6321782; online
Questions:
1. Can someone provide an overview of checkpoint processing, to help
me understand the performance issues?
2. Is the spike due to the checkpoint process keeping the disk busy?
Or is there some locking involved that blocks my application until the
checkpoint completes?
3. The spikes are quite problematic for us. What can I do to minimize
the impact of checkpointing on my application? I understand how
checkpoint_segments and checkpoint_timeout determine when a checkpoint
occurs; what can I do to lessen the impact of a checkpoint?
4. I understand that a "background writer" is being contemplated for
7.5. Will that replace or augment the checkpoint process? Any
comments on how that work will apply to my problem would be
appreciated. I wouldn't mind seeing the average performance,
(without the spikes) go up -- let's say -- 10%, in exchange for
more uniform performance. These spikes are a real problem.
Jack Orenstein
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.