I am beginning to look at Postgres 8, and am particularly
interested in cost-based vacuum/analyze. I'm hoping someone
can shed some light on the behavior I am seeing.
Suppose there are three threads:
writer_thread
every 1/15 second do
BEGIN TRANSACTION
COPY table1 FROM stdin
...
COPY tableN FROM stdin
perform several UPDATEs, DELETEs and INSERTs
COMMIT
reader_thread
every 1/15 second do
BEGIN TRANSACTION
SELECT FROM table1 ...
...
SELECT FROM tableN ...
COMMIT
analyze_thread
every 5 minutes do
ANALYZE table1
...
ANALYZE tableN
Now, Postgres 8.0.3 out-of-the-box (all default configs) on a
particular piece of hardware runs the Postgres connection for
writer_thread at about 15% CPU (meaningless, I know, but for
comparison) and runs the Postgres connection for reader_thread
at about 30% CPU. Latency for reader_thread seeing updates
from writer_thread is well under 1/15s. Impact of
analyze_thread is negligible.
If I make the single configuration change of setting
vacuum_cost_delay=1000, each iteration in analyze_thread takes
much longer, of course. But what I also see is that the CPU
usage of the connections for writer_thread and reader_thread
spike up to well over 80% each (this is a dualie) and latency
drops to 8-10s, during the ANALYZEs.
I don't understand why this would be. I don't think there
are any lock issues, and I don't see any obvious I/O issues.
Am I missing something? Is there any way to get some
insight into what those connections are doing?
Thanks,
--Ian