Re: cost-based vacuum - Mailing list pgsql-performance

From Simon Riggs
Subject Re: cost-based vacuum
Date
Msg-id 1121081504.3970.53.camel@localhost.localdomain
Whole thread Raw
In response to cost-based vacuum  (Ian Westmacott <ianw@intellivid.com>)
Responses Re: cost-based vacuum
Re: cost-based vacuum
List pgsql-performance
On Fri, 2005-07-08 at 12:25 -0400, Ian Westmacott wrote:
> I am beginning to look at Postgres 8, and am particularly
> interested in cost-based vacuum/analyze.  I'm hoping someone
> can shed some light on the behavior I am seeing.
>
> Suppose there are three threads:
>
> writer_thread
>   every 1/15 second do
>     BEGIN TRANSACTION
>       COPY table1 FROM stdin
>       ...
>       COPY tableN FROM stdin
>       perform several UPDATEs, DELETEs and INSERTs
>     COMMIT
>
> reader_thread
>   every 1/15 second do
>     BEGIN TRANSACTION
>       SELECT FROM table1 ...
>       ...
>       SELECT FROM tableN ...
>     COMMIT
>
> analyze_thread
>   every 5 minutes do
>     ANALYZE table1
>     ...
>     ANALYZE tableN
>
>
> Now, Postgres 8.0.3 out-of-the-box (all default configs) on a
> particular piece of hardware runs the Postgres connection for
> writer_thread at about 15% CPU (meaningless, I know, but for
> comparison) and runs the Postgres connection for reader_thread
> at about 30% CPU.  Latency for reader_thread seeing updates
> from writer_thread is well under 1/15s.  Impact of
> analyze_thread is negligible.
>
> If I make the single configuration change of setting
> vacuum_cost_delay=1000, each iteration in analyze_thread takes
> much longer, of course.  But what I also see is that the CPU
> usage of the connections for writer_thread and reader_thread
> spike up to well over 80% each (this is a dualie) and latency
> drops to 8-10s, during the ANALYZEs.
>
> I don't understand why this would be.  I don't think there
> are any lock issues, and I don't see any obvious I/O issues.
> Am I missing something?  Is there any way to get some
> insight into what those connections are doing?

The ANALYZE commands hold read locks on the tables you wish to write to.
If you slow them down, you merely slow down your write transactions
also, and then the read transactions that wait behind them. Every time
the ANALYZE sleeps it wakes up the other transactions, which then
realise they can't move because of locks and then wake up the ANALYZEs
for another shot. The end result is that you introduce more context-
switching, without any chance of doing more useful work while the
ANALYZEs sleep.

Don't use the vacuum_cost_delay in this situation. You might try setting
it to 0 for the analyze_thread only.

Sounds like you could speed things up by splitting everything into two
sets of tables, with writer_thread1 and writer_thread2 etc. That way
your 2 CPUs would be able to independently be able to get through more
work without locking each other out.

Best Regards, Simon Riggs


pgsql-performance by date:

Previous
From: Christopher Browne
Date:
Subject: Re: Mount database on RAM disk?
Next
From: Alejandro Lemus
Date:
Subject: Question