Re: Cost-Based Vacuum Delay tuning - Mailing list pgsql-performance

From Erik Jones
Subject Re: Cost-Based Vacuum Delay tuning
Date
Msg-id CBEABDE4-0420-4723-A884-0CC351EBBF02@myemma.com
Whole thread Raw
In response to Cost-Based Vacuum Delay tuning  (Guillaume Cottenceau <gc@mnc.ch>)
Responses Re: Cost-Based Vacuum Delay tuning
List pgsql-performance
On Dec 7, 2007, at 4:50 AM, Guillaume Cottenceau wrote:

> Hi,
>
> I'm currently trying to tune the Cost-Based Vacuum Delay in a
> 8.2.5 server. The aim is to reduce as much as possible the
> performance impact of vacuums on application queries, with the
> background idea of running autovacuum as much as possible[1].
>
> My test involves vacuuming a large table, and measuring the
> completion time, as the vacuuming proceeds, of a rather long
> running application query (involving a table different from the
> one being vacuumed) which cannot fit entirely in buffers (and the
> completion time of the vacuum, because it needs not be too slow,
> of course).
>
> I ran my tests with a few combinations of
> vacuum_cost_delay/vacuum_cost_limit, while keeping the other
> parameters set to the default from the 8.2.5 tarball:
>
> vacuum_cost_page_hit = 1
> vacuum_cost_page_miss = 10
> vacuum_cost_page_dirty = 20
>
> The completion time of the query is about 16 seconds in
> isolation. With a vacuuming proceeding, here are the results:
>
>   vacuum_cost_delay/vacuum_cost_limit  (deactivated)  20/200
> 40/200   100/1000   150/1000   200/1000   300/1000
>
>         VACUUM ANALYZE time                54 s        112 s    188
> s    109 s       152 s      190 s      274 s
>         SELECT time                        50 s         28 s     26
> s     24 s        22 s       20 s       19 s

While you do mention that the table you're running your select on is
too big to fit in the shared_buffers, the drop in time between the
first run and the rest most likely still reflects the fact that when
running those tests successively a good portion of the table will
already be in shared_buffers as well as being in the filesystem
cache, i.e. very little of the runs after the first will have to hit
the disk much.

> I have noticed that others (Alvaro, Joshua) suggest to set
> vacuum_cost_delay as low as 10 or 20 ms, however in my situation
> I'd like to lower the performance impact in application queries
> and will probably choose 150/1000 where "only" a +40% is seen -
> I'm curious if anyone else has followed the same path, or is
> there any outstanding flaw I've missed here? I'm talking
> outstanding, as of course any local decision may be different in
> the hope of favouring a different database/application behaviour.
>
>
> Other than that, it's the results obtained with the design
> principle of Cost-Base Vacuum Delay, which I find a little
> surprising. Of course, I think it has been thought through a lot,
> and my observations are probably naive, but I'm going to throw my
> ideas anyway, who knows.
>
> I'd think that it would be possible to lower yet again the impact
> of vacuuming on other queries, while keeping a vacuuming time
> with little overhead, if dynamically changing the delays related
> to database activity, rather than using fixed costs and delays.
> For example, before and after each vacuum sleep delay is
> completed, pg could:
>
> - check the amount of currently running queries
>   (pg_stat_activity), and continue sleeping if it is above a
>   configured threshold; by following this path, databases with
>   peak activities could use a threshold of 1 and have zero
>   ressource comsumption for vacuuming during peaks, still having
>   nearly no time completion overhead for vacuuming out of peaks
>   (since the check is performed also before the sleep delay,
>   which would be deactivated if no queries are running); if we
>   can afford a luxury implementation, we could always have a
>   maximum sleep time configuration, which would allow vacuuming
>   to proceed a little bit even when there's no timeframe with low
>   enough database activity
>
> - alternatively, pg could make use of some longer term statistics
>   (load average, IO statistics) to dynamically pause the
>   vacuuming - this I guess is related to the host OS and probably
>   more difficult to have working correctly with multiple disks
>   and/or processes running - however, if you want high
>   performance from PostgreSQL, you probably won't host other IO
>   applications on the same disk(s)

These ideas have been discussed much.  Look in the archives to the
beginning of this year.  I think the general consensus was that it
would be good have multiple autovacuum workers that could be tuned
for different times or workloads.  I know Alvarro was going to work
on something along those lines but I'm not sure what's made it into
8.3 or what's still definitely planned for the future.

> While I'm at it, a different Cost-Based Vacuum Delay issue:
> VACUUM FULL also follows the Cost-Based Vacuum Delay tunings.
> While it makes total sense when you want to perform a query on
> another table, it becomes a problem when your query is waiting
> for the exclusive lock on the vacuumed table. Potentially, you
> will have the vacuuming proceeding "slowly" because of the
> Cost-Based Vacuum Delay, and a blocked application because the
> application queries are just waiting.
>
> I'm wondering if it would not be possible to dynamically ignore
> (or lower, if it makes more sense?) the Cost-Based Vacuum Delay
> during vacuum full, if a configurable amount of queries are
> waiting for the lock?
>
> (please save yourself from answering "you should never run VACUUM
> FULL if you're vacuuming enough" - as long as VACUUM FULL is
> available in PostgreSQL, there's no reason to not make it as
> practically usable as possible, albeit with low dev priority)

Ok, I won't say what you said not to say.  But, I will say that I
don't agree with you're conjecture that VACUUM FULL should be made
more lightweight, it's like using dynamite to knock a whole in a wall
for a window.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Trouble with LEFT JOIN using VIEWS.
Next
From: Guillaume Cottenceau
Date:
Subject: Re: Cost-Based Vacuum Delay tuning