Re: autovacuum next steps, take 2 - Mailing list pgsql-hackers

From Matthew T. O'Connor
Subject Re: autovacuum next steps, take 2
Date
Msg-id 45DCCA75.7050908@zeut.net
Whole thread Raw
In response to Re: autovacuum next steps, take 2  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: autovacuum next steps, take 2  ("Jim C. Nasby" <jim@nasby.net>)
List pgsql-hackers
Alvaro Herrera wrote:
> Ok, scratch that :-)  Another round of braindumping below.

I still think this is solution in search of a problem.  The main problem 
we have right now is that hot tables can be starved from vacuum.  Most 
of this proposal doesn't touch that.  I would like to see that problem 
solved first, then we can talk about adding multiple workers per 
database or per tablespace etc...

> (This idea can be complemented by having another GUC var,
> autovacuum_hot_workers, which allows the DBA to have more than one
> worker on hot tables (just for the case where there are too many hot
> tables).  This may be overkill.)

I think this is more along the lines of what we need first.

> Ron Mayer expressed the thought that we're complicating needlessly the
> UI for vacuum_delay, naptime, etc.  He proposes that instead of having
> cost_delay etc, we have a mbytes_per_second parameter of some sort.
> This strikes me a good idea, but I think we could make that after this
> proposal is implemented.  So this "take 2" could be implemented, and
> then we could switch the cost_delay stuff to using a MB/s kind of
> measurement somehow (he says waving his hands wildly).

Agree this is probably a good idea in the long run, but I agree this is 
lower on the priority list and should come next.

> Greg Stark and Matthew O'Connor say that we're misdirected in having
> more than one worker per tablespace.  I say we're not :-)  If we
> consider Ron Mayer's idea of measuring MB/s, but we do it per
> tablespace, then we would inflict the correct amount of vacuum pain to
> each tablespace, sleeping as appropriate.  I think this would require
> workers of different databases to communicate what tablespaces they are
> using, so that all of them can utilize the correct amount of bandwidth.

I agree that in the long run it might be better to have multiple workers 
with MB/s throttle and tablespace aware, but we don't have any of that 
infrastructure right now.  I think the piece of low-hanging fruit that 
your launcher concept can solve is the hot table starvation.

My Proposal:  If we require admins to identify hot tables tables, then: 
1) Launcher fires-off a worker1 into database X.
2) worker1 deals with "hot" tables first, then regular tables.
3) Launcher continues to launch workers to DB X every autovac naptime. 
4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as 
worker1 did above.  If worker1 is still working in DB X then worker2 
looks for hot tables that are being starved because worker1 got busy. 
If worker2 finds no hot tables that need work, then worker2 exits.

This seems a very simple solution (given your launcher work) that can 
solve the starvation problem.


Thoughts?


pgsql-hackers by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Column storage positions
Next
From: Gregory Stark
Date:
Subject: Re: autovacuum next steps, take 2