Home > mailing lists

Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: vacuum, performance, and MVCC
Date	June 23, 2006 14:09:24
Msg-id	16709.1151082529@sss.pgh.pa.us Whole thread Raw
In response to	Re: vacuum, performance, and MVCC (Csaba Nagy <nagy@ecircle-ag.com>)
Responses	Re: vacuum, performance, and MVCC Re: vacuum, performance, and MVCC Re: vacuum, performance, and MVCC
List	pgsql-hackers

Tree view

Csaba Nagy <nagy@ecircle-ag.com> writes:
>> Surprisingly its mostly WAL traffic, the heap/index pages themselves are
>> often not yet synced to disk by time of vacuum, so no additional traffic
>> there. If you had made 5 updates per page and then vacuum it, then you
>> make effectively 1 extra WAL write meaning 20% increase in WAL traffic. 

> Is this also holding about read traffic ? I thought vacuum will make a
> full table scan... for big tables a full table scan is always badly
> influencing the performance of the box. If the full table scan would be
> avoided, then I wouldn't mind running vacuum in a loop... 

If you're doing heavy updates of a big table then it's likely to end up
visiting most of the table anyway, no?  There is talk of keeping a map
of dirty pages, but I think it'd be a win for infrequently-updated
tables, not ones that need constant vacuuming.

I think a lot of our problems in this area could be solved with fairly
straightforward tuning efforts on the existing autovacuum
infrastructure.  In particular, someone should be looking into
recommendable default vacuum-cost-delay settings so that a background
vacuum doesn't affect performance too much.  Another problem with the
current autovac infrastructure is that it doesn't respond very well to
the case where there are individual tables that need constant attention
as well as many that don't.  If you have N databases then you can visit
a particular table at most once every N*autovacuum_naptime seconds, and
*every* table in the entire cluster gets reconsidered at that same rate.
I'm not sure if we need the ability to have multiple autovac daemons
running at the same time, but we definitely could use something with a
more flexible table-visiting pattern.  Perhaps it would be enough to
look through the per-table stats for each database before selecting the
database to autovacuum in each cycle, instead of going by "least
recently autovacuumed".

Bottom line: there's still lots of low-hanging fruit.  Why are people
feeling that we need to abandon or massively complicate our basic
architecture to make progress?
        regards, tom lane

pgsql-hackers by date:

From: Gaetano Mendola
Date: 23 June 2006, 13:53:54
Subject: Re: checking on buildfarm member thrush

From: Tzahi Fadida
Date: 23 June 2006, 14:13:46
Subject: Re: Planning without reason.

Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

Previous

Next