Re: VACUUMs and WAL - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: VACUUMs and WAL
Date
Msg-id 877i7tc4gn.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: VACUUMs and WAL  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> writes:

> On Tue, 2008-10-28 at 10:59 +0000, Gregory Stark wrote:
>> To do that he proposed we do:
>> 
>> 1. scan heap doing two things: a) remove any marked tuples if they were marked
>>    by a previous vacuum which committed and b) prune and mark any tuples we
>>    find are deletable for a future vacuum to remove.
>> 
>> 2. scan indexes and remove the tuples we marked in 1b.
>
> It's fairly hard to remove the second heap pass completely. 
>
> I think what I am suggesting is two heap passes, but writing WAL and
> dirtying blocks on only one of the passes.

How small a patch would it be? I guess you just need to disable all pruning in
the first pass and do it in the second patch? 

I would still rather see Pavan's optimizationo if we can do it cleanly, but if
it's not going to happen and this is trivial then sure, we may as well.

>
> The biggest I/O cost comes from the writes, not the reads, ISTM.

It's counter-intuitive but actually it's usually the other way around. Writes
can be buffered, re-ordered, and scheduled during otherwise idle time. Reads
however are always blocking.

However in this situation I think you may be right. Vacuum is doing a
sequential scan through the table so if the OS has to interrupt that scan to
go do some writes it'll end up having to go back and forth. That would be a
*lot* slower than just doing a sequential scan.



--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Proposal of PITR performance improvement for 8.4.
Next
From: Hannu Krosing
Date:
Subject: Re: VACUUMs and WAL