Re: Vacuum questions... - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Vacuum questions...
Date
Msg-id 1127944213.4860.5.camel@fuji.krosing.net
Whole thread Raw
In response to Re: Vacuum questions...  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: Vacuum questions...
List pgsql-hackers
On T, 2005-09-27 at 17:57 -0500, Jim C. Nasby wrote:
> On Tue, Sep 27, 2005 at 02:47:46PM -0400, Jan Wieck wrote:
> > On 9/24/2005 8:17 PM, Jim C. Nasby wrote:
> > 
> > >Would it be difficult to vacuum as part of a dump? The reasoning behind
> > >this is that you have to read the table to do the dump anyway, 
> > 
> > I think aside from what's been said so far, it would be rather difficult 
> > anyway. pg_dump relies on MVCC and requires to run in one transaction to 
> > see a consistent snapshot while vacuum jiggles around with transactions 
> > in some rather non-standard way.
> 
> Is this true even if they were in different connections?
> 
> My (vague) understanding of the vacuum process is that it first vacuums
> indexes, and then vacuums the heap. 

actually (lazy) vacuum does this

1) scan heap, collect ctids of rows to remove
2) clean indexes
3) clean heap

> Since we don't dump indexes, there's
> nothing for backup to do while those are vacuumed, so my idea is:
> 
> pg_dump:
> foreach (table)
>     spawn vacuum
>     wait for vacuum to hit heap
>     start copy
>     wait for analyze to finish
> next;

probably the first heap scan of vacuum would go faster than dump as it
does not have to write out anything, and the second scan ( nr 3 in above
list ) would be either faster or slower, as it has to lock each page and
rearrange tuples there.

so it would be very hard to synchronize vacuum with either of them.

-- 
Hannu Krosing <hannu@skype.net>



pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: Open items list for 8.1
Next
From: Neil Conway
Date:
Subject: Re: Open items list for 8.1