Re: Proposal: Another attempt at vacuum improvements - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Proposal: Another attempt at vacuum improvements
Date
Msg-id BANLkTiketuezHUzeVT9hqguaUZS4qnoeEQ@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Another attempt at vacuum improvements  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal: Another attempt at vacuum improvements  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, May 25, 2011 at 11:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 25, 2011 at 1:43 PM, Pavan Deolasee
> Now, there
>> is no way you can store is after the line pointer array without moving
>> the live tuple somewhere else.
>
> So far I agree.  But don't we always defragment immediately after
> pruning dead tuples to line pointers?  The removal of even one tuple
> will give us more than enough space to store the LSN.
>

Yes, we do. But defragment means shuffling tuples around. So we agree
that to find space for the LSN, we might need to move the tuples
around.

>
> Agreed.  The only thing I'm trying to do further is to avoid the need
> for a reshuffle when the special LSN storage is reclaimed.

Ah ok. That was never clear from your initial emails or may be I
mis-read. So what you are saying is by storing LSN after line pointer
array, we might be able to reclaim LSN storage without shuffling. That
makes sense. Having said that, it doesn't excite me too much because I
think we should do the dead line pointer reclaim operation during page
pruning and we are already holding cleanup lock at that time and most
likely do a reshuffle anyways.

Also a downside of storing LSN after line pointer array is that you
may waste space because of alignment issues. I also thought that the
LSN might come in between extending line pointer array, but probably
thats not a big deal since if there is free space in the page (and
there should be if we are adding a new tuple), it should be available
immediately after the LSN.

There are some other issues that we should think about too. Like
recording free space  and managing visibility map. The free space is
recorded in the second pass pass today, but I don't see any reason why
that can't be moved to the first pass. Its not clear though if we
should also record free space after retail page vacuum or leave it as
it is. For visibility maps, we should not update them until there are
LP_DEAD line pointers on the page. Now thats not good because all
tuples in the page may be visible, so we may loose some advantage, at
least for a while, but if mark the page all-visible, the vacuum scan
would not find the dead line pointers in it and that would leave
dangling index pointers after an index vacuum.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: SSI predicate locking on heap -- tuple or row?
Next
From: Robert Haas
Date:
Subject: Re: Proposal: Another attempt at vacuum improvements