Re: Proposal: Another attempt at vacuum improvements - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Proposal: Another attempt at vacuum improvements
Date
Msg-id BANLkTin627onPsqETSARf9MPKYK7pfzSZQ@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Another attempt at vacuum improvements  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal: Another attempt at vacuum improvements  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, May 25, 2011 at 7:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 25, 2011 at 7:07 AM, Pavan Deolasee
> <pavan.deolasee@gmail.com> wrote:
>>> But instead of allocating permanent space in the page header, which would
>>> both reduce (admittedly only by 8 bytes) the amount of space available
>>> for tuples, and more significantly have the effect of breaking on-disk
>>> compatibility, I'm wondering if we could get by with making space for
>>> that extra LSN only when it's actually present. In other words, when
>>> it's present, we set a bit PD_HAS_DEAD_LINE_PTR_LSN or somesuch,
>>> increment pd_upper, and use the extra space to store the LSN.  There
>>> is an alignment problem to worry about there but that shouldn't be a
>>> huge issue.
>>
>> That might work but would require us to move tuples around when the first
>> dead line pointer gets generated in the page.
>
> I'm confused.  A major point of the approach I was proposing was to
> avoid having to move tuples around.
>

Well, I am not sure how you can always guarantee to make space
available for the LSN without moving tuples , irrespective of where
you store it.  But probably its not important as we discussed below.

>> You may argue that we should
>> be holding a cleanup-lock when that happens and the dead line pointer
>> creation is always followed by a call to PageRepairFragmentation(), so it
>> should be easier to make space for the LSN.
>
> I'm not sure if this is the same thing you're saying, but certainly
> the only time we need to make space for this value is when we've just
> remove tuples from the page and defragmented, and at that point there
> should certainly be 8 bytes free somewhere.
>

Agree.

>> Instead of storing the LSN after the page header, would it be easier to set
>> pd_special and store the LSN at the end of the page ?
>
> I was proposing storing it after the line pointer array, not after the
> page header.  If we store it at the end of the page, I suspect we're
> going to basically end up allocating permanent space for it, because
> otherwise we'll have to shift all the tuple data forward and backward
> by 8 bytes when we allocate or deallocate space for this.  Now, maybe
> that's OK: I'm not sure.  But it's something to think about carefully.
>  If we are going to allocate permanent space, the special space seems
> better than the page header, because we should be able to make that
> work without on-disk compatibility, and because AFAIUI we only need
> the space for heap pages, not index pages.
>

I think if are reclaiming LP_DEAD line pointers only while
defragmenting the page, we can always reclaim the space for the LSN,
irrespective of where we store it. So may be we should decide
depending on what would matter for on-disk compatibility and whatever
requires least invasive changes. I don't know what is that yet.


>
>> If so, how do we handle the case where after restart the page may get LSN
>> less than the index vacuum LSN if the index vacuum happened before the
>> crash/stop ?
>
> Well, on a crash, the unlogged relations get truncated, and their
> indexes also, so no problem.  On a clean shutdown, I guess we need to
> arrange to save the counter across restarts.

Oh ok. I was not aware that unlogged tables get truncated. I think we
can just restore from the value stored for last successful index
vacuum (after incrementing it may be). That should be possible.

>
> Take a look at the existing logic around GetXLogRecPtrForTemp().
> That's slightly different, because there we don't even need to be
> consistent across backends.  We just need an increasing sequence of
> values.  For unlogged relations things are a bit more complex - but it
> seems manageable.

Ok. Will look at it.

Thanks,
Pavan


--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: The way to know whether the standby has caught up with the master
Next
From: Stephen Frost
Date:
Subject: Re: Volunteering as Commitfest Manager