Re: Page at a time index scan - Mailing list pgsql-patches

From Simon Riggs
Subject Re: Page at a time index scan
Date
Msg-id 1146647678.449.42.camel@localhost.localdomain
Whole thread Raw
In response to Re: Page at a time index scan  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Page at a time index scan
List pgsql-patches
On Tue, 2006-05-02 at 15:35 -0400, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
> > On Tue, 2 May 2006, Tom Lane wrote:
> >> Backwards scan may break this whole concept; are you sure you've thought
> >> it through?
>
> > I think so. The patch doesn't change the walk-left code. Do you have
> > something specific in mind?
>
> I'm worried about synchronization, particularly what happens if the page
> gets deleted from under you while you don't have it pinned.

Perhaps I should update my comments on "we don't need a pin at all"...

On a Forward scan we need to pin while we are reading a whole page,
though can release the pin afterwards. We don't need to keep the pin
while servicing btgetnext() requests from our private page buffer
though. (Which is what I meant to say.)

AFAICS we will need to return to the page for a backward scan, so we
could just keep the pin the whole way. It's not possible to cache the
left page pointer because block splits to our immediate left can update
them even after we read the page contents. (A forward scan need never
fear page splits in the same way because existing items can't move past
the existing page boundary).

We need never return to a page that *could* be deleted. While scanning
in either direction, if the complete page contains nothing but dead
items we can simply move straight onto the next page, having updated the
page status to half-dead. (The great thing about this patch is we should
be able to report that somehow, so an asynchronous task handler can come
and clean that page (only) now that we don't have a restriction on
individual page vacuuming. We can think about somehow later)

If only some of the index tuples are deleted, we should only return to
the page to update the deleted index tuples *if*:
- the page is still in the buffer pool. If its been evicted its because
space is tight so we shouldn't call it back just to dirty the page.
- we have a minimum threshold of deleted tuples. Otherwise we might
re-dirty the page for just a single hint bit, so we end up writing the
page out hundreds of times. (Guess: that should be 2 or 3)

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


pgsql-patches by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: plpython improvements
Next
From: Tom Lane
Date:
Subject: Re: Page at a time index scan