Re: Page at a time index scan - Mailing list pgsql-patches
From | Heikki Linnakangas |
---|---|
Subject | Re: Page at a time index scan |
Date | |
Msg-id | Pine.OSF.4.61.0605021750180.463955@kosh.hut.fi Whole thread Raw |
In response to | Re: Page at a time index scan (Simon Riggs <simon@2ndquadrant.com>) |
Responses |
Re: Page at a time index scan
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-patches |
On Tue, 2 May 2006, Simon Riggs wrote: > On Mon, 2006-05-01 at 22:00 +0300, Heikki Linnakangas wrote: >> >> 1. An index scan now needs to allocate enough memory to hold potentially a >> whole page worth of items. And if you use markpos/restrpos, twice that >> much. I don't know if that's an issue, but I thought I'd bring that up. > > AFAICS the code: > > - allocates memory for the markpos whether or not its ever needed? Most > index scans never call markpos and not all merge joins either, so that > seems wasteful. We could allocate when btmarkpos() is called for the > first time, if ever. Right. I'll do that. > - allocates 1024 offsets in every case. If this were just a unique index > retrieval, that would be too much. When scan->is_multiscan == true, go > straight for 1024, otherwise start with just 32 offsets and double that > when/if required. I wonder if that gets a bit too complicated for saving a small amount of memory? I'll do it if people think it's worth it. Also, could we calculate a better estimate of the maximum number of offsets an index page can hold? There's no way a page can really hold 1024 items. Page headers and special space, line pointers, and the actual keys need some space. >> However, btbulkdelete doesn't know if the vacuum is a full or >> lazy one. The patch just assumes it's a lazy vacuum, but the API really >> needs to be changed to pass that information earlier than at vacuum_cleanup. > > Looks like it needs work. Do you have suggestions while you're there? Now that I look at it: Why do we have a separate vacuum_cleanup function at all? Calls to index_vacuum_cleanup go hand in hand with calls to index_bulk_delete. I thought that index_vacuum_cleanup would only be called after the last cycle of a multi-cycle vacuum, but that doesn't seem to be the case. >> 3. Before the patch, a scan would keep the current page pinned to keep >> vacuum from deleting the current item. The patch doesn't change that >> behaviour, but it now seems to me that even a pin is no longer needed. > > Agreed. The pin has two functions: > - keep the page from being moved out of the bufmgr - no need anymore > - stop a vacuum from removing the page - no need anymore. We'll not stop > on a removable row anymore, so no need. At the moment, backward scan returns to the page to walk left from there. It could be changed to take a copy of the left sibling pointer like forward scans do, eliminating the need to return to the original page in most cases. Also, if there's dead tuples on the page, the scan will need to return to the page to kill them. But you can always re-pin the page when needed. > Some of the code doesn't use standard spacing e.g. "if(" should be "if > (", but other than that it looks very neat and well implemented. Thanks, I'll fix the spacings. - Heikki
pgsql-patches by date: