Thread: Buffering GiST leaf pages too

Buffering GiST leaf pages too

From
Heikki Linnakangas
Date:
While looking at Alexander's GiST fastbuild patch, which adds buffers to 
internal nodes to avoid random I/O during index build, it occurred to me 
that inserting the tuples to the leaf pages one at a time is quite 
inefficient too, even if the leaf pages are in cache. There's still the 
overhead of locking and WAL-logging each insertion separately. I think 
we could get a nice further speedup if we attach a small buffer (one 
block or so) to every leaf page we're currently writing tuples to, and 
update the leaf page in bulk. Conveniently, the code to insert multiple 
tuples to a page already exists in GiST code (because inserting a tuple 
sometimes splits the page into more than two parts, so you need to 
insert multiple downlinks to the parent), so this requires no changes to 
the low-level routines and WAL-logging.

Let's finish off the main fastbuild patch first, but I wanted to get the 
idea out there.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Buffering GiST leaf pages too

From
Jim Nasby
Date:
On Aug 23, 2011, at 2:03 AM, Heikki Linnakangas wrote:
> While looking at Alexander's GiST fastbuild patch, which adds buffers to internal nodes to avoid random I/O during
indexbuild, it occurred to me that inserting the tuples to the leaf pages one at a time is quite inefficient too, even
ifthe leaf pages are in cache. There's still the overhead of locking and WAL-logging each insertion separately. I think
wecould get a nice further speedup if we attach a small buffer (one block or so) to every leaf page we're currently
writingtuples to, and update the leaf page in bulk. Conveniently, the code to insert multiple tuples to a page already
existsin GiST code (because inserting a tuple sometimes splits the page into more than two parts, so you need to insert
multipledownlinks to the parent), so this requires no changes to the low-level routines and WAL-logging. 
>
> Let's finish off the main fastbuild patch first, but I wanted to get the idea out there.

I've often wondered about the per-tuple overhead of all kinds of operations, not just GiST index builds. For example,
ifyou're doing a seqscan, ISTM it would be a lot more efficient to memcpy an entire page into backend-local memory and
operateoff of that lock-free. Similarly for an index scan, you'd want to copy a full leaf page if you think you'll be
hittingit more than once or twice. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: Buffering GiST leaf pages too

From
Heikki Linnakangas
Date:
On 26.08.2011 00:45, Jim Nasby wrote:
> I've often wondered about the per-tuple overhead of all kinds of operations, not just GiST index builds. For example,
ifyou're doing a seqscan, ISTM it would be a lot more efficient to memcpy an entire page into backend-local memory and
operateoff of that lock-free.
 

What we currently do is even better than that. We take the lock once, 
and hold it while we do all the visibility checks. Then the lock is 
released, but the page is kept pinned so that it doesn't get evicted 
from the buffer cache. No memcpy() required.

> Similarly for an index scan, you'd want to copy a full leaf page if you think you'll be hitting it more than once or
twice.

We more or less do that too already. When an index scan steps on a leaf 
page, it scans the page for all matches, and copies them to 
backend-local memory. The page lock is then released.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com