Re: heap vacuum & cleanup locks - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: heap vacuum & cleanup locks
Date
Msg-id EDB81868-4996-4D91-8CDF-1BAFA4FA42DC@nasby.net
Whole thread Raw
In response to Re: heap vacuum & cleanup locks  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: heap vacuum & cleanup locks
List pgsql-hackers
On Jun 6, 2011, at 1:00 AM, Robert Haas wrote:
> On Mon, Jun 6, 2011 at 12:19 AM, Itagaki Takahiro
> <itagaki.takahiro@gmail.com> wrote:
>> On Sun, Jun 5, 2011 at 12:03, Robert Haas <robertmhaas@gmail.com> wrote:
>>> If other buffer pins do exist, then we can't
>>> defragment the page, but that doesn't mean no useful work can be done:
>>> we can still mark used line pointers dead, or dead line pointers
>>> unused.  We cannot defragment, but that can be done either by the next
>>> VACUUM or by a HOT cleanup.
>>
>> This is just an idea -- Is it possible to have copy-on-write techniques?
>> VACUUM allocates a duplicated page for the pinned page, and copy valid
>> tuples into the new page. Following buffer readers after the VACUUM will
>> see the cloned page instead of the old pinned one.
>
> Heikki suggested the same thing, and it's not a bad idea, but I think
> it would be more work to implement than what I proposed.  The caller
> would need to be aware that, if it tries to re-acquire a content lock
> on the same page, the offset of the tuple within the page might
> change.  I'm not sure how much work would be required to cope with
> that possibility.

I've had a related idea that I haven't looked into... if you're scanning a relation (ie: index scan, seq scan) I've
wonderedif it would be more efficient to deal with the entire page at once, possibly be making a copy of it. This would
reducethe number of times you pin the page (often quite dramatically). I realize that means copying the entire page,
butI suspect that would occur entirely in the L1 cache, which would be fast. 

So perhaps instead of copy on write we should try for copy on read on all appropriate plan nodes.

On a related note, I've also wondered if it would be useful to allow nodes to deal with more than one tuple at a time;
theidea being that it's better to execute a smaller chunk of code over a bigger chunk of data instead of dribbling
tuplesthrough an entire execution tree one at a time. Perhaps that will only be useful if nodes are executing in
parallel...
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: heap vacuum & cleanup locks
Next
From: Pavan Deolasee
Date:
Subject: Re: heap vacuum & cleanup locks