Re: Maintaining cluster order on insert - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: Maintaining cluster order on insert
Date
Msg-id 464A33AB.1050401@enterprisedb.com
Whole thread Raw
In response to Re: Maintaining cluster order on insert  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Maintaining cluster order on insert
List pgsql-patches
Ah, thanks! I had forgotten about it as well.

Bruce Momjian wrote:
> [ Sorry I found this one only found recently.]
>
> Your patch has been added to the PostgreSQL unapplied patches list at:
>
>     http://momjian.postgresql.org/cgi-bin/pgpatches
>
> It will be applied as soon as one of the PostgreSQL committers reviews
> and approves it.
>
> ---------------------------------------------------------------------------
>
>
> Heikki Linnakangas wrote:
>> While thinking about index-organized-tables and similar ideas, it
>> occurred to me that there's some low-hanging-fruit: maintaining cluster
>> order on inserts by trying to place new heap tuples close to other
>> similar tuples. That involves asking the index am where on the heap the
>> new tuple should go, and trying to insert it there before using the FSM.
>> Using the new fillfactor parameter makes it more likely that there's
>> room on the page. We don't worry about the order within the page.
>>
>> The API I'm thinking of introduces a new optional index am function,
>> amsuggestblock (suggestions for a better name are welcome). It gets the
>> same parameters as aminsert, and returns the heap block number that
>> would be optimal place to put the new tuple. It's be called from
>> ExecInsert before inserting the heap tuple, and the suggestion is passed
>> on to heap_insert and RelationGetBufferForTuple.
>>
>> I wrote a little patch to implement this for btree, attached.
>>
>> This could be optimized by changing the existing aminsert API, because
>> as it is, an insert will have to descend the btree twice. Once in
>> amsuggestblock and then in aminsert. amsuggestblock could keep the right
>> index page pinned so aminsert could locate it quicker. But I wanted to
>> keep this simple for now. Another improvement might be to allow
>> amsuggestblock to return a list of suggestions, but that makes it more
>> expensive to insert if there isn't room in the suggested pages, since
>> heap_insert will have to try them all before giving up.
>>
>> Comments regarding the general idea or the patch? There should probably
>> be a index option to turn the feature on and off. You'll want to turn it
>> off when you first load a table, and turn it on after CLUSTER to keep it
>> clustered.
>>
>> Since there's been discussion on keeping the TODO list more up-to-date,
>> I hereby officially claim the "Automatically maintain clustering on a
>> table" TODO item :). Feel free to bombard me with requests for status
>> reports. And just to be clear, I'm not trying to sneak this into 8.2
>> anymore, this is 8.3 stuff.
>>
>> I won't be implementing a background daemon described on the TODO item,
>> since that would essentially be an online version of CLUSTER. Which sure
>> would be nice, but that's a different story.
>>
>> - Heikki
>>
>
>


--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Maintaining cluster order on insert
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Automatic adjustment of bgwriter_lru_maxpages