Re: maintain_cluster_order_v5.patch - Mailing list pgsql-performance

From phb07@apra.asso.fr
Subject Re: maintain_cluster_order_v5.patch
Date
Msg-id 20091021175518.58A7C4B020E@smtp2-g21.free.fr
Whole thread Raw
In response to maintain_cluster_order_v5.patch  ("phb07@apra.asso.fr" <phb07@apra.asso.fr>)
Responses Re: maintain_cluster_order_v5.patch  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-performance
Hi Jeff,

>> Hi all,
>>
>> The current discussion about "Indexes on low cardinality columns" let
>> me discover this
>> "grouped index tuples" patch (http://community.enterprisedb.com/git/)
>> and its associated
>> "maintain cluster order" patch
>> (http://community.enterprisedb.com/git/maintain_cluster_order_v5.patch)
>>
>> This last patch seems to cover the TODO item named "Automatically
>> maintain clustering on a table".
>
>The TODO item isn't clear about whether the order should be strictly
>maintained, or whether it should just make an effort to keep the table
>mostly clustered. The patch mentioned above makes an effort, but does
>not guarantee cluster order.
>
You are right, there are 2 different visions : a strictly maintained order or a  possibly maintained order.
This later is already a good enhancement as it largely decrease the time interval between 2 CLUSTER operations, in
particularif the FILLFACTOR is properly set. In term of performance, having 99% of rows in the "right" page is not
realyworse than having totaly optimized storage.  
The only benefit of a strictly maintained order is that there is no need for CLUSTER at all, which could be very
interestingfor very large databases with 24/24 access constraint. 
For our need, the "possibly maintained order" is enough.

>> As this patch is not so new (2007), I would like to know why it has
>> not been yet integrated in a standart version of PG (not well
>> finalized ? not totaly sure ? not corresponding to the way the core
>> team would like to address this item ?) and if there are good chance
>> to see it committed in a near future.
>
>Search the archives on -hackers for discussion. I don't think either of
>these features were rejected, but some of the work and benchmarking have
>not been completed.
OK, I will have a look.
>
>If you can help (either benchmark work or C coding), try reviving the
>features by testing them and merging them with the current tree.
OK, that's the rule of the game in such a community.
I am not a good C writer, but I will see what I could do.

> I recommend reading the discussion first, to see if there are any major
>problems.

>
>Personally, I'd like to see the GIT feature finished as well. When I
>have time, I was planning to take a look into it.
>
>Regards,
>    Jeff Davis



pgsql-performance by date:

Previous
From: Scott Carey
Date:
Subject: Re: There is a statistic table?
Next
From: Jesper Krogh
Date:
Subject: Re: Random penalties on GIN index updates?