Re: CLUSTER and clustered indices - Mailing list pgsql-hackers

From Jonah H. Harris
Subject Re: CLUSTER and clustered indices
Date
Msg-id 36e682920511172045t272575d2od0eef16cc9cf7b12@mail.gmail.com
Whole thread Raw
In response to Re: CLUSTER and clustered indices  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
I agree, keeping it clustered would be very nice.

On 11/17/05, Alvaro Herrera <alvherre@commandprompt.com> wrote:
Simon Riggs wrote:
> On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
> > Simon Riggs < simon@2ndquadrant.com> writes:
>
> The use case exists and the technique is low overhead, but the main
> question is: Does anybody think this behaviour would be beneficial for
> them? (I'm actually in two minds myself, but once the idea has arisen,
> it seems sensible to discuss this for everybody's sake).

I have no use for it but I see it would be beneficial in some cases.

> The trade-off is a table that keeps growing in size, even though you
> VACUUM it, with the benefit that the clustering is maintained.
>
> So how would you maintain it? Looks like you'd still have to use regular
> CLUSTER commands, but at least it would stay good in between.

Yeah, this is a problem.  The growth is unbounded.  Even if there's a
completely empty page somewhere, it can't be used because all tuples
will go to the last page.  The problem with using CLUSTER for
maintenance is that it takes an exclusive lock on the table, which is a
thing we've been running away from.  You are right in that it's much
cheaper than CLUSTERing a table that isn't ordered, because there's much
more locality.  But I don't think it's a big enough win.

Because of the drawbacks (unbounded growth being the most prominent one)
this would have to be an optional thing.  This means we would need an
additional system catalog column to keep whether it's active or not.
And a user command to activate it.  So it's starting to be a more
invasive thing.  Not that these things matter a whole lot, but anyway.

Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Anyone want to fix plperl for null array elements?
Next
From: "Jim C. Nasby"
Date:
Subject: Loading 7.4 dump to 8.1 with user-custom search_path breaks