Thread: CLUSTER and clustered indices

CLUSTER and clustered indices

From

Simon Riggs

Date:

17 November 2005, 11:25:33

When a table has been CLUSTERed on a particular index AND that index
values is monotonically increasing, then it would be a bad move to use
blocks from the FSM since this would tend to destroy the natural
clustering sequence.

The index values will be monotonically increasing if a datatype is
defined as SERIAL or if the default value is defined as the nextval of a
sequence.

Does anybody think it would be a good idea to not use the FSM if
- we have a CLUSTER defined on an index
- for the indexed column we have default value set of nextval()

Best Regards, Simon Riggs

Re: CLUSTER and clustered indices

From

Tom Lane

Date:

17 November 2005, 11:58:30

Simon Riggs <simon@2ndquadrant.com> writes:
> When a table has been CLUSTERed on a particular index AND that index
> values is monotonically increasing, then it would be a bad move to use
> blocks from the FSM since this would tend to destroy the natural
> clustering sequence.

By the time there are any blocks in FSM to take, the original clean
index page sequence is doubtless history.  The pure-increasing-key
scenario you are thinking of never will have any FSM entries, so it's
moot.
        regards, tom lane

Re: CLUSTER and clustered indices

From

Alvaro Herrera

Date:

17 November 2005, 12:02:12

Simon Riggs wrote:
> When a table has been CLUSTERed on a particular index AND that index
> values is monotonically increasing, then it would be a bad move to use
> blocks from the FSM since this would tend to destroy the natural
> clustering sequence.
> 
> The index values will be monotonically increasing if a datatype is
> defined as SERIAL or if the default value is defined as the nextval of a
> sequence.
> 
> Does anybody think it would be a good idea to not use the FSM if
> - we have a CLUSTER defined on an index
> - for the indexed column we have default value set of nextval()

That's a nice idea, but what's the cost?  You will have to check every
insert: does the table has indexes?  Is any of them clustered?  Is the
clustered index attached to a sequence?  It seems quite an expensive
check to be making.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: CLUSTER and clustered indices

From

Simon Riggs

Date:

17 November 2005, 17:48:48

On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > When a table has been CLUSTERed on a particular index AND that index
> > values is monotonically increasing, then it would be a bad move to use
> > blocks from the FSM since this would tend to destroy the natural
> > clustering sequence.
> 
> By the time there are any blocks in FSM to take, the original clean
> index page sequence is doubtless history.  The pure-increasing-key
> scenario you are thinking of never will have any FSM entries, so it's
> moot.

Well, just because its a pure-increasing-key doesn't imply rows don't
get deleted. Look at the NewOrder table in DBT-2/TPC-C. Order numbers
increase, but mostly they get delivered in the end. 

In the case I describe, tuples are added always to the rightmost edge of
the index, so it seems worthwhile to always add tuples to the top of the
heap only so that the order is consistent.

On Thu, 2005-11-17 at 12:45 -0300, Alvaro Herrera wrote:

> That's a nice idea, but what's the cost?  You will have to check every
> insert: does the table has indexes?  Is any of them clustered?  Is the
> clustered index attached to a sequence?  It seems quite an expensive
> check to be making.

I'd make the check when the relcache is loaded and store it there as a
boolean. heap_insert already has a boolean on it for use_fsm, so it
would transfer very simply at insert time with almost no overhead.

The use case exists and the technique is low overhead, but the main
question is: Does anybody think this behaviour would be beneficial for
them? (I'm actually in two minds myself, but once the idea has arisen,
it seems sensible to discuss this for everybody's sake).

The trade-off is a table that keeps growing in size, even though you
VACUUM it, with the benefit that the clustering is maintained. 

So how would you maintain it? Looks like you'd still have to use regular
CLUSTER commands, but at least it would stay good in between.

Best Regards, Simon Riggs

Re: CLUSTER and clustered indices

From

Alvaro Herrera

Date:

17 November 2005, 21:52:32

Simon Riggs wrote:
> On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
>
> The use case exists and the technique is low overhead, but the main
> question is: Does anybody think this behaviour would be beneficial for
> them? (I'm actually in two minds myself, but once the idea has arisen,
> it seems sensible to discuss this for everybody's sake).

I have no use for it but I see it would be beneficial in some cases.

> The trade-off is a table that keeps growing in size, even though you
> VACUUM it, with the benefit that the clustering is maintained. 
> 
> So how would you maintain it? Looks like you'd still have to use regular
> CLUSTER commands, but at least it would stay good in between.

Yeah, this is a problem.  The growth is unbounded.  Even if there's a
completely empty page somewhere, it can't be used because all tuples
will go to the last page.  The problem with using CLUSTER for
maintenance is that it takes an exclusive lock on the table, which is a
thing we've been running away from.  You are right in that it's much
cheaper than CLUSTERing a table that isn't ordered, because there's much
more locality.  But I don't think it's a big enough win.

Because of the drawbacks (unbounded growth being the most prominent one)
this would have to be an optional thing.  This means we would need an
additional system catalog column to keep whether it's active or not.
And a user command to activate it.  So it's starting to be a more
invasive thing.  Not that these things matter a whole lot, but anyway.

Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: CLUSTER and clustered indices

From

"Jonah H. Harris"

Date:

18 November 2005, 01:02:19

I agree, keeping it clustered would be very nice.

On 11/17/05, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Simon Riggs wrote:
> On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
> > Simon Riggs < simon@2ndquadrant.com> writes:
>
> The use case exists and the technique is low overhead, but the main
> question is: Does anybody think this behaviour would be beneficial for
> them? (I'm actually in two minds myself, but once the idea has arisen,
> it seems sensible to discuss this for everybody's sake).

I have no use for it but I see it would be beneficial in some cases.

> The trade-off is a table that keeps growing in size, even though you
> VACUUM it, with the benefit that the clustering is maintained.
>
> So how would you maintain it? Looks like you'd still have to use regular
> CLUSTER commands, but at least it would stay good in between.

Yeah, this is a problem.  The growth is unbounded.  Even if there's a
completely empty page somewhere, it can't be used because all tuples
will go to the last page.  The problem with using CLUSTER for
maintenance is that it takes an exclusive lock on the table, which is a
thing we've been running away from.  You are right in that it's much
cheaper than CLUSTERing a table that isn't ordered, because there's much
more locality.  But I don't think it's a big enough win.

Because of the drawbacks (unbounded growth being the most prominent one)
this would have to be an optional thing.  This means we would need an
additional system catalog column to keep whether it's active or not.
And a user command to activate it.  So it's starting to be a more
invasive thing.  Not that these things matter a whole lot, but anyway.

Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: CLUSTER and clustered indices

From

Simon Riggs

Date:

18 November 2005, 04:30:19

On Thu, 2005-11-17 at 21:57 -0300, Alvaro Herrera wrote:

> Personally I'd prefer to see index-ordered heaps, where the heap is
> itself an index, so the ordering it automatically kept.

Agreed. (I think thats case-closed on the previous proposal.)

As an aside, Index Organized Tables (IOTs) isn't just an Oracle term. 
They first used the term, but the concept had already been implemented
in both Tandem (value-ordered) and Teradata (hash-ordered) before this,
as well as numerous OLAP systems. The concept doesn't look to be
patented.

If anybody is looking for a justification for IOTs, the reduction in
table volume for large tables is very high. IOTs are the equivalent of
removing all of the leaf blocks of the clustered index.

Best Regards, Simon Riggs

Re: CLUSTER and clustered indices

From

"Kevin Grittner"

Date:

18 November 2005, 15:35:57

That sounds very much like a CLUSTERED INDEX under Sybase ASE
(or the derivative Microsoft SQL Server).  In those products, when you
create a clustered index, the data pages are sorted according to the
index sequence, and are used as the leaf pages in the index.  A
clustered index does not have another leaf level.

>>> Simon Riggs <simon@2ndquadrant.com>  >>>

As an aside, Index Organized Tables (IOTs) isn't just an Oracle term. 
They first used the term, but the concept had already been implemented
in both Tandem (value-ordered) and Teradata (hash-ordered) before this,
as well as numerous OLAP systems. The concept doesn't look to be
patented.

If anybody is looking for a justification for IOTs, the reduction in
table volume for large tables is very high. IOTs are the equivalent of
removing all of the leaf blocks of the clustered index.

Re: CLUSTER and clustered indices

From

"Jim C. Nasby"

Date:

21 November 2005, 19:17:54

+1, and I know Sybase had this in 11.0.3, which IIRC is over 10 years
old now.

BTW,
http://archives.postgresql.org/pgsql-performance/2004-08/msg00492.php is
one discussion about this from the past. I seem to recall that there was
an objection to true Index Organized Tables because it would be too
dificult to make that work with MVCC. If that's the case then what I
laid out in that email might get some of the benefit without the
difficulty. But hopefully it's easy to just store heap values in the
leaf nodes of an index.

FWIW, I know that Sybase required that an IOT be clustered on a unique
index. I think Oracle has the same requirement as well.

On Fri, Nov 18, 2005 at 08:30:14AM +0000, Simon Riggs wrote:
> On Thu, 2005-11-17 at 21:57 -0300, Alvaro Herrera wrote:
> 
> > Personally I'd prefer to see index-ordered heaps, where the heap is
> > itself an index, so the ordering it automatically kept.
> 
> Agreed. (I think thats case-closed on the previous proposal.)
> 
> As an aside, Index Organized Tables (IOTs) isn't just an Oracle term. 
> They first used the term, but the concept had already been implemented
> in both Tandem (value-ordered) and Teradata (hash-ordered) before this,
> as well as numerous OLAP systems. The concept doesn't look to be
> patented.
> 
> If anybody is looking for a justification for IOTs, the reduction in
> table volume for large tables is very high. IOTs are the equivalent of
> removing all of the leaf blocks of the clustered index.
> 
> Best Regards, Simon Riggs
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461