Home > mailing lists

Re: Our CLUSTER implementation is pessimal - Mailing list pgsql-hackers

From	Martijn van Oosterhout
Subject	Re: Our CLUSTER implementation is pessimal
Date	September 1, 2008 04:22:18
Msg-id	20080901072147.GB16993@svana.org Whole thread Raw
In response to	Our CLUSTER implementation is pessimal (Gregory Stark <stark@enterprisedb.com>)
Responses	Re: Our CLUSTER implementation is pessimal
List	pgsql-hackers

Tree view

On Mon, Sep 01, 2008 at 12:25:26AM +0100, Gregory Stark wrote:
> The problem is that it does a full index scan and looks up each tuple in the
> order of the index. That means it a) is doing a lot of random i/o and b) has
> to access the same pages over and over again.

<snip>

> a) We need some way to decide *when* to do a sort and when to do an index
> scan. The planner has all this machinery but we don't really have all the
> pieces handy to use it in a utility statement. This is especially important
> for the case where we're doing a cluster operation on a table that's already
> clustered. In that case an index scan could conceivably actually win (though I
> kind of doubt it). I don't really have a solution for this.

The case I had recently was a table that was hugely bloated. 300MB data
and only 110 live rows. A cluster was instant, a seqscan/sort would
probably be much slower. A VACUUM FULL probably worse :)

Isn't there some compromise. Like say scanning the index to collect a
few thousand records and then sort them the way a bitmap index scan
does. Should be much more efficient that what we have now.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

pgsql-hackers by date:

From: "Ryan Bradetich"
Date: 01 September 2008, 04:15:17
Subject: [Patch Review] TRUNCATE Permission

From: "Pavel Stehule"
Date: 01 September 2008, 04:35:29
Subject: Re: Is this really really as designed or defined in some standard

Re: Our CLUSTER implementation is pessimal - Mailing list pgsql-hackers

Previous

Next