Re: Re: Loading optimization - Mailing list pgsql-general

From Ian Harding
Subject Re: Re: Loading optimization
Date
Msg-id 3A5D50CD.31310F0F@pakrat.com
Whole thread Raw
In response to Loading optimization  (Gary Wesley <gary@db.stanford.edu>)
List pgsql-general
Tom Lane wrote:

> Martijn van Oosterhout <kleptog@cupid.suninternet.com> writes:
> > But does postgres actually use the fact that the data is clustered?
>
> The planner has no idea that the table is clustered, and will estimate
> indexscan costs on the assumption that the data is randomly ordered in
> the table.  So you're likely to get a seqscan plan for queries where
> indexscan would actually be faster.  This is something we need to fix,
> but the main problem is accounting for the fact that the clustered order
> will degrade over time as data is added/updated.  See past discussions
> in pghackers.
>
> The CLUSTER implementation is so shoddy at the moment that I'm hesitant
> to encourage people to use it anyway :-(.  We've got to rewrite it so
> that it doesn't drop other indexes, lose constraints, break foreign
> key and inheritance relationships, etc etc.
>
>                         regards, tom lane

Are the problems with CLUSTER isolated to the creation of the clustering,
or the maintenance of it?  If I cluster an index before I create any
relationships, constraints, or other indexes, (or load any data for that
matter) am I going to be OK?  BTW, Microsoft recommends creating clustered
indexes first, because creating one will cause all other existing indexes
to be dropped and recreated.  That bit makes sense, since rebuilding all
your indexes might take some time, and they have to be recreated since the
data has moved, right?

Ian


pgsql-general by date:

Previous
From: tony
Date:
Subject: VB and PostgreSQL
Next
From: Ben Stringer
Date:
Subject: Casting money to numeric(10,2)