Re: Re: Loading optimization - Mailing list pgsql-general

From Bruce Momjian
Subject Re: Re: Loading optimization
Date
Msg-id 200101091551.KAA07785@candle.pha.pa.us
Whole thread Raw
In response to Re: Re: Loading optimization  (Martijn van Oosterhout <kleptog@cupid.suninternet.com>)
Responses Re: Re: Loading optimization
List pgsql-general
> Ian Harding wrote:
> >
> > Gary Wesley wrote:
> >
> > > Is there any advantage to having data sorted before populating it into a
> > > table?
> > > (In 6.5)
> > >
> > > Gary Wesley
> >
> > Yes.  You have effectively loaded it with a clustered index.  If you
> > cluster an index on the sort column after loading it, the sort will remain
> > in effect and will speed queries/joins that use that column.
>
> But does postgres actually use the fact that the data is clustered? I
> keep thinking that here I could cluster all our data such that a
> sequential
> search is almost always a bad idea but I have no idea how to relate that
> fact to postgres...

Well, clustering certainly speeds up index access to multiple heap
values because duplicate values are all on the same heap page.  One
thing that is missing is that there is no preference for index scans for
clustered indexes.

Because the clustering is not permanent, but becomes unclustered as data
is added/modified, there is no easy way to know if the clustering is
still valid.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

pgsql-general by date:

Previous
From: Heiko Irrgang
Date:
Subject: trouble with db-restore
Next
From: Bruce Momjian
Date:
Subject: Re: ECPG could not connect to the database.