Re: Re: Loading optimization - Mailing list pgsql-general

From kleptog@cupid.suninternet.com (Martijn van Oosterhout)
Subject Re: Re: Loading optimization
Date
Msg-id 20010110152024.B12844@cupid.suninternet.com
Whole thread Raw
In response to Re: Re: Loading optimization  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Re: Loading optimization  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-general
On Tue, Jan 09, 2001 at 10:51:35AM -0500, Bruce Momjian wrote:
> Well, clustering certainly speeds up index access to multiple heap
> values because duplicate values are all on the same heap page.  One
> thing that is missing is that there is no preference for index scans for
> clustered indexes.

Maybe that would be the simple way, just a flag. Alternatively, have VACUUM
ANALYZE estimate the "cohesiveness" of the data...

> Because the clustering is not permanent, but becomes unclustered as data
> is added/modified, there is no easy way to know if the clustering is
> still valid.

Well, in our case the table has over 1,000,000 rows and refer to items that
would appear on a bill. Since a bill is never changed after the fact, the
clustering is always in effect.

This table is a WORM table, once data is added, it is never updated. The
question is, is this typical of very large tables? If that is the case then
generally clustering would tend to stay rather than degrade.

Also, in our case, clustering by a single index is not really sufficient.
Within a single bill we would like to cluster the items by service. I was
thinking or writing a program that would do a pg_dump, order by columns as
requested and then dump it back in. I would've done it except that
Postgresql won't use the fact that it's sorted.

I for one am hoping for progress in this area. Unnessesary sequential scans
are painful when someone is waiting on the phone...

Martijn

pgsql-general by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Re: starting PGSQL automatically on Redhat 6.2
Next
From: Tom Lane
Date:
Subject: Re: Re: COPY error: pqReadData() -- backend closed the channel unexpectedly