On Sun, 7 Jun 1998, Bruce Momjian wrote:
> >
> > Hi,
> >
> > I was trying to change to cluster command to do the its writes clustered
> > by a 100 tuples, thus hoping to improve performance. However, the code
> > I've written crashes. This has certainly to do with some internal states
> > of pgsql that aren't preserved in a HeapTuple.
> >
> > Could somebody with knowledge have a brief glimpse on my code and perhaps
> > tell me how to do it properly?
>
> I did not look at the code, but I can pretty much tell you that bunching
> the write will not help performance. We already do that pretty well
> with the cache.
>
> THe problem with the cluster is the normal problem of using an index to
> seek into a data table, where the data is not clustered on the index.
> Every entry in the index requires a different page, and each has to be
> read in from disk.
My thinking was that the reading from the table is very scattered, but
that the writing to the new table could be done 'sequentially'. Therefore
I thought it was interesting to see if it would help to cluster the writes.
> Often the fastest way is to discard the index, and just read the table,
> sorting each in pieces, and merging them in. That is what psort does,
> which is our sort code. That is why I recommend the SELECT INTO
> solution if you have enough disk space.
A 'select into ... order by ...' you mean?
Maarten
_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems |
| Department of Electrical Engineering |
| Computer Architecture and Digital Technique section |
| M.Boekhold@et.tudelft.nl |
-----------------------------------------------------------------------------