Re: Cluster table and order information - Mailing list pgsql-general

From Andy Colson
Subject Re: Cluster table and order information
Date
Msg-id 4D483343.4010806@squeakycode.net
Whole thread Raw
In response to Cluster table and order information  (Dario Beraldi <dario.beraldi@ed.ac.uk>)
List pgsql-general
On 2/1/2011 10:17 AM, Dario Beraldi wrote:
> Quoting Andy Colson <andy@squeakycode.net>:
>
>> On 2/1/2011 9:08 AM, Dario Beraldi wrote:
>>> Hello,
>>>
>>> From the documentation of CLUSTER table
>>> (http://www.postgresql.org/docs/8.4/static/sql-cluster.html) I
>>> understand that clustering can be achieved by re-creating the table like
>>> this:
>>>
>>> CREATE TABLE newtable AS
>>> SELECT * FROM table ORDER BY columnlist;
>>>
>>> My question is: If I upload with COPY a datafile which is already
>>> correctly sorted, can I inform postgres of such order, so that no
>>> clustering is necessary after the import? In other words, how can I tell
>>> postgres that my file is order by this and that column?
>>>
>>> Many thanks!
>>>
>>> Dario
>>>
>>
>> The planner has no knowledge of cluster. Meaning PG will query a
>> clustered and unclustered table exactly the same way. A table is not
>> marked or anything as clustered. And in fact, during usage of a table
>> it'll become unclustered.
>>
>> Clustering is only useful when you are going to read multiple records
>> in the same order as an index. It turns "more random seeks" into "more
>> sequential reads".
>>
>> If your COPY loads data in indexed order, then just dont run the cluster.
>>
>> -Andy
>>
> Thanks very much Andy, this clarifies my doubts.
>
> I was misled by the docs saying "When a table is clustered, PostgreSQL
> remembers which index it was clustered by" which made me think that the
> order information is stored somewhere.
>
> All the best
> Dario
>

The next sentience clears it up:

The form "CLUSTER table_name" reclusters the table using the same index
as before.


-Andy

pgsql-general by date:

Previous
From: hlcborg
Date:
Subject: Problem with encode () and hmac() in pgcrypto
Next
From: Tom Lane
Date:
Subject: Re: Weird performance issue with custom function with a for loop.