Re: Performance of CLUSTER - Mailing list pgsql-performance

From Mark Thornton
Subject Re: Performance of CLUSTER
Date
Msg-id 4FD5FA6F.3080903@optrak.com
Whole thread Raw
In response to Performance of CLUSTER  (Mark Thornton <mthornton@optrak.com>)
Responses Re: Performance of CLUSTER  (Shaun Thomas <sthomas@optionshouse.com>)
List pgsql-performance
On 11/06/12 14:52, Shaun Thomas wrote:
> On 06/11/2012 08:46 AM, Mark Thornton wrote:
>
>> 500m --- though isn't clear if cluster uses maintenance memory or the
>> regular work memory. I could readily use a higher value for
>> maintenance_work_mem.
>
> For an operation like that, having a full GB wouldn't hurt. Though if
> you haven't already, you might think about pointing
I didn't think the process was using even the 500m it ought to have had
available, whereas creating an index did appear to use that much. Note
though that I didn't stay up all night watching it!

> your pgsql_tmp to /dev/shm for a while, even for just this operation.
>
> Then again, is your CPU at 100% during the entire operation?
No the CPU utilization is quite low. Most of the time is waiting for IO.

> If it's not fetching anything from disk or writing out much, reducing
> IO won't help. :) One deficiency we've had with CLUSTER is that it's a
> synchronous operation. It does each thing one after the other. So
> it'll organize the table contents, then it'll reindex each index
> (including the primary key) one after the other. If you have a lot of
> those, that can take a while, especially if any composite or complex
> indexes exist.
In this case there are only two indexes, the gist one and a primary key
(on a bigint value).

>
> You might actually be better off running parallel REINDEX commands on
> the table (I wrote a script for this because we have a 250M row table
> that each index takes 1-2.5 hours to build). You might also consider
> pg_reorg, which seems to handle some parts of a table rebuild a little
> better.
>
> That should give you an escalation pattern, though. :)

Thanks for your help,
Mark Thornton



pgsql-performance by date:

Previous
From: "Fitch, Britt"
Date:
Subject: postgres clustering interactions with pg_dump
Next
From: Shaun Thomas
Date:
Subject: Re: Performance of CLUSTER