Re: Sort performance on large tables - Mailing list pgsql-performance

From Christopher Kings-Lynne
Subject Re: Sort performance on large tables
Date
Msg-id 437155B3.3050507@familyhealth.com.au
Whole thread Raw
In response to Re: Sort performance on large tables  (Charlie Savage <cfis@interserv.com>)
List pgsql-performance
I'd set up a trigger to maintain summary tables perhaps...

Chris


Charlie Savage wrote:
> Thanks everyone for the feedback.
>
> I tried increasing work_mem:
>
> set work_mem to 300000;
>
> select tlid, min(ogc_fid)
> from completechain
> group by tld;
>
> The results are:
>
> "GroupAggregate  (cost=9041602.80..10003036.88 rows=48071704 width=8)
> (actual time=4371749.523..5106162.256 rows=47599910 loops=1)"
> "  ->  Sort  (cost=9041602.80..9161782.06 rows=48071704 width=8) (actual
> time=4371690.894..4758660.433 rows=48199165 loops=1)"
> "        Sort Key: tlid"
> "        ->  Seq Scan on completechain  (cost=0.00..2228584.04
> rows=48071704 width=8) (actual time=49.518..805234.970 rows=48199165
> loops=1)"
> "Total runtime: 5279988.127 ms"
>
> Thus the time decreased from 8486 seconds to 5279 seconds - which is a
> nice improvement.  However, that still leaves postgresql about 9 times
> slower.
>
> I tried increasing work_mem up to 500000, but at that point the machine
> started using its swap partition and performance degraded back to the
> original values.
>
> Charlie
>
>
> Richard Huxton wrote:
>  > Charlie Savage wrote:
>  >> Hi everyone,
>  >>
>  >> I have a question about the performance of sort.
>  >
>  >> Note it takes over 10 times longer to do the sort than the full
>  >> sequential scan.
>  >>
>  >> Should I expect results like this?  I realize that the computer is
>  >> quite low-end and is very IO bound for this query, but I'm still
>  >> surprised that the sort operation takes so long.
>  >
>  > The sort will be spilling to disk, which will grind your I/O to a halt.
>  >
>  >> work_mem =  16384                        # in Kb
>  >
>  > Try upping this. You should be able to issue "set work_mem = 100000"
>  > before running your query IIRC. That should let PG do its sorting in
>  > larger chunks.
>  >
>  > Also, if your most common access pattern is ordered via tlid look into
>  > clustering the table on that.
>
>
>
> Richard Huxton wrote:
>
>> Charlie Savage wrote:
>>
>>> Hi everyone,
>>>
>>> I have a question about the performance of sort.
>>
>>
>>> Note it takes over 10 times longer to do the sort than the full
>>> sequential scan.
>>>
>>> Should I expect results like this?  I realize that the computer is
>>> quite low-end and is very IO bound for this query, but I'm still
>>> surprised that the sort operation takes so long.
>>
>>
>> The sort will be spilling to disk, which will grind your I/O to a halt.
>>
>>> work_mem =  16384                        # in Kb
>>
>>
>> Try upping this. You should be able to issue "set work_mem = 100000"
>> before running your query IIRC. That should let PG do its sorting in
>> larger chunks.
>>
>> Also, if your most common access pattern is ordered via tlid look into
>> clustering the table on that.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org


pgsql-performance by date:

Previous
From: Charlie Savage
Date:
Subject: Re: Sort performance on large tables
Next
From: William Lai
Date:
Subject: