Re: Major differences between oracle and postgres performance - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Major differences between oracle and postgres performance
Date
Msg-id 40D2F47E.6060306@archonet.com
Whole thread Raw
In response to Re: Major differences between oracle and postgres performance - what can I do ?  (Gary Cowell <gary_cowell@yahoo.co.uk>)
List pgsql-performance
Gary Cowell wrote:
> --- lnd@hnit.is wrote: > You can roughly estimate time
> spent for just scaning
>
>>the table using
>>something like this:
>>
>>    select sum(version) from ... where version is not
>>null
>>
>>    and just
>>
>>    select sum(version) from ...
>>
>>The results would be interesting to compare.
>
>
> To answer (I hope) everyones questions at once:
>
> 1) Oracle and postmaster were not running at the same
> time
> 2) The queries were run once, to cache as much as
> possible then run again to get the timing
>
> 3) Distinct vs. no distinct (i.e. sort performance).
>
> select length(version) from vers where version is not
> null;
>
> Time: 9748.174 ms
>
> select distinct(version) from vers where version is
> not null;
>
> Time: 67988.972 ms
>
> So about an extra 60 seconds with the distinct on.

Which is basically the sorting time...

> Here is the explain analyze output from psql:
>
> # explain analyze select distinct version from vers
> where version is not null;
>
>      QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------
>  Unique  (cost=117865.77..120574.48 rows=142
> width=132) (actual time=63623.428..68269.111 rows=536
> loops=1)
>    ->  Sort  (cost=117865.77..119220.13 rows=541741
> width=132) (actual time=63623.417..66127.641
> rows=541741 loops=1)
>          Sort Key: "version"
>          ->  Seq Scan on vers  (cost=0.00..21367.41
> rows=541741 width=132) (actual time=0.218..7214.903
> rows=541741 loops=1)
>                Filter: ("version" IS NOT NULL)
>  Total runtime: 68324.215 ms
> (6 rows)
>
> Time: 68326.062 ms

Yep - the seq-scan takes 7214.903 ms, there's a huge setup time for the
sort (63623.417) and it's not finished until 66127.641ms have elapsed.

>
> And the non-default .conf parameters:
>
> tcpip_socket = true
> max_connections = 100
> password_encryption = true
> shared_buffers = 2000
> sort_mem = 16384
> vacuum_mem = 8192
> effective_cache_size = 4000
> syslog = 2

Well, I'd probably up vacuum_mem, and check how much RAM is being used
for disk cache - I'd guess it's more than 32MB (4000 * 8kb).

You might want to up the shared_buffers, but that's going to depend on
the load.

Try increasing sort_mem temporarily, and see if that makes a difference:
   SET sort_mem = 64000;
   EXPLAIN ANALYSE ...
The only thing I can think is that you're getting disk activity to get a
sort that slow. I'd be expecting a hash-sort if PG thought it could fit
the distinct values in memory.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: "Domenico Sgarbossa"
Date:
Subject: Re: [BULK] Problems with vacuum!
Next
From: Richard Huxton
Date:
Subject: Re: memory allocation