Re: Major differences between oracle and postgres performance - what can I do ? - Mailing list pgsql-performance

From Stephen Frost
Subject Re: Major differences between oracle and postgres performance - what can I do ?
Date
Msg-id 20040618175317.GN11196@ns.snowman.net
Whole thread Raw
In response to Re: Major differences between oracle and postgres performance - what can I do ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Major differences between oracle and postgres performance - what can I do ?  (Stephen Frost <sfrost@snowman.net>)
List pgsql-performance
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> [... thinks for awhile ...]  It seems possible that they may use sort
> >> code that knows it is performing a DISTINCT operation and discards
> >> duplicates on sight.  Given that there are only 534 distinct values,
> >> the sort would easily stay in memory if that were happening.
>
> > Could this optimization be added to PostgreSQL?  It sounds like a very
> > reasonable thing to do.
>
> That's what I was wondering about too.  But first I'd like to get
> some kind of reading on how effective it would be.  If someone can
> demonstrate that Oracle can do sort-and-drop-dups a lot faster than
> it can do a straight sort of the same amount of input data, that
> would be a strong indication that it's worth doing.  At this point
> we don't know if that's the source of their win or not.

Alright, I did a couple tests, these are different systems with
different hardware, but in the end I think the difference is clear:

tsf=# explain analyze select distinct access_type_id from p_gen_dom_dedicated_swc_access ;
                                                                       QUERY PLAN
                                                               

--------------------------------------------------------------------------------------------------------------------------------------------------------
                              
 Unique  (cost=321591.00..333205.56 rows=16 width=10) (actual time=32891.141..37420.429 rows=16 loops=1)
   ->  Sort  (cost=321591.00..327398.28 rows=2322912 width=10) (actual time=32891.137..35234.810 rows=2322912 loops=1)
         Sort Key: access_type_id
         ->  Seq Scan on p_gen_dom_dedicated_swc_access  (cost=0.00..55492.12 rows=2322912 width=10) (actual
time=0.013..3743.470rows=2322912 loops=1) 
 Total runtime: 37587.519 ms
(5 rows)

tsf=# explain analyze select access_type_id from p_gen_dom_dedicated_swc_access order by access_type_id;
                                                                    QUERY PLAN
                          

--------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=321591.00..327398.28 rows=2322912 width=10) (actual time=32926.696..35278.847 rows=2322912 loops=1)
   Sort Key: access_type_id
   ->  Seq Scan on p_gen_dom_dedicated_swc_access  (cost=0.00..55492.12 rows=2322912 width=10) (actual
time=0.014..3753.443rows=2322912 loops=1) 
 Total runtime: 36737.628 ms
(4 rows)

So, about the same from postgres in each case.  From Oracle:

(select access_type_id from p_gen_dom_dedicated_swc_access order by access_type_id)
sauron:/home/sfrost> time sqlplus mci_vendor/mci @test.sql > /dev/null

real    3m55.12s
user    2m25.87s
sys     0m10.59s

(select distinct access_type_id from p_gen_dom_dedicated_swc_access)
sauron:/home/sfrost> time sqlplus mci_vendor/mci @test.sql > /dev/null

real    0m5.08s
user    0m3.86s
sys     0m0.95s

All the queries were run multiple times, though there wasn't all that
much difference in the times.  Both systems are pretty speedy, but I
tend to feel the Postgres box is faster in CPU/disk access time, which
is probably why the Oracle system took 4 minutes to do what the Postgres
systems does in 40 seconds.  My only other concern is the Oracle system
having to do the write I/O while the postgres one doesn't...  I don't
see an obvious way to get around that though, and I'm not sure if it'd
really make *that* big of a difference.

    Stephen

Attachment

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Major differences between oracle and postgres performance - what can I do ?
Next
From: "Scott Marlowe"
Date:
Subject: Re: [BULK] Problems with vacuum!