Re: Major differences between oracle and postgres performance - what can I do ? - Mailing list pgsql-performance
From | Stephen Frost |
---|---|
Subject | Re: Major differences between oracle and postgres performance - what can I do ? |
Date | |
Msg-id | 20040618175317.GN11196@ns.snowman.net Whole thread Raw |
In response to | Re: Major differences between oracle and postgres performance - what can I do ? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Major differences between oracle and postgres performance - what can I do ?
|
List | pgsql-performance |
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> [... thinks for awhile ...] It seems possible that they may use sort > >> code that knows it is performing a DISTINCT operation and discards > >> duplicates on sight. Given that there are only 534 distinct values, > >> the sort would easily stay in memory if that were happening. > > > Could this optimization be added to PostgreSQL? It sounds like a very > > reasonable thing to do. > > That's what I was wondering about too. But first I'd like to get > some kind of reading on how effective it would be. If someone can > demonstrate that Oracle can do sort-and-drop-dups a lot faster than > it can do a straight sort of the same amount of input data, that > would be a strong indication that it's worth doing. At this point > we don't know if that's the source of their win or not. Alright, I did a couple tests, these are different systems with different hardware, but in the end I think the difference is clear: tsf=# explain analyze select distinct access_type_id from p_gen_dom_dedicated_swc_access ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=321591.00..333205.56 rows=16 width=10) (actual time=32891.141..37420.429 rows=16 loops=1) -> Sort (cost=321591.00..327398.28 rows=2322912 width=10) (actual time=32891.137..35234.810 rows=2322912 loops=1) Sort Key: access_type_id -> Seq Scan on p_gen_dom_dedicated_swc_access (cost=0.00..55492.12 rows=2322912 width=10) (actual time=0.013..3743.470rows=2322912 loops=1) Total runtime: 37587.519 ms (5 rows) tsf=# explain analyze select access_type_id from p_gen_dom_dedicated_swc_access order by access_type_id; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=321591.00..327398.28 rows=2322912 width=10) (actual time=32926.696..35278.847 rows=2322912 loops=1) Sort Key: access_type_id -> Seq Scan on p_gen_dom_dedicated_swc_access (cost=0.00..55492.12 rows=2322912 width=10) (actual time=0.014..3753.443rows=2322912 loops=1) Total runtime: 36737.628 ms (4 rows) So, about the same from postgres in each case. From Oracle: (select access_type_id from p_gen_dom_dedicated_swc_access order by access_type_id) sauron:/home/sfrost> time sqlplus mci_vendor/mci @test.sql > /dev/null real 3m55.12s user 2m25.87s sys 0m10.59s (select distinct access_type_id from p_gen_dom_dedicated_swc_access) sauron:/home/sfrost> time sqlplus mci_vendor/mci @test.sql > /dev/null real 0m5.08s user 0m3.86s sys 0m0.95s All the queries were run multiple times, though there wasn't all that much difference in the times. Both systems are pretty speedy, but I tend to feel the Postgres box is faster in CPU/disk access time, which is probably why the Oracle system took 4 minutes to do what the Postgres systems does in 40 seconds. My only other concern is the Oracle system having to do the write I/O while the postgres one doesn't... I don't see an obvious way to get around that though, and I'm not sure if it'd really make *that* big of a difference. Stephen
Attachment
pgsql-performance by date: