Thread: How Postgresql Compares For Query And Load Operations
Dear list, With the advent of Version 7.1.2 I thought it would be interesting to compare how Postgresql does a certain class of queries (Star Queries), and Data Loads with some of the other leading databases ( which were in my humble opinion Mysql 3.23.39, Db2 7.2 ,Oracle 9.0 [ ok - Possibly SQL 2000 too but I dont run Winanyk] ). The results were overall very encouraging : Postgresql can clearly hold its own when compared to the "big guys". The full details (together with a wee rant) are aviailable on : http://homepages.slingshot.co.nz/~markir (if anyone asks I can submit the entire results...but I figured, lets cut to the chase here....) There were two areas where Postgresql was slower, and I thought it would be informative to discuss these briefly : 1 Star query scanning a sigificant portion of a fact table SELECT d0.f1, count(f.val) FROM dim0 d0, fact1 f WHERE d0.d0key = f.d0key AND d0.f1 between '2007-07-01' AND '2018-07-01' GROUP BY d0.f1 This query requires summarizing a significant proportion of the 3000000 row ( 700Mb ) fact1 table. Postgres 7.1.2 executed this query like : Aggregate (cost=2732703.88..2738731.49 rows=120552 width=20) -> Group (cost=2732703.88..2735717.69 rows=1205521 width=20) -> Sort (cost=2732703.88..2732703.88 rows=1205521 width=20) -> Hash Join (cost=1967.52..2550188.93 rows=1205521 width=20) -> Seq Scan on fact1 f (cost=0.00..1256604.00 rows=3000000 width=8) -> Hash (cost=1957.47..1957.47 rows=4018 width=12) -> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..1957.47 rows=4018 width=12) for an elapsed time of 3m50s Wheras Oracle 9.0 used : SELECT STATEMENT Optimizer=CHOOSE (Cost=5810 Card=4020 Bytes =60300) SORT (GROUP BY) (Cost=5810 Card=4020 Bytes=60300) HASH JOIN (Cost=5810 Card=2989644 Bytes=44844660) TABLE ACCESS (FULL) OF 'DIM0' (Cost=4 Card=4020 Bytes= 40200) TABLE ACCESS (FULL) OF 'FACT1' (Cost=5806 Card=2990089 Bytes=14950445) for an elapsed time of 50s. It would seem that Oracle's execution plan is more optimal. 2 Bulk loading data Buld Load times for a 3000000 row (700Mb ) fact table were Postgresql 9m30s (copy) Db2 2m15s (load) Oracle 5m (sqlldr) Mysql 2m20s (load) (Note that Db2 is faster than Mysql here ! ) While I left "fsync = on" for this test... I still think the difference was worth noting. Any comments on these sort of results would be welcome. regards Mark
Can anyone confirm me that Postgres 7.1 does not support outer join ? Are functions similar to classical stored procedure ? thanks -- ************************************************** Eddie IANNUCCELLI - tel: 05 61 28 54 44 INRA, Laboratoire de Génétique Cellulaire Chemin de Borde Rouge - Auzeville -BP27 31326 Castanet Tolosan **************************************************
Sorry, I do not remember the SQL92. Is it a JOIN and OUTER join syntax or the natural *= or #= ? thanks Neil Conway wrote: >> Can anyone confirm me that Postgres 7.1 does not support outer join >> ? Are functions similar to classical stored procedure ? > > > Postgres 7.1 DOES support outer joins -- I believe it uses SQL92 syntax. > > Cheers, > > Neil -- ************************************************** Eddie IANNUCCELLI - tel: 05 61 28 54 44 INRA, Laboratoire de Génétique Cellulaire Chemin de Borde Rouge - Auzeville -BP27 31326 Castanet Tolosan **************************************************
eddie iannuccelli <eddie.iannuccelli@toulouse.inra.fr> writes: > Can anyone confirm me that Postgres 7.1 does not support outer join ? What? It definitely *does* support outer joins. > Are functions similar to classical stored procedure ? Depends on how loose your idea of "similar" is ... a function can't readily return a recordset at the moment. (This may be fixed in time for 7.2, though.) regards, tom lane
Mark kirkwood <markir@slingshot.co.nz> writes: > Postgres 7.1.2 executed this query like : > Aggregate (cost=2732703.88..2738731.49 rows=120552 width=20) > -> Group (cost=2732703.88..2735717.69 rows=1205521 width=20) > -> Sort (cost=2732703.88..2732703.88 rows=1205521 width=20) > -> Hash Join (cost=1967.52..2550188.93 rows=1205521 width=20) > -> Seq Scan on fact1 f (cost=0.00..1256604.00 rows=3000000 > width=8) > -> Hash (cost=1957.47..1957.47 rows=4018 width=12) > -> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..1957.47 > rows=4018 width=12) > for an elapsed time of 3m50s > Wheras Oracle 9.0 used : > SELECT STATEMENT Optimizer=CHOOSE (Cost=5810 Card=4020 Bytes =60300) > SORT (GROUP BY) (Cost=5810 Card=4020 Bytes=60300) > HASH JOIN (Cost=5810 Card=2989644 Bytes=44844660) > TABLE ACCESS (FULL) OF 'DIM0' (Cost=4 Card=4020 Bytes= 40200) > TABLE ACCESS (FULL) OF 'FACT1' (Cost=5806 Card=2990089 > Bytes=14950445) > for an elapsed time of 50s. > It would seem that Oracle's execution plan is more optimal. Hmm, since I don't know the details of Oracle's plan displays, it's hard to comment on that --- but it looks to me like the plans are essentially the same, with the small difference that Postgres chooses to use the index on dim0 to filter out the dim0 rows that don't meet d0.f1 between '2007-07-01' AND '2018-07-01' before they get loaded into the hashtable, whereas Oracle is just scanning dim0 by brute force (and presumably evaluating the BETWEEN clause directly at each row). Given that that's the much smaller table, it's unlikely that that makes much difference. I am assuming that the different order in which the hash-join's inputs are listed is just an artifact of the listing format --- surely Oracle is also choosing to load the dim0 rows into an in-memory hash table and then scan fact1 to probe the hash table, rather than trying to load all of fact1 into memory. My guess is that the Oracle plan is misleading where it says "SORT", and that they are actually doing no such thing as a sort. Given that they only expect 4000 distinct rows out of the grouping/aggregation steps, it'd make more sense to do the group+aggregation in one pass using another in-memory hash table (indexed by GROUP BY key, with contents being the count() accumulator). Currently, Postgres only knows how to form groups via a true sort and then a "GROUP" pass (works like the traditional "sort | uniq" Unix filtering method). But if you actually want to aggregate the groups, and there aren't too many of them, then you can form each aggregate in parallel in one unsorted pass over the input, keeping the aggregate intermediate results in entries in a hash table. Hash-based aggregation is on our TODO list, and is fairly high priority in my eyes (though I doubt it'll be done for 7.2). If Oracle really is doing a sort, it's hard to see where the speed difference came from --- unless you have set the tuning parameters such that Oracle does the sort all-in-memory whereas Postgres doesn't. Sorts that have to go to disk are lots slower. Can anyone who actually knows how to read Oracle plans confirm or deny these speculations? > Buld Load times for a 3000000 row (700Mb ) fact table were > Postgresql 9m30s (copy) > Db2 2m15s (load) > Oracle 5m (sqlldr) > Mysql 2m20s (load) Hmm, I couldn't make out from your webpage exactly how you did the loading, or which steps are included in your timings. I see that you used COPY, which is good ... but did you create the indexes before or after COPY? What about the constraints? I also see a CLUSTER script --- was this used, and if so where is its time counted? regards, tom lane
Hey Mark, very interesting results! Thanks for taking the time to collect this info - it is really helpful! Quick note, I don't know what your licensing arrangement is with Oracle - but from what I understand you may be in violation of those terms by publishing this data (maybe not - Ned from Great Bridge can answer this question better). If that's is the case, I think publishing something like xxxxxx 9.0 may be a simple resolution. Great Work! I'd be interested in seeing how some additional tuning would affect your pg results. -Ryan Mahoney At 10:22 PM 7/13/01 +1200, Mark kirkwood wrote: >Dear list, > >With the advent of Version 7.1.2 I thought it would be interesting to compare >how Postgresql does a certain class of queries (Star Queries), and Data Loads >with some of the other leading databases ( which were in my humble opinion >Mysql 3.23.39, Db2 7.2 ,Oracle 9.0 [ ok - Possibly SQL 2000 too but I dont >run Winanyk] ). > >The results were overall very encouraging : > >Postgresql can clearly hold its own when compared to the "big guys". > >The full details (together with a wee rant) are aviailable on : > >http://homepages.slingshot.co.nz/~markir > >(if anyone asks I can submit the entire results...but I figured, lets cut to >the chase here....) > >There were two areas where Postgresql was slower, and I thought it would be >informative to discuss these briefly : > > >1 Star query scanning a sigificant portion of a fact table > >SELECT > d0.f1, > count(f.val) >FROM dim0 d0, > fact1 f >WHERE d0.d0key = f.d0key >AND d0.f1 between '2007-07-01' AND '2018-07-01' >GROUP BY d0.f1 > >This query requires summarizing a significant proportion of the 3000000 row ( >700Mb ) fact1 table. > >Postgres 7.1.2 executed this query like : > > Aggregate (cost=2732703.88..2738731.49 rows=120552 width=20) > -> Group (cost=2732703.88..2735717.69 rows=1205521 width=20) > -> Sort (cost=2732703.88..2732703.88 rows=1205521 width=20) > -> Hash Join (cost=1967.52..2550188.93 rows=1205521 width=20) > -> Seq Scan on fact1 f (cost=0.00..1256604.00 rows=3000000 >width=8) > -> Hash (cost=1957.47..1957.47 rows=4018 width=12) > -> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..1957.47 >rows=4018 width=12) > >for an elapsed time of 3m50s > >Wheras Oracle 9.0 used : > > SELECT STATEMENT Optimizer=CHOOSE (Cost=5810 Card=4020 Bytes =60300) > SORT (GROUP BY) (Cost=5810 Card=4020 Bytes=60300) > HASH JOIN (Cost=5810 Card=2989644 Bytes=44844660) > TABLE ACCESS (FULL) OF 'DIM0' (Cost=4 Card=4020 Bytes= 40200) > TABLE ACCESS (FULL) OF 'FACT1' (Cost=5806 Card=2990089 >Bytes=14950445) > >for an elapsed time of 50s. > >It would seem that Oracle's execution plan is more optimal. > > >2 Bulk loading data > >Buld Load times for a 3000000 row (700Mb ) fact table were > >Postgresql 9m30s (copy) >Db2 2m15s (load) >Oracle 5m (sqlldr) >Mysql 2m20s (load) > > >(Note that Db2 is faster than Mysql here ! ) > >While I left "fsync = on" for this test... I still think the difference was >worth noting. > >Any comments on these sort of results would be welcome. > >regards > >Mark > > > > > >---------------------------(end of broadcast)--------------------------- >TIP 3: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to majordomo@postgresql.org so that your >message can get through to the mailing list cleanly > > > >--- >Incoming mail is certified Virus Free. >Checked by AVG anti-virus system (http://www.grisoft.com). >Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01
Tom, Good point.... I notice I have set Oracle "pga_aggregate_size=30M" which includes the sort space for a user process ( oops) whereas every other db has about 2M ( there are a few typos on the page... forgot to update from an earlier study) I will have to re-run the Oracle results with 2M ( or re-run the rest with 30M...) I will update you ..... > If Oracle really is doing a sort, it's hard to see where the speed > difference came from --- unless you have set the tuning parameters such > that Oracle does the sort all-in-memory whereas Postgres doesn't. Sorts > that have to go to disk are lots slower. > Can anyone who actually knows how to read Oracle plans confirm or deny > these speculations? I will have a play with a clearer example for the star optimization business ( the thoery being - I believe ...that for a star query with n (small) dimension tables and 1 (big) fact table, it is best to cartesian product the dimensions, determine a set of keys, and access to the fact table using these). My "trivial" example with 1 dimension does not illustrate this that well...( I have another with 2 dimension tables which should be better)...again I will update you. > > > Buld Load times for a 3000000 row (700Mb ) fact table were > > > > Postgresql 9m30s (copy) > > Db2 2m15s (load) > > Oracle 5m (sqlldr) > > Mysql 2m20s (load) > There are a few "optional" scripts in the tar - which I should have indicated :-( ... I do not do the cluster, primary or foreign keys at all ( there were too many variations and options for constraints for all the different databases)....so I just create the table, load via copy and then create the various indexes. The load timings are for the fact0 table with no indexes created. > Hmm, I couldn't make out from your webpage exactly how you did the > loading, or which steps are included in your timings. I see that you > used COPY, which is good ... but did you create the indexes before or > after COPY? What about the constraints? I also see a CLUSTER script > --- was this used, and if so where is its time counted? > > regards, tom lane
> If Oracle really is doing a sort, it's hard to see where the speed > difference came from --- unless you have set the tuning parameters such > that Oracle does the sort all-in-memory whereas Postgres doesn't. Sorts > that have to go to disk are lots slower. > I redid the tests ensuring everybody used 10M sort area... nothing was significantly altered !! ( altho Postgres moved in towards the big boys on the first 3 queries and the elapsed time for queries 4 & 5 converged ) > > > Hmm, I couldn't make out from your webpage exactly how you did the > loading, or which steps are included in your timings. I see that you > used COPY, which is good ... but did you create the indexes before or > after COPY? What about the constraints? I also see a CLUSTER script > --- was this used, and if so where is its time counted? > > regards, tom lane My apologies for the state of the scripts ( to all you who downloaded them for a play) - I had forgotten to complete the README and also left heaps of test files lying about in the query directory. I have cleaned these up now ! The story is... the comparison was supposed to be simple... so no special features ( like clustered indexes/tables, bitmap indexes, materialized views, automatic summary tables...) just a comparison of how well each db did its "bread and butter" operations.
On Saturday 14 July 2001 02:49, Ryan Mahoney wrote: > Quick note, I don't know what your licensing arrangement is with Oracle - > but from what I understand you may be in violation of those terms by > publishing this data (maybe not - Ned from Great Bridge can answer this > question better). If that's is the case, I think publishing something like > xxxxxx 9.0 may be a simple resolution. > I have made some mods to the web site to "obfuscate" the relevant results....as the developer license forbids publishing benchmarking results... regards Mark
On Saturday 14 July 2001 02:49, Tom Lane wrote: > > > > It would seem that Oracle's execution plan is more optimal. > > Hmm, since I don't know the details of Oracle's plan displays, it's hard > to comment on that --- but it looks to me like the plans are essentially > the same, with the small difference that Postgres chooses to use the > index on dim0 to filter ....(snipped ) After a little thinking, I am inclined to agree with you Tom... I wondered if the difference might to be due to pure sequential scan performance differences. I tried this query : SELECT sum(val) FROM fact0 for Postgres, Db2 and Oracle. The results were Postgres 2m25s Db2 40s Oracle 50s This seems to be the likely culprit. I suspect that the "many block/page read at once" type optimzations (prefetch for Db2 and mutli block read for Oracle) mean that table sequential scans are faster for these guys than Postgres. Thus on the bright side their access plans are not necessarily any better than Postgres ! regards Mark
Mark kirkwood <markir@slingshot.co.nz> writes: > I tried this query : > SELECT sum(val) FROM fact0 > for Postgres, Db2 and Oracle. The results were > Postgres 2m25s > Db2 40s > Oracle 50s > This seems to be the likely culprit. I suspect that the "many > block/page read at once" type optimzations (prefetch for Db2 and mutli > block read for Oracle) mean that table sequential scans are faster for > these guys than Postgres. Hm. The theory about simple sequential reads is that we expect the kernel to optimize the disk access, since it'll recognize that we are doing sequential access to the table file and do read-aheads. Or that's the theory, anyway. I am not convinced that inefficient I/O is the story here. We could be paying the price of our very generalized implementation of aggregates. It would be interesting to know how much CPU time was chewed up by each DB during the SELECT sum(). It'd also be interesting to know just what datatype is being summed. regards, tom lane
> On Saturday 14 July 2001 02:49, Tom Lane wrote: > > > > > > > It would seem that Oracle's execution plan is more optimal. > > > > Hmm, since I don't know the details of Oracle's plan displays, it's hard > > to comment on that --- but it looks to me like the plans are essentially > > the same, with the small difference that Postgres chooses to use the > > index on dim0 to filter ....(snipped ) > > After a little thinking, I am inclined to agree with you Tom... I wondered if > the difference might to be due to pure sequential scan performance > differences. I tried this query : > > SELECT sum(val) FROM fact0 > > for Postgres, Db2 and Oracle. The results were > > Postgres 2m25s > Db2 40s > Oracle 50s > > This seems to be the likely culprit. I suspect that the "many block/page read > at once" type optimzations (prefetch for Db2 and mutli block read for Oracle) > mean that table sequential scans are faster for these guys than Postgres. > > Thus on the bright side their access plans are not necessarily any better > than Postgres ! Can you remind me about the OS you are using? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> > This seems to be the likely culprit. I suspect that the "many > > block/page read at once" type optimzations (prefetch for Db2 and mutli > > block read for Oracle) mean that table sequential scans are faster for > > these guys than Postgres. > > Hm. The theory about simple sequential reads is that we expect the > kernel to optimize the disk access, since it'll recognize that we are > doing sequential access to the table file and do read-aheads. Or that's > the theory, anyway. > > I am not convinced that inefficient I/O is the story here. We could be > paying the price of our very generalized implementation of aggregates. > It would be interesting to know how much CPU time was chewed up by each > DB during the SELECT sum(). It'd also be interesting to know just what > datatype is being summed. If it is Linux, they turn off read-ahead on the first fseek() and it never gets turned on again, or so I am told. And because we have virtual file descriptors, that table remains open after random access and the readahead bit doesn't get set for the next sequential scan. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> Hm. The theory about simple sequential reads is that we expect the > >> kernel to optimize the disk access, since it'll recognize that we are > >> doing sequential access to the table file and do read-aheads. Or that's > >> the theory, anyway. > > > If it is Linux, they turn off read-ahead on the first fseek() and it > > never gets turned on again, or so I am told. And because we have > > virtual file descriptors, that table remains open after random access > > and the readahead bit doesn't get set for the next sequential scan. > > Ugh. And even if we hacked the VFD code to close/reopen the file, the > shared disk buffers might still have some entries for some blocks of > the file, causing those blocks not to be requested during the seq scan, > thus disabling read-ahead again. > > It sounds like we really ought to try to get this Linux behavior fixed > to work more like BSD (ie, some reasonably small number of consecutive > reads turns on read-ahead). Red Hat guys, are you listening? I hit them with this yesterday, and sent an email this morning. The solution is to have the readahead throttle based on the number of cache hits from previous read-aheads. Basically, increase readahead on sequential reads and turn off on read cache lookup failures (meaning it doesn't have the requested block in its cache / random access). This works in cases where the app does alternating reads from two parts of a file. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Hm. The theory about simple sequential reads is that we expect the >> kernel to optimize the disk access, since it'll recognize that we are >> doing sequential access to the table file and do read-aheads. Or that's >> the theory, anyway. > If it is Linux, they turn off read-ahead on the first fseek() and it > never gets turned on again, or so I am told. And because we have > virtual file descriptors, that table remains open after random access > and the readahead bit doesn't get set for the next sequential scan. Ugh. And even if we hacked the VFD code to close/reopen the file, the shared disk buffers might still have some entries for some blocks of the file, causing those blocks not to be requested during the seq scan, thus disabling read-ahead again. It sounds like we really ought to try to get this Linux behavior fixed to work more like BSD (ie, some reasonably small number of consecutive reads turns on read-ahead). Red Hat guys, are you listening? regards, tom lane
Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> Hm. The theory about simple sequential reads is that we expect the > >> kernel to optimize the disk access, since it'll recognize that we are > >> doing sequential access to the table file and do read-aheads. Or that's > >> the theory, anyway. > > > If it is Linux, they turn off read-ahead on the first fseek() and it > > never gets turned on again, or so I am told. And because we have > > virtual file descriptors, that table remains open after random access > > and the readahead bit doesn't get set for the next sequential scan. > > Ugh. And even if we hacked the VFD code to close/reopen the file, the > shared disk buffers might still have some entries for some blocks of > the file, causing those blocks not to be requested during the seq scan, > thus disabling read-ahead again. > > It sounds like we really ought to try to get this Linux behavior fixed > to work more like BSD (ie, some reasonably small number of consecutive > reads turns on read-ahead). Red Hat guys, are you listening? Hmmm... Bruce mentioned this yesterday while he was up here and he reiterated his thoughts in a note this morning. The note has been forwarded to the appropriate people (ie. kernel folks). Cheers, Patrick
> > Can you remind me about the OS you are using? Linux 2.4.3 + Glibc 2.2.2 ( Mandrake 8.0) regards Mark
> > I tried this query : > > > > SELECT sum(val) FROM fact0 > > > > for Postgres, Db2 and Oracle. The results were > > > > Postgres 2m25s > > Db2 40s > > Oracle 50s > > > > This seems to be the likely culprit. I suspect that the "many > > block/page read at once" type optimzations (prefetch for Db2 and mutli > > block read for Oracle) mean that table sequential scans are faster for > > these guys than Postgres. > > Hm. The theory about simple sequential reads is that we expect the > kernel to optimize the disk access, since it'll recognize that we are > doing sequential access to the table file and do read-aheads. Or that's > the theory, anyway. > > I am not convinced that inefficient I/O is the story here. We could be > paying the price of our very generalized implementation of aggregates. > It would be interesting to know how much CPU time was chewed up by each > DB during the SELECT sum(). It'd also be interesting to know just what > datatype is being summed. > > regards, tom lane I monitored the cpu consumed by the relevant db processes ( counting the time noted against each process from ps -ef, hope that was what you had in mind ) DB Elapsed Cpu Postgres 2m25s 2m01s Db2 50s 30s Oracle 40s 18s ( I seem to have got my numbers for Db2 and the big O around the wrong way in the last post ! ) I thought it was worth trying a different query as well : SELECT count(*) FROM fact0 DB Elapsed Cpu Postgres 1m5s 32s Db2 23s 15s Oracle 37s 11s Finally the datatypes etc for the table Table "fact0" Attribute | Type | Modifier -----------+---------+---------- d0key | integer | d1key | integer | d2key | integer | val | integer | filler | text | Index: fact0_pk In terms of caching etc.... the first query was run from a cold start, the second immediatly afterwards. The Postgres db has 4000 (8K) pages of data buffers and the table itself is 57000 pages. ( others were configured analagously ) regards Mark
Mark, thanks for sharing these results with us. Interesting. PG is definitely slower, but not overwhelmingly slower. I wonder how different the result would be if you had a huge amount of RAM and allocated PG enough buffers to keep the entire table in RAM. That would take OS IO considerations out of the question, right? RAM is very cheap these days.
Mark kirkwood <markir@slingshot.co.nz> writes: > SELECT sum(val) FROM fact0 > I monitored the cpu consumed by the relevant db processes ( counting > the time noted against each process from ps -ef, hope that was what > you had in mind ) > DB Elapsed Cpu > Postgres 2m25s 2m01s > Db2 50s 30s > Oracle 40s 18s Yes, just what I wanted to see. It looks like my suspicion about CPU, not I/O, being the main issue was right on target. And I know where the time is going. sum(int4) is a good deal slower in PG 7.1 than in prior releases, because it uses an accumulator of type "numeric" rather than "int4". We made this change to avoid overflow problems, but I wonder whether we went too far in the direction of safety rather than performance. A possible compromise is to make the accumulator be type "int8" for int2 and int4 sums. You'd have to sum over at least 2^32 rows to have any risk of overflow, which seems acceptable to me --- comments anyone? Another consideration is what about machines without any 64-bit int support. These machines would end up using a 32-bit int accumulator, which would mean they'd be back to the pre-7.1 behavior in which sum() could overflow. Is this okay? regards, tom lane
> A possible compromise is to make the accumulator be type "int8" for > int2 and int4 sums. You'd have to sum over at least 2^32 rows to > have any risk of overflow, which seems acceptable to me --- comments > anyone? I've noticed the speed issues with sum() and think that using the int8 accumulator would be acceptable as sum() is just about too slow to use in it's current state.. I spent a few hours yesterday optomizing some queries and actually found it faster to select all the rows and go through programmatically and sum the fields that I wanted rather than to use sum() so I'm all for just about anything that could speed it up. What's the story with count(), are there overflow fears there too? It doesn't seem to suffer from the performance problem that sum() does but I'm curious just the same... > Another consideration is what about machines without any 64-bit int > support. These machines would end up using a 32-bit int accumulator, > which would mean they'd be back to the pre-7.1 behavior in which sum() > could overflow. Is this okay? I wonder how many PG users this would affect..... Any idea? Thanks! -Mitch
"Mitch Vincent" <mvincent@cablespeed.com> writes: > What's the story with count(), are there overflow fears there too? count() is just a plain int4 counter. Obviously it could overflow with more than 2^31 rows, but we haven't yet had many (any?) complaints about that, so I'm not in a big hurry to change it. OTOH, if we decide it's OK for sum(int4) to work better on machines with int8 support than on those without, maybe it'd make sense to change count() to use int8 too. >> Another consideration is what about machines without any 64-bit int >> support. These machines would end up using a 32-bit int accumulator, >> which would mean they'd be back to the pre-7.1 behavior in which sum() >> could overflow. Is this okay? > I wonder how many PG users this would affect..... Any idea? A fairly small minority, I'm sure; but as usual, there's no way to know just how many... regards, tom lane
> I thought it was worth trying a different query as well : > > SELECT count(*) FROM fact0 > > DB Elapsed Cpu > Postgres 1m5s 32s > Db2 23s 15s > Oracle 37s 11s I am surprised the others didn't optimize this more. Seems they don't keep an accurate count of the number of rows in a table either. I figured at least they would do an index scan for the count. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> > I thought it was worth trying a different query as well : > > > > SELECT count(*) FROM fact0 > > > > DB Elapsed Cpu > > Postgres 1m5s 32s > > Db2 23s 15s > > Oracle 37s 11s This may be an Oracle DBA myth, but I was told by my ORA DBA that it should be "SELECT count(1) FROM fact0" and not count(*). For some reason it was thought that count(1) would run faster, but I can't confirm or deny this. Does this make a difference in the benchmark? -sc -- Sean Chittenden
Attachment
> count() is just a plain int4 counter. Obviously it could overflow with > more than 2^31 rows, but we haven't yet had many (any?) complaints about > that, so I'm not in a big hurry to change it. I haven't run into it and haven't heard anything from anyone that has but was curious just the same... I figured that count() was using an integer but I suppose if someone had the number of rows required to overflow it they'd be in the overflowing OID situation too and a dysfunctional count() would probably be the least of their worries... > OTOH, if we decide it's OK for sum(int4) to work better on machines with > int8 support than on those without, maybe it'd make sense to change > count() to use int8 too. Sure.. > > I wonder how many PG users this would affect..... Any idea? > > A fairly small minority, I'm sure; but as usual, there's no way to know > just how many... I figured that it would be a tiny number of people.. IMHO we should do it because as the great Spock once said "The needs of the many outweigh the needs of the few.." Thanks! -Mitch
On Sat, 21 Jul 2001, Sean Chittenden wrote: > > > I thought it was worth trying a different query as well : > > > > > > SELECT count(*) FROM fact0 > > > > > > DB Elapsed Cpu > > > Postgres 1m5s 32s > > > Db2 23s 15s > > > Oracle 37s 11s > > This may be an Oracle DBA myth, but I was told by my ORA DBA > that it should be "SELECT count(1) FROM fact0" and not count(*). For > some reason it was thought that count(1) would run faster, but I can't > confirm or deny this. Does this make a difference in the benchmark? > -sc Actually, your oracle DBA was smoking crack. The real query is: SELECT COUNT(rowid) FROM fact0; RowID is a specially indexed field that Oracle uses to go DIRECTLY to a record through the datafile->tablespace->cluster->row, hence circumvents both indexes and the table itself. -- +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+ | Shaun M. Thomas INN Database Programmer | | Phone: (309) 743-0812 Fax : (309) 743-0830 | | Email: sthomas@townnews.com AIM : trifthen | | Web : hamster.lee.net | | | | "Most of our lives are about proving something, either to | | ourselves or to someone else." | | -- Anonymous | +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+