Thread: Weird function behavior from Sept 11 snapshot
Under both 6.5 and 7.0: ---------------------- stocks=# create table test (key int4); CREATE stocks=# create function crap(int4) returns int4 as 'select sum(key) from test' language 'sql'; CREATE stocks=# select version(); version ---------------------------------------------------------------------PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66 Under the snapshot from yesterday: --------------------------------- template1=# create table test (key int4); CREATE template1=# create function crap(int4) returns int4 as 'select sum(key) from test' language 'sql'; ERROR: return type mismatch in function: declared to return int4, returns numeric template1=# select version(); version ------------------------------------------------------------------------PostgreSQL 7.1devel on i586-pc-linux-gnu, compiledby GCC egcs-2.91.66 Is this correct behavior? All of the regression tests pass on the snapshot version, BTW. Mike Mascari
Mike Mascari <mascarm@mascari.com> writes: > Under the snapshot from yesterday: > --------------------------------- > template1=# create function crap(int4) returns int4 > as 'select sum(key) from test' language 'sql'; > ERROR: return type mismatch in function: declared to return > int4, returns numeric I changed sum() on integer types to return numeric as a way of avoiding overflow. Also avg() on integers now returns numeric so that you can get some fractional precision. If you think this was a bad idea, there's still time to debate it ... but we've had repeated complaints about both of those issues. regards, tom lane
> Is this correct behavior? All of the regression tests pass on the > snapshot version, BTW. This is the expected behavior, and is "correct". There was a change recently to the aggregate functions to make them more robust. So sum(int4) now calculates and returns a numeric result rather than an int4. The problem is that numeric is extremely slow compared to an int4 calculation, and I'd like us to consider doing the calculation in int4 (downside: silent overflow when dealing with non-trivial data), int8 (downside: no support on a few platforms), or float8 (downside: silent truncation on non-trivial data). Tom, do you recall measuring the performance difference on aggregate functions between int4 and numeric for small-value cases? We probably don't want to take order-of-magnitude performance hits to get this more correct behavior, but I'm not sure what the performance actually is. btw, Mike's function works when defined as create function c(int4) returns int4 as 'select cast(sum(key) as int4) from test' language 'sql'; - Thomas
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: > Tom, do you recall measuring the performance difference on aggregate > functions between int4 and numeric for small-value cases? We probably > don't want to take order-of-magnitude performance hits to get this more > correct behavior, but I'm not sure what the performance actually is. I have not tried to measure it --- I was sort of assuming that for realistic-size problems, disk I/O would swamp any increase in CPU time anyway. Does anyone want to check the time for sum() or avg() on an int4 column over a large table, under both 7.0.* and current? > The problem is that numeric is extremely slow compared to an int4 > calculation, and I'd like us to consider doing the calculation in int4 > (downside: silent overflow when dealing with non-trivial data), int8 > (downside: no support on a few platforms), or float8 (downside: silent > truncation on non-trivial data). Actually, using a float8 accumulator would work pretty well; assuming IEEE float8, you'd only start to get roundoff error when the running sum exceeds 2^52 or so. However the SQL92 spec is insistent that sum() deliver an exact-numeric result when applied to exact-numeric data, and with a float accumulator we'd be at the mercy of the quality of the local implementation of floating point. I could see offering variant aggregates, say "sumf" and "avgf", that use float8 accumulation. Right now the user can get the same result by writing "sum(foo::float8)" but it might be wise to formalize the idea ... regards, tom lane
> ... Does anyone want to check the time for sum() or avg() on an > int4 column over a large table, under both 7.0.* and current? For 262144 rows on the current tree, I get the following: sum(int4): 12.0 seconds sum(float8): 5.2 seconds sum(cast(int4 as float8): 5.7 seconds This includes startup costs, etc, and are the minimum times from several runs (there is pretty wide variability, presumably due to disk caching, swapping, etc on my laptop). It is a safe bet that the original int4 implementation was as fast or faster than the float8 result above (int4 does not require palloc() calls). > Actually, using a float8 accumulator would work pretty well; assuming > IEEE float8, you'd only start to get roundoff error when the running > sum exceeds 2^52 or so. However the SQL92 spec is insistent that sum() > deliver an exact-numeric result when applied to exact-numeric data, > and with a float accumulator we'd be at the mercy of the quality of the > local implementation of floating point. A problem with float8 is that it is possible to reach a point in the accumulation where subsequent input values are ignored in the sum. This is different than just roundoff error, since it degrades ungracefully from that point on. > I could see offering variant aggregates, say "sumf" and "avgf", that > use float8 accumulation. Right now the user can get the same result > by writing "sum(foo::float8)" but it might be wise to formalize the > idea ... How about using int8 for the accumulator (on machines which support it of course)? Falling back to float8 or numeric on other machines? Or perhaps we could have an option (runtime??) to switch accumulator modes. I like the idea of something like "sumf" to get alternative algorithms, but it would be nice if basic sum() could be a bit more optimized than currently. - Thomas
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: > How about using int8 for the accumulator (on machines which support it > of course)? Falling back to float8 or numeric on other machines? int8 would still pose some overflow risk (at least for int8 input), and would likely be no faster than a float8 implementation, since both would require palloc(). Your test suggests that the performance differential is *at most* 2X --- probably much less in real-world situations where the disk pages aren't already cached. I can't get excited about introducing platform-dependent behavior and overflow risk for that. If it were 10X then I would, but right now I think we are OK as is. I think any speedup efforts here would be better put into making NUMERIC ops go faster ... regards, tom lane
> int8 would still pose some overflow risk (at least for int8 input), > and would likely be no faster than a float8 implementation, since > both would require palloc(). Right. On 32-bit machines, int8 is likely to be substantially slower, since the int8 math is done in a library rather than in a single machine instruction. > Your test suggests that the performance differential is *at most* > 2X --- probably much less in real-world situations where the disk > pages aren't already cached. Hmm. sum(int4) on the same table is 1.8 seconds for 7.0.2 (vs 12.5 for snapshot). But I *am* compiling with asserts turned on for the other tests (with maybe some other differences too), so maybe it is not (yet) a fair comparison. Still a pretty big performance difference for something folks expect to be a fast operation. - Thomas
> Your test suggests that the performance differential is *at most* > 2X --- probably much less in real-world situations where the disk > pages aren't already cached. I can't get excited about introducing > platform-dependent behavior and overflow risk for that. If it were > 10X then I would, but right now I think we are OK as is. I think > any speedup efforts here would be better put into making NUMERIC > ops go faster ... Another followup: on 7.0.2, with different optimizations etc, sum(float8) takes 1.95 seconds, rather than the 5.2 on the current tree. I'd better look at the compilation optimizations; is there another explanation for the factor of 2.6 difference (!!)? So I'd expect int4 to be closer to float8 in performance than my previous mail suggested. - Thomas
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: > Another followup: on 7.0.2, with different optimizations etc, > sum(float8) takes 1.95 seconds, rather than the 5.2 on the current tree. > I'd better look at the compilation optimizations; is there another > explanation for the factor of 2.6 difference (!!)? If you are running with --enable-cassert then there is a whole bunch of memory-stomp debugging overhead turned on in current sources, including such time-consuming stuff as clearing every pfree'd block. 7.0.*'s --enable-cassert is not nearly as expensive. I plan to make that stuff not-default when we go beta, but right now it seems like a good idea to have it on for testing... regards, tom lane
Hmm. I recompiled the current snapshot with the optimizations from my Mandrake RPM (using the Mandrake defaults, except for disabling "fast-math"), and get the following: 7.0.2 current test1.8 5.3 sum(i)1.95 1.77 sum(f)2.3 1.9 sum(cast(i as float8)) My previous tests on the current tree were with -O0, asserts enabled, and few other options specified (mostly, the defaults for the Postgres Linux build). The Linux defaults in the Postgres tarball are: -O2 -Wall -Wmissing-prototypes -Wmissing-declarations whereas the defaults for Mandrake (with fast-math turned off since it gives rounding trouble in date/time math): -O3 -fomit-frame-pointer -fno-exceptions -fno-rtti -pipe -s -mpentiumpro -mcpu=pentiumpro -march=pentiumpro -fexpensive-optimizations-malign-loops=2 -malign-jumps=2 -malign-functions=2 -mpreferred-stack-boundary=2 -fno-fast-math I'll do some more tests with the default compiler options. The good news is that the new fmgr interface is apparently as fast or faster than the old one :) - Thomas