Thread: Weird function behavior from Sept 11 snapshot

Weird function behavior from Sept 11 snapshot

From
Mike Mascari
Date:
Under both 6.5 and 7.0:
----------------------
stocks=# create table test (key int4);
CREATE
stocks=# create function crap(int4) returns int4 as 
'select sum(key) from test' language 'sql';
CREATE
stocks=# select version();                             
version                               
---------------------------------------------------------------------PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled by
gcc
egcs-2.91.66


Under the snapshot from yesterday:
---------------------------------

template1=# create table test (key int4);
CREATE
template1=# create function crap(int4) returns int4 
as 'select sum(key) from test' language 'sql';
ERROR:  return type mismatch in function: declared to return
int4, returns numeric
template1=# select version();                              
version                                 
------------------------------------------------------------------------PostgreSQL 7.1devel on i586-pc-linux-gnu,
compiledby GCC
 
egcs-2.91.66


Is this correct behavior? All of the regression tests pass on the
snapshot version, BTW. 

Mike Mascari


Re: Weird function behavior from Sept 11 snapshot

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> Under the snapshot from yesterday:
> ---------------------------------

> template1=# create function crap(int4) returns int4 
> as 'select sum(key) from test' language 'sql';
> ERROR:  return type mismatch in function: declared to return
> int4, returns numeric

I changed sum() on integer types to return numeric as a way of
avoiding overflow.  Also avg() on integers now returns numeric
so that you can get some fractional precision.  If you think this
was a bad idea, there's still time to debate it ... but we've had
repeated complaints about both of those issues.
        regards, tom lane


Re: Weird function behavior from Sept 11 snapshot

From
Thomas Lockhart
Date:
> Is this correct behavior? All of the regression tests pass on the
> snapshot version, BTW.

This is the expected behavior, and is "correct". There was a change
recently to the aggregate functions to make them more robust. So
sum(int4) now calculates and returns a numeric result rather than an
int4.

The problem is that numeric is extremely slow compared to an int4
calculation, and I'd like us to consider doing the calculation in int4
(downside: silent overflow when dealing with non-trivial data), int8
(downside: no support on a few platforms), or float8 (downside: silent
truncation on non-trivial data).

Tom, do you recall measuring the performance difference on aggregate
functions between int4 and numeric for small-value cases? We probably
don't want to take order-of-magnitude performance hits to get this more
correct behavior, but I'm not sure what the performance actually is.

btw, Mike's function works when defined as

create function c(int4) returns int4 as 'select cast(sum(key) as int4) from test' language 'sql';
                     - Thomas


Re: Weird function behavior from Sept 11 snapshot

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Tom, do you recall measuring the performance difference on aggregate
> functions between int4 and numeric for small-value cases? We probably
> don't want to take order-of-magnitude performance hits to get this more
> correct behavior, but I'm not sure what the performance actually is.

I have not tried to measure it --- I was sort of assuming that for
realistic-size problems, disk I/O would swamp any increase in CPU time
anyway.  Does anyone want to check the time for sum() or avg() on an
int4 column over a large table, under both 7.0.* and current?

> The problem is that numeric is extremely slow compared to an int4
> calculation, and I'd like us to consider doing the calculation in int4
> (downside: silent overflow when dealing with non-trivial data), int8
> (downside: no support on a few platforms), or float8 (downside: silent
> truncation on non-trivial data).

Actually, using a float8 accumulator would work pretty well; assuming
IEEE float8, you'd only start to get roundoff error when the running
sum exceeds 2^52 or so.  However the SQL92 spec is insistent that sum()
deliver an exact-numeric result when applied to exact-numeric data,
and with a float accumulator we'd be at the mercy of the quality of the
local implementation of floating point.

I could see offering variant aggregates, say "sumf" and "avgf", that
use float8 accumulation.  Right now the user can get the same result
by writing "sum(foo::float8)" but it might be wise to formalize the
idea ...
        regards, tom lane


Re: Weird function behavior from Sept 11 snapshot

From
Thomas Lockhart
Date:
> ... Does anyone want to check the time for sum() or avg() on an
> int4 column over a large table, under both 7.0.* and current?

For 262144 rows on the current tree, I get the following:

sum(int4):               12.0 seconds
sum(float8):              5.2 seconds
sum(cast(int4 as float8): 5.7 seconds

This includes startup costs, etc, and are the minimum times from several
runs (there is pretty wide variability, presumably due to disk caching,
swapping, etc on my laptop). It is a safe bet that the original int4
implementation was as fast or faster than the float8 result above (int4
does not require palloc() calls).

> Actually, using a float8 accumulator would work pretty well; assuming
> IEEE float8, you'd only start to get roundoff error when the running
> sum exceeds 2^52 or so.  However the SQL92 spec is insistent that sum()
> deliver an exact-numeric result when applied to exact-numeric data,
> and with a float accumulator we'd be at the mercy of the quality of the
> local implementation of floating point.

A problem with float8 is that it is possible to reach a point in the
accumulation where subsequent input values are ignored in the sum. This
is different than just roundoff error, since it degrades ungracefully
from that point on.

> I could see offering variant aggregates, say "sumf" and "avgf", that
> use float8 accumulation.  Right now the user can get the same result
> by writing "sum(foo::float8)" but it might be wise to formalize the
> idea ...

How about using int8 for the accumulator (on machines which support it
of course)? Falling back to float8 or numeric on other machines? Or
perhaps we could have an option (runtime??) to switch accumulator modes.

I like the idea of something like "sumf" to get alternative algorithms,
but it would be nice if basic sum() could be a bit more optimized than
currently.
                      - Thomas


Re: Weird function behavior from Sept 11 snapshot

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> How about using int8 for the accumulator (on machines which support it
> of course)? Falling back to float8 or numeric on other machines?

int8 would still pose some overflow risk (at least for int8 input),
and would likely be no faster than a float8 implementation, since
both would require palloc().

Your test suggests that the performance differential is *at most*
2X --- probably much less in real-world situations where the disk
pages aren't already cached.  I can't get excited about introducing
platform-dependent behavior and overflow risk for that.  If it were
10X then I would, but right now I think we are OK as is.  I think
any speedup efforts here would be better put into making NUMERIC
ops go faster ...
        regards, tom lane


Re: Weird function behavior from Sept 11 snapshot

From
Thomas Lockhart
Date:
> int8 would still pose some overflow risk (at least for int8 input),
> and would likely be no faster than a float8 implementation, since
> both would require palloc().

Right. On 32-bit machines, int8 is likely to be substantially slower,
since the int8 math is done in a library rather than in a single machine
instruction.

> Your test suggests that the performance differential is *at most*
> 2X --- probably much less in real-world situations where the disk
> pages aren't already cached.

Hmm. sum(int4) on the same table is 1.8 seconds for 7.0.2 (vs 12.5 for
snapshot). But I *am* compiling with asserts turned on for the other
tests (with maybe some other differences too), so maybe it is not (yet)
a fair comparison. Still a pretty big performance difference for
something folks expect to be a fast operation.
                   - Thomas


Re: Weird function behavior from Sept 11 snapshot

From
Thomas Lockhart
Date:
> Your test suggests that the performance differential is *at most*
> 2X --- probably much less in real-world situations where the disk
> pages aren't already cached.  I can't get excited about introducing
> platform-dependent behavior and overflow risk for that.  If it were
> 10X then I would, but right now I think we are OK as is.  I think
> any speedup efforts here would be better put into making NUMERIC
> ops go faster ...

Another followup: on 7.0.2, with different optimizations etc,
sum(float8) takes 1.95 seconds, rather than the 5.2 on the current tree.
I'd better look at the compilation optimizations; is there another
explanation for the factor of 2.6 difference (!!)?

So I'd expect int4 to be closer to float8 in performance than my
previous mail suggested.
                        - Thomas


Re: Weird function behavior from Sept 11 snapshot

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Another followup: on 7.0.2, with different optimizations etc,
> sum(float8) takes 1.95 seconds, rather than the 5.2 on the current tree.
> I'd better look at the compilation optimizations; is there another
> explanation for the factor of 2.6 difference (!!)?

If you are running with --enable-cassert then there is a whole bunch
of memory-stomp debugging overhead turned on in current sources,
including such time-consuming stuff as clearing every pfree'd block.
7.0.*'s --enable-cassert is not nearly as expensive.

I plan to make that stuff not-default when we go beta, but right now
it seems like a good idea to have it on for testing...
        regards, tom lane


Re: Weird function behavior from Sept 11 snapshot

From
Thomas Lockhart
Date:
Hmm. I recompiled the current snapshot with the optimizations from my
Mandrake RPM (using the Mandrake defaults, except for disabling
"fast-math"), and get the following:

7.0.2   current   test1.8     5.3     sum(i)1.95    1.77     sum(f)2.3     1.9     sum(cast(i as float8))

My previous tests on the current tree were with -O0, asserts enabled,
and few other options specified (mostly, the defaults for the Postgres
Linux build).

The Linux defaults in the Postgres tarball are:
 -O2 -Wall -Wmissing-prototypes -Wmissing-declarations

whereas the defaults for Mandrake (with fast-math turned off since it
gives rounding trouble in date/time math):
 -O3 -fomit-frame-pointer -fno-exceptions -fno-rtti -pipe -s -mpentiumpro -mcpu=pentiumpro -march=pentiumpro
-fexpensive-optimizations-malign-loops=2 -malign-jumps=2 -malign-functions=2 -mpreferred-stack-boundary=2
-fno-fast-math

I'll do some more tests with the default compiler options. The good news
is that the new fmgr interface is apparently as fast or faster than the
old one :)
                    - Thomas