Re: Improving avg performance for numeric - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Improving avg performance for numeric
Date
Msg-id 9796.1384648156@sss.pgh.pa.us
Whole thread Raw
In response to Re: Improving avg performance for numeric  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ numeric-optimize-v8.patch ]

I've committed this with some adjustments.  Aside from cosmetic
changes and documentation/comment improvements:

* I don't agree at all with the claim that aggtransspace is only useful
for INTERNAL transition data.  There are lots of cases with regular
SQL types where the planner doesn't have a good idea of how large the
transition value will be, and with the current code it tends to guess
on the small side (frequently 32 bytes :-().  It seems to me to be
useful to give users a knob to twiddle there.  Consider for example
an aggregate that uses an integer array as transition state; the author
of the aggregate might know that there are usually hundreds of entries
in the array, and wish to dial up aggtransspace to prevent the planner
from optimistically choosing hash aggregation.

As committed, I just took out the restriction in CREATE AGGREGATE
altogether; it will let you set SSPACE for any transition datatype.
The planner will ignore it for pass-by-value types, where the behavior
is known, but otherwise honor it.  It's possible that we should put in
a restriction to INTERNAL plus pass-by-reference types, or even INTERNAL
plus pass-by-reference variable-length types, but I can't get excited
about that.  The error message would be too confusing I think.

* There was a stray "else" added to clauses.c that disabled space
accounting for polymorphic aggregates.

* I simplified the summing code in do_numeric_accum; the copying and
reallocating it was doing was not only unnecessary but contained
unpleasant assumptions about what you can do with a NumericVar.  This
probably makes the committed patch a bit faster than what was submitted,
but I didn't try to benchmark the change.

* I made sure all the functions accessed their input state pointer with    state = PG_ARGISNULL(0) ? NULL :
(NumericAggState*) PG_GETARG_POINTER(0);
 
There were places that omitted the ARGISNULL test, and thus only happened
to work if the uninitialized Datum value they got was zero.  While that
might chance to be true in the current state of the code, it's not an
assumption you're supposed to make.

* numeric sum/avg failed to check for NaN inputs.

* I fixed incorrect tests in the code added to pg_dump.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: additional json functionality
Next
From: Peter Geoghegan
Date:
Subject: Re: Storing pg_stat_statements query texts externally, pg_stat_statements in core