Hi,
I've been working on a custom aggregate, and I've ran into some fairly
annoying overhead due to casting direct parameters over and over. I'm
wondering if there's a way to eliminate this, somehow, without having to
do an explicit cast.
Imagine you have a simple aggregate:
CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[])
(
...
);
with two direct parameters (actually, I'm not sure that's the correct
term, becuse this is not an ordered-set aggregate and [1] only talks
about direct parameters in that context). Anyway, I'm talking about the
extra parameters, after the 'double precision' value to aggregate.
The last parameter is an array of values in [0.0,1.0], representing
percentiles (similarly to what we do in percentile_cont). It's annoying
to write literal values, so let's use CTE to generate the array:
WITH
perc AS (SELECT array_agg(i/100.0) AS p
FROM generate_series(1,99) s(i))
SELECT
SELECT tdigest_percentile(random(), 100, (SELECT p FROM perc))
FROM generate_series(1,10000000);
which does work, but it's running for ~180 seconds. When used with an
explicit array literal, it runs in ~1.6 second.
SELECT tdigest_percentile(random(), 100, ARRAY[0.01, ..., 0.99]))
FROM generate_series(1,10000000);
After a while, I've realized that the issue is casting - the CTE
produces numeric[] array, and we do the cast to double precision[] on
every call to the state transition function (and we do ~10M of those).
The cast is fairly expensive - much more expensive than the aggregate
itself. The explicit literal ends up being the right type, so the whole
query is much faster.
And indeed, adding the explicit cast to the CTE query
WITH
perc AS (SELECT array_agg((i/100.0)::double precision) AS p
FROM generate_series(1,99) s(i))
SELECT
SELECT tdigest_percentile(random(), 100, (SELECT p FROM perc))
FROM generate_series(1,10000000);
does the trick - the query is ~1.6s again.
I wonder if there's a chance to detect and handle this without having to
do the cast over and over? I'm thinking that's not quite possible,
because the value is not actually guaranteed to be the same for all
calls (even though it's the case for the example I've given).
But maybe we could flag the parameter somehow, to make it more like the
direct parameter (which is only evaluated once). I don't really need
those extra parameters in the transition function at all, it's fine to
just get it to the final function (and there should be far fewer calls
to those).
regards
[1] https://www.postgresql.org/docs/current/sql-createaggregate.html
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services