Thread: overhead due to casting extra parameters with aggregates (over andover)
Hi, I've been working on a custom aggregate, and I've ran into some fairly annoying overhead due to casting direct parameters over and over. I'm wondering if there's a way to eliminate this, somehow, without having to do an explicit cast. Imagine you have a simple aggregate: CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[]) ( ... ); with two direct parameters (actually, I'm not sure that's the correct term, becuse this is not an ordered-set aggregate and [1] only talks about direct parameters in that context). Anyway, I'm talking about the extra parameters, after the 'double precision' value to aggregate. The last parameter is an array of values in [0.0,1.0], representing percentiles (similarly to what we do in percentile_cont). It's annoying to write literal values, so let's use CTE to generate the array: WITH perc AS (SELECT array_agg(i/100.0) AS p FROM generate_series(1,99) s(i)) SELECT SELECT tdigest_percentile(random(), 100, (SELECT p FROM perc)) FROM generate_series(1,10000000); which does work, but it's running for ~180 seconds. When used with an explicit array literal, it runs in ~1.6 second. SELECT tdigest_percentile(random(), 100, ARRAY[0.01, ..., 0.99])) FROM generate_series(1,10000000); After a while, I've realized that the issue is casting - the CTE produces numeric[] array, and we do the cast to double precision[] on every call to the state transition function (and we do ~10M of those). The cast is fairly expensive - much more expensive than the aggregate itself. The explicit literal ends up being the right type, so the whole query is much faster. And indeed, adding the explicit cast to the CTE query WITH perc AS (SELECT array_agg((i/100.0)::double precision) AS p FROM generate_series(1,99) s(i)) SELECT SELECT tdigest_percentile(random(), 100, (SELECT p FROM perc)) FROM generate_series(1,10000000); does the trick - the query is ~1.6s again. I wonder if there's a chance to detect and handle this without having to do the cast over and over? I'm thinking that's not quite possible, because the value is not actually guaranteed to be the same for all calls (even though it's the case for the example I've given). But maybe we could flag the parameter somehow, to make it more like the direct parameter (which is only evaluated once). I don't really need those extra parameters in the transition function at all, it's fine to just get it to the final function (and there should be far fewer calls to those). regards [1] https://www.postgresql.org/docs/current/sql-createaggregate.html -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > I've been working on a custom aggregate, and I've ran into some fairly > annoying overhead due to casting direct parameters over and over. I'm > wondering if there's a way to eliminate this, somehow, without having to > do an explicit cast. > Imagine you have a simple aggregate: > CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[]) > ( > ... > ); > with two direct parameters (actually, I'm not sure that's the correct > term, becuse this is not an ordered-set aggregate and [1] only talks > about direct parameters in that context). Anyway, I'm talking about the > extra parameters, after the 'double precision' value to aggregate. But you're not telling the system that those are direct parameters, at least not if you mean that they can only legitimately have one value across the whole query. As-is, they're just more aggregated arguments so we have to evaluate them again at each row. It's fairly messy that the SQL spec ties direct arguments to ordered-set aggregates; you'd think there'd be some value in treating those features as orthogonal. I'm not sure how we could wedge them into the syntax otherwise, though :-(. You could perhaps convert your aggregate to an ordered-set aggregate, but then you'd be paying for a sort that you don't need, IIUC. > After a while, I've realized that the issue is casting - the CTE > produces numeric[] array, and we do the cast to double precision[] on > every call to the state transition function (and we do ~10M of those). The only reason that the CTE reference is cheap is that we understand that it's stable so we don't have to recompute it each time; otherwise you'd be moaning about that more than the cast. As you say, the short term workaround is to do the casting inside the sub-select. I think the long term fix is to generically avoid re-computing stable subexpressions. There was a patch for that a year or so ago but the author never finished it, AFAIR. regards, tom lane
On Mon, Sep 23, 2019 at 12:53:36PM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> I've been working on a custom aggregate, and I've ran into some fairly >> annoying overhead due to casting direct parameters over and over. I'm >> wondering if there's a way to eliminate this, somehow, without having to >> do an explicit cast. > >> Imagine you have a simple aggregate: > >> CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[]) >> ( >> ... >> ); > >> with two direct parameters (actually, I'm not sure that's the correct >> term, becuse this is not an ordered-set aggregate and [1] only talks >> about direct parameters in that context). Anyway, I'm talking about the >> extra parameters, after the 'double precision' value to aggregate. > >But you're not telling the system that those are direct parameters, >at least not if you mean that they can only legitimately have one value >across the whole query. As-is, they're just more aggregated arguments >so we have to evaluate them again at each row. > Understood. >It's fairly messy that the SQL spec ties direct arguments to ordered-set >aggregates; you'd think there'd be some value in treating those features >as orthogonal. I'm not sure how we could wedge them into the syntax >otherwise, though :-(. You could perhaps convert your aggregate to >an ordered-set aggregate, but then you'd be paying for a sort that >you don't need, IIUC. > Yeah, having to do the sort (and keep all the data) is exactly what the tdigest is meant to eliminate, so making it an ordered-set aggregate is exactly the thing I don't want to do. Also, it disables parallel query, which is another reason not to do that. >> After a while, I've realized that the issue is casting - the CTE >> produces numeric[] array, and we do the cast to double precision[] on >> every call to the state transition function (and we do ~10M of those). > >The only reason that the CTE reference is cheap is that we understand >that it's stable so we don't have to recompute it each time; otherwise >you'd be moaning about that more than the cast. As you say, the short >term workaround is to do the casting inside the sub-select. I think the >long term fix is to generically avoid re-computing stable subexpressions. >There was a patch for that a year or so ago but the author never finished >it, AFAIR. > Hmmm, yeah. I'll dig throgh the archives, although it's not a very high priority - it's more a thing that surprised/bugged me while working on the custom aggregate. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services