Re: Window-functions patch handling of aggregates - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Window-functions patch handling of aggregates |
Date | |
Msg-id | 14175.1230319049@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Window-functions patch handling of aggregates (Greg Stark <greg.stark@enterprisedb.com>) |
Responses |
Re: Window-functions patch handling of aggregates
Re: Window-functions patch handling of aggregates Re: Window-functions patch handling of aggregates |
List | pgsql-hackers |
Greg Stark <greg.stark@enterprisedb.com> writes: > Yeah, it seems like adding a flag like iswindowable to aggregate > functions is the safest option. I agree with Hitoshi-san: that's passing information in the wrong direction. The right direction is to make it visible to the called function which context it's being called in, so that it can do the right thing (or at worst, throw an error if it can't). The current state of play is that the documented expectation for aggregate functions is that they should do if (fcinfo->context && IsA(fcinfo->context, AggState)) ... optimized code for aggregate case ...else ... support regularcall, or throw error ... However, a bit of grepping for AggState shows that this expectation isn't met very well within the core distribution, let alone elsewhere. There are about 10 aggregates that have optimizations like this, only 8 of which are playing by the rules --- the violators are: tsearch2's tsa_rewrite_accum: just assumes it's been passed an AggState, dumps core (or worse) if not array_agg: Asserts it's been passed an AggState, dumps core if not As submitted, Hitoshi's patch made the WindowAgg code pass its WindowAggState to the aggregate functions, which at best would have the result of disabling the internal optimizations of the aggregates, and would result in a core dump in any aggregate that was taking a shortcut. The working copy I have right now does this: if (numaggs > 0) { /* * Set up a mostly-dummy AggState to be passed to plain aggregates * so theyknow they can optimize things. We don't supply any useful * info except for the ID of the aggregate-lifetimecontext. */ winstate->aggstate = makeNode(AggState); winstate->aggstate->aggcontext= winstate->wincontext; } and then passes the dummy AggState to plain aggregates, instead of WindowAggState. This makes count(*) and most of the other aggs work as desired, but it's not sufficient for array_agg because of that release-the-data-structure-in-the-final-function optimization. So the alternatives I see are: 1. Go back to Hitoshi's plan of passing WindowAggState to the aggregates. This will require changing every one of the ten aggregates in the core distro, as well as every third-party aggregate that has a similar optimization; and we just have to keep our fingers crossed that anyone who's taking a short-cut will fix their code before it fails in the field. 2. Use an intermediate dummy AggState as I have in my version, but document some convention for telling this from a "real" AggState when needed. (Not hard, we just pick some field that would never be zero in a real AggState and document testing that.) This is certainly on the ugly side, but it would very substantially cut the number of places that need changes. Only aggregates that are doing something irreversible in their final-functions would need to be touched. If we were working in a green field then #1 would clearly be the preferable choice, but worrying about compatibility with existing third-party aggregates is making me lean to #2. Comments? regards, tom lane
pgsql-hackers by date: