Re: Partial aggregates pushdown - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Partial aggregates pushdown |
Date | |
Msg-id | CA+Tgmoam-sXZqhqESi1CyNG0RpSQdGLEspNudoXsOJYvUL4Gqg@mail.gmail.com Whole thread Raw |
In response to | Re: Partial aggregates pushdown (Alexander Pyhalov <a.pyhalov@postgrespro.ru>) |
List | pgsql-hackers |
On Wed, Nov 30, 2022 at 3:12 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote: > 1) In previous version of the patch aggregates, which had partialaggfn, > were ok to push down. And it was a definite sign that aggregate can be > pushed down. Now we allow pushing down an aggregate, which prorettype is > not internal and aggfinalfn is not defined. Is it safe for all > user-defined (or builtin) aggregates, even if they are generally > shippable? I think that this is exactly the correct test. Here's how to think about it: to perform an aggregate, you merge all the values into the transition state, and then you apply the final function once at the end. So the process looks like this: TRANSITION_STATE_0 + VALUE_1 = TRANSITION_STATE_1 TRANSITION_STATE_1 + VALUE_2 = TRANSITION_STATE_2 ... TRANSITION_STATE_N => RESULT Here, + represents applying the transition function and => represents applying the final function. In the case of parallel query, we want every worker to be able to incorporate values into its own transition states and then merge all the transition states at the end. That's a problem, because the transition function expects a transition state and a value, not two transition states. So we invented the idea of a "combine" function to solve this problem. A combine function takes two transition states and produces a new transition state. That allows each worker to create an initially empty transition state, merge a bunch of values into it, and then pass the result back to the leader, which can combine all the transition states using the combine function, and then apply the final function at the end. The same kind of idea works here. If we want to push down an entire aggregate, there's no problem, provided the remote side supports it: just push down the whole operation and get the result. But if we want to push down part of the aggregate, then what we want to get back is a transition value that we can then combine with other values (using the transition function) or other transition states (using the combine function) locally. That's tricky, because there's no SQL syntax to ask the remote side to give us the transition value rather than the final value. I think we would need to add that to solve this problem in its full generality. However, in the special case where there's no final function, the problem goes away, because then a transition value and a result are identical. If we ask for a result, we can treat it as a transition value, and there's no problem. Internal values are a problem. Generally, you don't see internal as the return type for an aggregate, because then the aggregate couldn't be called by the user. An internal value can't be returned. However, it's pretty common to see an aggregate that has an internal value as a transition type, and something else as the result type. In such cases, even if we had some syntax telling the remote side to send the transition value rather than the final value, it would not be sufficient, because the internal value still couldn't be transmitted. This problem also arises for parallel query, where we want to move transition values between processes within a single database cluster. We solved that problem using aggserialfn and aggdeserialfn. aggserialfn converts an internal transition value (which can't be moved between processes) into a bytea, and aggdeserialfn does the reverse. Maybe we would adopt the same solution here: our syntax that tells the remote side to give us the transition value rather than the final value could also tell the remote side to serialize it to bytea if it's an internal type. However, if we did this, we'd have to be sure that our deserialization functions were pretty well hardened against unexpected or even malicious input, because who knows whether that remote server is really going to send us a bytea in the format that we're expecting to get? Anyway, for the present patch, I think that testing whether there's a final function is the right thing, and testing whether the return type is internal doesn't hurt. If we want to extend this to other cases in the future, then I think we need syntax to ask the remote side for the unfinalized aggregate, like SELECT UNFINALIZED MAX(a) FROM t1, or whatever. I'm not sure what the best concrete SQL syntax is - probably not that. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: