On Tue, 17 Jan 2023 at 13:16, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > I took a look at this, and I agree that the best solution is probably > to have make_pathkeys_for_groupagg() ignore Aggrefs that contain > volatile functions.
Thanks for giving that some additional thought. I've just pushed a fix which adjusts things that way.
This makes a lot of sense. I agree that we shouldn't do pre-sorting for volatile sort expressions, especially when there are multiple aggregates with the same volatile sort expression.
Not related to this specific issue, but I find sorting by volatile expression is confusing in different scenarios. Consider the two queries given by David
Query 1: select string_agg(random()::text, ',' order by random()) from generate_series(1,3);
Query 2: select random()::text from generate_series(1,3) order by random();
Considering the targetlist as Aggref->args or Query->targetList, in both queries we would add an additional TargetEntry (as resjunk column) for the ORDER BY item 'random()', because it's not present in the existing targetlist. Note that the existing TargetEntry for 'random()::text' is a CoerceViaIO expression which is an explicit cast, so we cannot strip it and match it to the ORDER BY item. Thus we would have two random() FuncExprs in the final targetlist, for both queries.
In query 1 we call random() twice for each tuple, one for the original TargetEntry 'random()::text', and one for the TargetEntry of the ORDER BY item 'random()', and do the sorting according to the second call results. Thus we would notice the final output is unsorted because it's from the first random() call.
However, in query 2 we have the ORDER BY item 'random()' in the scan/join node's targetlist. And then for the two random() FuncExprs in the final targetlist, set_plan_references would adjust both of them to refer to the outputs of the scan/join node. Thus random() is actually called only once for each tuple and we would find the final output is sorted.
It seems we fail to keep consistent about the behavior of sorting by volatile expression in the two scenarios.
BTW, I wonder if we should have checked CoercionForm before fix_upper_expr_mutator steps into CoerceViaIO->arg to adjust the expr there. It seems parser checks it and only strips implicit coercions when matching TargetEntry expr to ORDER BY item.