RE: Partial aggregates pushdown - Mailing list pgsql-hackers
From | Fujii.Yuki@df.MitsubishiElectric.co.jp" |
---|---|
Subject | RE: Partial aggregates pushdown |
Date | |
Msg-id | TYAPR01MB30884F8E4D94038CD661FFEF95F02@TYAPR01MB3088.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Partial aggregates pushdown (Alexander Pyhalov <a.pyhalov@postgrespro.ru>) |
Responses |
Re: Partial aggregates pushdown
Re: Partial aggregates pushdown Re: Partial aggregates pushdown |
List | pgsql-hackers |
Hi Mr. Pyhalov. Sorry for the late reply. Thank you for your modification and detailed review. I attach a fixed patch, have been not yet rebased. Monday, 25 March 2024 16:01 Alexander Pyhalov <a.pyhalov@postgrespro.ru>:. > Comment in nodeAgg.c seems to be strange: > > 1079 /* > 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE > keyword is > 1081 * not specified, apply the agg's finalfn. > 1082 * If PARTIAL_AGGREGATE keyword is specified and the > transValue type > 1083 * is internal, apply the agg's serialfn. In this case, if > the agg's > 1084 * serialfn must not be invalid. Otherwise return > transValue. > 1085 */ > > Likely, you mean: > > ... In this case the agg'ss serialfn must not be invalid... Fixed. > Lower, in the same file, please, correct error message: > > 1136 if(!OidIsValid(peragg->serialfn_oid)) > 1137 elog(ERROR, "serialfunc is note provided > for partial aggregate"); > > it should be "serialfunc is not provided for partial aggregate" Fixed. > Also something is wrong with the following test : > > SELECT /* aggregate <> partial aggregate */ > array_agg(c_int4array), array_agg(b), > avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval), > avg(b::float4), avg(b::float8), > corr(b::float8, (b * b)::float8), > covar_pop(b::float8, (b * b)::float8), > covar_samp(b::float8, (b * b)::float8), > regr_avgx((2 * b)::float8, b::float8), > ..... > > Its results have changed since last patch. Do they depend on daylight > saving time? You are right. In my environment, TimeZone is set to 'PST8PDT' with which timetz values depends on daylight saving time. Changed TimeZone to 'UTC' in this test. > You can see that filter is applied before append. The result is correct > only by chance, as sum in every partition is actually < 700. If you > lower this bound, let's say, to 200, you'll start getting wrong results > as data is filtered prior to aggregation. > > It seems, however, that in partial case you should just avoid pulling > conditions from having qual at all, all filters will be applied on upper > level. Something like Thank you for your modification. > Found one more problem. You can fire partial aggregate over partitioned > table, but convert_combining_aggrefs() will make non-partial copy, which > leads to > 'variable not found in subplan target list' error. Thanks for the correction as well. As you pointed out, the original patch certainly had the potential to cause problems. However, I could not actually reproduce the problem in cases such as the following. Settings: t(c1, c2) is a patitioned table whose partition key is c1. t1, t2 are patitions of t and are partitioned table. t11, t12: partitions of t1 and foreign table of postgres_fdw. t21, t22: partitions of t2 and foreign table of postgres_fdw. Query: select c2 / 2, sum(c1) from t group by c2 / 2 order by 1 If you have a reproducible example, I would like to add it to the regression test. Do you have a reproducible example? > Also denied partial agregates pushdown on server version mismatch. > Should check_partial_aggregate_support be 'true' by default? Could we discuss this point after we determine how to transfer state values? If we determine this point, we can easly determine whether check_partial_aggregate_support shold be 'true' by default. > I'm not sure what to do with current grammar - it precludes partial > distinct aggregates. I understand that it's currently impossible to have > partial aggregation for distinct agregates -but does it worth to have > such restriction at grammar level? If partial aggregation for distinct agregates becomes possible in the future, I see no problem with the policy of accepting new SQL keywords, such as "PARTIL_AGGREGATE DISTINCT". Sincerely yours, Yuuki Fujii -- Yuuki Fujii Information Technology R&D Center Mitsubishi Electric Corporation
Attachment
pgsql-hackers by date: