RE: Partial aggregates pushdown - Mailing list pgsql-hackers

From Fujii.Yuki@df.MitsubishiElectric.co.jp"
Subject RE: Partial aggregates pushdown
Date
Msg-id OS3PR01MB66604C11E390FECBAA1B6518959A9@OS3PR01MB6660.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Partial aggregates pushdown  (Alexander Pyhalov <a.pyhalov@postgrespro.ru>)
Responses RE: Partial aggregates pushdown
List pgsql-hackers
Hi Mr.Vondra, Mr.Pyhalov.

I'm interesied in Mr.Pyhalov's patch due to the following background.
--Background
I develop postgresql's extension such as fdw in my work. 
I'm interested in using postgresql for OLAP. 
I think the function of a previous patch "Push aggregation down to base relations and joins"[1] is desiable. I rebased
theprevious patch and register the rebased patch on the next commitfest[2].
 
And I think it would be more useful if the previous patch works on a foreign table of postgres_fdw.
I realized the function of partial aggregation pushdown is necessary  to make the previous patch work on a foreign
tableof postgres_fdw.
 
--

So I reviewed Mr.Pyhalov's patch and discussions on this thread.
I made a draft of approach to respond to Mr.Vondra's comments.
Would you check whether my draft is right or not?

--My draft
> 1) It's not clear to me how could this get extended to aggregates with 
> more complex aggregate states, to support e.g. avg() and similar 
> fairly common aggregates.
We add a special aggregate function every aggregate function (hereafter we call this src)  which supports partial
aggregation.
The followings are differences between the src and the special aggregate function.
difference1) result type
The result type is same with the src's transtype if the src's transtype is not internal.
Otherwise the result type is bytea.

difference2) final func
The final func does not exist if the src's transtype is not internal.
Otherwize the final func returns serialized value.

For example, let me call the special aggregate function of avg(float8) avg_p(float8).
The result value of avg_p is a float8 array which consists of count and summation.
avg_p does not have finalfunc.

We pushdown the special aggregate function instead of a src.
For example, we issue "select avg_p(c) from t" instead of "select avg(c) from t"
in the above example.

We add a new column partialaggfn to pg_aggregate to get the oid of  the special aggregate function from the the src's
oid.
This column is the oid of the special aggregate function which corresponds to the src.

If an aggregate function does not have any special aggregate function,  then we does not pushdown any partial
aggregationof the aggregate function.
 

> 2) I'm not sure relying on aggpartialpushdownsafe without any version 
> checks etc. is sufficient. I mean, how would we know the remote node 
> has the same idea of representing the aggregate state. I wonder how 
> this aligns with assumptions we do e.g. for functions etc.
We add compatible server versions infomation to pg_aggregate and  the set of options of postgres_fdw's foreign server.
We check compatibility of an aggregate function using this infomation.

An additional column of pg_aggregate is compatibleversonrange.
This column is a range of postgresql server versions which  has compatible aggregate function.
An additional options of postgres_fdw's foreign server are serverversion and bwcompatibleverson.
serverversion is remote postgresql server version.
bwcompatibleverson is the maximum version in which any aggregate function is compatible with local noed's one.
Our version check passes if and only if at least one of the following conditions is true.
condition1) the option value of serverversion is in compatibleversonrange.
condition2) the local postgresql server version is between bwcompatibleverson and the option value of serverversion.

We can get the local postgresql server version from PG_VERSION_NUM macro.
We use condition1 if the local postgresql server version is not more than the remote one.
and use condition2 if the local postgresql server version is greater than the remote one.
--

Sincerely yours,
Yuuki Fujii

[1] https://commitfest.postgresql.org/32/1247/
[2] https://commitfest.postgresql.org/39/3764/

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: making relfilenodes 56 bits
Next
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15