RE: [CAUTION!! freemail] Re: Partial aggregates pushdown - Mailing list pgsql-hackers
From | Fujii.Yuki@df.MitsubishiElectric.co.jp" |
---|---|
Subject | RE: [CAUTION!! freemail] Re: Partial aggregates pushdown |
Date | |
Msg-id | OS3PR01MB6660AE8BB4D08125EB4D3279950D9@OS3PR01MB6660.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Partial aggregates pushdown (Ted Yu <yuzhihong@gmail.com>) |
Responses |
Re: [CAUTION!! freemail] Re: Partial aggregates pushdown
|
List | pgsql-hackers |
Hi Mr.Yu. Thank you for comments. > + * Check that partial aggregate agg has compatibility > > If the `agg` refers to func parameter, the parameter name is aggform I fixed the above typo and made the above comment easy to understand New comment is "Check that partial aggregate function of aggform exsits in remote" > + int32 partialagg_minversion = PG_VERSION_NUM; > + if (aggform->partialagg_minversion == > PARTIALAGG_MINVERSION_DEFAULT) { > + partialagg_minversion = PG_VERSION_NUM; > > > I am curious why the same variable is assigned the same value twice. It seems > the if block is redundant. > > + if ((fpinfo->server_version >= partialagg_minversion)) { > + compatible = true; > > > The above can be simplified as: return fpinfo->server_version >= > partialagg_minversion; I fixed according to your comment. Sincerely yours, Yuuki Fujii -- Yuuki Fujii Information Technology R&D Center Mitsubishi Electric Corporation > -----Original Message----- > From: Ted Yu <yuzhihong@gmail.com> > Sent: Tuesday, November 22, 2022 2:00 PM > To: Fujii Yuki/藤井 雄規(MELCO/情報総研 DM最適G) > <Fujii.Yuki@df.MitsubishiElectric.co.jp> > Cc: Alexander Pyhalov <a.pyhalov@postgrespro.ru>; Tomas Vondra > <tomas.vondra@enterprisedb.com>; PostgreSQL-development > <pgsql-hackers@postgresql.org>; Andres Freund <andres@anarazel.de>; > Zhihong Yu <zyu@yugabyte.com>; Julien Rouhaud <rjuju123@gmail.com>; > Daniel Gustafsson <daniel@yesql.se>; Ilya Gladyshev > <i.gladyshev@postgrespro.ru> > Subject: [CAUTION!! freemail] Re: Partial aggregates pushdown > > > > On Mon, Nov 21, 2022 at 5:02 PM Fujii.Yuki@df.MitsubishiElectric.co.jp > <mailto:Fujii.Yuki@df.MitsubishiElectric.co.jp> > <Fujii.Yuki@df.mitsubishielectric.co.jp > <mailto:Fujii.Yuki@df.mitsubishielectric.co.jp> > wrote: > > > Hi Mr.Vondra, Mr.Pyhalov, Everyone. > > I discussed with Mr.Pyhalov about the above draft by directly sending > mail to > him(outside of pgsql-hackers). Mr.Pyhalov allowed me to update his > patch > along with the above draft. So I update Mr.Pyhalov's patch v10. > > I wrote my patch for discussion. > My patch passes regression tests which contains additional basic > postgres_fdw tests > for my patch's feature. But my patch doesn't contain sufficient > documents and tests. > If reviewers accept my approach, I will add documents and tests to my > patch. > > The following is a my patch's readme. > # I simplified the above draft. > > --readme of my patch > 1. interface > 1) pg_aggregate > There are the following additional columns. > a) partialaggfn > data type : regproc. > default value: zero(means invalid). > description : This field refers to the special aggregate > function(then we call > this partialaggfunc) > corresponding to aggregation function(then we call src) which has > aggfnoid. > partialaggfunc is used for partial aggregation pushdown by > postgres_fdw. > The followings are differences between the src and the special > aggregate function. > difference1) result type > The result type is same as the src's transtype if the src's > transtype > is not internal. > Otherwise the result type is bytea. > difference2) final func > The final func does not exist if the src's transtype is not > internal. > Otherwize the final func returns serialized value. > For example, there is a partialaggfunc avg_p_int4 which > corresponds to avg(int4) > whose aggtranstype is _int4. > The result value of avg_p_int4 is a float8 array which consists of > count and > summation. avg_p_int4 does not have finalfunc. > For another example, there is a partialaggfunc avg_p_int8 which > corresponds to > avg(int8) whose aggtranstype is internal. > The result value of avg_p_int8 is a bytea serialized array which > consists of count > and summation. avg_p_int8 has finalfunc int8_avg_serialize > which is serialize function > of avg(int8). This field is zero if there is no partialaggfunc. > > b) partialagg_minversion > data type : int4. > default value: zero(means current version). > description : This field is the minimum PostgreSQL server version > which has > partialaggfunc. This field is used for checking compatibility of > partialaggfunc. > > The above fields are valid in tuples for builtin avg, sum, min, max, > count. > There are additional records which correspond to partialaggfunc for > avg, sum, min, max, > count. > > 2) pg_proc > There are additional records which correspond to partialaggfunc for > avg, sum, min, max, > count. > > 3) postgres_fdw > postgres_fdw has an additional foreign server option server_version. > server_version is > integer value which means remote server version number. Default > value of server_version > is zero. server_version is used for checking compatibility of > partialaggfunc. > > 2. feature > postgres_fdw can pushdown partial aggregation of avg, sum, min, max, > count. > Partial aggregation pushdown is fine when the following two > conditions are both true. > condition1) partialaggfn is valid. > condition2) server_version is not less than partialagg_minversion > postgres_fdw executes pushdown the patialaggfunc instead of a src. > For example, we issue "select avg_p_int4(c) from t" instead of "select > avg(c) from t" > in the above example. > > postgres_fdw can pushdown every aggregate function which supports > partial aggregation > if you add a partialaggfunc corresponding to the aggregate function by > create aggregate > command. > > 3. difference between my patch and Mr.Pyhalov's v10 patch. > 1) In my patch postgres_fdw can pushdown partial aggregation of avg > 2) In my patch postgres_fdw can pushdown every aggregate function > which supports partial > aggregation if you add a partialaggfunc corresponding to the > aggregate function. > > 4. sample commands in psql > \c postgres > drop database tmp; > create database tmp; > \c tmp > create extension postgres_fdw; > create server server_01 foreign data wrapper postgres_fdw > options(host 'localhost', dbname 'tmp', server_version '160000', async_capable > 'true'); > create user mapping for postgres server server_01 options(user > 'postgres', password 'postgres'); > create server server_02 foreign data wrapper postgres_fdw > options(host 'localhost', dbname 'tmp', server_version '160000', async_capable > 'true'); > create user mapping for postgres server server_02 options(user > 'postgres', password 'postgres'); > > create table t(dt timestamp, id int4, name text, total int4, val float4, type > int4, span interval) partition by list (type); > > create table t1(dt timestamp, id int4, name text, total int4, val float4, > type int4, span interval); > create table t2(dt timestamp, id int4, name text, total int4, val float4, > type int4, span interval); > > truncate table t1; > truncate table t2; > insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as > interval), t % 100, 'hoge' || t, 1, 1.1, 1, cast('1 seconds' as interval) from > generate_series(1, 100000, 1) t; > insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as > interval), t % 100, 'hoge' || t, 2, 2.1, 1, cast('2 seconds' as interval) from > generate_series(1, 100000, 1) t; > insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as > interval), t % 100, 'hoge' || t, 1, 1.1, 2, cast('1 seconds' as interval) from > generate_series(1, 100000, 1) t; > insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as > interval), t % 100, 'hoge' || t, 2, 2.1, 2, cast('2 seconds' as interval) from > generate_series(1, 100000, 1) t; > > create foreign table f_t1 partition of t for values in (1) server server_01 > options(table_name 't1'); > create foreign table f_t2 partition of t for values in (2) server server_02 > options(table_name 't2'); > > set enable_partitionwise_aggregate = on; > explain (verbose, costs off) select avg(total::int4), avg(total::int8) from > t; > select avg(total::int4), avg(total::int8) from t; > > Sincerely yours, > Yuuki Fujii > -- > Yuuki Fujii > Information Technology R&D Center Mitsubishi Electric Corporation > > > > Hi, > For partial_agg_compatible : > > + * Check that partial aggregate agg has compatibility > > If the `agg` refers to func parameter, the parameter name is aggform > > + int32 partialagg_minversion = PG_VERSION_NUM; > + if (aggform->partialagg_minversion == > PARTIALAGG_MINVERSION_DEFAULT) { > + partialagg_minversion = PG_VERSION_NUM; > > > I am curious why the same variable is assigned the same value twice. It seems > the if block is redundant. > > + if ((fpinfo->server_version >= partialagg_minversion)) { > + compatible = true; > > > The above can be simplified as: return fpinfo->server_version >= > partialagg_minversion; > > Cheers
Attachment
pgsql-hackers by date: