RE: Partial aggregates pushdown - Mailing list pgsql-hackers

From Fujii.Yuki@df.MitsubishiElectric.co.jp"
Subject RE: Partial aggregates pushdown
Date
Msg-id TYRPR01MB13941CEA16574771B1BFD130595A02@TYRPR01MB13941.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Partial aggregates pushdown  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Partial aggregates pushdown
List pgsql-hackers
Hi Bruce, Jelte, hackers.

I apologize for my late response.

> From: Jelte Fennema-Nio <postgres@jeltef.nl>
> Sent: Thursday, August 8, 2024 8:49 PM
> SUMMARY OF THREAD
>
> The design of patch 0001 is agreed upon by everyone on the thread (so far). This adds the PARTIAL_AGGREGATE label for
> aggregates, which will cause the finalfunc not to run. It also starts using PARTIAL_AGGREGATE for pushdown of
> aggregates in postgres_fdw. In 0001 PARTIAL_AGGREGATE is only supported for aggregates with a non-internal/pseudo
> type as the stype.
>
> The design for patch 0002 is still under debate. This would expand on the functionality added by adding support for
> PARTIAL_AGGREGATE for aggregates with an internal stype. This is done by returning a byte array containing the bytes
> that the serialfunc of the aggregate returns.
>
> A competing proposal for 0002 is to instead change aggregates to not use an internal stype anymore, and create
dedicated
> types. The main downside here is that infunc and outfunc would need to be added for text serialization, in addition
tothe 
> binary serialization. An open question is: Can we change the requirements for CREATE TYPE, so that types can be
created
> without infunc and outfunc.

I rebased the patch 0001 and add the documentation to it.

Best regards, Yuki Fujii
--
Yuki Fujii
Information Technology R&D Center, Mitsubishi Electric Corporation

> -----Original Message-----
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Friday, August 23, 2024 4:31 AM
> To: Tomas Vondra <tomas@vondra.me>
> Cc: Jelte Fennema-Nio <postgres@jeltef.nl>; Fujii Yuki/藤井 雄規(MELCO/情報総研 DM最適G)
> <Fujii.Yuki@df.MitsubishiElectric.co.jp>; PostgreSQL-development <pgsql-hackers@postgresql.org>; Robert Haas
> <robertmhaas@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>; Stephen Frost <sfrost@snowman.net>; Andres Freund
> <andres@anarazel.de>; Tomas Vondra <tomas.vondra@enterprisedb.com>; Julien Rouhaud <rjuju123@gmail.com>;
> Daniel Gustafsson <daniel@yesql.se>; vignesh C <vignesh21@gmail.com>; Alexander Pyhalov
> <a.pyhalov@postgrespro.ru>
> Subject: Re: Partial aggregates pushdown
>
> On Thu, Aug 22, 2024 at 08:31:11PM +0200, Tomas Vondra wrote:
> > > My question is related to #3 and #4.  For #3, if we are going to be
> > > building infrastructure to handle passing int128 for AVG, wouldn't
> > > it be wiser to create an int128 type and an int128 array type, and
> > > then use method #2 to handle those, rather than creating custom code
> > > just to read/write int128 values for FDWs aggregate passing alone.
> > >
> >
> > Yep, adding int128 as a data type would extend this to aggregates that
> > store state as int128 (or array of int128).
>
> Great, I am not too far off then.
>
> > > For #4, can we use or improve the RECORD data type to handle #4 ---
> > > that seems preferable to creating custom FDWs aggregate passing code.
> > >
> > > I know the open question was whether we should create custom FDWs
> > > aggregate passing functions or custom data types for FDWs aggregate
> > > passing, but I am asking if we can improve existing facilities, like
> > > int128 or record passing, to reduce the need for some of these.
> > >
> >
> > But which code would produce the record? AFAIK it can't happen in some
> > generic executor code, because that only sees "internal" for each
> > aggregate. The exact structure of the aggstate is private within the
> > code of each aggregate - the record would have to be built there, no?
> >
> > I imagine we'd add this for each aggregate as a new pair of functions
> > to build/parse the record, but that's kinda the serial/deserial way we
> > discussed earlier.
> >
> > Or are you suggesting we'd actually say:
> >
> >   CREATE AGGREGATE xyz(...) (
> >     STYPE = record,
> >     ...
> >   )
>
> So my idea from the email I just sent is to create a pg_proc.proargtypes-like column (data type oidvector) for
pg_aggregate
> which stores the oids of the values we want to return, so AVG(interval) would have an array of the oids for interval
andint8, 
> e.g.:
>
>     SELECT oid FROM pg_type WHERE typname = 'interval';
>      oid
>     ------
>      1186
>
>     SELECT oid FROM pg_type WHERE typname = 'int8';
>      oid
>     -----
>       20
>
>     SELECT '1186 20'::oidvector;
>      oidvector
>     -----------
>      1186 20
>
> It seems all four methods could use this, again assuming we create
> int128/int16 and whatever other types we need.
>
> --
>   Bruce Momjian  <bruce@momjian.us>        https://momjian.us
>   EDB                                      https://enterprisedb.com
>
>   Only you can decide what is important to you.

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] PGSERVICEFILE as part of a normal connection string
Next
From: David Rowley
Date:
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning