Re: Partial aggregates pushdown - Mailing list pgsql-hackers
From | Alexander Pyhalov |
---|---|
Subject | Re: Partial aggregates pushdown |
Date | |
Msg-id | 8dc86e502fbd9d04da0dc6cdd4156f84@postgrespro.ru Whole thread Raw |
In response to | Re: Partial aggregates pushdown (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
RE: Partial aggregates pushdown
|
List | pgsql-hackers |
Tomas Vondra писал 2022-03-22 15:28: > On 3/22/22 01:49, Andres Freund wrote: >> On 2022-01-17 15:27:53 +0300, Alexander Pyhalov wrote: >>> Alexander Pyhalov писал 2022-01-17 15:26: >>>> Updated patch. >>> >>> Sorry, missed attachment. >> >> Needs another update: http://cfbot.cputube.org/patch_37_3369.log >> >> Marked as waiting on author. >> > > TBH I'm still not convinced this is the right approach. I've voiced > this > opinion before, but to reiterate the main arguments: > > 1) It's not clear to me how could this get extended to aggregates with > more complex aggregate states, to support e.g. avg() and similar fairly > common aggregates. Hi. Yes, I'm also not sure how to proceed with aggregates with complex state. Likely it needs separate function to export their state, but then we should somehow ensure that this function exists and our 'importer' can handle its result. Note that for now we have no mechanics in postgres_fdw to find out remote server version on planning stage. > 2) I'm not sure relying on aggpartialpushdownsafe without any version > checks etc. is sufficient. I mean, how would we know the remote node > has > the same idea of representing the aggregate state. I wonder how this > aligns with assumptions we do e.g. for functions etc. It seems to be not a problem for me, as for now we don't care about remote node internal aggregate state representation. We currently get just aggregate result from remote node. For aggregates with 'internal' stype we call converter locally, and it converts external result from aggregate return type to local node internal representation. > > Aside from that, there's a couple review comments: > > 1) should not remove the comment in foreign_expr_walker Fixed. > > 2) comment in deparseAggref is obsolete/inaccurate Fixed. > > 3) comment for partial_agg_ok should probably explain when we consider > aggregate OK to be pushed down Expanded comment. > > 4) I'm not sure why get_rcvd_attinmeta comment talks about "return type > bytea" and "real input type". Expanded comment. Tupdesc can be retrieved from node->ss.ss_ScanTupleSlot, and so we expect to see bytea (as should be produced by partial aggregation). But when we scan data, we get aggregate output type (which matches converter input type), so attinmeta should be fixed. If we deal with aggregate which doesn't have converter, partial_agg_ok() ensures that agg->aggfnoid return type matches agg->aggtranstype. > 5) Talking about "partial" aggregates is a bit confusing, because that > suggests this is related to actual "partial aggregates". But it's not. How should we call them? It's about pushing "Partial count()" or "Partial sum()" to the remote server, why it's not related to partial aggregates? Do you mean that it's not about parallel aggregate processing? > 6) Can add_foreign_grouping_paths do without the new 'partial' > parameter? Clearly, it can be deduced from extra->patype, no? Fixed this. > > 7) There's no docs for PARTIALCONVERTERFUNC / PARTIAL_PUSHDOWN_SAFE in > CREATE AGGREGATE sgml docs. Added documentation. I'd appreciate advice on how it should be extended. > > 8) I don't think "serialize" in the converter functions is the right > term, considering those functions are not "serializing" anything. If > anything, it's the remote node that is serializing the agg state and > the > local not is deserializing it. Or maybe I just misunderstand where are > the converter functions executed? Converter function transforms aggregate result to serialized internal representation, which is expected from partial aggregate. I mean, it converts aggregate result type to internal representation and then efficiently executes serialization code (i.e. converter(x) == serialize(to_internal(x))). -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
pgsql-hackers by date: