Hi Alexander,
On 10/15/21 15:15, Alexander Pyhalov wrote:
> Hi.
>
> One of the issues when we try to use sharding in PostgreSQL is absence
> of partial aggregates pushdown.
>
> I see several opportunities to alleviate this issue.
> If we look at Citus, it implements aggregate, calculating internal state
> of an arbitrary agregate function and exporting it as text. So we could
> calculate internal states independently on all data sources and then
> finalize it, which allows to compute arbitrary aggregate.
>
> But, as mentioned in [1] thread, for some functions (like
> count/max/min/sum) we can just push down them. It seems easy and covers
> a lot of cases.
> For now there are still issues - for example you can't handle functions
> as avg() as we should somehow get its internal state or sum() variants,
> which need aggserialfn/aggdeserialfn. Preliminary version is attached.
>
> Is someone else working on the issue? Does suggested approach make sense?
>
I think a couple people worked on this (or something similar/related) in
the past, but I don't recall any recent patches.
IMHO being able to push-down parts of an aggregation to other nodes is a
very desirable feature, that might result in huge improvements for some
analytical workloads.
As for the proposed approach, it's probably good enough for the first
version to restrict this to aggregates where the aggregate result is
sufficient, i.e. we don't need any new export/import procedures.
But it's very unlikely we'd want to restrict it the way the patch does
it, i.e. based on aggregate name. That's both fragile (people can create
new aggregates with such name) and against the PostgreSQL extensibility
(people may implement custom aggregates, but won't be able to benefit
from this just because of name).
So for v0 maybe, but I think there neeeds to be a way to relax this in
some way, for example we could add a new flag to pg_aggregate to mark
aggregates supporting this.
And then we should extend this for aggregates with more complex internal
states (e.g. avg), by supporting a function that "exports" the aggregate
state - similar to serial/deserial functions, but needs to be portable.
I think the trickiest thing here is rewriting the remote query to call
this export function, but maybe we could simply instruct the remote node
to use a different final function for the top-level node?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company