Home > mailing lists
Re: [HACKERS] WIP: Aggregation push-down - Mailing list pgsql-hackers

From	Antonin Houska
Subject	Re: [HACKERS] WIP: Aggregation push-down
Date	September 8, 2017 16:34:25
Msg-id	24235.1504877665@localhost Whole thread
In response to	Re: [HACKERS] WIP: Aggregation push-down (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses	Re: [HACKERS] WIP: Aggregation push-down
List	pgsql-hackers
Tree view
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

> On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska <ah@cybertec.at> wrote:
> > Antonin Houska <ah@cybertec.at> wrote:
> >
> >> Antonin Houska <ah@cybertec.at> wrote:
> >>
> >> > This is a new version of the patch I presented in [1].
> >>
> >> Rebased.
> >>
> >> cat .git/refs/heads/master
> >> b9a3ef55b253d885081c2d0e9dc45802cab71c7b
> >
> > This is another version of the patch.
> >
> > Besides other changes, it enables the aggregation push-down for postgres_fdw,
> > although only for aggregates whose transient aggregation state is equal to the
> > output type. For other aggregates (like avg()) the remote nodes will have to
> > return the transient state value in an appropriate form (maybe bytea type),
> > which does not depend on PG version.
>
> Having transient aggregation state type same as output type doesn't
> mean that we can feed the output of remote aggregation as is to the
> finalization stage locally or finalization is not needed at all. I am
> not sure why is that test being used to decide whether or not to push
> an aggregate (I assume they are partial aggregates) to the foreign
> server.

I agree. This seems to be the same problem as reported in [2].

> postgres_fdw doesn't have a feature to fetch partial aggregate
> results from the foreign server. Till we do that, I don't think this
> patch can push any partial aggregates down to the foreign server.

Even if the query contains only aggregates like sum(int4), i.e. aggfinalfn
does not exist and the transient type can be transfered from the remote server
in textual form? (Of course there's a question is if such behaviour is
consistent from user's perspective.)

> >
> > One thing I'm not sure about is whether the patch should remove
> > GetForeignUpperPaths function from FdwRoutine, which it essentially makes
> > obsolete. Or should it only be deprecated so far? I understand that
> > deprecation is important for "ordinary SQL users", but FdwRoutine is an
> > interface for extension developers, so the policy might be different.
>
> I doubt if that's correct. We can do that only when we know that all
> kinds aggregates/grouping can be pushed down the join tree, which
> looks impossible to me. Apart from that that FdwRoutine handles all
> kinds of upper relations like windowing, distinct, ordered which are
> not pushed down by this set of patches.

Good point.

> Here's review of the patchset
> The patches compile cleanly when applied together.
>
> Testcase "limit" fails in make check.

If I remember correctly, this test generates different plan because the
aggregate push-down gets involved. I tend to ignore this error until the patch
has cost estimates refined. Then let's see if the test will pass or if the
expected output should be adjusted.

> This is a pretty large patchset and the patches are divided by stages in the
> planner rather than functionality. IOW, no single patch provides a full
> functionality. Reviewing and probably committing such patches is not easy and
> may take quite long. Instead, if the patches are divided by functionality, i.e.
> each patch provides some functionality completely, it will be easy to review
> and commit such patches with each commit giving something useful. I had
> suggested breaking patches on that line at [1]. The patches also refactor
> existing code. It's better to move such refactoring in a patch/es by itself.

I definitely saw commits whose purpose is preparation for something else. But
you're right that some splitting of the actual funcionality wouldn't
harm. I'll at least separate partial aggregation of base relations and that of
joins. And maybe some more preparation steps.

> The patches need more comments, explaining various decisions

o.k.

> The patch maintains an extra rel target, two pathlists, partial pathlist and
> non-partial pathlist for grouping in RelOptInfo. These two extra
> pathlists require "grouped" argument to be passed to a number of functions.
> Instead probably it makes sense to create a RelOptInfo for aggregation/grouping
> and set pathlist and partial_pathlist in that RelOptInfo. That will also make a
> place for keeping grouped estimates.

If grouped paths require a separate RelOptInfo, why the existing partial paths
do not?

> For placeholders we have two function add_placeholders_to_base_rels() and
> add_placeholders_to_joinrel(). We have add_grouping_info_to_base_rels(), but do
> not have corresponding function for joinrels. How do we push aggregates
> involving two or more relations to the corresponding joinrels?

add_grouping_info_to_base_rels() calls prepare_rel_for_grouping() which
actually adds the "grouping info". For join, prepare_rel_for_grouping() is
called from build_join_rel().

While PlaceHolderVars need to be finalized before planner starts to create
joins, the GroupedPathInfo has to fit particular pair of joined relations.

Perhaps create_grouping_info_... would be better, to indicate that the thing
we're adding does not exist yet. I'll think about it.

> Some minor assorted comments ...

o.k., will fix.

> Some quick observations using two tables t1(a int, b int) and t2(a int, b int),
> populated as
> insert into t1 select i, i % 5 from generate_series(1, 100) i where i % 2 = 0;
> insert into t2 select i, i % 5 from generate_series(1, 100) i where i % 3 = 0;
>
> 1. The patch pushes aggregates down join in the following query
> explain verbose select avg(t2.a) from t1 inner join t2 on t1.b = t2.b group by
> t2.b;
> but does not push the aggregates down join in the following query
> explain verbose select avg(t2.a) from t1 left join t2 on t1.b = t2.b group by
> t2.b;

> In fact, it doesn't use the optimization for any OUTER join. I think the reason
> for this behaviour is that the patch uses equivalence classes to distribute the
> aggregates and grouping to base relations and OUTER equi-joins do not form
> equivalence classes. But I think it should be possible to push the aggregates
> down the OUTER join by adding one row for NULL values if the grouping is pushed
> to the inner side. I don't see much change for outer side. This also means that
> we have to choose some means other than equivalence class for propagating the
> aggregates.

The problem is that aggregate pushed down to the nullable side of an outer
join receives different input values than the original aggregate at the top
level of the query. NULL values generated by the OJ make the difference. I
have no idea how to handle this problem. If the aggregation is performed on
the nullable side of the OJ, it can't predict which of the input values don't
match any value of the other side. Suggestions are appreciated.

> 2. Following query throws error with these patches, but not without the
> patches.
> explain verbose select sum(t1.a + t2.a) from t1, t2, t2 t3 where t1.a
> = t2.a
> group by t2.a, t1.a;
> ERROR:  ORDER/GROUP BY expression not found in targetlist

I'll check this. Thanks.

> [1]
https://www.postgresql.org/message-id/CAFjFpRejPP4K%3Dg%2B0aaq_US0YrMaZzyM%2BNUCi%3DJgwaxhMUj2Zcg%40mail.gmail.com

[2] https://www.postgresql.org/message-id/CAM2%2B6%3DW2J-iaQBgj-sdMERELQLUm5dvOQEWQ2ho%2BQ7KZgnonkQ%40mail.gmail.com

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date:
From: amul sul
Date: 08 September 2017, 15:40:25
Subject: Re: [HACKERS] [POC] hash partitioning
From: Tom Lane
Date: 08 September 2017, 16:40:04
Subject: Re: [HACKERS] pgbench tap tests & minor fixes.
Re: [HACKERS] WIP: Aggregation push-down - Mailing list pgsql-hackers

Previous

Next