Re: WIP: Aggregation push-down - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: WIP: Aggregation push-down
Date
Msg-id 86487.1589876240@antos
Whole thread Raw
In response to Re: WIP: Aggregation push-down  (Andy Fan <zhihui.fan1213@gmail.com>)
Responses Re: WIP: Aggregation push-down
List pgsql-hackers
Andy Fan <zhihui.fan1213@gmail.com> wrote:

> On Fri, Apr 24, 2020 at 8:10 PM Antonin Houska <ah@cybertec.at> wrote:
>
> Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Let's not give up:)  I see your patch can push down the aggregation in 3 cases
> at least:
>
> 1).  The group by clause exists in the join eq clause.
> 2).  The group by clause doesn't exist in join eq clause, but we have
> pk on on side of join eq clause.
> 3).  The group by clause doesn't exist in join eq clause, and we don't have
> the pk as well.
>
> Tom well explained the correctness of the first 2 cases [1]

I admit that I don't quite understand his statement

    We can't simply aggregate during the scan of "a", because some values of
    "x" might not appear at all in the input of the naively-computed aggregate
    (if their rows have no join partner in "b"), or might appear multiple
    times (if their rows have multiple join partners).

Consider the following tables

    =======         ======
       A              B
    =======         ======
     x | y            z
    -------         ------
     3 | 4            1
     1 | 4            2
     4 | 0            3
     1 | 0            4


and this query

    SELECT sum(a.x), a.y FROM a, b WHERE a.y > b.z GROUP BY a.y;

Without aggregate push-down, the aggregation input is

     a.x | a.y | b.z
    ----------------
      3  |  4  |  1
      3  |  4  |  2
      3  |  4  |  3
      1  |  4  |  1
      1  |  4  |  2
      1  |  4  |  3


and the query result is

     sum(a.x) | a.y
    -----------------
         12   |  4


With the push-down, "A" aggregated looks like

     sum(a.x) | a.y
    -----------------
          4   |  4
          5   |  0

and the join with "B" produces this:

     sum(a.x) | a.y | b.z
    ---------------------
          4   |  4  | 1
          4   |  4  | 2
          4   |  4  | 3

After that, the final aggregation yields the same result that we got w/o the
aggregate push-down:

     sum(a.x) | a.y
    -----------------
         12   |  4

So if a row of "A" has no / multiple join partners in "B", the same is true
for the group produced by the pushed-down aggregation. Of course it's assumed
that the input rows belonging to particular group can be distributed among the
PartialAgg nodes in arbitrary way, for example:

    Agg
        <value 1>
        <value 1>
        <value 2>
        <value 2>

needs to be equivalent to

    FinalAgg
        PartialAgg
            <value 1>
            <value 2>
        PartialAgg
            <value 1>
            <value 2>

This should be o.k. because the partial aggregation performed currently by
parallel workers relies o the same "distributive property" of aggregation.

> So my little suggestion is can we split the patch into some smaller commit
> to handle each case?  like: commit 1 & 2 handles case 1 & 2,commit 3 handles
> join relation as well.   commit 4 handles the case 3.

To separate the patch this way effectively means to rework the feature almost
from scratch. I'm still not convinced that such a rework is necesary.

> Commit 5 can avoid the two-phase aggregation for case 2.

The two-phase aggregation can be avoided in more generic way as soon as the
"unique key" feature (I think you do participated in the related discussion)
is committed.

> Commit 6 introduces the aggmultifnf.

I used this function in the initial version of the patch [3], to join two
grouped relations. The current version does not try to join two grouped
relations and it's hard to imagine how to determine the "multiplication
factor". I suppose this is not really trivial. Since the current version of
the feature hasn't find it's way into PG core for several years, I'm not
willing to make it even trickier soon, if at all.

> and commit 7 to handle some outer join case Ashutosh raised at [2].

That was too high-level design, as he admits in [4].

> However not all the cases need to be handled at one time.

Sure.

>
>  > At the code level, I did some slight changes on init_grouping_targets which may
>  > make the code easier to read.  You are free to to use/not use it.
>
>  I'm going to accept your change of create_rel_agg_info(), but I hesitate about
>  the changes to init_grouping_targets().
>
>  First, is it worth to spend CPU cycles on construction of an extra list
>  grouping_columns? Is there a corner case in which we cannot simply pass
>  grouping_columns=target->exprs to check_functional_grouping()?
>
>  Second, it's obvious that you prefer the style
>
>      foreach ()
>      {
>          if ()
>             ...
>          else if ()
>             ...
>          else
>             ...
>      }
>
>  over this
>
>      foreach ()
>      {
>          if ()
>          {
>             ...
>             continue;
>          }
>
>          if ()
>          {
>             ...
>             continue;
>          }
>
>          ...
>      }
>
>  I often prefer the latter and I see that the existing planner code uses this
>  style quite often too. I think the reason is that it allows for more complex
>  tests, while the "else-if style" requires all tests to take place inside the
>  "if ()" expression. However, if several (not necessarily tightly related)
>  tests become "compressed" this way, it's less obvious how where to put
>  comments.  Indeed it seems that some comments got lost due to your changes.
>
> Your explanation looks reasonable,  thanks for that.  the changes also used
> some builtin function to avoid the one more level for-loop. like tlist_member.

Eventually I've reworked init_grouping_targets(), although not exactly as you
proposed. I hope it's easier to understand now.

> [1] https://www.postgresql.org/message-id/9726.1542577439%40sss.pgh.pa.us
> [2]
https://www.postgresql.org/message-id/flat/CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3%2BcoG%3D8ki7s8Zqjw%40mail.gmail.com 

[3] https://www.postgresql.org/message-id/29111.1483984605@localhost
[4] https://www.postgresql.org/message-id/CAFjFpRfUKq3Asgtki1XctPkCN6YC4oA2vNWh66OgBo-H26ePWA%40mail.gmail.com

The next version is attached.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Expand the use of check_canonical_path() for more GUCs
Next
From: Amit Kapila
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions