Re: WIP: Aggregation push-down - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: WIP: Aggregation push-down |
Date | |
Msg-id | 86487.1589876240@antos Whole thread Raw |
In response to | Re: WIP: Aggregation push-down (Andy Fan <zhihui.fan1213@gmail.com>) |
Responses |
Re: WIP: Aggregation push-down
|
List | pgsql-hackers |
Andy Fan <zhihui.fan1213@gmail.com> wrote: > On Fri, Apr 24, 2020 at 8:10 PM Antonin Houska <ah@cybertec.at> wrote: > > Andy Fan <zhihui.fan1213@gmail.com> wrote: > > Let's not give up:) I see your patch can push down the aggregation in 3 cases > at least: > > 1). The group by clause exists in the join eq clause. > 2). The group by clause doesn't exist in join eq clause, but we have > pk on on side of join eq clause. > 3). The group by clause doesn't exist in join eq clause, and we don't have > the pk as well. > > Tom well explained the correctness of the first 2 cases [1] I admit that I don't quite understand his statement We can't simply aggregate during the scan of "a", because some values of "x" might not appear at all in the input of the naively-computed aggregate (if their rows have no join partner in "b"), or might appear multiple times (if their rows have multiple join partners). Consider the following tables ======= ====== A B ======= ====== x | y z ------- ------ 3 | 4 1 1 | 4 2 4 | 0 3 1 | 0 4 and this query SELECT sum(a.x), a.y FROM a, b WHERE a.y > b.z GROUP BY a.y; Without aggregate push-down, the aggregation input is a.x | a.y | b.z ---------------- 3 | 4 | 1 3 | 4 | 2 3 | 4 | 3 1 | 4 | 1 1 | 4 | 2 1 | 4 | 3 and the query result is sum(a.x) | a.y ----------------- 12 | 4 With the push-down, "A" aggregated looks like sum(a.x) | a.y ----------------- 4 | 4 5 | 0 and the join with "B" produces this: sum(a.x) | a.y | b.z --------------------- 4 | 4 | 1 4 | 4 | 2 4 | 4 | 3 After that, the final aggregation yields the same result that we got w/o the aggregate push-down: sum(a.x) | a.y ----------------- 12 | 4 So if a row of "A" has no / multiple join partners in "B", the same is true for the group produced by the pushed-down aggregation. Of course it's assumed that the input rows belonging to particular group can be distributed among the PartialAgg nodes in arbitrary way, for example: Agg <value 1> <value 1> <value 2> <value 2> needs to be equivalent to FinalAgg PartialAgg <value 1> <value 2> PartialAgg <value 1> <value 2> This should be o.k. because the partial aggregation performed currently by parallel workers relies o the same "distributive property" of aggregation. > So my little suggestion is can we split the patch into some smaller commit > to handle each case? like: commit 1 & 2 handles case 1 & 2,commit 3 handles > join relation as well. commit 4 handles the case 3. To separate the patch this way effectively means to rework the feature almost from scratch. I'm still not convinced that such a rework is necesary. > Commit 5 can avoid the two-phase aggregation for case 2. The two-phase aggregation can be avoided in more generic way as soon as the "unique key" feature (I think you do participated in the related discussion) is committed. > Commit 6 introduces the aggmultifnf. I used this function in the initial version of the patch [3], to join two grouped relations. The current version does not try to join two grouped relations and it's hard to imagine how to determine the "multiplication factor". I suppose this is not really trivial. Since the current version of the feature hasn't find it's way into PG core for several years, I'm not willing to make it even trickier soon, if at all. > and commit 7 to handle some outer join case Ashutosh raised at [2]. That was too high-level design, as he admits in [4]. > However not all the cases need to be handled at one time. Sure. > > > At the code level, I did some slight changes on init_grouping_targets which may > > make the code easier to read. You are free to to use/not use it. > > I'm going to accept your change of create_rel_agg_info(), but I hesitate about > the changes to init_grouping_targets(). > > First, is it worth to spend CPU cycles on construction of an extra list > grouping_columns? Is there a corner case in which we cannot simply pass > grouping_columns=target->exprs to check_functional_grouping()? > > Second, it's obvious that you prefer the style > > foreach () > { > if () > ... > else if () > ... > else > ... > } > > over this > > foreach () > { > if () > { > ... > continue; > } > > if () > { > ... > continue; > } > > ... > } > > I often prefer the latter and I see that the existing planner code uses this > style quite often too. I think the reason is that it allows for more complex > tests, while the "else-if style" requires all tests to take place inside the > "if ()" expression. However, if several (not necessarily tightly related) > tests become "compressed" this way, it's less obvious how where to put > comments. Indeed it seems that some comments got lost due to your changes. > > Your explanation looks reasonable, thanks for that. the changes also used > some builtin function to avoid the one more level for-loop. like tlist_member. Eventually I've reworked init_grouping_targets(), although not exactly as you proposed. I hope it's easier to understand now. > [1] https://www.postgresql.org/message-id/9726.1542577439%40sss.pgh.pa.us > [2] https://www.postgresql.org/message-id/flat/CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3%2BcoG%3D8ki7s8Zqjw%40mail.gmail.com [3] https://www.postgresql.org/message-id/29111.1483984605@localhost [4] https://www.postgresql.org/message-id/CAFjFpRfUKq3Asgtki1XctPkCN6YC4oA2vNWh66OgBo-H26ePWA%40mail.gmail.com The next version is attached. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Attachment
pgsql-hackers by date: