Home > mailing lists

Re: WIP: Aggregation push-down - Mailing list pgsql-hackers

From	Andy Fan
Subject	Re: WIP: Aggregation push-down
Date	April 26, 2020 11:12:04
Msg-id	CAKU4AWpzp1ZaDdD+hX8q4zpXhgNg9S93zNwmAc4Q3h5Xqj0NSA@mail.gmail.com Whole thread Raw
In response to	Re: WIP: Aggregation push-down (Antonin Houska <ah@cybertec.at>)
Responses	Re: WIP: Aggregation push-down
List	pgsql-hackers

Tree view

On Fri, Apr 24, 2020 at 8:10 PM Antonin Houska <ah@cybertec.at> wrote:

Andy Fan <zhihui.fan1213@gmail.com> wrote:

> The more tests on your patch, the more powerful I feel it is!

Thanks for the appreciation. Given the poor progress it's increasingly hard
for me to find motivation to work on it. I'll try to be more professional :-)

Let's not give up:) I see your patch can push down the aggregation in 3 cases

at least:

1). The group by clause exists in the join eq clause.

2). The group by clause doesn't exist in join eq clause, but we have

pk on on side of join eq clause.

3). The group by clause doesn't exist in join eq clause, and we don't have

the pk as well.

Tom well explained the correctness of the first 2 cases [1] and probably the case

3) is correct as well, but it is a bit of hard to understand. If this is a case for others

as well, that probably make the review harder.

So my little suggestion is can we split the patch into some smaller commit

to handle each case? like: commit 1 & 2 handles case 1 & 2，commit 3 handles

join relation as well. commit 4 handles the case 3. Commit 5 can avoid the

two-phase aggregation for case 2. Commit 6 introduces the aggmultifn. and

commit 7 to handle some outer join case Ashutosh raised at [2]. However not

all the cases need to be handled at one time.

Actually when I read the code, I want such separation by my own to verify my

understanding is correct, but I don't have such time at least now. It maybe much

easier for you since you are the author. I will be pretty pleasure to help in any case

after I close my current working item (Hopefully in 2 months).

> At the code level, I did some slight changes on init_grouping_targets which may
> make the code easier to read. You are free to to use/not use it.

I'm going to accept your change of create_rel_agg_info(), but I hesitate about
the changes to init_grouping_targets().

First, is it worth to spend CPU cycles on construction of an extra list
grouping_columns? Is there a corner case in which we cannot simply pass
grouping_columns=target->exprs to check_functional_grouping()?

Second, it's obvious that you prefer the style

foreach ()
{
if ()
...
else if ()
...
else
...
}

over this

foreach ()
{
if ()
{
...
continue;
}

if ()
{
...
continue;
}

...
}

I often prefer the latter and I see that the existing planner code uses this
style quite often too. I think the reason is that it allows for more complex
tests, while the "else-if style" requires all tests to take place inside the
"if ()" expression. However, if several (not necessarily tightly related)
tests become "compressed" this way, it's less obvious how where to put
comments. Indeed it seems that some comments got lost due to your changes.

Your explanation looks reasonable, thanks for that. the changes also used

some builtin function to avoid the one more level for-loop. like tlist_member.

As for the high level design, based on my current knowledge, probably no need

to change.

[1] https://www.postgresql.org/message-id/9726.1542577439%40sss.pgh.pa.us

[2] https://www.postgresql.org/message-id/flat/CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3%2BcoG%3D8ki7s8Zqjw%40mail.gmail.com

pgsql-hackers by date:

From: 曾文旌
Date: 26 April 2020, 11:09:02
Subject: Re: [Proposal] Global temporary tables

From: David Rowley
Date: 26 April 2020, 12:48:51
Subject: Re: Subplan result caching

Re: WIP: Aggregation push-down - Mailing list pgsql-hackers

Previous

Next