Re: Aggregate Push Down - Performing aggregation on foreign server - Mailing list pgsql-hackers
From | Jeevan Chalke |
---|---|
Subject | Re: Aggregate Push Down - Performing aggregation on foreign server |
Date | |
Msg-id | CAM2+6=Vx+a9ub8W0QX-Neziwa86De25qZwrgf_jjTkJFR0qqng@mail.gmail.com Whole thread Raw |
In response to | Re: Aggregate Push Down - Performing aggregation on foreign server (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>) |
Responses |
Re: Aggregate Push Down - Performing aggregation on foreign server
|
List | pgsql-hackers |
On Wed, Oct 12, 2016 at 3:38 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
Done.
Using scan rel's relids in the context and checking for any foreign var
reference. Added Relids member in foreign_glob_cxt and scanrel in
deparse_expr_cxt to accomplish this. Removed logic of copying relids
from scanrel to grouped_rel. In create_foreignscan_plan(), assigned
all_baserels to fs_relids for upper relations per one of the suggestion
mentioned in the said email discussion.
I have observed that in deparse.c we do add parenthesis around
expressions, unlike to PRETTY_PAREN handling in ruleutils.c.
Thus we need parenthesis here too. However changed this comments
per other comment in the file.
I have tried it. Attached separate patch for it.
However I have noticed that istoplevel is always false (at-least for the
testcase we have, I get it false always). And I also think that taking
this decision only on PlanState is enough. Does that in attached patch.
To fix this, I have passed deparse_namespace to the private argument and
accessed it in get_special_variable(). Changes looks very simple. Please
have a look and let me know your views.
This issue is not introduced by the changes done for the aggregate push
down, it only got exposed as we now ship expressions in the target list.
Thus I think it will be good if we make these changes separately as new
patch, if required.
I see mixed use of capitalization in the file and use of it is adhoc too.
So not convinced with your argument yet. I don't think there is any policy
for that; otherwise use of capitalization in this file would have been
consistent already. Can we defer this to the committer's opinion.
This is consistent with other nodetyps usage. see FuncExpr for example.
inner_cxt is a merger of all the args/expressions context which is then
checked with aggregate's input collation. So I believe aggfilter collation
is also verified as expected.
Am I missing anything here?
>> I don't think add_foreign_grouping_path() is the right place to change
>> a structure managed by the core and esp when we are half-way adding
>> paths. An FDW should not meddle with an upper RelOptInfo. Since
>> create_foreignscan_plan() picks those up from RelOptInfo and both of
>> those are part of core, we need a place in core to set the
>> RelOptInfo::relids for an upper relation OR we have stop using
>> fs_relids for upper relation foreign scans.
>
>
> Yes, agree that relids must be set by the core rather than a fdw.
> However I don't want to touch the core for this patch and also not
> exactly sure where we can do that. I think, for this patch, we can
> copy relids into grouped_rel in create_grouping_paths() at place
> where we assign fdwroutine, userid etc. from the input relation.
I don't think this is acceptable since it changes the search key for
an upper without going through fetch_upper_rel(). Rest of the code,
which tries to find this upper rel with relids = NULL, will not be
able to find it anymore. I have started a discussion on this topic
[1]. One possible solution discussed there is to use all_baserels for
an upper rel. But that solution looks temporary. The discussion has
not yet concluded.
In case we decide not to have any relids in the upper relation, we
will have to obtain those from the scanrel in the context, which is
not being used right now.
Also there are places in code where we check reloptkind and assume it
to be either JOINREL or BASEREL e.g. deparseLockingClause. Those need
to be examined again since now we handle an upper relation. As
discussed before, we might user bms_num_members() on relids to decide
whether the underlying scan is join or base relation.
Done.
Using scan rel's relids in the context and checking for any foreign var
reference. Added Relids member in foreign_glob_cxt and scanrel in
deparse_expr_cxt to accomplish this. Removed logic of copying relids
from scanrel to grouped_rel. In create_foreignscan_plan(), assigned
all_baserels to fs_relids for upper relations per one of the suggestion
mentioned in the said email discussion.
Ok.
In this function
+ /* Must force parens for other expressions */
Please explain why we need parenthesis.
I have observed that in deparse.c we do add parenthesis around
expressions, unlike to PRETTY_PAREN handling in ruleutils.c.
Thus we need parenthesis here too. However changed this comments
per other comment in the file.
Since this patch adds showtype argument which is present in
get_const_expr(), it's clear that we haven't really kept those two
functions in sync, even though the comment says so. It's kind of ugly
to see those hardcoded values. But you have a point. Let's see what
committer says.
Sure, no issues.
>
> ExecInitForeignScan() while determining scan tuple type from passed
> fdw_scan_tlist, has target list containing Vars with varno set to
> INDEX_VAR. Due to which while deparsing we go through
> resolve_special_varno() and get_special_variable() function which
> forces parenthesis around the expression.
> I can't think of any easy and quick solution here. So keeping this
> as is. Input will be welcome or this can be done separately later.
>
I guess, you can decide whether or not to add paranthesis by looking
at the corresponding namespace in the deparse_context. If the
planstate there is ForeignScanState, you may skip the paranthesis if
istoplevel is true. Can you please check if this works?
I have tried it. Attached separate patch for it.
However I have noticed that istoplevel is always false (at-least for the
testcase we have, I get it false always). And I also think that taking
this decision only on PlanState is enough. Does that in attached patch.
To fix this, I have passed deparse_namespace to the private argument and
accessed it in get_special_variable(). Changes looks very simple. Please
have a look and let me know your views.
This issue is not introduced by the changes done for the aggregate push
down, it only got exposed as we now ship expressions in the target list.
Thus I think it will be good if we make these changes separately as new
patch, if required.
> To me it is better to have expression in the GROUP BY clause as it is
> more readable and easy to understand. Also it is in sync with deparsing
> logic in ruleutils.c.
Not exactly. get_rule_sortgroupclause() prints only the target list
positions instead of actual expression in GROUP BY or ORDER BY clause
if requested so. We could opt to do the same in all cases. But I think
it's better to add the whole expression in the remote query for
readability.
Right.
>> case
>> you do that, adjust capitalization accordingly.
>
>
> Moved testcases after Join testcases.
> However I have not made any capitalization changes. I see many tests
> like those elsewhere in the file too. Let me know if that's the policy
> to have appropriate capitalization in test-case. I don't want violate
> the policy if any and will prefer to do the changes as per the policy.
This file has used different styles for different sets. But we usually
adopt the style from surrounding testcases. The testcases surrounding
aggregate testcase block are capitalized. It will look odd to have
just the aggregate tests using different style.
I see mixed use of capitalization in the file and use of it is adhoc too.
So not convinced with your argument yet. I don't think there is any policy
for that; otherwise use of capitalization in this file would have been
consistent already. Can we defer this to the committer's opinion.
Please use "foreign server" instead of "remote".
+ /*
+ * If aggregate's input collation is not derived from a foreign
+ * Var, it can't be sent to remote.
+ */
This is consistent with other nodetyps usage. see FuncExpr for example.
I guess, aggregate's input collation should consider aggfilter collation, since
that doesn't affect aggregates collation. But we should make sure that
aggfilter is safe to pushdown from collation's perspective.
+ /* Check aggregate filter */
+ if (!foreign_expr_walker((Node *) agg->aggfilter,
+ glob_cxt, &inner_cxt))
+ return false;
inner_cxt is a merger of all the args/expressions context which is then
checked with aggregate's input collation. So I believe aggfilter collation
is also verified as expected.
Am I missing anything here?
Fixed all other comments including test-cases changes as suggested.
[1] http://www.mail-archive.com/pgsql-hackers@postgresql.org/ msg295435.html --
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Thanks
--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: