Thread: postgres_fdw: oddity in costing aggregate pushdown paths

postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
Hi,

While working on [1], I noticed that since we don't set the selectivity
and cost of the local_conds (i.e., fpinfo->local_conds_sel and
fpinfo->local_conds_cost) properly in add_foreign_grouping_paths and
foreign_grouping_ok, estimate_path_cost_size produces considerably
underestimated results when use_remote_estimate.  I think this would
cause problems in later planning steps.  So, why not add something like
this to add_foreign_grouping_paths, as done in other functions such as
postgresGetForeignJopinPaths?

--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -5496,6 +5496,19 @@ add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo\
 *input_rel,
    if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual))
        return;

+   /*
+    * Compute the selectivity and cost of the local_conds, so we don't have
+    * to do it over again for each path.  The best we can do for these
+    * conditions is to estimate selectivity on the basis of local
statistics.
+    */
+   fpinfo->local_conds_sel = clauselist_selectivity(root,
+                                                    fpinfo->local_conds,
+                                                    0,
+                                                    JOIN_INNER,
+                                                    NULL);
+
+   cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+
    /* Estimate the cost of push down */
    estimate_path_cost_size(root, grouped_rel, NIL, NIL, &rows,
                            &width, &startup_cost, &total_cost);

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/5BBC4140.50403%40lab.ntt.co.jp


Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2018/11/27 21:55), Etsuro Fujita wrote:
> While working on [1], I noticed that since we don't set the selectivity
> and cost of the local_conds (i.e., fpinfo->local_conds_sel and
> fpinfo->local_conds_cost) properly in add_foreign_grouping_paths and
> foreign_grouping_ok, estimate_path_cost_size produces considerably
> underestimated results when use_remote_estimate.  I think this would
> cause problems in later planning steps.  So, why not add something like
> this to add_foreign_grouping_paths, as done in other functions such as
> postgresGetForeignJopinPaths?
> 
> --- a/contrib/postgres_fdw/postgres_fdw.c
> +++ b/contrib/postgres_fdw/postgres_fdw.c
> @@ -5496,6 +5496,19 @@ add_foreign_grouping_paths(PlannerInfo *root,
> RelOptInfo\
>   *input_rel,
>      if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual))
>          return;
> 
> +   /*
> +    * Compute the selectivity and cost of the local_conds, so we don't have
> +    * to do it over again for each path.  The best we can do for these
> +    * conditions is to estimate selectivity on the basis of local
> statistics.
> +    */
> +   fpinfo->local_conds_sel = clauselist_selectivity(root,
> +                                                    fpinfo->local_conds,
> +                                                    0,
> +                                                    JOIN_INNER,
> +                                                    NULL);
> +
> +   cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
> +
>      /* Estimate the cost of push down */
>      estimate_path_cost_size(root, grouped_rel, NIL, NIL,&rows,
>                              &width,&startup_cost,&total_cost);

Here is a patch for that.

BTW another thing I noticed is this comment on costing aggregate
pushdown paths using local statistics in estimate_path_cost_size:

             * Also, core does not care about costing HAVING expressions and
             * adding that to the costs.  So similarly, here too we are not
             * considering remote and local conditions for costing.

I think this was true when aggregate pushdown went in, but isn't anymore
because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2.  So we
should update estimate_path_cost_size so that it accounts for the
selectivity and cost of the HAVING expressions as well?  I think this
comment needs to be updated at least, though.

Best regards,
Etsuro Fujita

Attachment

Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2018/11/28 13:38), Etsuro Fujita wrote:
> BTW another thing I noticed is this comment on costing aggregate
> pushdown paths using local statistics in estimate_path_cost_size:
> 
>               * Also, core does not care about costing HAVING expressions and
>               * adding that to the costs.  So similarly, here too we are not
>               * considering remote and local conditions for costing.
> 
> I think this was true when aggregate pushdown went in, but isn't anymore
> because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2.  So we
> should update estimate_path_cost_size so that it accounts for the
> selectivity and cost of the HAVING expressions as well?

There seems to be no objections, I updated the patch as such.  Attached
is an updated version of the patch.

Best regards,
Etsuro Fujita

Attachment

Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2018/11/30 18:51), Etsuro Fujita wrote:
> (2018/11/28 13:38), Etsuro Fujita wrote:
>> BTW another thing I noticed is this comment on costing aggregate
>> pushdown paths using local statistics in estimate_path_cost_size:
>>
>>                * Also, core does not care about costing HAVING expressions and
>>                * adding that to the costs.  So similarly, here too we are not
>>                * considering remote and local conditions for costing.
>>
>> I think this was true when aggregate pushdown went in, but isn't anymore
>> because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2.  So we
>> should update estimate_path_cost_size so that it accounts for the
>> selectivity and cost of the HAVING expressions as well?
> 
> There seems to be no objections, I updated the patch as such.  Attached
> is an updated version of the patch.

I revised some comments a bit and added the commit message.  Attached is
an updated patch.  If there are no objections, I'll apply this to HEAD only.

Best regards,
Etsuro Fujita

Attachment

Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2018/12/03 20:20), Etsuro Fujita wrote:
> (2018/11/30 18:51), Etsuro Fujita wrote:
>> (2018/11/28 13:38), Etsuro Fujita wrote:
>>> BTW another thing I noticed is this comment on costing aggregate
>>> pushdown paths using local statistics in estimate_path_cost_size:
>>>
>>>                 * Also, core does not care about costing HAVING expressions and
>>>                 * adding that to the costs.  So similarly, here too we are not
>>>                 * considering remote and local conditions for costing.
>>>
>>> I think this was true when aggregate pushdown went in, but isn't anymore
>>> because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2.  So we
>>> should update estimate_path_cost_size so that it accounts for the
>>> selectivity and cost of the HAVING expressions as well?
>>
>> There seems to be no objections, I updated the patch as such.  Attached
>> is an updated version of the patch.
> 
> I revised some comments a bit and added the commit message.  Attached is
> an updated patch.  If there are no objections, I'll apply this to HEAD only.

Done after fixing typos in the commit message.

Best regards,
Etsuro Fujita



Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2018/12/04 17:24), Etsuro Fujita wrote:
> (2018/12/03 20:20), Etsuro Fujita wrote:
>> (2018/11/30 18:51), Etsuro Fujita wrote:
>>> (2018/11/28 13:38), Etsuro Fujita wrote:
>>>> BTW another thing I noticed is this comment on costing aggregate
>>>> pushdown paths using local statistics in estimate_path_cost_size:
>>>>
>>>>                  * Also, core does not care about costing HAVING expressions and
>>>>                  * adding that to the costs.  So similarly, here too we are not
>>>>                  * considering remote and local conditions for costing.
>>>>
>>>> I think this was true when aggregate pushdown went in, but isn't anymore
>>>> because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2.  So we
>>>> should update estimate_path_cost_size so that it accounts for the
>>>> selectivity and cost of the HAVING expressions as well?
>>>
>>> There seems to be no objections, I updated the patch as such.  Attached
>>> is an updated version of the patch.
>>
>> I revised some comments a bit and added the commit message.  Attached is
>> an updated patch.  If there are no objections, I'll apply this to HEAD only.
> 
> Done after fixing typos in the commit message.

I noticed that I forgot to modify the cost for evaluating the PathTarget
for each output row accordingly to this change :(.  Attached is a patch
for that.

Best regards,
Etsuro Fujita

Attachment

Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2018/12/28 17:28), Etsuro Fujita wrote:
> I noticed that I forgot to modify the cost for evaluating the PathTarget
> for each output row accordingly to this change :(.  Attached is a patch
> for that.

On reflection, I noticed these on estimate_path_cost_size, other than that:

1) In the case of a foreign-grouping path, we failed to adjust the
PathTarget eval cost when using the remote estimates, which I think
would be needed because the PathTarget expressions cannot always be
pre-computed as entries of the fdw_scan_tlist for the foreign-grouping path.

2) We also failed to factor in the eval cost for the foreign-scan and
foreign-join cases, with/without the remote estimates; in the scan/join
cases, the PathTarget might contain PHVs, so the cost of evaluating PHVs
should be charged.  Currently, PHVs are evaluated locally, so the cost
of PHV expressions should also be factored in when using the remote
estimates.

Here is a new version of the patch.

Best regards,
Etsuro Fujita

Attachment

Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2019/01/04 20:33), Etsuro Fujita wrote:
> Here is a new version of the patch.

Attached is an updated version of the patch.

Changes:
* Fixed a stupid bug in the case when use_remote_estimate
* Fixed a typo in a comment I added
* Modified comments I added a little bit further
* Added the commit message

If there are no objections, I'll push that.

Best regards,
Etsuro Fujita

Attachment

Re: postgres_fdw: oddity in costing aggregate pushdown paths

From
Etsuro Fujita
Date:
(2019/01/23 20:35), Etsuro Fujita wrote:
> Attached is an updated version of the patch.
> 
> Changes:
> * Fixed a stupid bug in the case when use_remote_estimate
> * Fixed a typo in a comment I added
> * Modified comments I added a little bit further
> * Added the commit message
> 
> If there are no objections, I'll push that.

Pushed after fixing another typo in a comment.

Best regards,
Etsuro Fujita