Thread: postgres_fdw: another oddity in costing aggregate pushdown paths
As mentioned in the near thread, I think there is another oversight in the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC. When costing an aggregate pushdown path using local statistics, we re-use the estimated costs of implementing the underlying scan/join relation, cached in the relation's PgFdwRelationInfo (ie, rel_startup_cost and rel_total_cost). Since these costs wouldn't yet contain the costs of evaluating the final scan/join target, as tlist replacement by apply_scanjoin_target_to_paths() is performed afterwards. So I think we need to adjust these costs so that the tlist eval costs are included, but ISTM that estimate_path_cost_size() forgot to do so. Attached is a patch for fixing this issue. Best regards, Etsuro Fujita
Attachment
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > As mentioned in the near thread, I think there is another oversight in > the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC. > When costing an aggregate pushdown path using local statistics, we > re-use the estimated costs of implementing the underlying scan/join > relation, cached in the relation's PgFdwRelationInfo (ie, > rel_startup_cost and rel_total_cost). Since these costs wouldn't yet > contain the costs of evaluating the final scan/join target, as tlist > replacement by apply_scanjoin_target_to_paths() is performed afterwards. > So I think we need to adjust these costs so that the tlist eval costs > are included, but ISTM that estimate_path_cost_size() forgot to do so. > Attached is a patch for fixing this issue. I think the following comment in apply_scanjoin_target_to_paths() should mention that FDWs rely on the new value of reltarget. /* * Update the reltarget. This may not be strictly necessary in all cases, * but it is at least necessary when create_append_path() gets called * below directly or indirectly, since that function uses the reltarget as * the pathtarget for the resulting path. It seems like a good idea to do * it unconditionally. */ rel->reltarget = llast_node(PathTarget, scanjoin_targets); -- Antonin Houska https://www.cybertec-postgresql.com
(2019/02/22 23:10), Antonin Houska wrote: > Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote: >> As mentioned in the near thread, I think there is another oversight in >> the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC. >> When costing an aggregate pushdown path using local statistics, we >> re-use the estimated costs of implementing the underlying scan/join >> relation, cached in the relation's PgFdwRelationInfo (ie, >> rel_startup_cost and rel_total_cost). Since these costs wouldn't yet >> contain the costs of evaluating the final scan/join target, as tlist >> replacement by apply_scanjoin_target_to_paths() is performed afterwards. >> So I think we need to adjust these costs so that the tlist eval costs >> are included, but ISTM that estimate_path_cost_size() forgot to do so. >> Attached is a patch for fixing this issue. > > I think the following comment in apply_scanjoin_target_to_paths() should > mention that FDWs rely on the new value of reltarget. > > /* > * Update the reltarget. This may not be strictly necessary in all cases, > * but it is at least necessary when create_append_path() gets called > * below directly or indirectly, since that function uses the reltarget as > * the pathtarget for the resulting path. It seems like a good idea to do > * it unconditionally. > */ > rel->reltarget = llast_node(PathTarget, scanjoin_targets); Agreed. How about mentioning that like the attached? In addition, I added another assertion to estimate_path_cost_size() in that patch. Thanks for the review! Best regards, Etsuro Fujita
Attachment
(2019/02/25 19:59), Etsuro Fujita wrote: > (2019/02/22 23:10), Antonin Houska wrote: >> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote: >>> As mentioned in the near thread, I think there is another oversight in >>> the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC. >>> When costing an aggregate pushdown path using local statistics, we >>> re-use the estimated costs of implementing the underlying scan/join >>> relation, cached in the relation's PgFdwRelationInfo (ie, >>> rel_startup_cost and rel_total_cost). Since these costs wouldn't yet >>> contain the costs of evaluating the final scan/join target, as tlist >>> replacement by apply_scanjoin_target_to_paths() is performed afterwards. >>> So I think we need to adjust these costs so that the tlist eval costs >>> are included, but ISTM that estimate_path_cost_size() forgot to do so. >>> Attached is a patch for fixing this issue. >> >> I think the following comment in apply_scanjoin_target_to_paths() should >> mention that FDWs rely on the new value of reltarget. >> >> /* >> * Update the reltarget. This may not be strictly necessary in all cases, >> * but it is at least necessary when create_append_path() gets called >> * below directly or indirectly, since that function uses the reltarget as >> * the pathtarget for the resulting path. It seems like a good idea to do >> * it unconditionally. >> */ >> rel->reltarget = llast_node(PathTarget, scanjoin_targets); > > Agreed. How about mentioning that like the attached? In addition, I > added another assertion to estimate_path_cost_size() in that patch. This doesn't get applied cleanly after commit 1d33858406. Here is a rebased version of the patch. I also modified the comments a little bit. If there are no objections from Antonin or anyone else, I'll commit the patch. Best regards, Etsuro Fujita
Attachment
On Wed, May 8, 2019 at 12:45 PM Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > This doesn't get applied cleanly after commit 1d33858406. Here is a > rebased version of the patch. I also modified the comments a little > bit. If there are no objections from Antonin or anyone else, I'll > commit the patch. Pushed. Thanks for reviewing, Antonin! Best regards, Etsuro Fujita