Thread: postgres_fdw: estimate_path_cost_size fails to re-use cached costs
Hi, I noticed yet another thing while updating the patch for pushing down ORDER BY LIMIT. Let me explain. When costing foreign paths on the basis of local statistics, we calculate/cache the costs of an unsorted foreign path, and re-use them to estimate the costs of presorted foreign paths, as shown below. BUT: we fail to re-use them for some typical queries, such as "select * from ft1 order by a", due to fpinfo->rel_startup_cost=0, leading to doing the same cost calculation repeatedly. /* * We will come here again and again with different set of pathkeys * that caller wants to cost. We don't need to calculate the cost of * bare scan each time. Instead, use the costs if we have cached them * already. */ if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0) { startup_cost = fpinfo->rel_startup_cost; run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost; } I think we should use "fpinfo->rel_startup_cost >= 0" here, not "fpinfo->rel_startup_cost > 0". Also, it would be possible that the total cost calculated is zero in corner cases (eg, seq_page_cost=0 and cpu_tuple_cost=0 for the example), so I think we should change the total cost part as well. Attached is a patch for that. Best regards, Etsuro Fujita
Attachment
(2019/01/25 20:33), Etsuro Fujita wrote: > I noticed yet another thing while updating the patch for pushing down > ORDER BY LIMIT. Let me explain. When costing foreign paths on the > basis of local statistics, we calculate/cache the costs of an unsorted > foreign path, and re-use them to estimate the costs of presorted foreign > paths, as shown below. BUT: we fail to re-use them for some typical > queries, such as "select * from ft1 order by a", due to > fpinfo->rel_startup_cost=0, leading to doing the same cost calculation > repeatedly. > > /* > * We will come here again and again with different set of pathkeys > * that caller wants to cost. We don't need to calculate the cost of > * bare scan each time. Instead, use the costs if we have cached > them > * already. > */ > if (fpinfo->rel_startup_cost> 0&& fpinfo->rel_total_cost> 0) > { > startup_cost = fpinfo->rel_startup_cost; > run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost; > } > > I think we should use "fpinfo->rel_startup_cost>= 0" here, not > "fpinfo->rel_startup_cost> 0". Also, it would be possible that the > total cost calculated is zero in corner cases (eg, seq_page_cost=0 and > cpu_tuple_cost=0 for the example), so I think we should change the total > cost part as well. Attached is a patch for that. I added the commit message. Updated patch attached. If no objections, I'll apply that to HEAD only as there are no reports of actual trouble from this, as far as I know. Best regards, Etsuro Fujita
Attachment
(2019/01/28 19:37), Etsuro Fujita wrote: > (2019/01/25 20:33), Etsuro Fujita wrote: >> I noticed yet another thing while updating the patch for pushing down >> ORDER BY LIMIT. Let me explain. When costing foreign paths on the >> basis of local statistics, we calculate/cache the costs of an unsorted >> foreign path, and re-use them to estimate the costs of presorted foreign >> paths, as shown below. BUT: we fail to re-use them for some typical >> queries, such as "select * from ft1 order by a", due to >> fpinfo->rel_startup_cost=0, leading to doing the same cost calculation >> repeatedly. >> >> /* >> * We will come here again and again with different set of pathkeys >> * that caller wants to cost. We don't need to calculate the cost of >> * bare scan each time. Instead, use the costs if we have cached >> them >> * already. >> */ >> if (fpinfo->rel_startup_cost> 0&& fpinfo->rel_total_cost> 0) >> { >> startup_cost = fpinfo->rel_startup_cost; >> run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost; >> } >> >> I think we should use "fpinfo->rel_startup_cost>= 0" here, not >> "fpinfo->rel_startup_cost> 0". Also, it would be possible that the >> total cost calculated is zero in corner cases (eg, seq_page_cost=0 and >> cpu_tuple_cost=0 for the example), so I think we should change the total >> cost part as well. Attached is a patch for that. > > I added the commit message. Updated patch attached. If no objections, > I'll apply that to HEAD only as there are no reports of actual trouble > from this, as far as I know. Pushed. Best regards, Etsuro Fujita