Re: Problems with plan estimates in postgres_fdw - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: Problems with plan estimates in postgres_fdw |
Date | |
Msg-id | 10994.1549559088@localhost Whole thread Raw |
In response to | Re: Problems with plan estimates in postgres_fdw (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: Problems with plan estimates in postgres_fdw
|
List | pgsql-hackers |
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > (2018/12/28 15:50), Etsuro Fujita wrote: > > Attached is a new version of the > > patch. > > Here is an updated version of the patch set. Changes are: I've spent some time reviewing the patches. First, I wonder if the information on LIMIT needs to be passed to the FDW. Cannot the FDW routines use root->tuple_fraction? I think (although have not verified experimentally) that the problem reported in [1] is not limited to the LIMIT clause. I think the same problem can happen if client uses cursor and sets cursor_tuple_fraction very low. Since the remote node is not aware of this setting when running EXPLAIN, the low fraction can also make postgres_fdw think that retrieval of the whole set is cheaper than it will appear to be during the actual execution. Some comments on coding: 0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch ----------------------------------------------------------------- * I'm not sure if UPPERREL_ORDERD should to deal with LIMIT at all. Note that grouping_planner() does not call create_limit_path() until it's done with create_ordered_paths(), and create_ordered_paths() is where the FDW is requested to add its paths to UPPERREL_ORDERED relation. * add_foreign_ordered_paths() Exprssion root->upper_targets[UPPERREL_FINAL] is used to access the target. I think create_ordered_paths() should actually set the target so that postgresGetForeignUpperPaths() can simply find it in output_rel->reltarget. This is how postgres_fdw already retrieves the target for grouped paths. Small patch is attached that makes this possible. * regression tests: I think test(s) should be added for queries that have ORDER BY clause but do not have GROUP BY (and also no LIMIT / OFFSET) clause. I haven't noticed such tests. 0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v3.patch --------------------------------------------------------------- * add_foreign_final_paths() Similar problem I reported for add_foreign_ordered_paths() above. * adjust_limit_rows_costs() Doesn't this function address more generic problem than the use of postgres_fdw? If so, then it should be submitted as a separate patch. BTW, the function is only used in pathnode.c, so it should be declared static. Both patches: ------------ One thing that makes me confused when I try to understand details is that the new functions add_foreign_ordered_paths() and add_foreign_final_paths() both pass "input_rel" for the "foreignrel" argument of estimate_path_cost_size(): estimate_path_cost_size(root, input_rel, ...) whereas the existing add_foreign_grouping_paths() passes "grouped_rel": estimate_path_cost_size(root, grouped_rel, ...) Can you please make this consistent? If you do, you'll probably need to reconsider your changes to estimate_path_cost_size(). And maybe related problem: why should FDW care about the effect of apply_scanjoin_target_to_paths() like you claim to do here? /* * If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the * final scan/join relation, the costs obtained from the cache * wouldn't yet contain the eval cost for the final scan/join target, * which would have been updated by apply_scanjoin_target_to_paths(); * add the eval cost now. */ if (fpextra && !IS_UPPER_REL(foreignrel)) { /* The costs should have been obtained from the cache. */ Assert(fpinfo->rel_startup_cost >= 0 && fpinfo->rel_total_cost >= 0); startup_cost += foreignrel->reltarget->cost.startup; run_cost += foreignrel->reltarget->cost.per_tuple * rows; } I think it should not, whether "foreignrel" means "input_rel" or "output_rel" from the perspective of postgresGetForeignUpperPaths(). [1] https://www.postgresql.org/message-id/87pnz1aby9.fsf@news-spur.riddles.org.uk -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index b2239728cf..17e983f11e 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -2035,6 +2035,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update, * of the corresponding upperrels might not be needed for this query. */ root->upper_targets[UPPERREL_FINAL] = final_target; + root->upper_targets[UPPERREL_ORDERED] = final_target; root->upper_targets[UPPERREL_WINDOW] = sort_input_target; root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target; @@ -2118,6 +2119,12 @@ grouping_planner(PlannerInfo *root, bool inheritance_update, final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL); /* + * Set reltarget so that it's consistent with the paths. Also it's more + * convenient for FDW to find the target here. + */ + final_rel->reltarget = final_target; + + /* * If the input rel is marked consider_parallel and there's nothing that's * not parallel-safe in the LIMIT clause, then the final_rel can be marked * consider_parallel as well. Note that if the query has rowMarks or is @@ -4865,6 +4872,12 @@ create_ordered_paths(PlannerInfo *root, ordered_rel = fetch_upper_rel(root, UPPERREL_ORDERED, NULL); /* + * Set reltarget so that it's consistent with the paths. Also it's more + * convenient for FDW to find the target here. + */ + ordered_rel->reltarget = target; + + /* * If the input relation is not parallel-safe, then the ordered relation * can't be parallel-safe, either. Otherwise, it's parallel-safe if the * target list is parallel-safe.
pgsql-hackers by date: