Re: MergeAppend could consider sorting cheapest child path - Mailing list pgsql-hackers
From | Alexander Pyhalov |
---|---|
Subject | Re: MergeAppend could consider sorting cheapest child path |
Date | |
Msg-id | 1b95a0d224dbb6884c64b44b8995da32@postgrespro.ru Whole thread Raw |
In response to | Re: MergeAppend could consider sorting cheapest child path (Andrei Lepikhov <lepihov@gmail.com>) |
List | pgsql-hackers |
Andrei Lepikhov писал(а) 2025-04-29 16:52: > On 4/25/25 17:13, Andrei Lepikhov wrote: >> On 4/25/25 11:16, Alexander Pyhalov wrote: >>> Usually, sorted cheapest_total_path will be cheaper than sorted >>> fractional/startup path at least by startup cost (as after sorting it >>> includes total_cost of input path). But we ignore this case when >>> selecting cheapest_startup and cheapest_fractional subpaths. As >>> result selected cheapest_startup and cheapest_fractional can be not >>> cheapest for startup or selecting a fraction of rows. >> I don't know what you mean by that. The cheapest_total_path is >> considered when we chose optimal cheapest_total path. The same works >> for the fractional path - get_cheapest_fractional_path gives us the >> most optimal fractional path and probes cheapest_total_path too. >> As above, not sure about min-startup case for now. I can imagine >> MergeAppend over sophisticated subquery: non-sorted includes highly >> parameterised JOINs and the alternative (with pathkeys) includes >> HashJoin, drastically increasing startup cost. It is only a theory, of >> course. So, lets discover how min-startup works. > After a second thought I have caught your idea. I agree that for a > fractional path it have no sense to choose any other path except a > cheapest total one. > There are the modified patch in the attachment. > > Also, to be more objective, I propose to use examples in argumentation > - something like in attached test2.sql script. Hi. I've looked through new patch and found minor inconsistencies in get_cheapest_path_for_pathkeys_ext() and get_cheapest_fractional_path_for_pathkeys_ext(). In get_cheapest_fractional_path_for_pathkeys_ext() we check that base_path is not NULL path = get_cheapest_fractional_path_for_pathkeys(rel->pathlist, pathkeys, required_outer, fraction); base_path = rel->cheapest_total_path; /* Stop here if the path doesn't satisfy necessary conditions */ if (!base_path || !bms_is_subset(PATH_REQ_OUTER(base_path), required_outer)) return path; But it seems, base_path can't be NULL (as add_paths_to_append_rel() is called after set_rel_pathlist() for childrels). However, path can. Can we do these two functions get_cheapest_path_for_pathkeys_ext() and get_cheapest_fractional_path_for_pathkeys_ext() more similar? Also we check base_path for required_outer and require_parallel_safe, but if cheapest path for pathkeys is NULL, these checks are not performed. Luckily, they seen to be no-op anyway due to cheapest_total->param_info == NULL and function arguments being NULL (required_outer) and false (require_parallel_safe). Should we do something about this? Don't know, perhaps, remove these misleading arguments? Now, if we return cheapest_total_path from get_cheapest_fractional_path_for_pathkeys_ext() if cheapest paths for pathkeys don't exist, do the following lines /* * If we found no path with matching pathkeys, use the * cheapest total path instead. * * XXX We might consider partially sorted paths too (with an * incremental sort on top). But we'd have to build all the * incremental paths, do the costing etc. */ if (!cheapest_fractional) cheapest_fractional = cheapest_total; become no-op? And we do return non-null path from get_cheapest_fractional_path_for_pathkeys_ext(), as it seems we return either cheapest_total_path or cheapest fractional path from get_cheapest_fractional_path_for_pathkeys_ext(). -- Best regards, Alexander Pyhalov, Postgres Professional
pgsql-hackers by date: