Re: nested loop semijoin estimates - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: nested loop semijoin estimates |
Date | |
Msg-id | 556E0625.30802@2ndquadrant.com Whole thread Raw |
In response to | Re: nested loop semijoin estimates (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: nested loop semijoin estimates
|
List | pgsql-hackers |
On 06/02/15 17:50, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 06/02/15 16:37, Tom Lane wrote: >>> It's possible that the change was due to random variation in ANALYZE >>> statistics, in which case it was just luck. > >> I don't think so. I simply loaded the data, ran ANALYZE, and then simply >> started either master or patched master. There should be no difference >> in statistics, I believe. Also, the plans contain pretty much the same >> row counts, but the costs differ. > >> For example look at the 'cs_ui' CTE, right at the beginning of the >> analyze logs. The row counts are exactly the same, but the costs are >> different. And it's not using semijoins or not nested loops ... > > The cost estimates in that CTE all look exactly the same to me, and the > actual runtime's not all that different either. The key runtime > difference in the first query seems to be in the first couple of joins > to the cs_ui CTE: D'oh! I was looking at the timings, not costs. EINOTENOUGHCAFFEINE ... > Those are the exact same plans, and same cost estimates, but for > some reason the master run is 2X slower. The only explanation I can > think of is disk caching effects, or maybe hint-bit setting during > the first run. It's certainly not the planner that deserves either > credit or blame for that. Yes, seems like that. > The part of this plan that's actually different is a different join > order sequence for a lot of follow-on joins that are expected to get > single rows from their other table. I think that's basically > irrelevant really, but again, this patch should not have changed > anything there, since those were plain joins not semijoins. Yes, it's interesting. I've repeated the tests on 1GB TPC-DS dataset, and this time I got slightly different set of plan changes. I don't see the change in join order, for example. Attached is the whole explain output (all 63 queries), explain.master.1 and explain.patched.1. The are two kinds of plan changes: 1) minor cost changes (these queries contain semijoins, so it's expected), but the plan is exactly the same 2) queries with nested loop replaced by hash join, i.e. plan on master looks like this: -> Nested Loop -> Nested Loop -> Nested Loop -> Seq Scan -> Index Scan -> Index Scan -> Materialize -> Seq Scan while with the patches: -> Hash Join -> Hash Join -> Nested Loop -> Seq Scan -> Index Scan -> Hash -> Seq Scan -> Hash -> Seq Scan This is slightly surprising, because those queries do not contain any semijoins. Detailed explain analyze for the three queries with different plans is attached. At the end, I repeated the explain on master, and got exactly the same results as at the beginning (same as explain.master.1) - so there's no variability in statistics at all. Another interesting thing is that these plan changes only happen after the second patch, i.e. with nestloop-semijoin-costing-fix everything is exactly as on master, but with consider-startup-costs-for-semijoins the plans change. I suspected the problem is somewhere in compare_path_costs_fuzzily() but that doesn't seem to be the case - I've however noticed that the number of calls to that function changes with the patch. For example when running the query in tpcds-3.log, compare_fuzzily os called ~1050x, but with the patch attached it's called only ~730x. I've checked backtraces for a few of the paths that are not considered with the patch, and all of them look like this (detailed backtrace attached): #3 0x0000000000656cd5 in compare_path_costs_fuzzily at pathnode.c:154 #4 0x0000000000657146 in add_path at pathnode.c:438 #5 0x0000000000631bfb in try_nestloop_path at joinpath.c:347 #6 0x0000000000632931 in match_unsorted_outer at joinpath.c:900 #7 add_paths_to_joinrel at joinpath.c:225 That particular part of try_nestloop_path looks like this: if (add_path_precheck(joinrel, workspace.startup_cost, workspace.total_cost, pathkeys, required_outer)) { add_path(joinrel, (Path *) create_nestloop_path(...); } I wouldn't be surprised if the problem was in add_path_precheck. Actually, I'm quite sure that's where the problem is - particularly in how required_outer was replaced with (!consider_startup): consider_startup = required_outer ? consider_param_startup : consider_startup; If both _startup flags are false, the original required_outer value is lost, and that changes the if() condition which was changed from if (startup_cost >= old_path->startup_cost || required_outer) to if (startup_cost >= old_path->startup_cost || !consider_startup) So, if required_outer=false and both p->_startup=false, we get consider_startup=false irrespectedly of the required_outer value, so (!consider_startupe) != required_outer so that the conditions are not equivalent. And indeed, by reverting the if condition to the previous form, we get the same plans as on master. I don't know whether this is correct behavior or not, but in all three cases I've observed on TPC-DS, the new plans performed better Q1 Q2 Q3 -------------------------------------------------------------------- master 518.330 ms 466.539 ms 560.585 ms patched 405.310 ms 340.506 ms 122.072 ms This time I've been careful not to produce bogus numbers (also, the explain analyze timing I mentioned in previous post were significantly influenced by timing overhead). These results are quite stable. A few more comments on the patch: 1) Shouldn't the CONSIDER_PATH_STARTUP_COST(p) be defined rather like this? I mean, if it's a parametric query, use any of the flags? And same in add_path_precheck()? #define CONSIDER_PATH_STARTUP_COST(p) \ ((p)->param_info == NULL ? (p)->parent->consider_startup : ((p)->parent->consider_param_startup || (p)->parent->consider_startup)) 2) Is it really possible to get different values for the startup flags on path1 and path2 in compare_path_costs_fuzzily()? If not (if I get it right, path1->parent == path2->parent anyway), then maybe passing that as a function parameter (just like before) would be better. The macro probably won't be as nice, but I find it easier to understand. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: