Thread: [sqlsmith] Failed to generate plan on lateral subqueries
Hi, I've added new grammar rules to sqlsmith and improved some older ones. This was rewarded with a return of "failed to generate plan" errors. The failing queries all contain a lateral subquery. The shortest of the failing queries are below. They were run against the regression db of master as of db07236. regards, Andreas smith=# select msg, query from error where (firstline(msg) ~~ 'ERROR: failed to build any%' or firstline(msg) ~~ 'ERROR: could not devise a query plan%') and t > now() - interval '1 day' order by length(query) asc limit 3; ERROR: failed to build any 8-way joins select ref_96.foreign_table_schema as c0, sample_87.is_supported as c1 from information_schema.sql_packages as sample_87 tablesample system (0.2) right join information_schema._pg_foreign_tablesas ref_96 on (sample_87.feature_id = ref_96.foreign_table_catalog ), lateral (select sample_87.is_verified_by as c0, ref_97.indexed_col as c1, coalesce(sample_87.feature_id, ref_96.foreign_server_name)as c2, 4 as c3 from public.comment_test as ref_97 where ref_97.id ~>~ ref_97.indexed_col fetch first 73 rows only) as subq_33 where ref_96.foreign_table_name ~~ subq_33.c1 ERROR: could not devise a query plan for the given query select subq_43.c0 as c0 from (select ref_181.installed as c0 from pg_catalog.pg_available_extension_versions as ref_181, lateral (select ref_181.name as c0, ref_181.installed as c1 from pg_catalog.pg_conversion as ref_182 where ref_182.conname ~~* ref_181.version fetch first 98rows only) as subq_42 where (subq_42.c0 is not NULL) or (subq_42.c1 is NULL)) as subq_43 right join pg_catalog.pg_languageas sample_177 tablesample system (2.8) on (subq_43.c0 = sample_177.lanispl ) where sample_177.lanowner < sample_177.lanvalidator ERROR: failed to build any 5-way joins select ref_239.id2 as c0, 40 as c1, ref_239.id2 as c2, ref_238.aa as c3 from public.tt5 as sample_289 tablesample system (8.1) inner join information_schema.element_types as ref_237 on (sample_289.x = ref_237.character_maximum_length ) left join public.b as ref_238 on (ref_237.character_maximum_length= ref_238.aa ) left join public.num_exp_mul as ref_239 on (ref_237.numeric_precision_radix= ref_239.id1 ), lateral (select sample_290.b as c0, sample_289.y as c1, ref_239.id2 as c2 from public.rtest_t8 as sample_290 tablesample bernoulli (4.6) where (sample_290.b > ref_238.bb) and (sample_289.y > ref_239.expected) fetch first 91 rows only) as subq_64 where (subq_64.c1 > sample_289.y) and (sample_289.y = ref_239.expected) fetch first 133 rows only
On 2015/12/07 2:52, Andreas Seltenreich wrote: > Hi, > > I've added new grammar rules to sqlsmith and improved some older ones. > This was rewarded with a return of "failed to generate plan" errors. > The failing queries all contain a lateral subquery. The shortest of the > failing queries are below. They were run against the regression db of > master as of db07236. > > smith=# select msg, query from error where > (firstline(msg) ~~ 'ERROR: failed to build any%' > or firstline(msg) ~~ 'ERROR: could not devise a query plan%') > and t > now() - interval '1 day' order by length(query) asc limit 3; > > ERROR: failed to build any 8-way joins > select > ref_96.foreign_table_schema as c0, > sample_87.is_supported as c1 > from > information_schema.sql_packages as sample_87 tablesample system (0.2) > right join information_schema._pg_foreign_tables as ref_96 > on (sample_87.feature_id = ref_96.foreign_table_catalog ), > lateral (select > sample_87.is_verified_by as c0, > ref_97.indexed_col as c1, > coalesce(sample_87.feature_id, ref_96.foreign_server_name) as c2, > 4 as c3 > from > public.comment_test as ref_97 > where ref_97.id ~>~ ref_97.indexed_col > fetch first 73 rows only) as subq_33 > where ref_96.foreign_table_name ~~ subq_33.c1 > > ERROR: could not devise a query plan for the given query > select > subq_43.c0 as c0 > from > (select > ref_181.installed as c0 > from > pg_catalog.pg_available_extension_versions as ref_181, > lateral (select > ref_181.name as c0, > ref_181.installed as c1 > from > pg_catalog.pg_conversion as ref_182 > where ref_182.conname ~~* ref_181.version > fetch first 98 rows only) as subq_42 > where (subq_42.c0 is not NULL) > or (subq_42.c1 is NULL)) as subq_43 > right join pg_catalog.pg_language as sample_177 tablesample system (2.8) > on (subq_43.c0 = sample_177.lanispl ) > where sample_177.lanowner < sample_177.lanvalidator > > ERROR: failed to build any 5-way joins > select > ref_239.id2 as c0, > 40 as c1, > ref_239.id2 as c2, > ref_238.aa as c3 > from > public.tt5 as sample_289 tablesample system (8.1) > inner join information_schema.element_types as ref_237 > on (sample_289.x = ref_237.character_maximum_length ) > left join public.b as ref_238 > on (ref_237.character_maximum_length = ref_238.aa ) > left join public.num_exp_mul as ref_239 > on (ref_237.numeric_precision_radix = ref_239.id1 ), > lateral (select > sample_290.b as c0, > sample_289.y as c1, > ref_239.id2 as c2 > from > public.rtest_t8 as sample_290 tablesample bernoulli (4.6) > where (sample_290.b > ref_238.bb) > and (sample_289.y > ref_239.expected) > fetch first 91 rows only) as subq_64 > where (subq_64.c1 > sample_289.y) > and (sample_289.y = ref_239.expected) > fetch first 133 rows only FWIW, I'm seeing the following behaviors: * Removing the limit (fetch first...) in lateral sub-queries makes the errors go away for all above queries. * For the last query producing "failed to build any 5-way joins" error, setting join_collapse_limit to 1, 2 and 4 makes the error go away irrespective of whether the limit is kept or not. Thanks, Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > On 2015/12/07 2:52, Andreas Seltenreich wrote: >> I've added new grammar rules to sqlsmith and improved some older ones. >> This was rewarded with a return of "failed to generate plan" errors. >> The failing queries all contain a lateral subquery. > * Removing the limit (fetch first...) in lateral sub-queries makes the > errors go away for all above queries. Yeah. I've been able to reduce Andreas' first example to select * from text_tbl tt1 left join int8_tbl i8 on i8.q1 = 42, lateral (select i8.q2, tt2.f1 from text_tbl tt2 limit 1 ) as ss where tt1.f1 = ss.f1; ERROR: failed to build any 3-way joins The key features are (1) a subquery with a LATERAL reference to the inner side of a left join, and (2) a degenerate join condition for the left join, ie it only references the inner side. (2) causes the planner to see the left join as a clauseless join, so it prefers to postpone it as long as possible. In this case it will try to join tt1 to ss first, because the WHERE condition is an inner-join clause linking those two, and inner joins are allowed to associate into the lefthand sides of left joins. But now it's stuck: because of the lateral reference to i8, the only way to generate a join between tt1+ss and i8 is for i8 to be on the outside of a nestloop, and that doesn't work because nestloop can only handle LEFT outer joins not RIGHT outer joins. So that's a dead end; but because it thought earlier that tt1 could be joined to ss, it did not generate the tt1+i8 join at all, so it fails to find any way to build the final join. If you remove the LIMIT then the sub-select can be flattened, causing the problem to go away because there's no longer a lateral ordering constraint (there's not actually any need to evaluate i8.q2 while scanning tt2). I think the way to fix this is that join_is_legal() should be taught to notice whether the proposed join would have unresolved lateral references to other relations that it will need to be on the outside of any join to. If join_is_legal were to reject tt1+ss then the has_legal_joinclause tests at the bottom of have_join_order_restriction would fail, so have_join_order_restriction would correctly report that there's a constraint forcing tt1 and i8 to be joined directly despite the lack of a join clause. Andreas' second example is a similar situation, with the addition of a PlaceHolderVar in the sub-select (since it has a non-strict output variable that has to bubble up through an outer join). In this case we fail earlier, because although join_is_legal again lets us try to make a join that can't be useful, the check_hazardous_phv() test that I recently added to joinpath.c recognizes that there's no safe way to make that join, so it rejects all the possible join paths and we end up with a joinrel with no paths, leading to the different error message. I think that fixing join_is_legal() may be enough to take care of this case too, but I need to think about whether there could still be any cases in which check_hazardous_phv() would reject all paths for a joinrel. It might be that that logic is in the wrong place and needs to be folded into join_is_legal(), or that join_is_legal() *also* has to account for this (which would be annoying). I've not traced through the third example in detail, but it looks like it's just a variant of these problems. regards, tom lane
Andreas Seltenreich <seltenreich@gmx.de> writes: > I've added new grammar rules to sqlsmith and improved some older ones. > This was rewarded with a return of "failed to generate plan" errors. I believe I've dealt with these cases now. Thanks for the report! regards, tom lane
Tom Lane writes: > Andreas Seltenreich <seltenreich@gmx.de> writes: >> I've added new grammar rules to sqlsmith and improved some older ones. >> This was rewarded with a return of "failed to generate plan" errors. > > I believe I've dealt with these cases now. Thanks for the report! I no longer see "failed to build any n-way joins" after pulling, but there are still instances of "could not devise a query plan". Samples below. regards, Andreas select ref_1.aa as c0, subq_1.c1 as c1, coalesce(ref_1.class, ref_1.class) as c2, subq_1.c0 as c3 from (select subq_0.c1 as c0, coalesce(sample_0.a, sample_1.i) as c1from public.rtest_t9 as sample_0 tablesample bernoulli(5.6) inner join public.iportaltest as sample_1 tablesample bernoulli (9.8) on (sample_0.a = sample_1.i ), lateral (select sample_1.d as c0, ref_0.a as c1, sample_1.p as c2, ref_0.a as c3, ref_0.a as c4, sample_0.bas c5, sample_1.i as c6 from public.rtest_view2 as ref_0 where sample_0.b = sample_0.b fetchfirst 93 rows only) as subq_0where sample_0.b ~<=~ sample_0.b) as subq_1 right join public.e_star as ref_1 on (subq_1.c0= ref_1.aa ) where ref_1.cc < ref_1.cc fetch first 59 rows only; select sample_69.tmpllibrary as c0, coalesce(sample_69.tmplname, sample_69.tmplname) as c1, subq_33.c0 as c2 from (select coalesce(ref_53.provider, sample_68.typdefault) as c0from pg_catalog.pg_type as sample_68 tablesample bernoulli(6.9) inner join pg_catalog.pg_shseclabel as ref_53 on (sample_68.typowner = ref_53.objoid ), lateral (select sample_68.typcategory as c0, ref_54.speaker as c1, ref_54.speaker as c2 from public.test_range_exclas ref_54 where (ref_53.label >= ref_53.provider) and (ref_53.label !~* ref_53.provider) fetch first 143 rows only) as subq_32where ref_53.label ~>~ ref_53.label) as subq_33 right join pg_catalog.pg_pltemplateas sample_69 tablesample bernoulli (9.8) on (subq_33.c0 = sample_69.tmplhandler ) where sample_69.tmplvalidator ~ subq_33.c0 fetch first 131 rows only;
I wrote: > Tom Lane writes: >> Andreas Seltenreich <seltenreich@gmx.de> writes: >>> I've added new grammar rules to sqlsmith and improved some older ones. >>> This was rewarded with a return of "failed to generate plan" errors. >> >> I believe I've dealt with these cases now. Thanks for the report! > > I no longer see "failed to build any n-way joins" after pulling, but > there are still instances of "could not devise a query plan". Samples below. sorry, I spoke too soon: nine of the former have been logged through the night. I'm attaching a larger set of sample queries this time in case that there are still multiple causes for the observed errors. regards, Andreas
Attachment
Andreas Seltenreich <seltenreich@gmx.de> writes: >> I no longer see "failed to build any n-way joins" after pulling, but >> there are still instances of "could not devise a query plan". Samples below. > sorry, I spoke too soon: nine of the former have been logged through the > night. I'm attaching a larger set of sample queries this time in case > that there are still multiple causes for the observed errors. Hm. At least in the first of these cases, the problem is that the code I committed yesterday doesn't account for indirect lateral dependencies. That is, if S1 depends on S2 which depends on the inner side of an outer join, it now knows not to join S2 directly to the outer side of the outer join, but it doesn't realize that the same must apply to S1. Maybe we should redefine lateral_relids as the transitive closure of a rel's lateral dependencies? Not sure. regards, tom lane
On Tue, Dec 08, 2015 at 12:13:41PM -0500, Tom Lane wrote: > Andreas Seltenreich <seltenreich@gmx.de> writes: > >> I no longer see "failed to build any n-way joins" after pulling, > >> but there are still instances of "could not devise a query plan". > >> Samples below. > > > sorry, I spoke too soon: nine of the former have been logged > > through the night. I'm attaching a larger set of sample queries > > this time in case that there are still multiple causes for the > > observed errors. > > Hm. At least in the first of these cases, the problem is that the > code I committed yesterday doesn't account for indirect lateral > dependencies. That is, if S1 depends on S2 which depends on the > inner side of an outer join, it now knows not to join S2 directly to > the outer side of the outer join, but it doesn't realize that the > same must apply to S1. > > Maybe we should redefine lateral_relids as the transitive closure of > a rel's lateral dependencies? Not sure. That seems like it would fix the problem once and for good. Is there any reason to believe that the lateral dependencies could go in a loop, i.e. is there a reason to put in checks for same in the transitive closure construction? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Tue, Dec 08, 2015 at 12:13:41PM -0500, Tom Lane wrote: >> Hm. At least in the first of these cases, the problem is that the >> code I committed yesterday doesn't account for indirect lateral >> dependencies. That is, if S1 depends on S2 which depends on the >> inner side of an outer join, it now knows not to join S2 directly to >> the outer side of the outer join, but it doesn't realize that the >> same must apply to S1. >> >> Maybe we should redefine lateral_relids as the transitive closure of >> a rel's lateral dependencies? Not sure. > That seems like it would fix the problem once and for good. Is there > any reason to believe that the lateral dependencies could go in a > loop, i.e. is there a reason to put in checks for same in the > transitive closure construction? It should not be possible to have a loop, since rels can only have lateral references to rels that appeared syntactically before them. But an Assert about it doesn't seem a bad thing. After working on this for awhile, I've found that indeed converting lateral_relids to a transitive closure makes things better. But there was another bug (or two other bugs, depending on how you count) exposed by Andreas' examples. I was right after all to suspect that the "hazardous PHV" condition needs to be accounted for by join_is_legal; as things stood, join_is_legal might allow a join for which every possible path would be rejected by check_hazardous_phv. More, if we were insisting that a join PHV be calculated before we could pass it to some other relation, we didn't actually do anything to ensure that that join would get built. Given an appropriate set of conditions, the clauseless-join heuristics could decide that that join wasn't interesting and never build it, again possibly leading to planner failure. So join_is_legal's cousins have_join_order_restriction and has_join_restriction also need to know about this issue. (This is a bit of a mess; I'm beginning to wonder if we shouldn't bite the bullet and teach the executor how to handle non-Var NestLoopParams, so that the planner doesn't have to work around that. But I'd rather not back-patch such a change.) I also noticed that lateral_referencers should be a transitive closure as well. I think that oversight doesn't lead to any observable bug, but it would lead to constructing index paths that cannot be used, so fixing it should save some planner cycles. So that leads to the attached patch, which I think is good as-is for the back branches. In this state of the code, the LateralJoinInfo list is darn near useless: the only thing we use it for is detecting whether two relations have a direct (not transitive closure) lateral reference. I'm strongly tempted to replace that logic by keeping a copy of the pre-closure lateral_relids sets, and then we could drop LateralJoinInfo altogether, which would make create_lateral_join_info quite a bit shorter and faster. That could go into HEAD/9.5, but I'm afraid to back-patch it further than 9.5 for fear that third-party code somewhere might be relying on the LateralJoinInfo data structure. Comments? regards, tom lane diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index 0f04033..53d8fdd 100644 *** a/src/backend/optimizer/path/joinpath.c --- b/src/backend/optimizer/path/joinpath.c *************** allow_star_schema_join(PlannerInfo *root *** 255,308 **** } /* - * There's a pitfall for creating parameterized nestloops: suppose the inner - * rel (call it A) has a parameter that is a PlaceHolderVar, and that PHV's - * minimum eval_at set includes the outer rel (B) and some third rel (C). - * We might think we could create a B/A nestloop join that's parameterized by - * C. But we would end up with a plan in which the PHV's expression has to be - * evaluated as a nestloop parameter at the B/A join; and the executor is only - * set up to handle simple Vars as NestLoopParams. Rather than add complexity - * and overhead to the executor for such corner cases, it seems better to - * forbid the join. (Note that existence of such a PHV probably means there - * is a join order constraint that will cause us to consider joining B and C - * directly; so we can still make use of A's parameterized path with B+C.) - * So we check whether any PHVs used in the query could pose such a hazard. - * We don't have any simple way of checking whether a risky PHV would actually - * be used in the inner plan, and the case is so unusual that it doesn't seem - * worth working very hard on it. - * - * This case can occur whether or not the join's remaining parameterization - * overlaps param_source_rels, so we have to check for it separately from - * allow_star_schema_join, even though it looks much like a star-schema case. - */ - static inline bool - check_hazardous_phv(PlannerInfo *root, - Path *outer_path, - Path *inner_path) - { - Relids innerparams = PATH_REQ_OUTER(inner_path); - Relids outerrelids = outer_path->parent->relids; - ListCell *lc; - - foreach(lc, root->placeholder_list) - { - PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); - - if (!bms_is_subset(phinfo->ph_eval_at, innerparams)) - continue; /* ignore, could not be a nestloop param */ - if (!bms_overlap(phinfo->ph_eval_at, outerrelids)) - continue; /* ignore, not relevant to this join */ - if (bms_is_subset(phinfo->ph_eval_at, outerrelids)) - continue; /* safe, it can be eval'd within outerrel */ - /* Otherwise, it's potentially unsafe, so reject the join */ - return false; - } - - /* OK to perform the join */ - return true; - } - - /* * try_nestloop_path * Consider a nestloop join path; if it appears useful, push it into * the joinrel's pathlist via add_path(). --- 255,260 ---- *************** try_nestloop_path(PlannerInfo *root, *** 322,336 **** /* * Check to see if proposed path is still parameterized, and reject if the * parameterization wouldn't be sensible --- unless allow_star_schema_join ! * says to allow it anyway. Also, we must reject if check_hazardous_phv ! * doesn't like the look of it. */ required_outer = calc_nestloop_required_outer(outer_path, inner_path); if (required_outer && ((!bms_overlap(required_outer, extra->param_source_rels) && !allow_star_schema_join(root, outer_path, inner_path)) || ! !check_hazardous_phv(root, outer_path, inner_path))) { /* Waste no memory when we reject a path here */ bms_free(required_outer); --- 274,291 ---- /* * Check to see if proposed path is still parameterized, and reject if the * parameterization wouldn't be sensible --- unless allow_star_schema_join ! * says to allow it anyway. Also, we must reject if have_dangerous_phv ! * doesn't like the look of it, which could only happen if the nestloop is ! * still parameterized. */ required_outer = calc_nestloop_required_outer(outer_path, inner_path); if (required_outer && ((!bms_overlap(required_outer, extra->param_source_rels) && !allow_star_schema_join(root, outer_path, inner_path)) || ! have_dangerous_phv(root, ! outer_path->parent->relids, ! PATH_REQ_OUTER(inner_path)))) { /* Waste no memory when we reject a path here */ bms_free(required_outer); *************** try_nestloop_path(PlannerInfo *root, *** 338,353 **** } /* - * Independently of that, add parameterization needed for any - * PlaceHolderVars that need to be computed at the join. We can handle - * that just by adding joinrel->lateral_relids; that might include some - * rels that are already in required_outer, but no harm done. (Note that - * lateral_relids is exactly NULL if empty, so this will not break the - * property that required_outer is too.) - */ - required_outer = bms_add_members(required_outer, joinrel->lateral_relids); - - /* * Do a precheck to quickly eliminate obviously-inferior paths. We * calculate a cheap lower bound on the path's cost and then use * add_path_precheck() to see if the path is clearly going to be dominated --- 293,298 ---- *************** try_mergejoin_path(PlannerInfo *root, *** 419,430 **** } /* - * Independently of that, add parameterization needed for any - * PlaceHolderVars that need to be computed at the join. - */ - required_outer = bms_add_members(required_outer, joinrel->lateral_relids); - - /* * If the given paths are already well enough ordered, we can skip doing * an explicit sort. */ --- 364,369 ---- *************** try_hashjoin_path(PlannerInfo *root, *** 501,512 **** } /* - * Independently of that, add parameterization needed for any - * PlaceHolderVars that need to be computed at the join. - */ - required_outer = bms_add_members(required_outer, joinrel->lateral_relids); - - /* * See comments in try_nestloop_path(). Also note that hashjoin paths * never have any output pathkeys, per comments in create_hashjoin_path. */ --- 440,445 ---- diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c index 9f0212f..9b24c35 100644 *** a/src/backend/optimizer/path/joinrels.c --- b/src/backend/optimizer/path/joinrels.c *************** join_is_legal(PlannerInfo *root, RelOptI *** 532,573 **** * in just one direction, then the join has to be done with a nestloop * with the lateral referencer on the inside. If the join matches an SJ * that cannot be implemented by such a nestloop, the join is impossible. */ ! lateral_fwd = lateral_rev = false; ! foreach(l, root->lateral_info_list) { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel1->relids)) { ! /* has to be implemented as nestloop with rel1 on left */ ! if (lateral_rev) ! return false; /* have lateral refs in both directions */ ! lateral_fwd = true; ! if (!bms_is_subset(ljinfo->lateral_lhs, rel1->relids)) ! return false; /* rel1 can't compute the required parameter */ ! if (match_sjinfo && ! (reversed || ! unique_ified || ! match_sjinfo->jointype == JOIN_FULL)) ! return false; /* not implementable as nestloop */ } ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel2->relids)) { ! /* has to be implemented as nestloop with rel2 on left */ ! if (lateral_fwd) ! return false; /* have lateral refs in both directions */ ! lateral_rev = true; ! if (!bms_is_subset(ljinfo->lateral_lhs, rel2->relids)) ! return false; /* rel2 can't compute the required parameter */ ! if (match_sjinfo && ! (!reversed || ! unique_ified || ! match_sjinfo->jointype == JOIN_FULL)) ! return false; /* not implementable as nestloop */ } } /* --- 532,589 ---- * in just one direction, then the join has to be done with a nestloop * with the lateral referencer on the inside. If the join matches an SJ * that cannot be implemented by such a nestloop, the join is impossible. + * Another case that might keep us from building a valid plan is the + * implementation restriction described by have_dangerous_phv(). */ ! lateral_fwd = bms_overlap(rel1->relids, rel2->lateral_relids); ! lateral_rev = bms_overlap(rel2->relids, rel1->lateral_relids); ! if (lateral_fwd && lateral_rev) ! return false; /* have lateral refs in both directions */ ! if (lateral_fwd) { ! /* has to be implemented as nestloop with rel1 on left */ ! if (match_sjinfo && ! (reversed || ! unique_ified || ! match_sjinfo->jointype == JOIN_FULL)) ! return false; /* not implementable as nestloop */ ! /* check we won't have a dangerous PHV */ ! if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids)) ! return false; /* might be unable to handle required PHV */ ! /* check there is a direct reference from rel2 to rel1 */ ! foreach(l, root->lateral_info_list) { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_is_subset(ljinfo->lateral_lhs, rel1->relids)) ! break; } ! if (l == NULL) ! return false; /* only indirect refs, so reject */ ! } ! else if (lateral_rev) ! { ! /* has to be implemented as nestloop with rel2 on left */ ! if (match_sjinfo && ! (!reversed || ! unique_ified || ! match_sjinfo->jointype == JOIN_FULL)) ! return false; /* not implementable as nestloop */ ! /* check we won't have a dangerous PHV */ ! if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids)) ! return false; /* might be unable to handle required PHV */ ! /* check there is a direct reference from rel1 to rel2 */ ! foreach(l, root->lateral_info_list) { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_is_subset(ljinfo->lateral_lhs, rel2->relids)) ! break; } + if (l == NULL) + return false; /* only indirect refs, so reject */ } /* *************** join_is_legal(PlannerInfo *root, RelOptI *** 582,588 **** * It seems best not to merge this check into the main loop above, because * it is concerned with SJs that are not otherwise relevant to this join. */ ! join_lateral_rels = min_join_parameterization(root, joinrelids); if (join_lateral_rels) { foreach(l, root->join_info_list) --- 598,605 ---- * It seems best not to merge this check into the main loop above, because * it is concerned with SJs that are not otherwise relevant to this join. */ ! join_lateral_rels = min_join_parameterization(root, joinrelids, ! rel1, rel2); if (join_lateral_rels) { foreach(l, root->join_info_list) *************** make_join_rel(PlannerInfo *root, RelOptI *** 852,859 **** * could be merged with that function, but it seems clearer to separate the * two concerns. We need this test because there are degenerate cases where * a clauseless join must be performed to satisfy join-order restrictions. ! * Also, if one rel has a lateral reference to the other, we should consider ! * joining them even if the join would be clauseless. * * Note: this is only a problem if one side of a degenerate outer join * contains multiple rels, or a clauseless join is required within an --- 869,877 ---- * could be merged with that function, but it seems clearer to separate the * two concerns. We need this test because there are degenerate cases where * a clauseless join must be performed to satisfy join-order restrictions. ! * Also, if one rel has a lateral reference to the other, or both are needed ! * to compute some PHV, we should consider joining them even if the join would ! * be clauseless. * * Note: this is only a problem if one side of a degenerate outer join * contains multiple rels, or a clauseless join is required within an *************** have_join_order_restriction(PlannerInfo *** 870,877 **** ListCell *l; /* ! * If either side has a lateral reference to the other, attempt the join ! * regardless of outer-join considerations. */ foreach(l, root->lateral_info_list) { --- 888,895 ---- ListCell *l; /* ! * If either side has a direct lateral reference to the other, attempt the ! * join regardless of outer-join considerations. */ foreach(l, root->lateral_info_list) { *************** have_join_order_restriction(PlannerInfo *** 886,891 **** --- 904,925 ---- } /* + * Likewise, if both rels are needed to compute some PlaceHolderVar, + * attempt the join regardless of outer-join considerations. (This is not + * very desirable, because a PHV with a large eval_at set will cause a lot + * of probably-useless joins to be considered, but failing to do this can + * cause us to fail to construct a plan at all.) + */ + foreach(l, root->placeholder_list) + { + PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l); + + if (bms_is_subset(rel1->relids, phinfo->ph_eval_at) && + bms_is_subset(rel2->relids, phinfo->ph_eval_at)) + return true; + } + + /* * It's possible that the rels correspond to the left and right sides of a * degenerate outer join, that is, one with no joinclause mentioning the * non-nullable side; in which case we should force the join to occur. *************** have_join_order_restriction(PlannerInfo *** 960,966 **** * has_join_restriction * Detect whether the specified relation has join-order restrictions, * due to being inside an outer join or an IN (sub-SELECT), ! * or participating in any LATERAL references. * * Essentially, this tests whether have_join_order_restriction() could * succeed with this rel and some other one. It's OK if we sometimes --- 994,1000 ---- * has_join_restriction * Detect whether the specified relation has join-order restrictions, * due to being inside an outer join or an IN (sub-SELECT), ! * or participating in any LATERAL references or multi-rel PHVs. * * Essentially, this tests whether have_join_order_restriction() could * succeed with this rel and some other one. It's OK if we sometimes *************** has_join_restriction(PlannerInfo *root, *** 972,983 **** { ListCell *l; ! foreach(l, root->lateral_info_list) { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! if (bms_is_subset(ljinfo->lateral_rhs, rel->relids) || ! bms_overlap(ljinfo->lateral_lhs, rel->relids)) return true; } --- 1006,1020 ---- { ListCell *l; ! if (rel->lateral_relids != NULL || rel->lateral_referencers != NULL) ! return true; ! ! foreach(l, root->placeholder_list) { ! PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l); ! if (bms_is_subset(rel->relids, phinfo->ph_eval_at) && ! !bms_equal(rel->relids, phinfo->ph_eval_at)) return true; } *************** has_legal_joinclause(PlannerInfo *root, *** 1059,1064 **** --- 1096,1152 ---- /* + * There's a pitfall for creating parameterized nestloops: suppose the inner + * rel (call it A) has a parameter that is a PlaceHolderVar, and that PHV's + * minimum eval_at set includes the outer rel (B) and some third rel (C). + * We might think we could create a B/A nestloop join that's parameterized by + * C. But we would end up with a plan in which the PHV's expression has to be + * evaluated as a nestloop parameter at the B/A join; and the executor is only + * set up to handle simple Vars as NestLoopParams. Rather than add complexity + * and overhead to the executor for such corner cases, it seems better to + * forbid the join. (Note that we can still make use of A's parameterized + * path with pre-joined B+C as the outer rel. have_join_order_restriction() + * ensures that we will consider making such a join even if there are not + * other reasons to do so.) + * + * So we check whether any PHVs used in the query could pose such a hazard. + * We don't have any simple way of checking whether a risky PHV would actually + * be used in the inner plan, and the case is so unusual that it doesn't seem + * worth working very hard on it. + * + * This needs to be checked in two places. If the inner rel's minimum + * parameterization would trigger the restriction, then join_is_legal() should + * reject the join altogether, because there will be no workable paths for it. + * But joinpath.c has to check again for every proposed nestloop path, because + * the inner path might have more than the minimum parameterization, causing + * some PHV to be dangerous for it that otherwise wouldn't be. + */ + bool + have_dangerous_phv(PlannerInfo *root, + Relids outer_relids, Relids inner_params) + { + ListCell *lc; + + foreach(lc, root->placeholder_list) + { + PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); + + if (!bms_is_subset(phinfo->ph_eval_at, inner_params)) + continue; /* ignore, could not be a nestloop param */ + if (!bms_overlap(phinfo->ph_eval_at, outer_relids)) + continue; /* ignore, not relevant to this join */ + if (bms_is_subset(phinfo->ph_eval_at, outer_relids)) + continue; /* safe, it can be eval'd within outerrel */ + /* Otherwise, it's potentially unsafe, so reject the join */ + return true; + } + + /* OK to perform the join */ + return false; + } + + + /* * is_dummy_rel --- has relation been proven empty? */ static bool diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index 2f4e818..e5688ef 100644 *** a/src/backend/optimizer/plan/initsplan.c --- b/src/backend/optimizer/plan/initsplan.c *************** create_lateral_join_info(PlannerInfo *ro *** 449,457 **** Assert(false); } ! /* We now know all the relids needed for lateral refs in this rel */ ! if (bms_is_empty(lateral_relids)) ! continue; /* ensure lateral_relids is NULL if empty */ brel->lateral_relids = lateral_relids; } --- 449,455 ---- Assert(false); } ! /* We now have all the direct lateral refs from this rel */ brel->lateral_relids = lateral_relids; } *************** create_lateral_join_info(PlannerInfo *ro *** 459,464 **** --- 457,470 ---- * Now check for lateral references within PlaceHolderVars, and make * LateralJoinInfos describing each such reference. Unlike references in * unflattened LATERAL RTEs, the referencing location could be a join. + * + * For a PHV that is due to be evaluated at a join, we mark each of the + * join's member baserels as having the PHV's lateral references too. Even + * though the baserels could be scanned without considering those lateral + * refs, we will never be able to form the join except as a path + * parameterized by the lateral refs, so there is no point in considering + * unparameterized paths for the baserels; and we mustn't try to join any + * of those baserels to the lateral refs too soon, either. */ foreach(lc, root->placeholder_list) { *************** create_lateral_join_info(PlannerInfo *ro *** 471,476 **** --- 477,483 ---- PVC_RECURSE_AGGREGATES, PVC_INCLUDE_PLACEHOLDERS); ListCell *lc2; + int ev_at; foreach(lc2, vars) { *************** create_lateral_join_info(PlannerInfo *ro *** 500,505 **** --- 507,521 ---- } list_free(vars); + + ev_at = -1; + while ((ev_at = bms_next_member(eval_at, ev_at)) >= 0) + { + RelOptInfo *brel = find_base_rel(root, ev_at); + + brel->lateral_relids = bms_add_members(brel->lateral_relids, + phinfo->ph_lateral); + } } } *************** create_lateral_join_info(PlannerInfo *ro *** 508,519 **** return; /* ! * Now that we've identified all lateral references, make a second pass in ! * which we mark each baserel with the set of relids of rels that ! * reference it laterally (essentially, the inverse mapping of ! * lateral_relids). We'll need this for join_clause_is_movable_to(). * ! * Also, propagate lateral_relids and lateral_referencers from appendrel * parent rels to their child rels. We intentionally give each child rel * the same minimum parameterization, even though it's quite possible that * some don't reference all the lateral rels. This is because any append --- 524,613 ---- return; /* ! * At this point the lateral_relids sets represent only direct lateral ! * references. Replace them by their transitive closure, so that they ! * describe both direct and indirect lateral references. If relation X ! * references Y laterally, and Y references Z laterally, then we will have ! * to scan X on the inside of a nestloop with Z, so for all intents and ! * purposes X is laterally dependent on Z too. * ! * This code is essentially Warshall's algorithm for transitive closure. ! * The outer loop considers each baserel, and propagates its lateral ! * dependencies to those baserels that have a lateral dependency on it. ! */ ! for (rti = 1; rti < root->simple_rel_array_size; rti++) ! { ! RelOptInfo *brel = root->simple_rel_array[rti]; ! Relids outer_lateral_relids; ! Index rti2; ! ! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) ! continue; ! ! /* need not consider baserel further if it has no lateral refs */ ! outer_lateral_relids = brel->lateral_relids; ! if (outer_lateral_relids == NULL) ! continue; ! ! /* else scan all baserels */ ! for (rti2 = 1; rti2 < root->simple_rel_array_size; rti2++) ! { ! RelOptInfo *brel2 = root->simple_rel_array[rti2]; ! ! if (brel2 == NULL || brel2->reloptkind != RELOPT_BASEREL) ! continue; ! ! /* if brel2 has lateral ref to brel, propagate brel's refs */ ! if (bms_is_member(rti, brel2->lateral_relids)) ! brel2->lateral_relids = bms_add_members(brel2->lateral_relids, ! outer_lateral_relids); ! } ! } ! ! /* ! * Now that we've identified all lateral references, mark each baserel ! * with the set of relids of rels that reference it laterally (possibly ! * indirectly) --- that is, the inverse mapping of lateral_relids. ! */ ! for (rti = 1; rti < root->simple_rel_array_size; rti++) ! { ! RelOptInfo *brel = root->simple_rel_array[rti]; ! Relids lateral_relids; ! Index rti2; ! ! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) ! continue; ! ! /* Nothing to do at rels with no lateral refs */ ! lateral_relids = brel->lateral_relids; ! if (lateral_relids == NULL) ! continue; ! ! /* ! * We should not have broken the invariant that lateral_relids is ! * exactly NULL if empty. ! */ ! Assert(!bms_is_empty(lateral_relids)); ! ! /* Also, no rel should have a lateral dependency on itself */ ! Assert(!bms_is_member(rti, lateral_relids)); ! ! /* Mark this rel's referencees */ ! for (rti2 = 1; rti2 < root->simple_rel_array_size; rti2++) ! { ! if (bms_is_member(rti2, lateral_relids)) ! { ! RelOptInfo *brel2 = root->simple_rel_array[rti2]; ! ! Assert(brel2 != NULL && brel2->reloptkind == RELOPT_BASEREL); ! brel2->lateral_referencers = ! bms_add_member(brel2->lateral_referencers, rti); ! } ! } ! } ! ! /* ! * Last, propagate lateral_relids and lateral_referencers from appendrel * parent rels to their child rels. We intentionally give each child rel * the same minimum parameterization, even though it's quite possible that * some don't reference all the lateral rels. This is because any append *************** create_lateral_join_info(PlannerInfo *ro *** 525,549 **** for (rti = 1; rti < root->simple_rel_array_size; rti++) { RelOptInfo *brel = root->simple_rel_array[rti]; - Relids lateral_referencers; ! if (brel == NULL) ! continue; ! if (brel->reloptkind != RELOPT_BASEREL) continue; - /* Compute lateral_referencers using the finished lateral_info_list */ - lateral_referencers = NULL; - foreach(lc, root->lateral_info_list) - { - LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(lc); - - if (bms_is_member(brel->relid, ljinfo->lateral_lhs)) - lateral_referencers = bms_add_members(lateral_referencers, - ljinfo->lateral_rhs); - } - brel->lateral_referencers = lateral_referencers; - /* * If it's an appendrel parent, copy its lateral_relids and * lateral_referencers to each child rel. --- 619,628 ---- for (rti = 1; rti < root->simple_rel_array_size; rti++) { RelOptInfo *brel = root->simple_rel_array[rti]; ! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) continue; /* * If it's an appendrel parent, copy its lateral_relids and * lateral_referencers to each child rel. diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index b197f14..a2be2ed 100644 *** a/src/backend/optimizer/util/relnode.c --- b/src/backend/optimizer/util/relnode.c *************** build_join_rel(PlannerInfo *root, *** 373,379 **** joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; joinrel->cheapest_parameterized_paths = NIL; ! joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids); joinrel->relid = 0; /* indicates not a baserel */ joinrel->rtekind = RTE_JOIN; joinrel->min_attr = 0; --- 373,380 ---- joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; joinrel->cheapest_parameterized_paths = NIL; ! joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids, ! outer_rel, inner_rel); joinrel->relid = 0; /* indicates not a baserel */ joinrel->rtekind = RTE_JOIN; joinrel->min_attr = 0; *************** build_join_rel(PlannerInfo *root, *** 508,536 **** * because join_is_legal() needs the value to check a prospective join. */ Relids ! min_join_parameterization(PlannerInfo *root, Relids joinrelids) { Relids result; - ListCell *lc; - - /* Easy if there are no lateral references */ - if (root->lateral_info_list == NIL) - return NULL; /* ! * Scan lateral_info_list to find all the lateral references occurring in ! * or below this join. */ ! result = NULL; ! foreach(lc, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(lc); ! ! if (bms_is_subset(ljinfo->lateral_rhs, joinrelids)) ! result = bms_add_members(result, ljinfo->lateral_lhs); ! } ! ! /* Remove any rels that are already included in the join */ result = bms_del_members(result, joinrelids); /* Maintain invariant that result is exactly NULL if empty */ --- 509,535 ---- * because join_is_legal() needs the value to check a prospective join. */ Relids ! min_join_parameterization(PlannerInfo *root, ! Relids joinrelids, ! RelOptInfo *outer_rel, ! RelOptInfo *inner_rel) { Relids result; /* ! * Basically we just need the union of the inputs' lateral_relids, less ! * whatever is already in the join. ! * ! * It's not immediately obvious that this is a valid way to compute the ! * result, because it might seem that we're ignoring possible lateral refs ! * of PlaceHolderVars that are due to be computed at the join but not in ! * either input. However, because create_lateral_join_info() already ! * charged all such PHV refs to each member baserel of the join, they'll ! * be accounted for already in the inputs' lateral_relids. Likewise, we ! * do not need to worry about doing transitive closure here, because that ! * was already accounted for in the original baserel lateral_relids. */ ! result = bms_union(outer_rel->lateral_relids, inner_rel->lateral_relids); result = bms_del_members(result, joinrelids); /* Maintain invariant that result is exactly NULL if empty */ diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 3232123..74f4daf 100644 *** a/src/include/nodes/relation.h --- b/src/include/nodes/relation.h *************** typedef struct PlannerInfo *** 358,363 **** --- 358,364 ---- * cheapest_parameterized_paths - best paths for their parameterizations; * always includes cheapest_total_path, even if that's unparameterized * lateral_relids - required outer rels for LATERAL, as a Relids set + * (includes both direct and indirect lateral references) * * If the relation is a base relation it will have these fields set: * diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h index f41847f..8fb9eda 100644 *** a/src/include/optimizer/pathnode.h --- b/src/include/optimizer/pathnode.h *************** extern RelOptInfo *build_join_rel(Planne *** 148,154 **** RelOptInfo *inner_rel, SpecialJoinInfo *sjinfo, List **restrictlist_ptr); ! extern Relids min_join_parameterization(PlannerInfo *root, Relids joinrelids); extern RelOptInfo *build_empty_join_rel(PlannerInfo *root); extern AppendRelInfo *find_childrel_appendrelinfo(PlannerInfo *root, RelOptInfo *rel); --- 148,157 ---- RelOptInfo *inner_rel, SpecialJoinInfo *sjinfo, List **restrictlist_ptr); ! extern Relids min_join_parameterization(PlannerInfo *root, ! Relids joinrelids, ! RelOptInfo *outer_rel, ! RelOptInfo *inner_rel); extern RelOptInfo *build_empty_join_rel(PlannerInfo *root); extern AppendRelInfo *find_childrel_appendrelinfo(PlannerInfo *root, RelOptInfo *rel); diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 87123a5..7757741 100644 *** a/src/include/optimizer/paths.h --- b/src/include/optimizer/paths.h *************** extern RelOptInfo *make_join_rel(Planner *** 98,103 **** --- 98,105 ---- RelOptInfo *rel1, RelOptInfo *rel2); extern bool have_join_order_restriction(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2); + extern bool have_dangerous_phv(PlannerInfo *root, + Relids outer_relids, Relids inner_params); /* * equivclass.c diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index dba5d2d..3926d0d 100644 *** a/src/test/regress/expected/join.out --- b/src/test/regress/expected/join.out *************** where t1.f1 = ss.f1; *** 3620,3625 **** --- 3620,3726 ---- doh! | 4567890123456789 | 123 | 4567890123456789 | doh! (1 row) + explain (verbose, costs off) + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + QUERY PLAN + ------------------------------------------------------------------- + Nested Loop + Output: t1.f1, i8.q1, i8.q2, (i8.q1), t2.f1, ((i8.q1)), (t2.f1) + Join Filter: (t1.f1 = (t2.f1)) + -> Nested Loop + Output: t1.f1, i8.q1, i8.q2, (i8.q1), t2.f1 + -> Nested Loop Left Join + Output: t1.f1, i8.q1, i8.q2 + -> Seq Scan on public.text_tbl t1 + Output: t1.f1 + -> Materialize + Output: i8.q1, i8.q2 + -> Seq Scan on public.int8_tbl i8 + Output: i8.q1, i8.q2 + Filter: (i8.q2 = 123) + -> Limit + Output: (i8.q1), t2.f1 + -> Seq Scan on public.text_tbl t2 + Output: i8.q1, t2.f1 + -> Limit + Output: ((i8.q1)), (t2.f1) + -> Seq Scan on public.text_tbl t3 + Output: (i8.q1), t2.f1 + (22 rows) + + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + f1 | q1 | q2 | q1 | f1 | q1 | f1 + ------+------------------+-----+------------------+------+------------------+------ + doh! | 4567890123456789 | 123 | 4567890123456789 | doh! | 4567890123456789 | doh! + (1 row) + + -- + -- check a case in which a PlaceHolderVar forces join order + -- + explain (verbose, costs off) + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + QUERY PLAN + ------------------------------------------------------------------------ + Nested Loop + Output: (i41.f1), (i8.q1), (i8.q2), (i42.f1), (i43.f1), ((42)) + -> Hash Join + Output: i41.f1, i42.f1, i8.q1, i8.q2, i43.f1, 42 + Hash Cond: (i41.f1 = i42.f1) + -> Nested Loop + Output: i8.q1, i8.q2, i43.f1, i41.f1 + -> Nested Loop + Output: i8.q1, i8.q2, i43.f1 + -> Seq Scan on public.int8_tbl i8 + Output: i8.q1, i8.q2 + Filter: (i8.q1 = 0) + -> Seq Scan on public.int4_tbl i43 + Output: i43.f1 + Filter: (i43.f1 = 0) + -> Seq Scan on public.int4_tbl i41 + Output: i41.f1 + -> Hash + Output: i42.f1 + -> Seq Scan on public.int4_tbl i42 + Output: i42.f1 + -> Limit + Output: (i41.f1), (i8.q1), (i8.q2), (i42.f1), (i43.f1), ((42)) + -> Seq Scan on public.text_tbl + Output: i41.f1, i8.q1, i8.q2, i42.f1, i43.f1, (42) + (25 rows) + + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + f1 | q1 | q2 | c1 | c2 | c3 + ----+----+----+----+----+---- + (0 rows) + -- -- test ability to push constants through outer join clauses -- *************** select * from *** 4741,4754 **** Output: a.q1, a.q2 -> Nested Loop Output: b.q1, c.q1, LEAST(a.q1, b.q1, c.q1) - Join Filter: (a.q2 = b.q1) -> Seq Scan on public.int8_tbl b Output: b.q1, b.q2 ! -> Materialize ! Output: c.q1 ! -> Seq Scan on public.int8_tbl c ! Output: c.q1 ! (13 rows) select * from int8_tbl a left join lateral --- 4842,4853 ---- Output: a.q1, a.q2 -> Nested Loop Output: b.q1, c.q1, LEAST(a.q1, b.q1, c.q1) -> Seq Scan on public.int8_tbl b Output: b.q1, b.q2 ! Filter: (a.q2 = b.q1) ! -> Seq Scan on public.int8_tbl c ! Output: c.q1, c.q2 ! (11 rows) select * from int8_tbl a left join lateral diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql index fdd4e78..85005d7 100644 *** a/src/test/regress/sql/join.sql --- b/src/test/regress/sql/join.sql *************** select * from *** 1134,1139 **** --- 1134,1181 ---- lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss where t1.f1 = ss.f1; + explain (verbose, costs off) + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + + -- + -- check a case in which a PlaceHolderVar forces join order + -- + + explain (verbose, costs off) + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + -- -- test ability to push constants through outer join clauses --
On Wed, Dec 9, 2015 at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Fetter <david@fetter.org> writes: >> On Tue, Dec 08, 2015 at 12:13:41PM -0500, Tom Lane wrote: >>> Hm. At least in the first of these cases, the problem is that the >>> code I committed yesterday doesn't account for indirect lateral >>> dependencies. That is, if S1 depends on S2 which depends on the >>> inner side of an outer join, it now knows not to join S2 directly to >>> the outer side of the outer join, but it doesn't realize that the >>> same must apply to S1. >>> >>> Maybe we should redefine lateral_relids as the transitive closure of >>> a rel's lateral dependencies? Not sure. > >> That seems like it would fix the problem once and for good. Is there >> any reason to believe that the lateral dependencies could go in a >> loop, i.e. is there a reason to put in checks for same in the >> transitive closure construction? > > It should not be possible to have a loop, since rels can only have lateral > references to rels that appeared syntactically before them. But an Assert > about it doesn't seem a bad thing. > > After working on this for awhile, I've found that indeed converting > lateral_relids to a transitive closure makes things better. But there was > another bug (or two other bugs, depending on how you count) exposed by > Andreas' examples. I was right after all to suspect that the "hazardous > PHV" condition needs to be accounted for by join_is_legal; as things > stood, join_is_legal might allow a join for which every possible path > would be rejected by check_hazardous_phv. More, if we were insisting that > a join PHV be calculated before we could pass it to some other relation, > we didn't actually do anything to ensure that that join would get built. > Given an appropriate set of conditions, the clauseless-join heuristics > could decide that that join wasn't interesting and never build it, again > possibly leading to planner failure. So join_is_legal's cousins > have_join_order_restriction and has_join_restriction also need to know > about this issue. > > (This is a bit of a mess; I'm beginning to wonder if we shouldn't bite > the bullet and teach the executor how to handle non-Var NestLoopParams, > so that the planner doesn't have to work around that. But I'd rather > not back-patch such a change.) > > I also noticed that lateral_referencers should be a transitive closure > as well. I think that oversight doesn't lead to any observable bug, > but it would lead to constructing index paths that cannot be used, so > fixing it should save some planner cycles. > > So that leads to the attached patch, which I think is good as-is for > the back branches. In this state of the code, the LateralJoinInfo > list is darn near useless: the only thing we use it for is detecting > whether two relations have a direct (not transitive closure) lateral > reference. I'm strongly tempted to replace that logic by keeping a > copy of the pre-closure lateral_relids sets, and then we could drop > LateralJoinInfo altogether, which would make create_lateral_join_info > quite a bit shorter and faster. That could go into HEAD/9.5, but > I'm afraid to back-patch it further than 9.5 for fear that third-party > code somewhere might be relying on the LateralJoinInfo data structure. Aside from the functional issues, could your changes result in performance regressions? (if so, I'd argue not to backpatch unless there were cases that returned bad data as opposed to spurious errors). merlin
Merlin Moncure <mmoncure@gmail.com> writes: > Aside from the functional issues, could your changes result in > performance regressions? (if so, I'd argue not to backpatch unless > there were cases that returned bad data as opposed to spurious > errors). I can't say that I think planner failures on valid queries is something that's optional to fix. However, I believe that this will typically not change the selected plan in cases where the planner didn't fail before. I did notice one change in an existing regression test, where the planner pushed a qual clause further down in the plan than it did before; but that seems like a better plan anyway. (The reason that happened is that the changes to enlarge the minimum parameterization of some base rels result in choosing to push qualifiers further down, since a qual clause will be evaluated at the lowest plan level that the selected parameterization allows.) It's a little bit harder to gauge the impact on planner speed. The transitive closure calculation could be expensive in a query with many lateral references, but that doesn't seem likely to be common; and anyway we'll buy back some of that cost due to simpler tests later. I'm optimistic that we'll come out ahead in HEAD/9.5 after the removal of LateralJoinInfo setup. It might be roughly a wash in the back branches. regards, tom lane
Tom Lane writes: > [2. transitive-lateral-fixes-1.patch] I was about to write that sqlsmith likes the patch, but after more than 10^8 ok queries the attached ones were generated. regards, Andreas
Attachment
On Sun, Dec 6, 2015 at 9:52 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > I've added new grammar rules to sqlsmith and improved some older ones. Could you possibly teach sqlsmith about INSERT ... ON CONFLICT DO UPDATE/IGNORE? I think that that could be very helpful, especially if it could be done in advance of any stable release of 9.5. Thanks -- Peter Geoghegan
Tom Lane writes: > Merlin Moncure <mmoncure@gmail.com> writes: >> Aside from the functional issues, could your changes result in >> performance regressions? [...] > It's a little bit harder to gauge the impact on planner speed. The > transitive closure calculation could be expensive in a query with many > lateral references, but that doesn't seem likely to be common; and anyway > we'll buy back some of that cost due to simpler tests later. I'm > optimistic that we'll come out ahead in HEAD/9.5 after the removal > of LateralJoinInfo setup. It might be roughly a wash in the back > branches. On the empirical side: I see a speedup of 0.4% in testing speed with the patch applied. It could very well be me venting the room one additional time during the second session, resulting in the CPUs spending more time in their opportunistic frequency range or something. regards, Andreas smith=# select extract('day' from t), avg(generated/extract(epoch from updated - t)) as tps from instancenatural join stat where generated > 1000000 and hostname not in('dwagon','slugbug') -- these do stuff beside sqlsmith and t > now() - interval '3 days' group by 1 order by 1;date_part| tps -----------+------------------ 8 | 55.494181110456 9 | 55.6902316869404
On Thu, Dec 10, 2015 at 4:55 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > Tom Lane writes: > >> Merlin Moncure <mmoncure@gmail.com> writes: >>> Aside from the functional issues, could your changes result in >>> performance regressions? > [...] >> It's a little bit harder to gauge the impact on planner speed. The >> transitive closure calculation could be expensive in a query with many >> lateral references, but that doesn't seem likely to be common; and anyway >> we'll buy back some of that cost due to simpler tests later. I'm >> optimistic that we'll come out ahead in HEAD/9.5 after the removal >> of LateralJoinInfo setup. It might be roughly a wash in the back >> branches. > > On the empirical side: I see a speedup of 0.4% in testing speed with the > patch applied. It could very well be me venting the room one additional > time during the second session, resulting in the CPUs spending more time > in their opportunistic frequency range or something. Really...wow. That satisfies me. You ought to be commended for sqlsmith -- great stuff. merlin
Andreas Seltenreich <seltenreich@gmx.de> writes: > Tom Lane writes: >> [2. transitive-lateral-fixes-1.patch] > I was about to write that sqlsmith likes the patch, but after more than > 10^8 ok queries the attached ones were generated. Ah-hah --- the new check I added in join_is_legal understood about chains of LATERAL references, but it forgot that we could also have chains of outer-join ordering constraints. When we're looking to see if joining on the basis of a LATERAL reference would break some later outer join, we have to look at outer joins to the outer joins' inner relations, too. Fixed in the attached. I also stuck all of join_is_legal's lateral-related checks inside an "if (root->hasLateralRTEs)" block, which will save some time in typical queries with no LATERAL. That makes that section of the patch a bit bigger than before, but it's mostly just reindentation. Many thanks for the work you've done with this tool! Who knows how long it would've taken us to find these problems otherwise ... regards, tom lane diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index 0f04033..53d8fdd 100644 *** a/src/backend/optimizer/path/joinpath.c --- b/src/backend/optimizer/path/joinpath.c *************** allow_star_schema_join(PlannerInfo *root *** 255,308 **** } /* - * There's a pitfall for creating parameterized nestloops: suppose the inner - * rel (call it A) has a parameter that is a PlaceHolderVar, and that PHV's - * minimum eval_at set includes the outer rel (B) and some third rel (C). - * We might think we could create a B/A nestloop join that's parameterized by - * C. But we would end up with a plan in which the PHV's expression has to be - * evaluated as a nestloop parameter at the B/A join; and the executor is only - * set up to handle simple Vars as NestLoopParams. Rather than add complexity - * and overhead to the executor for such corner cases, it seems better to - * forbid the join. (Note that existence of such a PHV probably means there - * is a join order constraint that will cause us to consider joining B and C - * directly; so we can still make use of A's parameterized path with B+C.) - * So we check whether any PHVs used in the query could pose such a hazard. - * We don't have any simple way of checking whether a risky PHV would actually - * be used in the inner plan, and the case is so unusual that it doesn't seem - * worth working very hard on it. - * - * This case can occur whether or not the join's remaining parameterization - * overlaps param_source_rels, so we have to check for it separately from - * allow_star_schema_join, even though it looks much like a star-schema case. - */ - static inline bool - check_hazardous_phv(PlannerInfo *root, - Path *outer_path, - Path *inner_path) - { - Relids innerparams = PATH_REQ_OUTER(inner_path); - Relids outerrelids = outer_path->parent->relids; - ListCell *lc; - - foreach(lc, root->placeholder_list) - { - PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); - - if (!bms_is_subset(phinfo->ph_eval_at, innerparams)) - continue; /* ignore, could not be a nestloop param */ - if (!bms_overlap(phinfo->ph_eval_at, outerrelids)) - continue; /* ignore, not relevant to this join */ - if (bms_is_subset(phinfo->ph_eval_at, outerrelids)) - continue; /* safe, it can be eval'd within outerrel */ - /* Otherwise, it's potentially unsafe, so reject the join */ - return false; - } - - /* OK to perform the join */ - return true; - } - - /* * try_nestloop_path * Consider a nestloop join path; if it appears useful, push it into * the joinrel's pathlist via add_path(). --- 255,260 ---- *************** try_nestloop_path(PlannerInfo *root, *** 322,336 **** /* * Check to see if proposed path is still parameterized, and reject if the * parameterization wouldn't be sensible --- unless allow_star_schema_join ! * says to allow it anyway. Also, we must reject if check_hazardous_phv ! * doesn't like the look of it. */ required_outer = calc_nestloop_required_outer(outer_path, inner_path); if (required_outer && ((!bms_overlap(required_outer, extra->param_source_rels) && !allow_star_schema_join(root, outer_path, inner_path)) || ! !check_hazardous_phv(root, outer_path, inner_path))) { /* Waste no memory when we reject a path here */ bms_free(required_outer); --- 274,291 ---- /* * Check to see if proposed path is still parameterized, and reject if the * parameterization wouldn't be sensible --- unless allow_star_schema_join ! * says to allow it anyway. Also, we must reject if have_dangerous_phv ! * doesn't like the look of it, which could only happen if the nestloop is ! * still parameterized. */ required_outer = calc_nestloop_required_outer(outer_path, inner_path); if (required_outer && ((!bms_overlap(required_outer, extra->param_source_rels) && !allow_star_schema_join(root, outer_path, inner_path)) || ! have_dangerous_phv(root, ! outer_path->parent->relids, ! PATH_REQ_OUTER(inner_path)))) { /* Waste no memory when we reject a path here */ bms_free(required_outer); *************** try_nestloop_path(PlannerInfo *root, *** 338,353 **** } /* - * Independently of that, add parameterization needed for any - * PlaceHolderVars that need to be computed at the join. We can handle - * that just by adding joinrel->lateral_relids; that might include some - * rels that are already in required_outer, but no harm done. (Note that - * lateral_relids is exactly NULL if empty, so this will not break the - * property that required_outer is too.) - */ - required_outer = bms_add_members(required_outer, joinrel->lateral_relids); - - /* * Do a precheck to quickly eliminate obviously-inferior paths. We * calculate a cheap lower bound on the path's cost and then use * add_path_precheck() to see if the path is clearly going to be dominated --- 293,298 ---- *************** try_mergejoin_path(PlannerInfo *root, *** 419,430 **** } /* - * Independently of that, add parameterization needed for any - * PlaceHolderVars that need to be computed at the join. - */ - required_outer = bms_add_members(required_outer, joinrel->lateral_relids); - - /* * If the given paths are already well enough ordered, we can skip doing * an explicit sort. */ --- 364,369 ---- *************** try_hashjoin_path(PlannerInfo *root, *** 501,512 **** } /* - * Independently of that, add parameterization needed for any - * PlaceHolderVars that need to be computed at the join. - */ - required_outer = bms_add_members(required_outer, joinrel->lateral_relids); - - /* * See comments in try_nestloop_path(). Also note that hashjoin paths * never have any output pathkeys, per comments in create_hashjoin_path. */ --- 440,445 ---- diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c index 9f0212f..e57b8e2 100644 *** a/src/backend/optimizer/path/joinrels.c --- b/src/backend/optimizer/path/joinrels.c *************** join_is_legal(PlannerInfo *root, RelOptI *** 332,340 **** bool reversed; bool unique_ified; bool must_be_leftjoin; - bool lateral_fwd; - bool lateral_rev; - Relids join_lateral_rels; ListCell *l; /* --- 332,337 ---- *************** join_is_legal(PlannerInfo *root, RelOptI *** 527,601 **** /* * We also have to check for constraints imposed by LATERAL references. - * The proposed rels could each contain lateral references to the other, - * in which case the join is impossible. If there are lateral references - * in just one direction, then the join has to be done with a nestloop - * with the lateral referencer on the inside. If the join matches an SJ - * that cannot be implemented by such a nestloop, the join is impossible. */ ! lateral_fwd = lateral_rev = false; ! foreach(l, root->lateral_info_list) { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel1->relids)) { /* has to be implemented as nestloop with rel1 on left */ - if (lateral_rev) - return false; /* have lateral refs in both directions */ - lateral_fwd = true; - if (!bms_is_subset(ljinfo->lateral_lhs, rel1->relids)) - return false; /* rel1 can't compute the required parameter */ if (match_sjinfo && (reversed || unique_ified || match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ } ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel2->relids)) { /* has to be implemented as nestloop with rel2 on left */ - if (lateral_fwd) - return false; /* have lateral refs in both directions */ - lateral_rev = true; - if (!bms_is_subset(ljinfo->lateral_lhs, rel2->relids)) - return false; /* rel2 can't compute the required parameter */ if (match_sjinfo && (!reversed || unique_ified || match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ } - } ! /* ! * LATERAL references could also cause problems later on if we accept this ! * join: if the join's minimum parameterization includes any rels that ! * would have to be on the inside of an outer join with this join rel, ! * then it's never going to be possible to build the complete query using ! * this join. We should reject this join not only because it'll save ! * work, but because if we don't, the clauseless-join heuristics might ! * think that legality of this join means that some other join rel need ! * not be formed, and that could lead to failure to find any plan at all. ! * It seems best not to merge this check into the main loop above, because ! * it is concerned with SJs that are not otherwise relevant to this join. ! */ ! join_lateral_rels = min_join_parameterization(root, joinrelids); ! if (join_lateral_rels) ! { ! foreach(l, root->join_info_list) { ! SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(l); ! if (bms_overlap(sjinfo->min_righthand, join_lateral_rels) && ! bms_overlap(sjinfo->min_lefthand, joinrelids)) ! return false; /* will not be able to join to min_righthand */ ! if (sjinfo->jointype == JOIN_FULL && ! bms_overlap(sjinfo->min_lefthand, join_lateral_rels) && ! bms_overlap(sjinfo->min_righthand, joinrelids)) ! return false; /* will not be able to join to min_lefthand */ } } --- 524,644 ---- /* * We also have to check for constraints imposed by LATERAL references. */ ! if (root->hasLateralRTEs) { ! bool lateral_fwd; ! bool lateral_rev; ! Relids join_lateral_rels; ! /* ! * The proposed rels could each contain lateral references to the ! * other, in which case the join is impossible. If there are lateral ! * references in just one direction, then the join has to be done with ! * a nestloop with the lateral referencer on the inside. If the join ! * matches an SJ that cannot be implemented by such a nestloop, the ! * join is impossible. Another case that might keep us from building ! * a valid plan is the implementation restriction described by ! * have_dangerous_phv(). ! */ ! lateral_fwd = bms_overlap(rel1->relids, rel2->lateral_relids); ! lateral_rev = bms_overlap(rel2->relids, rel1->lateral_relids); ! if (lateral_fwd && lateral_rev) ! return false; /* have lateral refs in both directions */ ! if (lateral_fwd) { /* has to be implemented as nestloop with rel1 on left */ if (match_sjinfo && (reversed || unique_ified || match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ + /* check there is a direct reference from rel2 to rel1 */ + foreach(l, root->lateral_info_list) + { + LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); + + if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && + bms_is_subset(ljinfo->lateral_lhs, rel1->relids)) + break; + } + if (l == NULL) + return false; /* only indirect refs, so reject */ + /* check we won't have a dangerous PHV */ + if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids)) + return false; /* might be unable to handle required PHV */ } ! else if (lateral_rev) { /* has to be implemented as nestloop with rel2 on left */ if (match_sjinfo && (!reversed || unique_ified || match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ + /* check there is a direct reference from rel1 to rel2 */ + foreach(l, root->lateral_info_list) + { + LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); + + if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && + bms_is_subset(ljinfo->lateral_lhs, rel2->relids)) + break; + } + if (l == NULL) + return false; /* only indirect refs, so reject */ + /* check we won't have a dangerous PHV */ + if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids)) + return false; /* might be unable to handle required PHV */ } ! /* ! * LATERAL references could also cause problems later on if we accept ! * this join: if the join's minimum parameterization includes any rels ! * that would have to be on the inside of an outer join with this join ! * rel, then it's never going to be possible to build the complete ! * query using this join. We should reject this join not only because ! * it'll save work, but because if we don't, the clauseless-join ! * heuristics might think that legality of this join means that some ! * other join rel need not be formed, and that could lead to failure ! * to find any plan at all. We have to consider not only rels that ! * are directly on the inner side of an OJ with the joinrel, but also ! * ones that are indirectly so, so search to find all such rels. ! */ ! join_lateral_rels = min_join_parameterization(root, joinrelids, ! rel1, rel2); ! if (join_lateral_rels) { ! Relids join_plus_rhs = bms_copy(joinrelids); ! bool more; ! do ! { ! more = false; ! foreach(l, root->join_info_list) ! { ! SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(l); ! ! if (bms_overlap(sjinfo->min_lefthand, join_plus_rhs) && ! !bms_is_subset(sjinfo->min_righthand, join_plus_rhs)) ! { ! join_plus_rhs = bms_add_members(join_plus_rhs, ! sjinfo->min_righthand); ! more = true; ! } ! /* full joins constrain both sides symmetrically */ ! if (sjinfo->jointype == JOIN_FULL && ! bms_overlap(sjinfo->min_righthand, join_plus_rhs) && ! !bms_is_subset(sjinfo->min_lefthand, join_plus_rhs)) ! { ! join_plus_rhs = bms_add_members(join_plus_rhs, ! sjinfo->min_lefthand); ! more = true; ! } ! } ! } while (more); ! if (bms_overlap(join_plus_rhs, join_lateral_rels)) ! return false; /* will not be able to join to some RHS rel */ } } *************** make_join_rel(PlannerInfo *root, RelOptI *** 852,859 **** * could be merged with that function, but it seems clearer to separate the * two concerns. We need this test because there are degenerate cases where * a clauseless join must be performed to satisfy join-order restrictions. ! * Also, if one rel has a lateral reference to the other, we should consider ! * joining them even if the join would be clauseless. * * Note: this is only a problem if one side of a degenerate outer join * contains multiple rels, or a clauseless join is required within an --- 895,903 ---- * could be merged with that function, but it seems clearer to separate the * two concerns. We need this test because there are degenerate cases where * a clauseless join must be performed to satisfy join-order restrictions. ! * Also, if one rel has a lateral reference to the other, or both are needed ! * to compute some PHV, we should consider joining them even if the join would ! * be clauseless. * * Note: this is only a problem if one side of a degenerate outer join * contains multiple rels, or a clauseless join is required within an *************** have_join_order_restriction(PlannerInfo *** 870,877 **** ListCell *l; /* ! * If either side has a lateral reference to the other, attempt the join ! * regardless of outer-join considerations. */ foreach(l, root->lateral_info_list) { --- 914,921 ---- ListCell *l; /* ! * If either side has a direct lateral reference to the other, attempt the ! * join regardless of outer-join considerations. */ foreach(l, root->lateral_info_list) { *************** have_join_order_restriction(PlannerInfo *** 886,891 **** --- 930,951 ---- } /* + * Likewise, if both rels are needed to compute some PlaceHolderVar, + * attempt the join regardless of outer-join considerations. (This is not + * very desirable, because a PHV with a large eval_at set will cause a lot + * of probably-useless joins to be considered, but failing to do this can + * cause us to fail to construct a plan at all.) + */ + foreach(l, root->placeholder_list) + { + PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l); + + if (bms_is_subset(rel1->relids, phinfo->ph_eval_at) && + bms_is_subset(rel2->relids, phinfo->ph_eval_at)) + return true; + } + + /* * It's possible that the rels correspond to the left and right sides of a * degenerate outer join, that is, one with no joinclause mentioning the * non-nullable side; in which case we should force the join to occur. *************** have_join_order_restriction(PlannerInfo *** 960,966 **** * has_join_restriction * Detect whether the specified relation has join-order restrictions, * due to being inside an outer join or an IN (sub-SELECT), ! * or participating in any LATERAL references. * * Essentially, this tests whether have_join_order_restriction() could * succeed with this rel and some other one. It's OK if we sometimes --- 1020,1026 ---- * has_join_restriction * Detect whether the specified relation has join-order restrictions, * due to being inside an outer join or an IN (sub-SELECT), ! * or participating in any LATERAL references or multi-rel PHVs. * * Essentially, this tests whether have_join_order_restriction() could * succeed with this rel and some other one. It's OK if we sometimes *************** has_join_restriction(PlannerInfo *root, *** 972,983 **** { ListCell *l; ! foreach(l, root->lateral_info_list) { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! if (bms_is_subset(ljinfo->lateral_rhs, rel->relids) || ! bms_overlap(ljinfo->lateral_lhs, rel->relids)) return true; } --- 1032,1046 ---- { ListCell *l; ! if (rel->lateral_relids != NULL || rel->lateral_referencers != NULL) ! return true; ! ! foreach(l, root->placeholder_list) { ! PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l); ! if (bms_is_subset(rel->relids, phinfo->ph_eval_at) && ! !bms_equal(rel->relids, phinfo->ph_eval_at)) return true; } *************** has_legal_joinclause(PlannerInfo *root, *** 1059,1064 **** --- 1122,1178 ---- /* + * There's a pitfall for creating parameterized nestloops: suppose the inner + * rel (call it A) has a parameter that is a PlaceHolderVar, and that PHV's + * minimum eval_at set includes the outer rel (B) and some third rel (C). + * We might think we could create a B/A nestloop join that's parameterized by + * C. But we would end up with a plan in which the PHV's expression has to be + * evaluated as a nestloop parameter at the B/A join; and the executor is only + * set up to handle simple Vars as NestLoopParams. Rather than add complexity + * and overhead to the executor for such corner cases, it seems better to + * forbid the join. (Note that we can still make use of A's parameterized + * path with pre-joined B+C as the outer rel. have_join_order_restriction() + * ensures that we will consider making such a join even if there are not + * other reasons to do so.) + * + * So we check whether any PHVs used in the query could pose such a hazard. + * We don't have any simple way of checking whether a risky PHV would actually + * be used in the inner plan, and the case is so unusual that it doesn't seem + * worth working very hard on it. + * + * This needs to be checked in two places. If the inner rel's minimum + * parameterization would trigger the restriction, then join_is_legal() should + * reject the join altogether, because there will be no workable paths for it. + * But joinpath.c has to check again for every proposed nestloop path, because + * the inner path might have more than the minimum parameterization, causing + * some PHV to be dangerous for it that otherwise wouldn't be. + */ + bool + have_dangerous_phv(PlannerInfo *root, + Relids outer_relids, Relids inner_params) + { + ListCell *lc; + + foreach(lc, root->placeholder_list) + { + PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); + + if (!bms_is_subset(phinfo->ph_eval_at, inner_params)) + continue; /* ignore, could not be a nestloop param */ + if (!bms_overlap(phinfo->ph_eval_at, outer_relids)) + continue; /* ignore, not relevant to this join */ + if (bms_is_subset(phinfo->ph_eval_at, outer_relids)) + continue; /* safe, it can be eval'd within outerrel */ + /* Otherwise, it's potentially unsafe, so reject the join */ + return true; + } + + /* OK to perform the join */ + return false; + } + + + /* * is_dummy_rel --- has relation been proven empty? */ static bool diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index 2f4e818..e5688ef 100644 *** a/src/backend/optimizer/plan/initsplan.c --- b/src/backend/optimizer/plan/initsplan.c *************** create_lateral_join_info(PlannerInfo *ro *** 449,457 **** Assert(false); } ! /* We now know all the relids needed for lateral refs in this rel */ ! if (bms_is_empty(lateral_relids)) ! continue; /* ensure lateral_relids is NULL if empty */ brel->lateral_relids = lateral_relids; } --- 449,455 ---- Assert(false); } ! /* We now have all the direct lateral refs from this rel */ brel->lateral_relids = lateral_relids; } *************** create_lateral_join_info(PlannerInfo *ro *** 459,464 **** --- 457,470 ---- * Now check for lateral references within PlaceHolderVars, and make * LateralJoinInfos describing each such reference. Unlike references in * unflattened LATERAL RTEs, the referencing location could be a join. + * + * For a PHV that is due to be evaluated at a join, we mark each of the + * join's member baserels as having the PHV's lateral references too. Even + * though the baserels could be scanned without considering those lateral + * refs, we will never be able to form the join except as a path + * parameterized by the lateral refs, so there is no point in considering + * unparameterized paths for the baserels; and we mustn't try to join any + * of those baserels to the lateral refs too soon, either. */ foreach(lc, root->placeholder_list) { *************** create_lateral_join_info(PlannerInfo *ro *** 471,476 **** --- 477,483 ---- PVC_RECURSE_AGGREGATES, PVC_INCLUDE_PLACEHOLDERS); ListCell *lc2; + int ev_at; foreach(lc2, vars) { *************** create_lateral_join_info(PlannerInfo *ro *** 500,505 **** --- 507,521 ---- } list_free(vars); + + ev_at = -1; + while ((ev_at = bms_next_member(eval_at, ev_at)) >= 0) + { + RelOptInfo *brel = find_base_rel(root, ev_at); + + brel->lateral_relids = bms_add_members(brel->lateral_relids, + phinfo->ph_lateral); + } } } *************** create_lateral_join_info(PlannerInfo *ro *** 508,519 **** return; /* ! * Now that we've identified all lateral references, make a second pass in ! * which we mark each baserel with the set of relids of rels that ! * reference it laterally (essentially, the inverse mapping of ! * lateral_relids). We'll need this for join_clause_is_movable_to(). * ! * Also, propagate lateral_relids and lateral_referencers from appendrel * parent rels to their child rels. We intentionally give each child rel * the same minimum parameterization, even though it's quite possible that * some don't reference all the lateral rels. This is because any append --- 524,613 ---- return; /* ! * At this point the lateral_relids sets represent only direct lateral ! * references. Replace them by their transitive closure, so that they ! * describe both direct and indirect lateral references. If relation X ! * references Y laterally, and Y references Z laterally, then we will have ! * to scan X on the inside of a nestloop with Z, so for all intents and ! * purposes X is laterally dependent on Z too. * ! * This code is essentially Warshall's algorithm for transitive closure. ! * The outer loop considers each baserel, and propagates its lateral ! * dependencies to those baserels that have a lateral dependency on it. ! */ ! for (rti = 1; rti < root->simple_rel_array_size; rti++) ! { ! RelOptInfo *brel = root->simple_rel_array[rti]; ! Relids outer_lateral_relids; ! Index rti2; ! ! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) ! continue; ! ! /* need not consider baserel further if it has no lateral refs */ ! outer_lateral_relids = brel->lateral_relids; ! if (outer_lateral_relids == NULL) ! continue; ! ! /* else scan all baserels */ ! for (rti2 = 1; rti2 < root->simple_rel_array_size; rti2++) ! { ! RelOptInfo *brel2 = root->simple_rel_array[rti2]; ! ! if (brel2 == NULL || brel2->reloptkind != RELOPT_BASEREL) ! continue; ! ! /* if brel2 has lateral ref to brel, propagate brel's refs */ ! if (bms_is_member(rti, brel2->lateral_relids)) ! brel2->lateral_relids = bms_add_members(brel2->lateral_relids, ! outer_lateral_relids); ! } ! } ! ! /* ! * Now that we've identified all lateral references, mark each baserel ! * with the set of relids of rels that reference it laterally (possibly ! * indirectly) --- that is, the inverse mapping of lateral_relids. ! */ ! for (rti = 1; rti < root->simple_rel_array_size; rti++) ! { ! RelOptInfo *brel = root->simple_rel_array[rti]; ! Relids lateral_relids; ! Index rti2; ! ! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) ! continue; ! ! /* Nothing to do at rels with no lateral refs */ ! lateral_relids = brel->lateral_relids; ! if (lateral_relids == NULL) ! continue; ! ! /* ! * We should not have broken the invariant that lateral_relids is ! * exactly NULL if empty. ! */ ! Assert(!bms_is_empty(lateral_relids)); ! ! /* Also, no rel should have a lateral dependency on itself */ ! Assert(!bms_is_member(rti, lateral_relids)); ! ! /* Mark this rel's referencees */ ! for (rti2 = 1; rti2 < root->simple_rel_array_size; rti2++) ! { ! if (bms_is_member(rti2, lateral_relids)) ! { ! RelOptInfo *brel2 = root->simple_rel_array[rti2]; ! ! Assert(brel2 != NULL && brel2->reloptkind == RELOPT_BASEREL); ! brel2->lateral_referencers = ! bms_add_member(brel2->lateral_referencers, rti); ! } ! } ! } ! ! /* ! * Last, propagate lateral_relids and lateral_referencers from appendrel * parent rels to their child rels. We intentionally give each child rel * the same minimum parameterization, even though it's quite possible that * some don't reference all the lateral rels. This is because any append *************** create_lateral_join_info(PlannerInfo *ro *** 525,549 **** for (rti = 1; rti < root->simple_rel_array_size; rti++) { RelOptInfo *brel = root->simple_rel_array[rti]; - Relids lateral_referencers; ! if (brel == NULL) ! continue; ! if (brel->reloptkind != RELOPT_BASEREL) continue; - /* Compute lateral_referencers using the finished lateral_info_list */ - lateral_referencers = NULL; - foreach(lc, root->lateral_info_list) - { - LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(lc); - - if (bms_is_member(brel->relid, ljinfo->lateral_lhs)) - lateral_referencers = bms_add_members(lateral_referencers, - ljinfo->lateral_rhs); - } - brel->lateral_referencers = lateral_referencers; - /* * If it's an appendrel parent, copy its lateral_relids and * lateral_referencers to each child rel. --- 619,628 ---- for (rti = 1; rti < root->simple_rel_array_size; rti++) { RelOptInfo *brel = root->simple_rel_array[rti]; ! if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) continue; /* * If it's an appendrel parent, copy its lateral_relids and * lateral_referencers to each child rel. diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index b197f14..a2be2ed 100644 *** a/src/backend/optimizer/util/relnode.c --- b/src/backend/optimizer/util/relnode.c *************** build_join_rel(PlannerInfo *root, *** 373,379 **** joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; joinrel->cheapest_parameterized_paths = NIL; ! joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids); joinrel->relid = 0; /* indicates not a baserel */ joinrel->rtekind = RTE_JOIN; joinrel->min_attr = 0; --- 373,380 ---- joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; joinrel->cheapest_parameterized_paths = NIL; ! joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids, ! outer_rel, inner_rel); joinrel->relid = 0; /* indicates not a baserel */ joinrel->rtekind = RTE_JOIN; joinrel->min_attr = 0; *************** build_join_rel(PlannerInfo *root, *** 508,536 **** * because join_is_legal() needs the value to check a prospective join. */ Relids ! min_join_parameterization(PlannerInfo *root, Relids joinrelids) { Relids result; - ListCell *lc; - - /* Easy if there are no lateral references */ - if (root->lateral_info_list == NIL) - return NULL; /* ! * Scan lateral_info_list to find all the lateral references occurring in ! * or below this join. */ ! result = NULL; ! foreach(lc, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(lc); ! ! if (bms_is_subset(ljinfo->lateral_rhs, joinrelids)) ! result = bms_add_members(result, ljinfo->lateral_lhs); ! } ! ! /* Remove any rels that are already included in the join */ result = bms_del_members(result, joinrelids); /* Maintain invariant that result is exactly NULL if empty */ --- 509,535 ---- * because join_is_legal() needs the value to check a prospective join. */ Relids ! min_join_parameterization(PlannerInfo *root, ! Relids joinrelids, ! RelOptInfo *outer_rel, ! RelOptInfo *inner_rel) { Relids result; /* ! * Basically we just need the union of the inputs' lateral_relids, less ! * whatever is already in the join. ! * ! * It's not immediately obvious that this is a valid way to compute the ! * result, because it might seem that we're ignoring possible lateral refs ! * of PlaceHolderVars that are due to be computed at the join but not in ! * either input. However, because create_lateral_join_info() already ! * charged all such PHV refs to each member baserel of the join, they'll ! * be accounted for already in the inputs' lateral_relids. Likewise, we ! * do not need to worry about doing transitive closure here, because that ! * was already accounted for in the original baserel lateral_relids. */ ! result = bms_union(outer_rel->lateral_relids, inner_rel->lateral_relids); result = bms_del_members(result, joinrelids); /* Maintain invariant that result is exactly NULL if empty */ diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 3232123..74f4daf 100644 *** a/src/include/nodes/relation.h --- b/src/include/nodes/relation.h *************** typedef struct PlannerInfo *** 358,363 **** --- 358,364 ---- * cheapest_parameterized_paths - best paths for their parameterizations; * always includes cheapest_total_path, even if that's unparameterized * lateral_relids - required outer rels for LATERAL, as a Relids set + * (includes both direct and indirect lateral references) * * If the relation is a base relation it will have these fields set: * diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h index f41847f..8fb9eda 100644 *** a/src/include/optimizer/pathnode.h --- b/src/include/optimizer/pathnode.h *************** extern RelOptInfo *build_join_rel(Planne *** 148,154 **** RelOptInfo *inner_rel, SpecialJoinInfo *sjinfo, List **restrictlist_ptr); ! extern Relids min_join_parameterization(PlannerInfo *root, Relids joinrelids); extern RelOptInfo *build_empty_join_rel(PlannerInfo *root); extern AppendRelInfo *find_childrel_appendrelinfo(PlannerInfo *root, RelOptInfo *rel); --- 148,157 ---- RelOptInfo *inner_rel, SpecialJoinInfo *sjinfo, List **restrictlist_ptr); ! extern Relids min_join_parameterization(PlannerInfo *root, ! Relids joinrelids, ! RelOptInfo *outer_rel, ! RelOptInfo *inner_rel); extern RelOptInfo *build_empty_join_rel(PlannerInfo *root); extern AppendRelInfo *find_childrel_appendrelinfo(PlannerInfo *root, RelOptInfo *rel); diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 87123a5..7757741 100644 *** a/src/include/optimizer/paths.h --- b/src/include/optimizer/paths.h *************** extern RelOptInfo *make_join_rel(Planner *** 98,103 **** --- 98,105 ---- RelOptInfo *rel1, RelOptInfo *rel2); extern bool have_join_order_restriction(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2); + extern bool have_dangerous_phv(PlannerInfo *root, + Relids outer_relids, Relids inner_params); /* * equivclass.c diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index dba5d2d..64f046e 100644 *** a/src/test/regress/expected/join.out --- b/src/test/regress/expected/join.out *************** where t1.f1 = ss.f1; *** 3620,3625 **** --- 3620,3778 ---- doh! | 4567890123456789 | 123 | 4567890123456789 | doh! (1 row) + explain (verbose, costs off) + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + QUERY PLAN + ------------------------------------------------------------------- + Nested Loop + Output: t1.f1, i8.q1, i8.q2, (i8.q1), t2.f1, ((i8.q1)), (t2.f1) + Join Filter: (t1.f1 = (t2.f1)) + -> Nested Loop + Output: t1.f1, i8.q1, i8.q2, (i8.q1), t2.f1 + -> Nested Loop Left Join + Output: t1.f1, i8.q1, i8.q2 + -> Seq Scan on public.text_tbl t1 + Output: t1.f1 + -> Materialize + Output: i8.q1, i8.q2 + -> Seq Scan on public.int8_tbl i8 + Output: i8.q1, i8.q2 + Filter: (i8.q2 = 123) + -> Limit + Output: (i8.q1), t2.f1 + -> Seq Scan on public.text_tbl t2 + Output: i8.q1, t2.f1 + -> Limit + Output: ((i8.q1)), (t2.f1) + -> Seq Scan on public.text_tbl t3 + Output: (i8.q1), t2.f1 + (22 rows) + + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + f1 | q1 | q2 | q1 | f1 | q1 | f1 + ------+------------------+-----+------------------+------+------------------+------ + doh! | 4567890123456789 | 123 | 4567890123456789 | doh! | 4567890123456789 | doh! + (1 row) + + explain (verbose, costs off) + select 1 from + text_tbl as tt1 + inner join text_tbl as tt2 on (tt1.f1 = 'foo') + left join text_tbl as tt3 on (tt3.f1 = 'foo') + left join text_tbl as tt4 on (tt3.f1 = tt4.f1), + lateral (select tt4.f1 as c0 from text_tbl as tt5 limit 1) as ss1 + where tt1.f1 = ss1.c0; + QUERY PLAN + ---------------------------------------------------------- + Nested Loop + Output: 1 + -> Nested Loop Left Join + Output: tt1.f1, tt4.f1 + -> Nested Loop + Output: tt1.f1 + -> Seq Scan on public.text_tbl tt1 + Output: tt1.f1 + Filter: (tt1.f1 = 'foo'::text) + -> Seq Scan on public.text_tbl tt2 + Output: tt2.f1 + -> Materialize + Output: tt4.f1 + -> Nested Loop Left Join + Output: tt4.f1 + Join Filter: (tt3.f1 = tt4.f1) + -> Seq Scan on public.text_tbl tt3 + Output: tt3.f1 + Filter: (tt3.f1 = 'foo'::text) + -> Seq Scan on public.text_tbl tt4 + Output: tt4.f1 + Filter: (tt4.f1 = 'foo'::text) + -> Subquery Scan on ss1 + Output: ss1.c0 + Filter: (ss1.c0 = 'foo'::text) + -> Limit + Output: (tt4.f1) + -> Seq Scan on public.text_tbl tt5 + Output: tt4.f1 + (29 rows) + + select 1 from + text_tbl as tt1 + inner join text_tbl as tt2 on (tt1.f1 = 'foo') + left join text_tbl as tt3 on (tt3.f1 = 'foo') + left join text_tbl as tt4 on (tt3.f1 = tt4.f1), + lateral (select tt4.f1 as c0 from text_tbl as tt5 limit 1) as ss1 + where tt1.f1 = ss1.c0; + ?column? + ---------- + (0 rows) + + -- + -- check a case in which a PlaceHolderVar forces join order + -- + explain (verbose, costs off) + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + QUERY PLAN + ------------------------------------------------------------------------ + Nested Loop + Output: (i41.f1), (i8.q1), (i8.q2), (i42.f1), (i43.f1), ((42)) + -> Hash Join + Output: i41.f1, i42.f1, i8.q1, i8.q2, i43.f1, 42 + Hash Cond: (i41.f1 = i42.f1) + -> Nested Loop + Output: i8.q1, i8.q2, i43.f1, i41.f1 + -> Nested Loop + Output: i8.q1, i8.q2, i43.f1 + -> Seq Scan on public.int8_tbl i8 + Output: i8.q1, i8.q2 + Filter: (i8.q1 = 0) + -> Seq Scan on public.int4_tbl i43 + Output: i43.f1 + Filter: (i43.f1 = 0) + -> Seq Scan on public.int4_tbl i41 + Output: i41.f1 + -> Hash + Output: i42.f1 + -> Seq Scan on public.int4_tbl i42 + Output: i42.f1 + -> Limit + Output: (i41.f1), (i8.q1), (i8.q2), (i42.f1), (i43.f1), ((42)) + -> Seq Scan on public.text_tbl + Output: i41.f1, i8.q1, i8.q2, i42.f1, i43.f1, (42) + (25 rows) + + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + f1 | q1 | q2 | c1 | c2 | c3 + ----+----+----+----+----+---- + (0 rows) + -- -- test ability to push constants through outer join clauses -- *************** select * from *** 4741,4754 **** Output: a.q1, a.q2 -> Nested Loop Output: b.q1, c.q1, LEAST(a.q1, b.q1, c.q1) - Join Filter: (a.q2 = b.q1) -> Seq Scan on public.int8_tbl b Output: b.q1, b.q2 ! -> Materialize ! Output: c.q1 ! -> Seq Scan on public.int8_tbl c ! Output: c.q1 ! (13 rows) select * from int8_tbl a left join lateral --- 4894,4905 ---- Output: a.q1, a.q2 -> Nested Loop Output: b.q1, c.q1, LEAST(a.q1, b.q1, c.q1) -> Seq Scan on public.int8_tbl b Output: b.q1, b.q2 ! Filter: (a.q2 = b.q1) ! -> Seq Scan on public.int8_tbl c ! Output: c.q1, c.q2 ! (11 rows) select * from int8_tbl a left join lateral diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql index fdd4e78..0358d00 100644 *** a/src/test/regress/sql/join.sql --- b/src/test/regress/sql/join.sql *************** select * from *** 1134,1139 **** --- 1134,1198 ---- lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss where t1.f1 = ss.f1; + explain (verbose, costs off) + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + + select * from + text_tbl t1 + left join int8_tbl i8 + on i8.q2 = 123, + lateral (select i8.q1, t2.f1 from text_tbl t2 limit 1) as ss1, + lateral (select ss1.* from text_tbl t3 limit 1) as ss2 + where t1.f1 = ss2.f1; + + explain (verbose, costs off) + select 1 from + text_tbl as tt1 + inner join text_tbl as tt2 on (tt1.f1 = 'foo') + left join text_tbl as tt3 on (tt3.f1 = 'foo') + left join text_tbl as tt4 on (tt3.f1 = tt4.f1), + lateral (select tt4.f1 as c0 from text_tbl as tt5 limit 1) as ss1 + where tt1.f1 = ss1.c0; + + select 1 from + text_tbl as tt1 + inner join text_tbl as tt2 on (tt1.f1 = 'foo') + left join text_tbl as tt3 on (tt3.f1 = 'foo') + left join text_tbl as tt4 on (tt3.f1 = tt4.f1), + lateral (select tt4.f1 as c0 from text_tbl as tt5 limit 1) as ss1 + where tt1.f1 = ss1.c0; + + -- + -- check a case in which a PlaceHolderVar forces join order + -- + + explain (verbose, costs off) + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + + select ss2.* from + int4_tbl i41 + left join int8_tbl i8 + join (select i42.f1 as c1, i43.f1 as c2, 42 as c3 + from int4_tbl i42, int4_tbl i43) ss1 + on i8.q1 = ss1.c2 + on i41.f1 = ss1.c1, + lateral (select i41.*, i8.*, ss1.* from text_tbl limit 1) ss2 + where ss1.c2 = 0; + -- -- test ability to push constants through outer join clauses --
I wrote: > Ah-hah --- the new check I added in join_is_legal understood about > chains of LATERAL references, but it forgot that we could also have chains > of outer-join ordering constraints. When we're looking to see if joining > on the basis of a LATERAL reference would break some later outer join, we > have to look at outer joins to the outer joins' inner relations, too. > Fixed in the attached. I also stuck all of join_is_legal's > lateral-related checks inside an "if (root->hasLateralRTEs)" block, > which will save some time in typical queries with no LATERAL. That > makes that section of the patch a bit bigger than before, but it's > mostly just reindentation. As threatened, here's a patch on top of that that gets rid of LateralJoinInfo. I'm pretty happy with this, although I have an itchy feeling that we could dispense with the lateral_vars lists too. regards, tom lane diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 26264cb..ba04b72 100644 *** a/src/backend/nodes/copyfuncs.c --- b/src/backend/nodes/copyfuncs.c *************** _copySpecialJoinInfo(const SpecialJoinIn *** 2067,2086 **** } /* - * _copyLateralJoinInfo - */ - static LateralJoinInfo * - _copyLateralJoinInfo(const LateralJoinInfo *from) - { - LateralJoinInfo *newnode = makeNode(LateralJoinInfo); - - COPY_BITMAPSET_FIELD(lateral_lhs); - COPY_BITMAPSET_FIELD(lateral_rhs); - - return newnode; - } - - /* * _copyAppendRelInfo */ static AppendRelInfo * --- 2067,2072 ---- *************** copyObject(const void *from) *** 4519,4527 **** case T_SpecialJoinInfo: retval = _copySpecialJoinInfo(from); break; - case T_LateralJoinInfo: - retval = _copyLateralJoinInfo(from); - break; case T_AppendRelInfo: retval = _copyAppendRelInfo(from); break; --- 4505,4510 ---- diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index aa6e102..356fcaf 100644 *** a/src/backend/nodes/equalfuncs.c --- b/src/backend/nodes/equalfuncs.c *************** _equalSpecialJoinInfo(const SpecialJoinI *** 846,860 **** } static bool - _equalLateralJoinInfo(const LateralJoinInfo *a, const LateralJoinInfo *b) - { - COMPARE_BITMAPSET_FIELD(lateral_lhs); - COMPARE_BITMAPSET_FIELD(lateral_rhs); - - return true; - } - - static bool _equalAppendRelInfo(const AppendRelInfo *a, const AppendRelInfo *b) { COMPARE_SCALAR_FIELD(parent_relid); --- 846,851 ---- *************** equal(const void *a, const void *b) *** 2860,2868 **** case T_SpecialJoinInfo: retval = _equalSpecialJoinInfo(a, b); break; - case T_LateralJoinInfo: - retval = _equalLateralJoinInfo(a, b); - break; case T_AppendRelInfo: retval = _equalAppendRelInfo(a, b); break; --- 2851,2856 ---- diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index f07c793..63fae82 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *************** _outPlannerInfo(StringInfo str, const Pl *** 1847,1853 **** WRITE_NODE_FIELD(right_join_clauses); WRITE_NODE_FIELD(full_join_clauses); WRITE_NODE_FIELD(join_info_list); - WRITE_NODE_FIELD(lateral_info_list); WRITE_NODE_FIELD(append_rel_list); WRITE_NODE_FIELD(rowMarks); WRITE_NODE_FIELD(placeholder_list); --- 1847,1852 ---- *************** _outRelOptInfo(StringInfo str, const Rel *** 1892,1897 **** --- 1891,1897 ---- WRITE_NODE_FIELD(cheapest_total_path); WRITE_NODE_FIELD(cheapest_unique_path); WRITE_NODE_FIELD(cheapest_parameterized_paths); + WRITE_BITMAPSET_FIELD(direct_lateral_relids); WRITE_BITMAPSET_FIELD(lateral_relids); WRITE_UINT_FIELD(relid); WRITE_OID_FIELD(reltablespace); *************** _outSpecialJoinInfo(StringInfo str, cons *** 2057,2071 **** } static void - _outLateralJoinInfo(StringInfo str, const LateralJoinInfo *node) - { - WRITE_NODE_TYPE("LATERALJOININFO"); - - WRITE_BITMAPSET_FIELD(lateral_lhs); - WRITE_BITMAPSET_FIELD(lateral_rhs); - } - - static void _outAppendRelInfo(StringInfo str, const AppendRelInfo *node) { WRITE_NODE_TYPE("APPENDRELINFO"); --- 2057,2062 ---- *************** _outNode(StringInfo str, const void *obj *** 3355,3363 **** case T_SpecialJoinInfo: _outSpecialJoinInfo(str, obj); break; - case T_LateralJoinInfo: - _outLateralJoinInfo(str, obj); - break; case T_AppendRelInfo: _outAppendRelInfo(str, obj); break; --- 3346,3351 ---- diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c index e57b8e2..2ab650c 100644 *** a/src/backend/optimizer/path/joinrels.c --- b/src/backend/optimizer/path/joinrels.c *************** join_search_one_level(PlannerInfo *root, *** 231,237 **** */ if (joinrels[level] == NIL && root->join_info_list == NIL && ! root->lateral_info_list == NIL) elog(ERROR, "failed to build any %d-way joins", level); } } --- 231,237 ---- */ if (joinrels[level] == NIL && root->join_info_list == NIL && ! !root->hasLateralRTEs) elog(ERROR, "failed to build any %d-way joins", level); } } *************** join_is_legal(PlannerInfo *root, RelOptI *** 554,568 **** match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel2 to rel1 */ ! foreach(l, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_is_subset(ljinfo->lateral_lhs, rel1->relids)) ! break; ! } ! if (l == NULL) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids)) --- 554,560 ---- match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel2 to rel1 */ ! if (!bms_overlap(rel1->relids, rel2->direct_lateral_relids)) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids)) *************** join_is_legal(PlannerInfo *root, RelOptI *** 577,591 **** match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel1 to rel2 */ ! foreach(l, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_is_subset(ljinfo->lateral_lhs, rel2->relids)) ! break; ! } ! if (l == NULL) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids)) --- 569,575 ---- match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel1 to rel2 */ ! if (!bms_overlap(rel2->relids, rel1->direct_lateral_relids)) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids)) *************** have_join_order_restriction(PlannerInfo *** 917,933 **** * If either side has a direct lateral reference to the other, attempt the * join regardless of outer-join considerations. */ ! foreach(l, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel1->relids)) ! return true; ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel2->relids)) ! return true; ! } /* * Likewise, if both rels are needed to compute some PlaceHolderVar, --- 901,909 ---- * If either side has a direct lateral reference to the other, attempt the * join regardless of outer-join considerations. */ ! if (bms_overlap(rel1->relids, rel2->direct_lateral_relids) || ! bms_overlap(rel2->relids, rel1->direct_lateral_relids)) ! return true; /* * Likewise, if both rels are needed to compute some PlaceHolderVar, diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index 7912b15..d188d97 100644 *** a/src/backend/optimizer/plan/analyzejoins.c --- b/src/backend/optimizer/plan/analyzejoins.c *************** remove_rel_from_query(PlannerInfo *root, *** 439,447 **** sjinfo->syn_righthand = bms_del_member(sjinfo->syn_righthand, relid); } - /* There shouldn't be any LATERAL info to translate, as yet */ - Assert(root->lateral_info_list == NIL); - /* * Likewise remove references from PlaceHolderVar data structures, * removing any no-longer-needed placeholders entirely. --- 439,444 ---- diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index e5688ef..a3f79b6 100644 *** a/src/backend/optimizer/plan/initsplan.c --- b/src/backend/optimizer/plan/initsplan.c *************** typedef struct PostponedQual *** 47,53 **** static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel, Index rtindex); - static void add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs); static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, Relids *qualscope, Relids *inner_join_rels, --- 47,52 ---- *************** extract_lateral_references(PlannerInfo * *** 382,392 **** /* * create_lateral_join_info ! * For each unflattened LATERAL subquery, create LateralJoinInfo(s) and add ! * them to root->lateral_info_list, and fill in the per-rel lateral_relids ! * and lateral_referencers sets. Also generate LateralJoinInfo(s) to ! * represent any lateral references within PlaceHolderVars (this part deals ! * with the effects of flattened LATERAL subqueries). * * This has to run after deconstruct_jointree, because we need to know the * final ph_eval_at values for PlaceHolderVars. --- 381,388 ---- /* * create_lateral_join_info ! * Fill in the per-base-relation direct_lateral_relids, lateral_relids ! * and lateral_referencers sets. * * This has to run after deconstruct_jointree, because we need to know the * final ph_eval_at values for PlaceHolderVars. *************** extract_lateral_references(PlannerInfo * *** 394,399 **** --- 390,396 ---- void create_lateral_join_info(PlannerInfo *root) { + bool found_laterals = false; Index rti; ListCell *lc; *************** create_lateral_join_info(PlannerInfo *ro *** 430,437 **** { Var *var = (Var *) node; ! add_lateral_info(root, bms_make_singleton(var->varno), ! brel->relids); lateral_relids = bms_add_member(lateral_relids, var->varno); } --- 427,433 ---- { Var *var = (Var *) node; ! found_laterals = true; lateral_relids = bms_add_member(lateral_relids, var->varno); } *************** create_lateral_join_info(PlannerInfo *ro *** 441,447 **** PlaceHolderInfo *phinfo = find_placeholder_info(root, phv, false); ! add_lateral_info(root, phinfo->ph_eval_at, brel->relids); lateral_relids = bms_add_members(lateral_relids, phinfo->ph_eval_at); } --- 437,443 ---- PlaceHolderInfo *phinfo = find_placeholder_info(root, phv, false); ! found_laterals = true; lateral_relids = bms_add_members(lateral_relids, phinfo->ph_eval_at); } *************** create_lateral_join_info(PlannerInfo *ro *** 449,517 **** Assert(false); } ! /* We now have all the direct lateral refs from this rel */ ! brel->lateral_relids = lateral_relids; } /* ! * Now check for lateral references within PlaceHolderVars, and make ! * LateralJoinInfos describing each such reference. Unlike references in ! * unflattened LATERAL RTEs, the referencing location could be a join. * ! * For a PHV that is due to be evaluated at a join, we mark each of the ! * join's member baserels as having the PHV's lateral references too. Even ! * though the baserels could be scanned without considering those lateral ! * refs, we will never be able to form the join except as a path ! * parameterized by the lateral refs, so there is no point in considering ! * unparameterized paths for the baserels; and we mustn't try to join any ! * of those baserels to the lateral refs too soon, either. */ foreach(lc, root->placeholder_list) { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); Relids eval_at = phinfo->ph_eval_at; ! if (phinfo->ph_lateral != NULL) ! { ! List *vars = pull_var_clause((Node *) phinfo->ph_var->phexpr, ! PVC_RECURSE_AGGREGATES, ! PVC_INCLUDE_PLACEHOLDERS); ! ListCell *lc2; ! int ev_at; ! ! foreach(lc2, vars) ! { ! Node *node = (Node *) lfirst(lc2); ! ! if (IsA(node, Var)) ! { ! Var *var = (Var *) node; ! ! if (!bms_is_member(var->varno, eval_at)) ! add_lateral_info(root, ! bms_make_singleton(var->varno), ! eval_at); ! } ! else if (IsA(node, PlaceHolderVar)) ! { ! PlaceHolderVar *other_phv = (PlaceHolderVar *) node; ! PlaceHolderInfo *other_phi; ! other_phi = find_placeholder_info(root, other_phv, ! false); ! if (!bms_is_subset(other_phi->ph_eval_at, eval_at)) ! add_lateral_info(root, other_phi->ph_eval_at, eval_at); ! } ! else ! Assert(false); ! } ! list_free(vars); ! ev_at = -1; ! while ((ev_at = bms_next_member(eval_at, ev_at)) >= 0) { ! RelOptInfo *brel = find_base_rel(root, ev_at); brel->lateral_relids = bms_add_members(brel->lateral_relids, phinfo->ph_lateral); --- 445,498 ---- Assert(false); } ! /* We now have all the simple lateral refs from this rel */ ! brel->direct_lateral_relids = lateral_relids; ! brel->lateral_relids = bms_copy(lateral_relids); } /* ! * Now check for lateral references within PlaceHolderVars, and mark their ! * eval_at rels as having lateral references to the source rels. * ! * For a PHV that is due to be evaluated at a baserel, mark its source(s) ! * as direct lateral dependencies of the baserel (adding onto the ones ! * recorded above). If it's due to be evaluated at a join, mark its ! * source(s) as indirect lateral dependencies of each baserel in the join, ! * ie put them into lateral_relids but not direct_lateral_relids. This is ! * appropriate because we can't put any such baserel on the outside of a ! * join to one of the PHV's lateral dependencies, but on the other hand we ! * also can't yet join it directly to the dependency. */ foreach(lc, root->placeholder_list) { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); Relids eval_at = phinfo->ph_eval_at; + int varno; ! if (phinfo->ph_lateral == NULL) ! continue; /* PHV is uninteresting if no lateral refs */ ! found_laterals = true; ! if (bms_get_singleton_member(eval_at, &varno)) ! { ! /* Evaluation site is a baserel */ ! RelOptInfo *brel = find_base_rel(root, varno); ! brel->direct_lateral_relids = ! bms_add_members(brel->direct_lateral_relids, ! phinfo->ph_lateral); ! brel->lateral_relids = ! bms_add_members(brel->lateral_relids, ! phinfo->ph_lateral); ! } ! else ! { ! /* Evaluation site is a join */ ! varno = -1; ! while ((varno = bms_next_member(eval_at, varno)) >= 0) { ! RelOptInfo *brel = find_base_rel(root, varno); brel->lateral_relids = bms_add_members(brel->lateral_relids, phinfo->ph_lateral); *************** create_lateral_join_info(PlannerInfo *ro *** 519,535 **** } } ! /* If we found no lateral references, we're done. */ ! if (root->lateral_info_list == NIL) return; /* ! * At this point the lateral_relids sets represent only direct lateral ! * references. Replace them by their transitive closure, so that they ! * describe both direct and indirect lateral references. If relation X ! * references Y laterally, and Y references Z laterally, then we will have ! * to scan X on the inside of a nestloop with Z, so for all intents and ! * purposes X is laterally dependent on Z too. * * This code is essentially Warshall's algorithm for transitive closure. * The outer loop considers each baserel, and propagates its lateral --- 500,521 ---- } } ! /* ! * If we found no actual lateral references, we're done; but reset the ! * hasLateralRTEs flag to avoid useless work later. ! */ ! if (!found_laterals) ! { ! root->hasLateralRTEs = false; return; + } /* ! * Calculate the transitive closure of the lateral_relids sets, so that ! * they describe both direct and indirect lateral references. If relation ! * X references Y laterally, and Y references Z laterally, then we will ! * have to scan X on the inside of a nestloop with Z, so for all intents ! * and purposes X is laterally dependent on Z too. * * This code is essentially Warshall's algorithm for transitive closure. * The outer loop considers each baserel, and propagates its lateral *************** create_lateral_join_info(PlannerInfo *ro *** 623,632 **** if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) continue; - /* - * If it's an appendrel parent, copy its lateral_relids and - * lateral_referencers to each child rel. - */ if (root->simple_rte_array[rti]->inh) { foreach(lc, root->append_rel_list) --- 609,614 ---- *************** create_lateral_join_info(PlannerInfo *ro *** 638,643 **** --- 620,627 ---- continue; childrel = root->simple_rel_array[appinfo->child_relid]; Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL); + Assert(childrel->direct_lateral_relids == NULL); + childrel->direct_lateral_relids = brel->direct_lateral_relids; Assert(childrel->lateral_relids == NULL); childrel->lateral_relids = brel->lateral_relids; Assert(childrel->lateral_referencers == NULL); *************** create_lateral_join_info(PlannerInfo *ro *** 647,692 **** } } - /* - * add_lateral_info - * Add a LateralJoinInfo to root->lateral_info_list, if needed - * - * We suppress redundant list entries. The passed Relids are copied if saved. - */ - static void - add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs) - { - LateralJoinInfo *ljinfo; - ListCell *lc; - - /* Sanity-check the input */ - Assert(!bms_is_empty(lhs)); - Assert(!bms_is_empty(rhs)); - Assert(!bms_overlap(lhs, rhs)); - - /* - * The input is redundant if it has the same RHS and an LHS that is a - * subset of an existing entry's. If an existing entry has the same RHS - * and an LHS that is a subset of the new one, it's redundant, but we - * don't trouble to get rid of it. The only case that is really worth - * worrying about is identical entries, and we handle that well enough - * with this simple logic. - */ - foreach(lc, root->lateral_info_list) - { - ljinfo = (LateralJoinInfo *) lfirst(lc); - if (bms_equal(rhs, ljinfo->lateral_rhs) && - bms_is_subset(lhs, ljinfo->lateral_lhs)) - return; - } - - /* Not there, so make a new entry */ - ljinfo = makeNode(LateralJoinInfo); - ljinfo->lateral_lhs = bms_copy(lhs); - ljinfo->lateral_rhs = bms_copy(rhs); - root->lateral_info_list = lappend(root->lateral_info_list, ljinfo); - } - /***************************************************************************** * --- 631,636 ---- diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c index a761cfd..02d38f1 100644 *** a/src/backend/optimizer/plan/planagg.c --- b/src/backend/optimizer/plan/planagg.c *************** build_minmax_path(PlannerInfo *root, Min *** 433,441 **** subroot->plan_params = NIL; subroot->outer_params = NULL; subroot->init_plans = NIL; ! /* There shouldn't be any OJ or LATERAL info to translate, as yet */ Assert(subroot->join_info_list == NIL); - Assert(subroot->lateral_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot->placeholder_list == NIL); --- 433,440 ---- subroot->plan_params = NIL; subroot->outer_params = NULL; subroot->init_plans = NIL; ! /* There shouldn't be any OJ info to translate, as yet */ Assert(subroot->join_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot->placeholder_list == NIL); diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c index d73e7c0..91873d3 100644 *** a/src/backend/optimizer/plan/planmain.c --- b/src/backend/optimizer/plan/planmain.c *************** query_planner(PlannerInfo *root, List *t *** 114,120 **** root->right_join_clauses = NIL; root->full_join_clauses = NIL; root->join_info_list = NIL; - root->lateral_info_list = NIL; root->placeholder_list = NIL; root->initial_rels = NIL; --- 114,119 ---- *************** query_planner(PlannerInfo *root, List *t *** 201,207 **** add_placeholders_to_base_rels(root); /* ! * Create the LateralJoinInfo list now that we have finalized * PlaceHolderVar eval levels and made any necessary additions to the * lateral_vars lists for lateral references within PlaceHolderVars. */ --- 200,206 ---- add_placeholders_to_base_rels(root); /* ! * Construct the lateral reference sets now that we have finalized * PlaceHolderVar eval levels and made any necessary additions to the * lateral_vars lists for lateral references within PlaceHolderVars. */ diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index a9cccee..797df31 100644 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *************** inheritance_planner(PlannerInfo *root) *** 1132,1140 **** } } ! /* There shouldn't be any OJ or LATERAL info to translate, as yet */ Assert(subroot.join_info_list == NIL); - Assert(subroot.lateral_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot.placeholder_list == NIL); /* hack to mark target relation as an inheritance partition */ --- 1132,1139 ---- } } ! /* There shouldn't be any OJ info to translate, as yet */ Assert(subroot.join_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot.placeholder_list == NIL); /* hack to mark target relation as an inheritance partition */ diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 401ba5b..5d3de14 100644 *** a/src/backend/optimizer/prep/prepjointree.c --- b/src/backend/optimizer/prep/prepjointree.c *************** pull_up_simple_subquery(PlannerInfo *roo *** 1150,1163 **** subroot->append_rel_list); /* ! * We don't have to do the equivalent bookkeeping for outer-join or ! * LATERAL info, because that hasn't been set up yet. placeholder_list ! * likewise. */ Assert(root->join_info_list == NIL); Assert(subroot->join_info_list == NIL); - Assert(root->lateral_info_list == NIL); - Assert(subroot->lateral_info_list == NIL); Assert(root->placeholder_list == NIL); Assert(subroot->placeholder_list == NIL); --- 1150,1160 ---- subroot->append_rel_list); /* ! * We don't have to do the equivalent bookkeeping for outer-join info, ! * because that hasn't been set up yet. placeholder_list likewise. */ Assert(root->join_info_list == NIL); Assert(subroot->join_info_list == NIL); Assert(root->placeholder_list == NIL); Assert(subroot->placeholder_list == NIL); *************** pull_up_simple_values(PlannerInfo *root, *** 1642,1648 **** Assert(root->append_rel_list == NIL); Assert(list_length(parse->rtable) == 1); Assert(root->join_info_list == NIL); - Assert(root->lateral_info_list == NIL); Assert(root->placeholder_list == NIL); /* --- 1639,1644 ---- *************** substitute_multiple_relids_walker(Node * *** 2839,2845 **** } /* Shouldn't need to handle planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, AppendRelInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 2835,2840 ---- diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index 8884fb1..2e55131 100644 *** a/src/backend/optimizer/prep/prepunion.c --- b/src/backend/optimizer/prep/prepunion.c *************** adjust_appendrel_attrs_mutator(Node *nod *** 1786,1792 **** } /* Shouldn't need to handle planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, AppendRelInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 1786,1791 ---- diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c index 2870315..009ba69 100644 *** a/src/backend/optimizer/util/placeholder.c --- b/src/backend/optimizer/util/placeholder.c *************** add_placeholders_to_base_rels(PlannerInf *** 436,442 **** /* * add_placeholders_to_joinrel ! * Add any required PlaceHolderVars to a join rel's targetlist. * * A join rel should emit a PlaceHolderVar if (a) the PHV is needed above * this join level and (b) the PHV can be computed at or below this level. --- 436,444 ---- /* * add_placeholders_to_joinrel ! * Add any required PlaceHolderVars to a join rel's targetlist; ! * and if they contain lateral references, add those references to the ! * joinrel's direct_lateral_relids. * * A join rel should emit a PlaceHolderVar if (a) the PHV is needed above * this join level and (b) the PHV can be computed at or below this level. *************** add_placeholders_to_joinrel(PlannerInfo *** 463,468 **** --- 465,474 ---- joinrel->reltargetlist = lappend(joinrel->reltargetlist, phinfo->ph_var); joinrel->width += phinfo->ph_width; + /* Adjust joinrel's direct_lateral_relids as needed */ + joinrel->direct_lateral_relids = + bms_add_members(joinrel->direct_lateral_relids, + phinfo->ph_lateral); } } } diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index a2be2ed..f2bdfcc 100644 *** a/src/backend/optimizer/util/relnode.c --- b/src/backend/optimizer/util/relnode.c *************** build_simple_rel(PlannerInfo *root, int *** 111,116 **** --- 111,117 ---- rel->cheapest_total_path = NULL; rel->cheapest_unique_path = NULL; rel->cheapest_parameterized_paths = NIL; + rel->direct_lateral_relids = NULL; rel->lateral_relids = NULL; rel->relid = relid; rel->rtekind = rte->rtekind; *************** build_join_rel(PlannerInfo *root, *** 373,378 **** --- 374,383 ---- joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; joinrel->cheapest_parameterized_paths = NIL; + /* init direct_lateral_relids from children; we'll finish it up below */ + joinrel->direct_lateral_relids = + bms_union(outer_rel->direct_lateral_relids, + inner_rel->direct_lateral_relids); joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids, outer_rel, inner_rel); joinrel->relid = 0; /* indicates not a baserel */ *************** build_join_rel(PlannerInfo *root, *** 423,428 **** --- 428,445 ---- add_placeholders_to_joinrel(root, joinrel); /* + * add_placeholders_to_joinrel also took care of adding the ph_lateral + * sets of any PlaceHolderVars computed here to direct_lateral_relids, so + * now we can finish computing that. This is much like the computation of + * the transitively-closed lateral_relids in min_join_parameterization, + * except that here we *do* have to consider the added PHVs. + */ + joinrel->direct_lateral_relids = + bms_del_members(joinrel->direct_lateral_relids, joinrel->relids); + if (bms_is_empty(joinrel->direct_lateral_relids)) + joinrel->direct_lateral_relids = NULL; + + /* * Construct restrict and join clause lists for the new joinrel. (The * caller might or might not need the restrictlist, but I need it anyway * for set_joinrel_size_estimates().) diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c index 773e7b2..32038ce 100644 *** a/src/backend/optimizer/util/var.c --- b/src/backend/optimizer/util/var.c *************** flatten_join_alias_vars_mutator(Node *no *** 782,788 **** Assert(!IsA(node, SubPlan)); /* Shouldn't need to handle these planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 782,787 ---- diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 1da90ff..6f22168 100644 *** a/src/backend/rewrite/rewriteManip.c --- b/src/backend/rewrite/rewriteManip.c *************** OffsetVarNodes_walker(Node *node, Offset *** 402,408 **** /* Shouldn't need to handle other planner auxiliary nodes here */ Assert(!IsA(node, PlanRowMark)); Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 402,407 ---- *************** ChangeVarNodes_walker(Node *node, Change *** 586,592 **** } /* Shouldn't need to handle other planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 585,590 ---- *************** rangeTableEntry_used_walker(Node *node, *** 868,874 **** Assert(!IsA(node, PlaceHolderVar)); Assert(!IsA(node, PlanRowMark)); Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, AppendRelInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 866,871 ---- diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 94bdb7c..603edd3 100644 *** a/src/include/nodes/nodes.h --- b/src/include/nodes/nodes.h *************** typedef enum NodeTag *** 247,253 **** T_RestrictInfo, T_PlaceHolderVar, T_SpecialJoinInfo, - T_LateralJoinInfo, T_AppendRelInfo, T_PlaceHolderInfo, T_MinMaxAggInfo, --- 247,252 ---- diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 74f4daf..5393005 100644 *** a/src/include/nodes/relation.h --- b/src/include/nodes/relation.h *************** typedef struct PlannerInfo *** 226,233 **** List *join_info_list; /* list of SpecialJoinInfos */ - List *lateral_info_list; /* list of LateralJoinInfos */ - List *append_rel_list; /* list of AppendRelInfos */ List *rowMarks; /* list of PlanRowMarks */ --- 226,231 ---- *************** typedef struct PlannerInfo *** 357,362 **** --- 355,361 ---- * (no duplicates) output from relation; NULL if not yet requested * cheapest_parameterized_paths - best paths for their parameterizations; * always includes cheapest_total_path, even if that's unparameterized + * direct_lateral_relids - rels this rel has direct LATERAL references to * lateral_relids - required outer rels for LATERAL, as a Relids set * (includes both direct and indirect lateral references) * *************** typedef struct PlannerInfo *** 374,379 **** --- 373,379 ---- * lateral_vars - lateral cross-references of rel, if any (list of * Vars and PlaceHolderVars) * lateral_referencers - relids of rels that reference this one laterally + * (includes both direct and indirect lateral references) * indexlist - list of IndexOptInfo nodes for relation's indexes * (always NIL if it's not a table) * pages - number of disk pages in relation (zero if not a table) *************** typedef struct RelOptInfo *** 465,470 **** --- 465,471 ---- /* parameterization information needed for both base rels and join rels */ /* (see also lateral_vars and lateral_referencers) */ + Relids direct_lateral_relids; /* rels directly laterally referenced */ Relids lateral_relids; /* minimum parameterization of rel */ /* information about a base rel (not set for join rels!) */ *************** typedef struct SpecialJoinInfo *** 1461,1503 **** } SpecialJoinInfo; /* - * "Lateral join" info. - * - * Lateral references constrain the join order in a way that's somewhat like - * outer joins, though different in detail. We construct a LateralJoinInfo - * for each lateral cross-reference, placing them in the PlannerInfo node's - * lateral_info_list. - * - * For unflattened LATERAL RTEs, we generate LateralJoinInfo(s) in which - * lateral_rhs is the relid of the LATERAL baserel, and lateral_lhs is a set - * of relids of baserels it references, all of which must be present on the - * LHS to compute a parameter needed by the RHS. Typically, lateral_lhs is - * a singleton, but it can include multiple rels if the RHS references a - * PlaceHolderVar with a multi-rel ph_eval_at level. We disallow joining to - * only part of the LHS in such cases, since that would result in a join tree - * with no convenient place to compute the PHV. - * - * When an appendrel contains lateral references (eg "LATERAL (SELECT x.col1 - * UNION ALL SELECT y.col2)"), the LateralJoinInfos reference the parent - * baserel not the member otherrels, since it is the parent relid that is - * considered for joining purposes. - * - * If any LATERAL RTEs were flattened into the parent query, it is possible - * that the query now contains PlaceHolderVars containing lateral references, - * representing expressions that need to be evaluated at particular spots in - * the jointree but contain lateral references to Vars from elsewhere. These - * give rise to LateralJoinInfos in which lateral_rhs is the evaluation point - * of a PlaceHolderVar and lateral_lhs is the set of lateral rels it needs. - */ - - typedef struct LateralJoinInfo - { - NodeTag type; - Relids lateral_lhs; /* rels needed to compute a lateral value */ - Relids lateral_rhs; /* rel where lateral value is needed */ - } LateralJoinInfo; - - /* * Append-relation info. * * When we expand an inheritable table or a UNION-ALL subselect into an --- 1462,1467 ----
I wrote: > As threatened, here's a patch on top of that that gets rid of > LateralJoinInfo. I'm pretty happy with this, although I have an > itchy feeling that we could dispense with the lateral_vars lists too. I experimented with that and figured out what was bothering me: there is no need for add_placeholders_to_base_rels to fool around with adding items to lateral_vars anymore, because the code in create_lateral_join_info that adds placeholders' ph_lateral bits to the evaluation locations' lateral_relids sets now does everything that we needed to have happen. We don't care exactly what's inside the PHV expression, only which rels it comes from, and the ph_lateral bitmap is sufficient for that. We still need the lateral_vars field in RelOptInfo, but it now exists purely to carry the results of extract_lateral_references forward to create_lateral_join_info. (We wouldn't need even that so far as Vars are concerned, because we could just add their source rel to the referencing rel's lateral_relids on-the-fly in extract_lateral_references. But we can't handle PHVs that way, because at that stage we don't know precisely where they'll be evaluated, so we don't know what to add to the referencer's lateral_relids.) Updated patch attached; compared to the previous one, this just removes a big chunk of code in add_placeholders_to_base_rels and updates a couple of related comments. regards, tom lane diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 26264cb..ba04b72 100644 *** a/src/backend/nodes/copyfuncs.c --- b/src/backend/nodes/copyfuncs.c *************** _copySpecialJoinInfo(const SpecialJoinIn *** 2067,2086 **** } /* - * _copyLateralJoinInfo - */ - static LateralJoinInfo * - _copyLateralJoinInfo(const LateralJoinInfo *from) - { - LateralJoinInfo *newnode = makeNode(LateralJoinInfo); - - COPY_BITMAPSET_FIELD(lateral_lhs); - COPY_BITMAPSET_FIELD(lateral_rhs); - - return newnode; - } - - /* * _copyAppendRelInfo */ static AppendRelInfo * --- 2067,2072 ---- *************** copyObject(const void *from) *** 4519,4527 **** case T_SpecialJoinInfo: retval = _copySpecialJoinInfo(from); break; - case T_LateralJoinInfo: - retval = _copyLateralJoinInfo(from); - break; case T_AppendRelInfo: retval = _copyAppendRelInfo(from); break; --- 4505,4510 ---- diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index aa6e102..356fcaf 100644 *** a/src/backend/nodes/equalfuncs.c --- b/src/backend/nodes/equalfuncs.c *************** _equalSpecialJoinInfo(const SpecialJoinI *** 846,860 **** } static bool - _equalLateralJoinInfo(const LateralJoinInfo *a, const LateralJoinInfo *b) - { - COMPARE_BITMAPSET_FIELD(lateral_lhs); - COMPARE_BITMAPSET_FIELD(lateral_rhs); - - return true; - } - - static bool _equalAppendRelInfo(const AppendRelInfo *a, const AppendRelInfo *b) { COMPARE_SCALAR_FIELD(parent_relid); --- 846,851 ---- *************** equal(const void *a, const void *b) *** 2860,2868 **** case T_SpecialJoinInfo: retval = _equalSpecialJoinInfo(a, b); break; - case T_LateralJoinInfo: - retval = _equalLateralJoinInfo(a, b); - break; case T_AppendRelInfo: retval = _equalAppendRelInfo(a, b); break; --- 2851,2856 ---- diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index f07c793..63fae82 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *************** _outPlannerInfo(StringInfo str, const Pl *** 1847,1853 **** WRITE_NODE_FIELD(right_join_clauses); WRITE_NODE_FIELD(full_join_clauses); WRITE_NODE_FIELD(join_info_list); - WRITE_NODE_FIELD(lateral_info_list); WRITE_NODE_FIELD(append_rel_list); WRITE_NODE_FIELD(rowMarks); WRITE_NODE_FIELD(placeholder_list); --- 1847,1852 ---- *************** _outRelOptInfo(StringInfo str, const Rel *** 1892,1897 **** --- 1891,1897 ---- WRITE_NODE_FIELD(cheapest_total_path); WRITE_NODE_FIELD(cheapest_unique_path); WRITE_NODE_FIELD(cheapest_parameterized_paths); + WRITE_BITMAPSET_FIELD(direct_lateral_relids); WRITE_BITMAPSET_FIELD(lateral_relids); WRITE_UINT_FIELD(relid); WRITE_OID_FIELD(reltablespace); *************** _outSpecialJoinInfo(StringInfo str, cons *** 2057,2071 **** } static void - _outLateralJoinInfo(StringInfo str, const LateralJoinInfo *node) - { - WRITE_NODE_TYPE("LATERALJOININFO"); - - WRITE_BITMAPSET_FIELD(lateral_lhs); - WRITE_BITMAPSET_FIELD(lateral_rhs); - } - - static void _outAppendRelInfo(StringInfo str, const AppendRelInfo *node) { WRITE_NODE_TYPE("APPENDRELINFO"); --- 2057,2062 ---- *************** _outNode(StringInfo str, const void *obj *** 3355,3363 **** case T_SpecialJoinInfo: _outSpecialJoinInfo(str, obj); break; - case T_LateralJoinInfo: - _outLateralJoinInfo(str, obj); - break; case T_AppendRelInfo: _outAppendRelInfo(str, obj); break; --- 3346,3351 ---- diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c index e57b8e2..2ab650c 100644 *** a/src/backend/optimizer/path/joinrels.c --- b/src/backend/optimizer/path/joinrels.c *************** join_search_one_level(PlannerInfo *root, *** 231,237 **** */ if (joinrels[level] == NIL && root->join_info_list == NIL && ! root->lateral_info_list == NIL) elog(ERROR, "failed to build any %d-way joins", level); } } --- 231,237 ---- */ if (joinrels[level] == NIL && root->join_info_list == NIL && ! !root->hasLateralRTEs) elog(ERROR, "failed to build any %d-way joins", level); } } *************** join_is_legal(PlannerInfo *root, RelOptI *** 554,568 **** match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel2 to rel1 */ ! foreach(l, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_is_subset(ljinfo->lateral_lhs, rel1->relids)) ! break; ! } ! if (l == NULL) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids)) --- 554,560 ---- match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel2 to rel1 */ ! if (!bms_overlap(rel1->relids, rel2->direct_lateral_relids)) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids)) *************** join_is_legal(PlannerInfo *root, RelOptI *** 577,591 **** match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel1 to rel2 */ ! foreach(l, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_is_subset(ljinfo->lateral_lhs, rel2->relids)) ! break; ! } ! if (l == NULL) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids)) --- 569,575 ---- match_sjinfo->jointype == JOIN_FULL)) return false; /* not implementable as nestloop */ /* check there is a direct reference from rel1 to rel2 */ ! if (!bms_overlap(rel2->relids, rel1->direct_lateral_relids)) return false; /* only indirect refs, so reject */ /* check we won't have a dangerous PHV */ if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids)) *************** have_join_order_restriction(PlannerInfo *** 917,933 **** * If either side has a direct lateral reference to the other, attempt the * join regardless of outer-join considerations. */ ! foreach(l, root->lateral_info_list) ! { ! LateralJoinInfo *ljinfo = (LateralJoinInfo *) lfirst(l); ! ! if (bms_is_subset(ljinfo->lateral_rhs, rel2->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel1->relids)) ! return true; ! if (bms_is_subset(ljinfo->lateral_rhs, rel1->relids) && ! bms_overlap(ljinfo->lateral_lhs, rel2->relids)) ! return true; ! } /* * Likewise, if both rels are needed to compute some PlaceHolderVar, --- 901,909 ---- * If either side has a direct lateral reference to the other, attempt the * join regardless of outer-join considerations. */ ! if (bms_overlap(rel1->relids, rel2->direct_lateral_relids) || ! bms_overlap(rel2->relids, rel1->direct_lateral_relids)) ! return true; /* * Likewise, if both rels are needed to compute some PlaceHolderVar, diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index 7912b15..d188d97 100644 *** a/src/backend/optimizer/plan/analyzejoins.c --- b/src/backend/optimizer/plan/analyzejoins.c *************** remove_rel_from_query(PlannerInfo *root, *** 439,447 **** sjinfo->syn_righthand = bms_del_member(sjinfo->syn_righthand, relid); } - /* There shouldn't be any LATERAL info to translate, as yet */ - Assert(root->lateral_info_list == NIL); - /* * Likewise remove references from PlaceHolderVar data structures, * removing any no-longer-needed placeholders entirely. --- 439,444 ---- diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index e5688ef..a3f79b6 100644 *** a/src/backend/optimizer/plan/initsplan.c --- b/src/backend/optimizer/plan/initsplan.c *************** typedef struct PostponedQual *** 47,53 **** static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel, Index rtindex); - static void add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs); static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, Relids *qualscope, Relids *inner_join_rels, --- 47,52 ---- *************** extract_lateral_references(PlannerInfo * *** 382,392 **** /* * create_lateral_join_info ! * For each unflattened LATERAL subquery, create LateralJoinInfo(s) and add ! * them to root->lateral_info_list, and fill in the per-rel lateral_relids ! * and lateral_referencers sets. Also generate LateralJoinInfo(s) to ! * represent any lateral references within PlaceHolderVars (this part deals ! * with the effects of flattened LATERAL subqueries). * * This has to run after deconstruct_jointree, because we need to know the * final ph_eval_at values for PlaceHolderVars. --- 381,388 ---- /* * create_lateral_join_info ! * Fill in the per-base-relation direct_lateral_relids, lateral_relids ! * and lateral_referencers sets. * * This has to run after deconstruct_jointree, because we need to know the * final ph_eval_at values for PlaceHolderVars. *************** extract_lateral_references(PlannerInfo * *** 394,399 **** --- 390,396 ---- void create_lateral_join_info(PlannerInfo *root) { + bool found_laterals = false; Index rti; ListCell *lc; *************** create_lateral_join_info(PlannerInfo *ro *** 430,437 **** { Var *var = (Var *) node; ! add_lateral_info(root, bms_make_singleton(var->varno), ! brel->relids); lateral_relids = bms_add_member(lateral_relids, var->varno); } --- 427,433 ---- { Var *var = (Var *) node; ! found_laterals = true; lateral_relids = bms_add_member(lateral_relids, var->varno); } *************** create_lateral_join_info(PlannerInfo *ro *** 441,447 **** PlaceHolderInfo *phinfo = find_placeholder_info(root, phv, false); ! add_lateral_info(root, phinfo->ph_eval_at, brel->relids); lateral_relids = bms_add_members(lateral_relids, phinfo->ph_eval_at); } --- 437,443 ---- PlaceHolderInfo *phinfo = find_placeholder_info(root, phv, false); ! found_laterals = true; lateral_relids = bms_add_members(lateral_relids, phinfo->ph_eval_at); } *************** create_lateral_join_info(PlannerInfo *ro *** 449,517 **** Assert(false); } ! /* We now have all the direct lateral refs from this rel */ ! brel->lateral_relids = lateral_relids; } /* ! * Now check for lateral references within PlaceHolderVars, and make ! * LateralJoinInfos describing each such reference. Unlike references in ! * unflattened LATERAL RTEs, the referencing location could be a join. * ! * For a PHV that is due to be evaluated at a join, we mark each of the ! * join's member baserels as having the PHV's lateral references too. Even ! * though the baserels could be scanned without considering those lateral ! * refs, we will never be able to form the join except as a path ! * parameterized by the lateral refs, so there is no point in considering ! * unparameterized paths for the baserels; and we mustn't try to join any ! * of those baserels to the lateral refs too soon, either. */ foreach(lc, root->placeholder_list) { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); Relids eval_at = phinfo->ph_eval_at; ! if (phinfo->ph_lateral != NULL) ! { ! List *vars = pull_var_clause((Node *) phinfo->ph_var->phexpr, ! PVC_RECURSE_AGGREGATES, ! PVC_INCLUDE_PLACEHOLDERS); ! ListCell *lc2; ! int ev_at; ! ! foreach(lc2, vars) ! { ! Node *node = (Node *) lfirst(lc2); ! ! if (IsA(node, Var)) ! { ! Var *var = (Var *) node; ! ! if (!bms_is_member(var->varno, eval_at)) ! add_lateral_info(root, ! bms_make_singleton(var->varno), ! eval_at); ! } ! else if (IsA(node, PlaceHolderVar)) ! { ! PlaceHolderVar *other_phv = (PlaceHolderVar *) node; ! PlaceHolderInfo *other_phi; ! other_phi = find_placeholder_info(root, other_phv, ! false); ! if (!bms_is_subset(other_phi->ph_eval_at, eval_at)) ! add_lateral_info(root, other_phi->ph_eval_at, eval_at); ! } ! else ! Assert(false); ! } ! list_free(vars); ! ev_at = -1; ! while ((ev_at = bms_next_member(eval_at, ev_at)) >= 0) { ! RelOptInfo *brel = find_base_rel(root, ev_at); brel->lateral_relids = bms_add_members(brel->lateral_relids, phinfo->ph_lateral); --- 445,498 ---- Assert(false); } ! /* We now have all the simple lateral refs from this rel */ ! brel->direct_lateral_relids = lateral_relids; ! brel->lateral_relids = bms_copy(lateral_relids); } /* ! * Now check for lateral references within PlaceHolderVars, and mark their ! * eval_at rels as having lateral references to the source rels. * ! * For a PHV that is due to be evaluated at a baserel, mark its source(s) ! * as direct lateral dependencies of the baserel (adding onto the ones ! * recorded above). If it's due to be evaluated at a join, mark its ! * source(s) as indirect lateral dependencies of each baserel in the join, ! * ie put them into lateral_relids but not direct_lateral_relids. This is ! * appropriate because we can't put any such baserel on the outside of a ! * join to one of the PHV's lateral dependencies, but on the other hand we ! * also can't yet join it directly to the dependency. */ foreach(lc, root->placeholder_list) { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc); Relids eval_at = phinfo->ph_eval_at; + int varno; ! if (phinfo->ph_lateral == NULL) ! continue; /* PHV is uninteresting if no lateral refs */ ! found_laterals = true; ! if (bms_get_singleton_member(eval_at, &varno)) ! { ! /* Evaluation site is a baserel */ ! RelOptInfo *brel = find_base_rel(root, varno); ! brel->direct_lateral_relids = ! bms_add_members(brel->direct_lateral_relids, ! phinfo->ph_lateral); ! brel->lateral_relids = ! bms_add_members(brel->lateral_relids, ! phinfo->ph_lateral); ! } ! else ! { ! /* Evaluation site is a join */ ! varno = -1; ! while ((varno = bms_next_member(eval_at, varno)) >= 0) { ! RelOptInfo *brel = find_base_rel(root, varno); brel->lateral_relids = bms_add_members(brel->lateral_relids, phinfo->ph_lateral); *************** create_lateral_join_info(PlannerInfo *ro *** 519,535 **** } } ! /* If we found no lateral references, we're done. */ ! if (root->lateral_info_list == NIL) return; /* ! * At this point the lateral_relids sets represent only direct lateral ! * references. Replace them by their transitive closure, so that they ! * describe both direct and indirect lateral references. If relation X ! * references Y laterally, and Y references Z laterally, then we will have ! * to scan X on the inside of a nestloop with Z, so for all intents and ! * purposes X is laterally dependent on Z too. * * This code is essentially Warshall's algorithm for transitive closure. * The outer loop considers each baserel, and propagates its lateral --- 500,521 ---- } } ! /* ! * If we found no actual lateral references, we're done; but reset the ! * hasLateralRTEs flag to avoid useless work later. ! */ ! if (!found_laterals) ! { ! root->hasLateralRTEs = false; return; + } /* ! * Calculate the transitive closure of the lateral_relids sets, so that ! * they describe both direct and indirect lateral references. If relation ! * X references Y laterally, and Y references Z laterally, then we will ! * have to scan X on the inside of a nestloop with Z, so for all intents ! * and purposes X is laterally dependent on Z too. * * This code is essentially Warshall's algorithm for transitive closure. * The outer loop considers each baserel, and propagates its lateral *************** create_lateral_join_info(PlannerInfo *ro *** 623,632 **** if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) continue; - /* - * If it's an appendrel parent, copy its lateral_relids and - * lateral_referencers to each child rel. - */ if (root->simple_rte_array[rti]->inh) { foreach(lc, root->append_rel_list) --- 609,614 ---- *************** create_lateral_join_info(PlannerInfo *ro *** 638,643 **** --- 620,627 ---- continue; childrel = root->simple_rel_array[appinfo->child_relid]; Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL); + Assert(childrel->direct_lateral_relids == NULL); + childrel->direct_lateral_relids = brel->direct_lateral_relids; Assert(childrel->lateral_relids == NULL); childrel->lateral_relids = brel->lateral_relids; Assert(childrel->lateral_referencers == NULL); *************** create_lateral_join_info(PlannerInfo *ro *** 647,692 **** } } - /* - * add_lateral_info - * Add a LateralJoinInfo to root->lateral_info_list, if needed - * - * We suppress redundant list entries. The passed Relids are copied if saved. - */ - static void - add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs) - { - LateralJoinInfo *ljinfo; - ListCell *lc; - - /* Sanity-check the input */ - Assert(!bms_is_empty(lhs)); - Assert(!bms_is_empty(rhs)); - Assert(!bms_overlap(lhs, rhs)); - - /* - * The input is redundant if it has the same RHS and an LHS that is a - * subset of an existing entry's. If an existing entry has the same RHS - * and an LHS that is a subset of the new one, it's redundant, but we - * don't trouble to get rid of it. The only case that is really worth - * worrying about is identical entries, and we handle that well enough - * with this simple logic. - */ - foreach(lc, root->lateral_info_list) - { - ljinfo = (LateralJoinInfo *) lfirst(lc); - if (bms_equal(rhs, ljinfo->lateral_rhs) && - bms_is_subset(lhs, ljinfo->lateral_lhs)) - return; - } - - /* Not there, so make a new entry */ - ljinfo = makeNode(LateralJoinInfo); - ljinfo->lateral_lhs = bms_copy(lhs); - ljinfo->lateral_rhs = bms_copy(rhs); - root->lateral_info_list = lappend(root->lateral_info_list, ljinfo); - } - /***************************************************************************** * --- 631,636 ---- diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c index a761cfd..02d38f1 100644 *** a/src/backend/optimizer/plan/planagg.c --- b/src/backend/optimizer/plan/planagg.c *************** build_minmax_path(PlannerInfo *root, Min *** 433,441 **** subroot->plan_params = NIL; subroot->outer_params = NULL; subroot->init_plans = NIL; ! /* There shouldn't be any OJ or LATERAL info to translate, as yet */ Assert(subroot->join_info_list == NIL); - Assert(subroot->lateral_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot->placeholder_list == NIL); --- 433,440 ---- subroot->plan_params = NIL; subroot->outer_params = NULL; subroot->init_plans = NIL; ! /* There shouldn't be any OJ info to translate, as yet */ Assert(subroot->join_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot->placeholder_list == NIL); diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c index d73e7c0..894f968 100644 *** a/src/backend/optimizer/plan/planmain.c --- b/src/backend/optimizer/plan/planmain.c *************** query_planner(PlannerInfo *root, List *t *** 114,120 **** root->right_join_clauses = NIL; root->full_join_clauses = NIL; root->join_info_list = NIL; - root->lateral_info_list = NIL; root->placeholder_list = NIL; root->initial_rels = NIL; --- 114,119 ---- *************** query_planner(PlannerInfo *root, List *t *** 201,209 **** add_placeholders_to_base_rels(root); /* ! * Create the LateralJoinInfo list now that we have finalized ! * PlaceHolderVar eval levels and made any necessary additions to the ! * lateral_vars lists for lateral references within PlaceHolderVars. */ create_lateral_join_info(root); --- 200,207 ---- add_placeholders_to_base_rels(root); /* ! * Construct the lateral reference sets now that we have finalized ! * PlaceHolderVar eval levels. */ create_lateral_join_info(root); diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index a9cccee..797df31 100644 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *************** inheritance_planner(PlannerInfo *root) *** 1132,1140 **** } } ! /* There shouldn't be any OJ or LATERAL info to translate, as yet */ Assert(subroot.join_info_list == NIL); - Assert(subroot.lateral_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot.placeholder_list == NIL); /* hack to mark target relation as an inheritance partition */ --- 1132,1139 ---- } } ! /* There shouldn't be any OJ info to translate, as yet */ Assert(subroot.join_info_list == NIL); /* and we haven't created PlaceHolderInfos, either */ Assert(subroot.placeholder_list == NIL); /* hack to mark target relation as an inheritance partition */ diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 401ba5b..5d3de14 100644 *** a/src/backend/optimizer/prep/prepjointree.c --- b/src/backend/optimizer/prep/prepjointree.c *************** pull_up_simple_subquery(PlannerInfo *roo *** 1150,1163 **** subroot->append_rel_list); /* ! * We don't have to do the equivalent bookkeeping for outer-join or ! * LATERAL info, because that hasn't been set up yet. placeholder_list ! * likewise. */ Assert(root->join_info_list == NIL); Assert(subroot->join_info_list == NIL); - Assert(root->lateral_info_list == NIL); - Assert(subroot->lateral_info_list == NIL); Assert(root->placeholder_list == NIL); Assert(subroot->placeholder_list == NIL); --- 1150,1160 ---- subroot->append_rel_list); /* ! * We don't have to do the equivalent bookkeeping for outer-join info, ! * because that hasn't been set up yet. placeholder_list likewise. */ Assert(root->join_info_list == NIL); Assert(subroot->join_info_list == NIL); Assert(root->placeholder_list == NIL); Assert(subroot->placeholder_list == NIL); *************** pull_up_simple_values(PlannerInfo *root, *** 1642,1648 **** Assert(root->append_rel_list == NIL); Assert(list_length(parse->rtable) == 1); Assert(root->join_info_list == NIL); - Assert(root->lateral_info_list == NIL); Assert(root->placeholder_list == NIL); /* --- 1639,1644 ---- *************** substitute_multiple_relids_walker(Node * *** 2839,2845 **** } /* Shouldn't need to handle planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, AppendRelInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 2835,2840 ---- diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index 8884fb1..2e55131 100644 *** a/src/backend/optimizer/prep/prepunion.c --- b/src/backend/optimizer/prep/prepunion.c *************** adjust_appendrel_attrs_mutator(Node *nod *** 1786,1792 **** } /* Shouldn't need to handle planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, AppendRelInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 1786,1791 ---- diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c index 2870315..7fa93fb 100644 *** a/src/backend/optimizer/util/placeholder.c --- b/src/backend/optimizer/util/placeholder.c *************** fix_placeholder_input_needed_levels(Plan *** 363,374 **** /* * add_placeholders_to_base_rels ! * Add any required PlaceHolderVars to base rels' targetlists, and ! * update lateral_vars lists for lateral references contained in them. * * If any placeholder can be computed at a base rel and is needed above it, ! * add it to that rel's targetlist, and add any lateral references it requires ! * to the rel's lateral_vars list. This might look like it could be merged * with fix_placeholder_input_needed_levels, but it must be separate because * join removal happens in between, and can change the ph_eval_at sets. There * is essentially the same logic in add_placeholders_to_joinrel, but we can't --- 363,372 ---- /* * add_placeholders_to_base_rels ! * Add any required PlaceHolderVars to base rels' targetlists. * * If any placeholder can be computed at a base rel and is needed above it, ! * add it to that rel's targetlist. This might look like it could be merged * with fix_placeholder_input_needed_levels, but it must be separate because * join removal happens in between, and can change the ph_eval_at sets. There * is essentially the same logic in add_placeholders_to_joinrel, but we can't *************** add_placeholders_to_base_rels(PlannerInf *** 385,442 **** Relids eval_at = phinfo->ph_eval_at; int varno; ! if (bms_get_singleton_member(eval_at, &varno)) { RelOptInfo *rel = find_base_rel(root, varno); ! /* add it to reltargetlist if needed above the rel scan level */ ! if (bms_nonempty_difference(phinfo->ph_needed, eval_at)) ! rel->reltargetlist = lappend(rel->reltargetlist, ! copyObject(phinfo->ph_var)); ! /* if there are lateral refs in it, add them to lateral_vars */ ! if (phinfo->ph_lateral != NULL) ! { ! List *vars = pull_var_clause((Node *) phinfo->ph_var->phexpr, ! PVC_RECURSE_AGGREGATES, ! PVC_INCLUDE_PLACEHOLDERS); ! ListCell *lc2; ! ! foreach(lc2, vars) ! { ! Node *node = (Node *) lfirst(lc2); ! ! if (IsA(node, Var)) ! { ! Var *var = (Var *) node; ! ! if (var->varno != varno) ! rel->lateral_vars = lappend(rel->lateral_vars, ! var); ! } ! else if (IsA(node, PlaceHolderVar)) ! { ! PlaceHolderVar *other_phv = (PlaceHolderVar *) node; ! PlaceHolderInfo *other_phi; ! ! other_phi = find_placeholder_info(root, other_phv, ! false); ! if (!bms_is_subset(other_phi->ph_eval_at, eval_at)) ! rel->lateral_vars = lappend(rel->lateral_vars, ! other_phv); ! } ! else ! Assert(false); ! } ! ! list_free(vars); ! } } } } /* * add_placeholders_to_joinrel ! * Add any required PlaceHolderVars to a join rel's targetlist. * * A join rel should emit a PlaceHolderVar if (a) the PHV is needed above * this join level and (b) the PHV can be computed at or below this level. --- 383,404 ---- Relids eval_at = phinfo->ph_eval_at; int varno; ! if (bms_get_singleton_member(eval_at, &varno) && ! bms_nonempty_difference(phinfo->ph_needed, eval_at)) { RelOptInfo *rel = find_base_rel(root, varno); ! rel->reltargetlist = lappend(rel->reltargetlist, ! copyObject(phinfo->ph_var)); } } } /* * add_placeholders_to_joinrel ! * Add any required PlaceHolderVars to a join rel's targetlist; ! * and if they contain lateral references, add those references to the ! * joinrel's direct_lateral_relids. * * A join rel should emit a PlaceHolderVar if (a) the PHV is needed above * this join level and (b) the PHV can be computed at or below this level. *************** add_placeholders_to_joinrel(PlannerInfo *** 463,468 **** --- 425,434 ---- joinrel->reltargetlist = lappend(joinrel->reltargetlist, phinfo->ph_var); joinrel->width += phinfo->ph_width; + /* Adjust joinrel's direct_lateral_relids as needed */ + joinrel->direct_lateral_relids = + bms_add_members(joinrel->direct_lateral_relids, + phinfo->ph_lateral); } } } diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index a2be2ed..f2bdfcc 100644 *** a/src/backend/optimizer/util/relnode.c --- b/src/backend/optimizer/util/relnode.c *************** build_simple_rel(PlannerInfo *root, int *** 111,116 **** --- 111,117 ---- rel->cheapest_total_path = NULL; rel->cheapest_unique_path = NULL; rel->cheapest_parameterized_paths = NIL; + rel->direct_lateral_relids = NULL; rel->lateral_relids = NULL; rel->relid = relid; rel->rtekind = rte->rtekind; *************** build_join_rel(PlannerInfo *root, *** 373,378 **** --- 374,383 ---- joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; joinrel->cheapest_parameterized_paths = NIL; + /* init direct_lateral_relids from children; we'll finish it up below */ + joinrel->direct_lateral_relids = + bms_union(outer_rel->direct_lateral_relids, + inner_rel->direct_lateral_relids); joinrel->lateral_relids = min_join_parameterization(root, joinrel->relids, outer_rel, inner_rel); joinrel->relid = 0; /* indicates not a baserel */ *************** build_join_rel(PlannerInfo *root, *** 423,428 **** --- 428,445 ---- add_placeholders_to_joinrel(root, joinrel); /* + * add_placeholders_to_joinrel also took care of adding the ph_lateral + * sets of any PlaceHolderVars computed here to direct_lateral_relids, so + * now we can finish computing that. This is much like the computation of + * the transitively-closed lateral_relids in min_join_parameterization, + * except that here we *do* have to consider the added PHVs. + */ + joinrel->direct_lateral_relids = + bms_del_members(joinrel->direct_lateral_relids, joinrel->relids); + if (bms_is_empty(joinrel->direct_lateral_relids)) + joinrel->direct_lateral_relids = NULL; + + /* * Construct restrict and join clause lists for the new joinrel. (The * caller might or might not need the restrictlist, but I need it anyway * for set_joinrel_size_estimates().) diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c index 773e7b2..32038ce 100644 *** a/src/backend/optimizer/util/var.c --- b/src/backend/optimizer/util/var.c *************** flatten_join_alias_vars_mutator(Node *no *** 782,788 **** Assert(!IsA(node, SubPlan)); /* Shouldn't need to handle these planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 782,787 ---- diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index 1da90ff..6f22168 100644 *** a/src/backend/rewrite/rewriteManip.c --- b/src/backend/rewrite/rewriteManip.c *************** OffsetVarNodes_walker(Node *node, Offset *** 402,408 **** /* Shouldn't need to handle other planner auxiliary nodes here */ Assert(!IsA(node, PlanRowMark)); Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 402,407 ---- *************** ChangeVarNodes_walker(Node *node, Change *** 586,592 **** } /* Shouldn't need to handle other planner auxiliary nodes here */ Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 585,590 ---- *************** rangeTableEntry_used_walker(Node *node, *** 868,874 **** Assert(!IsA(node, PlaceHolderVar)); Assert(!IsA(node, PlanRowMark)); Assert(!IsA(node, SpecialJoinInfo)); - Assert(!IsA(node, LateralJoinInfo)); Assert(!IsA(node, AppendRelInfo)); Assert(!IsA(node, PlaceHolderInfo)); Assert(!IsA(node, MinMaxAggInfo)); --- 866,871 ---- diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 94bdb7c..603edd3 100644 *** a/src/include/nodes/nodes.h --- b/src/include/nodes/nodes.h *************** typedef enum NodeTag *** 247,253 **** T_RestrictInfo, T_PlaceHolderVar, T_SpecialJoinInfo, - T_LateralJoinInfo, T_AppendRelInfo, T_PlaceHolderInfo, T_MinMaxAggInfo, --- 247,252 ---- diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 74f4daf..5393005 100644 *** a/src/include/nodes/relation.h --- b/src/include/nodes/relation.h *************** typedef struct PlannerInfo *** 226,233 **** List *join_info_list; /* list of SpecialJoinInfos */ - List *lateral_info_list; /* list of LateralJoinInfos */ - List *append_rel_list; /* list of AppendRelInfos */ List *rowMarks; /* list of PlanRowMarks */ --- 226,231 ---- *************** typedef struct PlannerInfo *** 357,362 **** --- 355,361 ---- * (no duplicates) output from relation; NULL if not yet requested * cheapest_parameterized_paths - best paths for their parameterizations; * always includes cheapest_total_path, even if that's unparameterized + * direct_lateral_relids - rels this rel has direct LATERAL references to * lateral_relids - required outer rels for LATERAL, as a Relids set * (includes both direct and indirect lateral references) * *************** typedef struct PlannerInfo *** 374,379 **** --- 373,379 ---- * lateral_vars - lateral cross-references of rel, if any (list of * Vars and PlaceHolderVars) * lateral_referencers - relids of rels that reference this one laterally + * (includes both direct and indirect lateral references) * indexlist - list of IndexOptInfo nodes for relation's indexes * (always NIL if it's not a table) * pages - number of disk pages in relation (zero if not a table) *************** typedef struct RelOptInfo *** 465,470 **** --- 465,471 ---- /* parameterization information needed for both base rels and join rels */ /* (see also lateral_vars and lateral_referencers) */ + Relids direct_lateral_relids; /* rels directly laterally referenced */ Relids lateral_relids; /* minimum parameterization of rel */ /* information about a base rel (not set for join rels!) */ *************** typedef struct SpecialJoinInfo *** 1461,1503 **** } SpecialJoinInfo; /* - * "Lateral join" info. - * - * Lateral references constrain the join order in a way that's somewhat like - * outer joins, though different in detail. We construct a LateralJoinInfo - * for each lateral cross-reference, placing them in the PlannerInfo node's - * lateral_info_list. - * - * For unflattened LATERAL RTEs, we generate LateralJoinInfo(s) in which - * lateral_rhs is the relid of the LATERAL baserel, and lateral_lhs is a set - * of relids of baserels it references, all of which must be present on the - * LHS to compute a parameter needed by the RHS. Typically, lateral_lhs is - * a singleton, but it can include multiple rels if the RHS references a - * PlaceHolderVar with a multi-rel ph_eval_at level. We disallow joining to - * only part of the LHS in such cases, since that would result in a join tree - * with no convenient place to compute the PHV. - * - * When an appendrel contains lateral references (eg "LATERAL (SELECT x.col1 - * UNION ALL SELECT y.col2)"), the LateralJoinInfos reference the parent - * baserel not the member otherrels, since it is the parent relid that is - * considered for joining purposes. - * - * If any LATERAL RTEs were flattened into the parent query, it is possible - * that the query now contains PlaceHolderVars containing lateral references, - * representing expressions that need to be evaluated at particular spots in - * the jointree but contain lateral references to Vars from elsewhere. These - * give rise to LateralJoinInfos in which lateral_rhs is the evaluation point - * of a PlaceHolderVar and lateral_lhs is the set of lateral rels it needs. - */ - - typedef struct LateralJoinInfo - { - NodeTag type; - Relids lateral_lhs; /* rels needed to compute a lateral value */ - Relids lateral_rhs; /* rel where lateral value is needed */ - } LateralJoinInfo; - - /* * Append-relation info. * * When we expand an inheritable table or a UNION-ALL subselect into an --- 1462,1467 ----
Tom Lane writes: > [2. transitive-lateral-fixes-2.patch ] > [2. remove-lateraljoininfo-2.patch ] They seem to have fixed the issue for good now. No errors have been logged for 2e8 queries since applying the first patch. (The second one was applied later and didn't get as much exposure.) I guess that means I have to go back to extending the grammar again :-). regards Andreas
Andreas Seltenreich <seltenreich@gmx.de> writes: > Tom Lane writes: >> [2. transitive-lateral-fixes-2.patch ] >> [2. remove-lateraljoininfo-2.patch ] > They seem to have fixed the issue for good now. No errors have been > logged for 2e8 queries since applying the first patch. (The second one > was applied later and didn't get as much exposure.) Thanks. It's good that you tested both states of the code, since I intend to back-patch transitive-lateral-fixes into 9.4 and 9.3, but not the second patch. > I guess that means I have to go back to extending the grammar again :-). I await the results with interest. Did you note the suggestion about trying to stress the ON CONFLICT code with this? You'd need it to issue non-SELECT queries, which might create some reproducibility issues... regards, tom lane
Peter Geoghegan writes: > On Sun, Dec 6, 2015 at 9:52 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: >> I've added new grammar rules to sqlsmith and improved some older ones. > > Could you possibly teach sqlsmith about INSERT ... ON CONFLICT DO > UPDATE/IGNORE? I think that that could be very helpful, especially if > it could be done in advance of any stable release of 9.5. In summary, it can't be added ad-hoc, but might still happen in advance of the release of 9.5. Adding upsert needs significiant effort because historically, non-boolean value expression productions yield a random type. This is not a problem for generating queries, but it is for inserts. Also, sqlsmith can at the moment only generate sensible value expressions from column references. Generating a proper upsert would require supporting type-constraining of productions as well as adding productions for pulling values out of thin air (e.g., generating atomic value subselects or calling argumentless functions). regards, Andreas
On Fri, Dec 11, 2015 at 7:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I guess that means I have to go back to extending the grammar again :-). > > I await the results with interest. Did you note the suggestion about > trying to stress the ON CONFLICT code with this? You'd need it to > issue non-SELECT queries, which might create some reproducibility > issues... About 80% of the bugs we've seen so far are the type that a tool like sqlsmith could plausibly catch: bugs that trigger defensive "can't happen" elog(ERROR, ... ) calls within the planner and rewriter. While I've been vigilant, I certainly wouldn't be surprised if more were found, given the total flexibility of the ON CONFLICT syntax. -- Peter Geoghegan
<p dir="ltr">There may be other errors that would be surprising for Tom or Robert. What I did with the string argument fuzzerwas printed counts for each sqlstate for the errors and watched for errors that only occurred occasionally or didn'tmake sense to me.<p dir="ltr">Also, do you have any timeouts? Do you have any stats on how long these queries are takingto plan? What's the longest query to plan you've found?<p dir="ltr">Do you have coverage data for the corpus? Maybewe could suggest syntaxes specifically aimed at getting coverage for sections of chose that don't have any yet.<br /><divclass="gmail_quote">On 11 Dec 2015 19:25, "Peter Geoghegan" <<a href="mailto:pg@heroku.com">pg@heroku.com</a>>wrote:<br type="attribution" /><blockquote class="gmail_quote" style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Dec 11, 2015 at 7:58 AM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>wrote:<br /> >> I guess that means I have to go back to extendingthe grammar again :-).<br /> ><br /> > I await the results with interest. Did you note the suggestion about<br/> > trying to stress the ON CONFLICT code with this? You'd need it to<br /> > issue non-SELECT queries, whichmight create some reproducibility<br /> > issues...<br /><br /> About 80% of the bugs we've seen so far are the typethat a tool like<br /> sqlsmith could plausibly catch: bugs that trigger defensive "can't<br /> happen" elog(ERROR, ...) calls within the planner and rewriter. While<br /> I've been vigilant, I certainly wouldn't be surprised if more were<br/> found, given the total flexibility of the ON CONFLICT syntax.<br /><br /> --<br /> Peter Geoghegan<br /><br /><br/> --<br /> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers" rel="noreferrer" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></blockquote></div>
Greg Stark writes: > There may be other errors that would be surprising for Tom or Robert. What > I did with the string argument fuzzer was printed counts for each sqlstate > for the errors and watched for errors that only occurred occasionally or > didn't make sense to me. > > Also, do you have any timeouts? I currently set statement_timeout to 1s to avoid wasting time letting postgres crunch numbers. Less than 0.5% of the queries run into this timeout. > Do you have any stats on how long these queries are taking to plan? > What's the longest query to plan you've found? No, I'm currently not logging timing spects. The schema I let the instances log into is part of the repo[1]. > Do you have coverage data for the corpus? I do have some older numbers for line coverage from before the recent grammar extension: | revision | overall | parser | |----------+---------+--------| | a4c1989 | 26.0 | 20.4 | regards, Andreas Footnotes: [1] https://github.com/anse1/sqlsmith/blob/master/log.sql
On Sat, Dec 12, 2015 at 8:30 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > I currently set statement_timeout to 1s to avoid wasting time letting > postgres crunch numbers. Less than 0.5% of the queries run into this > timeout. I wonder if any of these timeouts would be interesting to look at. Some may just be very large queries that will take a few seconds to plan but others may be queries that are uncovering N^2 algorithms or even conceivably loops that are not terminating. When you hit the timeout is this implemented in your fuzzer or using statement_timeout? If the former, can you add a statement_timeout of just short of the timeout in the fuzzer and find cases where the planner might not be calling CHECK_FOR_INTERRUPTS frequently enough? > > Do you have coverage data for the corpus? > > I do have some older numbers for line coverage from before the recent grammar extension: If you have a corpus of queries in a simple format it would be pretty convenient to add them in a regression test and then run make coverage to get html reports. Did you publish the source already? I haven't been following all along, sorry if these are all answered questions. -- greg
On Sun, Dec 13, 2015 at 12:04:34AM +0000, Greg Stark wrote: > On Sat, Dec 12, 2015 at 8:30 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > Did you publish the source already? I haven't been following all > along, sorry if these are all answered questions. Either he has, or the following is a truly astonishing coincidence: https://github.com/anse1/sqlsmith Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Greg Stark writes: > On Sat, Dec 12, 2015 at 8:30 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote: > When you hit the timeout is this implemented in your fuzzer or using > statement_timeout? If the former, can you add a statement_timeout of > just short of the timeout in the fuzzer and find cases where the > planner might not be calling CHECK_FOR_INTERRUPTS frequently enough? It's the latter. I don't think I can add a client-side timeout into sqlsmith elegantly. IMHO it's better to write another test tool that just re-runs the queries that were logged with a timeout by sqlsmith and investigates their timeout-behavior more closely. >> I do have some older numbers for line coverage from before the recent grammar extension: > > If you have a corpus of queries in a simple format it would be pretty > convenient to add them in a regression test and then run make coverage > to get html reports. Hmm, I thought I found a workflow that would yield sqlsmith's coverage without integrating it into the regession tests. This is what I did: make install initdb /tmp/gcov pg_ctl -D /tmp/gcov start make installcheck pg_ctl -D /tmp/gcov stop make coverage-clean pg_ctl -D /tmp/gcov start sqlsmith --target='dbname=regression' --max-queries=10000 pg_ctl -D /tmp/gcovstop make coverage-html It seems to yield a pure sqlsmith-only coverage report, as a "make coverage-html" before the "make coverage-clean" yields a report with much higher score. Maybe there are drawbacks to the workflow you are suggesting? I just re-did it with the current sqlsmith code, and it's up by 25% compared to the latest tested revision: | revision | overall | parser | |----------+---------+--------| | a4c1989 | 26.0 | 20.4 | | ee099e6 | 33.8 | 25.8 | I also put the report here, in case someone wants to look at certain details, or make suggestions into what directions to best extend the grammar to increase coverage. http://ansel.ydns.eu/~andreas/coverage/ http://ansel.ydns.eu/~andreas/gcov.tar.xz > Did you publish the source already? I haven't been following all > along, sorry if these are all answered questions. It's not had a proper release yet, but the code is available via github in all its rapid-prototypesque glory: https://github.com/anse1/sqlsmith regards, Andreas
On Sun, Dec 13, 2015 at 10:14 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote: >> Did you publish the source already? I haven't been following all >> along, sorry if these are all answered questions. > > It's not had a proper release yet, but the code is available via github > in all its rapid-prototypesque glory: > > https://github.com/anse1/sqlsmith I am in awe regarding this stuff, which has caught many bugs already, it is a bit sad that it is released under the GPL license preventing a potential integration into PostgreSQL core to strengthen the test infrastructure, and it is even sadder to see a direct dependency with libpqxx :( -- Michael
On Mon, Dec 14, 2015 at 03:06:18PM +0900, Michael Paquier wrote: > On Sun, Dec 13, 2015 at 10:14 AM, Andreas Seltenreich > <seltenreich@gmx.de> wrote: > >> Did you publish the source already? I haven't been following all > >> along, sorry if these are all answered questions. > > > > It's not had a proper release yet, but the code is available via > > github in all its rapid-prototypesque glory: > > > > https://github.com/anse1/sqlsmith > > I am in awe regarding this stuff, which has caught many bugs > already, it is a bit sad that it is released under the GPL license > preventing a potential integration into PostgreSQL core to > strengthen the test infrastructure, I suspect that a polite request to the Andreas that he change to a PostgreSQL-compatible license like one of (TPL, BSD2, MIT) might handle this part. > and it is even sadder to see a direct dependency with > libpqxx :( I suspect this part is a SMOP, but I'm not quite sure what S might constitute in this case. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter writes: > On Mon, Dec 14, 2015 at 03:06:18PM +0900, Michael Paquier wrote: >> On Sun, Dec 13, 2015 at 10:14 AM, Andreas Seltenreich wrote: >> > https://github.com/anse1/sqlsmith >> >> I am in awe regarding this stuff, which has caught many bugs >> already, it is a bit sad that it is released under the GPL license >> preventing a potential integration into PostgreSQL core to >> strengthen the test infrastructure, > > I suspect that a polite request to the Andreas that he change to a > PostgreSQL-compatible license like one of (TPL, BSD2, MIT) might > handle this part. It probably would, but I never thought core integration would be a desirable thing. Like Csmith, SQLsmith is intended to be product-agnostic. That's not yet the case, but it's still on the roadmap. Further, the license shouldn't interfere with institutionalizing sqlsmith somewhere to automatically send mails on failed assertions or first sight of an error message. Or providing a web interface around the logging database of an instance where one can drill down on logged errors/queries. >> and it is even sadder to see a direct dependency with libpqxx :( > > I suspect this part is a SMOP, but I'm not quite sure what S might > constitute in this case. sqlsmith uses three connections in total: 1. Connection for sending the generated queries to the DUT2. Connection for retrieving the schema from the DUT3. Loggingconnection 1 is trivial to change. 1+2 are intendend to be pluggable for supporting other products. Switching 1 to libpq is even on the todo list, as libpqxx doesn't allow access to the SQLSTATE (There is a four year old bug report about this by Andres). 2. Is more effort to change to libpq, as actual data is passed through that connection. 3. Was intended to stay libpqxx-only even when testing other products. Btw, I'm glad C++11 was no longer considered a blocker this thime :-) regards, Andreas