Re: Add semi-join pushdown to postgres_fdw - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: Add semi-join pushdown to postgres_fdw |
Date | |
Msg-id | CAPpHfdt3m8DdyMfLK_mMcikno3ZA0DWsca+2JD=kVyF1DBNGWA@mail.gmail.com Whole thread Raw |
In response to | Re: Add semi-join pushdown to postgres_fdw (Alexander Pyhalov <a.pyhalov@postgrespro.ru>) |
List | pgsql-hackers |
On Mon, Mar 24, 2025 at 6:56 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote: > Alexander Korotkov писал(а) 2025-03-24 11:49: > > On Mon, Mar 24, 2025 at 9:07 AM Alexander Pyhalov > > <a.pyhalov@postgrespro.ru> wrote: > >> Alexander Korotkov писал(а) 2025-03-24 04:21: > >> > Hi, Alexander! > >> > > >> > On Tue, Mar 18, 2025 at 6:04 PM Alexander Pyhalov > >> > <a.pyhalov@postgrespro.ru> wrote: > >> >> This shouldn't. When semi-join is found below left/right join, it's > >> >> deparsed as subquery. > >> >> Interesting enough, this mechanics (deparsing as subquery) is used > >> >> 1) for semi-joins under left/right join, > >> >> 2) for full outer joins when inner or outer part has some > >> >> remote_conds. > >> >> > >> >> The issue here is that after subquery is deparsed, we don't consider > >> >> if > >> >> its target attributes are available to the upper level > >> >> join . As for semi-join itself, all conditions are still deparsed on > >> >> left/right join boundary, they are just not propagated further. > >> >> This shouldn't be a problem, as they are evaluated in subquery. As for > >> >> left/right join without semi-join beneath it - its behavior is not > >> >> affected > >> >> (as hidden_subquery_rels is empty). > >> > > >> > Thank you for the explanation. But I have another question. Aren't > >> > the checks you've proposed too strict? hidden_subquery_rels are > >> > propagated all the way to the join tree. So, pulling conditions would > >> > be disables all the way to the join tree too. Is it enough to just > >> > disable pulling conditions directly from semi-joins, then their > >> > further pulls will be disabled automatically? See the attached patch. > >> > It also contains other (mostly cosmetic improvements). > >> > > >> > ------ > >> > Regards, > >> > Alexander Korotkov > >> > Supabase > >> > >> Hi. No, they are not too strict. Look at the following example > >> > >> EXPLAIN (verbose, costs off) > >> SELECT x1.c1 FROM > >> (SELECT * FROM ft2 WHERE EXISTS (SELECT 1 FROM ft4 > >> WHERE > >> ft4.c1 = ft2.c1 AND ft2.c2 < 10)) x1 > >> RIGHT JOIN > >> (SELECT * FROM ft2 WHERE EXISTS (SELECT 1 FROM ft4 > >> WHERE > >> ft4.c1 = ft2.c1 AND ft2.c2 < 10)) x2 > >> ON (x1.c1 = x2.c1) > >> LEFT JOIN > >> (SELECT * FROM ft2 WHERE c2 < 11) x3 > >> ON (x1.c1 = x3.c1) > >> ORDER BY x1.c1 LIMIT 10; > >> > >> With patch which you suggest, we'll deparse left part of left join as > >> subquery, but will try to pop c2 < 10 condition from > >> (8) LEFT JOIN ((6) SEMI JOIN (7)) subquery. When we look at left join > >> of > >> this subquery and ft2, we still deparse left part as > >> subquery, so can't pop up conditions from it. > > > > I've checked, this query seems to result in the exactly same remote > > SQLs with your and mine patches. Could you elaborate more on the > > difference? Do you think foreign_join_ok() can give different results > > on this query? > > Hi. > With your patch this example gives the same > ERROR: unexpected expression in subquery output > > This happens, because we don't keep knowledge that we have deparsed all > semi-joins below this left join. As long as left/right join has > semi-join in its left or right part, this part will be deparsed as > subquery (look at the following lines in foreign_join_ok()): > > else if (jointype == JOIN_LEFT || jointype == JOIN_RIGHT || > jointype == JOIN_FULL) > { > /* > * Conditions, generated from semi-joins, should be > evaluated before > * LEFT/RIGHT/FULL join. > */ > if (!bms_is_empty(fpinfo_o->hidden_subquery_rels)) > { > fpinfo->make_outerrel_subquery = true; > fpinfo->lower_subquery_rels = > bms_add_members(fpinfo->lower_subquery_rels, outerrel->relids); > } > > if (!bms_is_empty(fpinfo_i->hidden_subquery_rels)) > { > fpinfo->make_innerrel_subquery = true; > fpinfo->lower_subquery_rels = > bms_add_members(fpinfo->lower_subquery_rels, innerrel->relids); > } > } > > > So, we still can't refer to its remote_conds from upper level queries > (as not all Vars are available from subquery after subquery is created > in one part of left or right join). It's not necessary to have semi-join > for this as immediate left/right join inner or outer for inner/outer to > be deparsed as subquery. But it shouldn't be an issue - we've already > used remote_conds when created this subquery. > What I'm trying to say - logic of 'making subquery' and extracting > conditions should match (or we need more sophisticated way of forming > subquery targetlist, so that extracted conditions could be used above > subqueries). Thank you for the explanation. Pushed. However, it would be nice in future to rework this in a way that semi-joins on lower levels of join tree don't have negative impact on optimization of upper levels of join tree. ------ Regards, Alexander Korotkov Supabase
pgsql-hackers by date: