Re: Removing unneeded self joins - Mailing list pgsql-hackers
From | Ronan Dunklau |
---|---|
Subject | Re: Removing unneeded self joins |
Date | |
Msg-id | 2375492.jE0xQCEvom@aivenronan Whole thread Raw |
In response to | Re: Removing unneeded self joins (Andrey Lepikhov <a.lepikhov@postgrespro.ru>) |
Responses |
Re: Removing unneeded self joins
|
List | pgsql-hackers |
Le jeudi 30 juin 2022, 16:11:51 CEST Andrey Lepikhov a écrit : > On 19/5/2022 16:47, Ronan Dunklau wrote: > > I'll take a look at that one. > > New version of the patch, rebased on current master: > 1. pgindent over the patch have passed. > 2. number of changed files is reduced. > 3. Some documentation and comments is added. Hello Andrey, Thanks for the updates. The general approach seems sensible to me, so I'm going to focus on some details. In a very recent thread [1], Tom Lane is proposing to add infrastructure to make Var aware of their nullability by outerjoins. I wonder if that would help with avoiding the need for adding is not null clauses when the column is known notnull ? If we have a precedent for adding a BitmapSet to the Var itself, maybe the whole discussion regarding keeping track of nullabilitycan be extended to the original column nullability ? Also, I saw it was mentioned earlier in the thread but how difficult would it be to process the transformed quals throughthe EquivalenceClass machinery and the qual simplification ? For example, if the target audience of this patch is ORM, or inlined views, it wouldn't surprise me to see queries of thiskind in the wild, which could be avoided altogether: postgres=# explain (costs off) select * from sj s1 join sj s2 on s1.a = s2.a where s1.b = 2 and s2.b =3; QUERY PLAN ----------------------------------------------------- Seq Scan on sj s2 Filter: ((a IS NOT NULL) AND (b = 3) AND (b = 2)) (2 lignes) + for (counter = 0; counter < list_length(*sources);) + { + ListCell *cell = list_nth_cell(*sources, counter); + RestrictInfo *rinfo = castNode(RestrictInfo, lfirst(cell)); + int counter1; + .... + ec->ec_members = list_delete_cell(ec->ec_members, cell); Why don't you use foreach() and foreach_delete_current macros for iterating and removing items in the lists, both in update_ec_membersand update_ec_sources ? + if ((bms_is_member(k, info->syn_lefthand) ^ + bms_is_member(r, info->syn_lefthand)) || + (bms_is_member(k, info->syn_righthand) ^ + bms_is_member(r, info->syn_righthand))) I think this is more compact and easier to follow than the previous version, but I'm not sure how common it is in postgressource code to use that kind of construction ? Some review about the comments: I see you keep using the terms "the keeping relation" and "the removing relation" in reference to the relation that is keptand the one that is removed. Aside from the grammar (the kept relation or the removed relation), maybe it would make it clearer to call them somethingelse. In other parts of the code, you used "the remaining relation / the removed relation" which makes sense. /* * Remove the target relid from the planner's data structures, having - * determined that there is no need to include it in the query. + * determined that there is no need to include it in the query. Or replace + * with another relid. + * To reusability, this routine can work in two modes: delete relid from a plan + * or replace it. It is used in replace mode in a self-join removing process. This could be rephrased: ", optionally replacing it with another relid. The latter is used by the self-join removing process." [1] https://www.postgresql.org/message-id/flat/830269.1656693747%40sss.pgh.pa.us -- Ronan Dunklau
pgsql-hackers by date: