Re: Removing unneeded self joins - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: Removing unneeded self joins |
Date | |
Msg-id | CAPpHfdsj1WdqoCg+g4oq1XQ_=KOj0U4pXqhauBc7e+0msHu66A@mail.gmail.com Whole thread Raw |
In response to | Re: Removing unneeded self joins (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Responses |
Re: Removing unneeded self joins
Re: Removing unneeded self joins |
List | pgsql-hackers |
On Fri, Jul 19, 2024 at 11:30 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On Wed, 17 Jul 2024 at 01:45, Alexander Korotkov <aekorotkov@gmail.com> wrote: > > > > Jian He gave a try to ChangeVarNodes() [1]. That gives some > > improvement, but the vast majority of complexity is still here. I > > think the reason for complexity of SJE is that it's the first time we > > remove relation, which is actually *used* and therefore might has > > references in awful a lot of places. In previous cases we removed > > relations, which were actually unused. > > I had a quick look at this, and I have a couple of comments on the > rewriter changes. > > The new function replace_relid() looks to be the same as adjust_relid_set(). They are similar, not the same. replace_relid() has handling for negative newId, while adjust_relid_set() hasn't. One thing I'd like to borrow from adjust_relid_set() to replace_relid() is the usage of IS_SPECIAL_VARNO() macro. It would be probably nice to move this logic into bms_replace_member() residing at bitmapset.c. What do you think? > The changes to ChangeVarNodes() look a little messy. There's a lot of > code duplicated between ChangeVarNodesExtended() and ChangeVarNodes(), > which could be avoided by having one call the other. Also, it would be > better for ChangeVarNodesExtended() to have a "flags" parameter > instead of an extra boolean parameter, to make it more extensible in > the future. However,... I certainly didn't mean to have duplicate functions ChangeVarNodesExtended() and ChangeVarNodes(). I mean ChangeVarNodes() should just call ChangeVarNodesExtended(). > I question whether ChangeVarNodesExtended() and the changes to > ChangeVarNodes() are really the right way to go about this. > ChangeVarNodes() in particular gains a lot more logic to handle > RestrictInfo nodes that doesn't really feel like it belongs there -- > e.g., building NullTest nodes is really specific to SJE, and doesn't > seem like it's something ChangeVarNodes() should be doing. > > A better solution might be to add a new walker function to > analyzejoins.c that does just what SJE needs, which is different from > ChangeVarNodes() in a number of ways. For Var nodes, it might > ultimately be necessary to do more than just change the varno, to > solve the RETURNING/EPQ problems. For RestrictInfo nodes, there's a > lot of SJE-specific logic. The SJE code wants to ignore RangeTblRef > nodes, but it could delegate to ChangeVarNodes() for various other > node types to avoid code duplication. At the top level, the stuff that > ChangeVarNodes() does to fields of the Query struct would be different > for SJE, I think. We initially didn't use ChangeVarNodes() in SJE at all. See the last patch version without it [1]. We're trying to address Tom Lane's proposal to re-use more of existing tree-manipulation infrastructure [2]. I agree with you that the case with ChangeVarNodes() looks questionable. Do you have other ideas how we can re-use some more of existing tree-manipulation infrastructure in SJE? Links 1. https://www.postgresql.org/message-id/55f680bc-756d-4dd3-ab27-3c6e663b0e4c%40postgrespro.ru 2. https://www.postgresql.org/message-id/3622801.1715010885%40sss.pgh.pa.us ------ Regards, Alexander Korotkov Supabase
Attachment
pgsql-hackers by date: