Re: Some problems regarding the self-join elimination code - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Some problems regarding the self-join elimination code
Date
Msg-id CAMbWs48dTVhxMeTfKXSvAUCaLKtsgaznDwRCRzxMnJtUob_oeQ@mail.gmail.com
Whole thread Raw
In response to Re: Some problems regarding the self-join elimination code  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Some problems regarding the self-join elimination code
List pgsql-hackers
On Tue, Apr 8, 2025 at 11:12 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On Tue, 8 Apr 2025 at 12:31, Andrei Lepikhov <lepihov@gmail.com> wrote:
> > On 4/4/25 09:37, Richard Guo wrote:
> > > With more exposure to the changes in ChangeVarNodes(), I have some
> > > more concerns.  As I understand it, ChangeVarNodes() is designed to
> > > update the varno fields of Var nodes in the given tree that belong to
> > > the specific relation; neat and clear.  However, now, within this
> > > function, we also create new NullTest quals for SJE, which doesn't
> > > seem like something this function should be handling.  It makes this
> > > function look a bit like a kludge.

> > To be precise, inside the function we replace old clause with the
> > NullTest, because relid replacement action made it degenerate. It seems
> > essential to me and I think it may be ok to add a comment at the top of
> > ChangeVarNodes, describing this minor optimisation.

> I recall raising the same concerns when I originally reviewed the
> patch. I don't think this code belongs in the rewriter, because it's
> very specific to how SJE wants to handle these kinds of nodes.

Correct.  And I don't think it's this function's responsibility to
create new NullTest quals for SJE.

> Note also that RestrictInfo is defined in nodes/pathnodes.h, which
> describes its contents as internal data structures for the planner,
> suggesting that the rewriter has no business processing those kinds of
> nodes.

> I don't think simply adding a comment addresses those concerns.

Agreed.  We may need more than just a comment change.

> > > Additionally, I have a minor suggestion regarding the new boolean
> > > parameter, "change_RangeTblRef".  AFAIU, by convention, we typically
> > > use flags to control the behavior of walker functions, like in
> > > pull_var_clause.  While it's fine to use a boolean parameter here
> > > given that we currently only have one flag, for the sake of future
> > > extensibility, I think using flags might be a better choice.

> > That was exactly why I wasn't happy with replacing the change_relid
> > routine with ChangeVarNodes.
> > But variant with a flag seems non-trivial to implement. Do you have any
> > proposal about the code?

> Perhaps the way to do it is to make ChangeVarNodesExtended() take a
> callback function to be invoked for each node visited. The callback
> (which would then be in the planner, with the other SJE code) would do
> any special handling it needed (for RangeTblRef and RestrictInfo
> nodes), and call ChangeVarNodes_walker() for any other types of node,
> to get the default behaviour.

Yeah, this might be a better approach.  Perhaps we can borrow some
ideas from replace_rte_variables.

Thanks
Richard



pgsql-hackers by date:

Previous
From: James Hunter
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Next
From: Peter Smith
Date:
Subject: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.