Re: Removing unneeded self joins - Mailing list pgsql-hackers
From | jian he |
---|---|
Subject | Re: Removing unneeded self joins |
Date | |
Msg-id | CACJufxFOuWQecXxWdp1RQnhoznqaDYo7jBRLOHmr7HG6hbugyg@mail.gmail.com Whole thread Raw |
In response to | Re: Removing unneeded self joins (Alexander Korotkov <aekorotkov@gmail.com>) |
Responses |
Re: Removing unneeded self joins
|
List | pgsql-hackers |
On Wed, Jul 3, 2024 at 11:39 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Mon, Jun 17, 2024 at 3:00 AM jian he <jian.universality@gmail.com> wrote: > > On Mon, May 6, 2024 at 11:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > Robert Haas <robertmhaas@gmail.com> writes: > > > > I want to go on record right now as disagreeing with the plan proposed > > > > in the commit message for the revert commit, namely, committing this > > > > again early in the v18 cycle. I don't think Tom would have proposed > > > > reverting this feature unless he believed that it had more serious > > > > problems than could be easily fixed in a short period of time. I think > > > > that concern is well-founded, given the number of fixes that were > > > > committed. It seems likely that the patch needs significant rework and > > > > stabilization before it gets committed again, and I think it shouldn't > > > > be committed again without explicit agreement from Tom or one of the > > > > other committers who have significant experience with the query > > > > planner. > > > > > > FWIW I accept some of the blame here, for not having paid any > > > attention to the SJE work earlier. I had other things on my mind > > > for most of last year, and not enough bandwidth to help. > > > > > > The main thing I'd like to understand before we try this again is > > > why SJE needed so much new query-tree-manipulation infrastructure. > > > I would have expected it to be very similar to the left-join > > > elimination we do already, and therefore to mostly just share the > > > existing infrastructure. (I also harbor suspicions that some of > > > the new code existed just because someone didn't research what > > > was already there --- for instance, the now-removed replace_varno > > > sure looks like ChangeVarNodes should have been used instead.) > > > > > > > i have looked around the code. > > about replace_varno and ChangeVarNodes: > > > > ChangeVarNodes > > have > > ```` > > if (IsA(node, RangeTblRef)) > > { > > RangeTblRef *rtr = (RangeTblRef *) node; > > > > if (context->sublevels_up == 0 && > > rtr->rtindex == context->rt_index) > > rtr->rtindex = context->new_index; > > /* the subquery itself is visited separately */ > > return false; > > } > > ```` > > if ChangeVarNodes executed the above code in remove_useless_self_joins and > > remove_self_joins_recurse. the joinlist(RangeTblRef) will change from (1,2) > > to (2,2). then later, remove_rel_from_joinlist cannot remove the 1, > > *nremoved will be zero. > > then the below code error branch will be executed. > > ```` > > joinlist = remove_rel_from_joinlist(joinlist, relid, &nremoved); > > if (nremoved != 1) > > elog(ERROR, "failed to find relation %d in joinlist", relid); > > ``` > > Did you manage to overcome this problem in your patch? If not, why do > regression tests pass while this seems to affect pretty much every > self-join removal? If so, how did you do that? > in remove_self_join_rel, i have ```ChangeVarNodes((Node *) root->parse, toRemove->relid, toKeep->relid, 0);``` which will change the joinlist(RangeTblRef) from (1,2) to (2,2). Immediately after this call, I wrote a function (restore_rangetblref) to restore the joinlist as original (1,2). then remove_rel_from_joinlist won't error out. see remove_self_join_rel, restore_rangetblref. Andrei Lepikhov: + /* Replace varno in all the query structures */ + replace_varno((Node *) root->parse, toRemove->relid, toKeep->relid); So Andrei Lepikhov's change didn't touch joinlist, Query->resultRelation, Query->mergeTargetRelation. Then in v3-0002 I tried to make SJE work with UPDATE, i thought it worked well, because ChangeVarNodes also takes care of Query->resultRelation, Query->mergeTargetRelation. then later your EPQ demenonsate shows that's not enough. so, in summary, in v3-0001, by changing all replace_varno to ChangeVarNodes paves ways to make SJE apply to UPDATE/DELETE/MERGE. It's just that we need to reverse some effects of ChangeVarNodes. (restore_rangetblref) > > > Another thing that made me pretty sad was 8c441c082 (Forbid SJE with > > > result relation). While I don't claim that that destroyed the entire > > > use case for SJE, it certainly knocked its usefulness down by many > > > notches, maybe even to the point where it's not worth putting in the > > > effort needed to get it to re-committability. So I think we need to > > > look harder at finding a way around that. Is the concern that > > > RETURNING should return either old or new values depending on which > > > RTE is mentioned? If so, maybe the feature Dean has proposed to > > > allow RETURNING to access old values [1] is a prerequisite to moving > > > forward. Alternatively, perhaps it'd be good enough to forbid SJE > > > only when the non-target relation is actually mentioned in RETURNING. > > > > > It appears you didn't try to address the EPQ problem, which seems to > me even more serious than the RETURNING problem. > > See the following example. > > Session 1 > # create table test (id int primary key, val int); > # insert into test values (1,1); > # begin; > # update test set val = val + 1 where id = 1; > > Session 2 > # update test set val = t.val + 1 from test t where test.id = t.id; > (wait) > > Session 1 > # commit; > current mechanism, in this example context, SJE can translate ```update test set val = t.val + 1 from test t where test.id = t.id;``` as good as to ```update test set val = val + 1```. if we replace it that way, then this example would result val = 3. but without SJE, ```update test set val = t.val + 1 from test t where test.id = t.id;``` will result val = 2. you mentioned the EPQ problem, previously i don't know what that means. now i see, I feel like it is quite challenging to resolve it.
pgsql-hackers by date: