Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Removing unneeded self joins
Date
Msg-id CAPpHfdus-O3Ozpn+tFL8r5KPiedaBYSrLK_3BsvJSC7sVBi+_g@mail.gmail.com
Whole thread Raw
In response to Re: Removing unneeded self joins  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Removing unneeded self joins
List pgsql-hackers
On Fri, Jul 12, 2024 at 1:30 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Fri, Jul 12, 2024 at 6:05 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > On 7/11/24 14:43, jian he wrote:
> > > On Tue, Jul 9, 2024 at 2:06 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > >>
> > >> On 7/2/24 07:25, jian he wrote:
> > >>> to make sure it's correct, I have added a lot of tests,
> > >>> Some of this may be contrived, maybe some of the tests are redundant.
> > >> Thanks for your job!
> > >> I passed through the patches and have some notes:
> > >> 1. Patch 0001 has not been applied anymore since the previous week's
> > >> changes in the core. Also, there is one place with trailing whitespace.
> > >
> > > thanks.
> > > because the previous thread mentioned the EPQ problem.
> > > in remove_useless_self_joins, i make it can only process CMD_SELECT query.
> > I would like to oppose here: IMO, it is just a mishap which we made
> > because of a long history of patch transformations. There we lost the
> > case where RowMark exists for only one of candidate relations.
> > Also, after review I think we don't need so many new tests. Specifically
> > for DML we already have one:
> >
> > EXPLAIN (COSTS OFF)
> > UPDATE sj sq SET b = 1 FROM sj as sz WHERE sq.a = sz.a;
> >
> > And we should just add something to elaborate it a bit.
> > See the patch in attachment containing my proposal to improve v4-0001
> > main SJE patch. I think it resolved the issue with EPQ assertion as well
> > as problems with returning value.
>
> I tried this.  I applied 0001 from [1] and 0002 from [2].  Then I
> tried the concurrent test case [3].  It still fails with assert for
> me.  But assert and related stuff is the least problem.  The big
> problem, as described in [3], is semantical change in query.  When EPQ
> is applied, we fetch the latest tuple of the target relation
> regardless snapshot.  But for the self-joined relation we should still
> use the snapshot-satisfying tuple.  I don't see even attempt to
> address this in your patch.  And as I pointed before, this appears
> quite complex.

Oh, sorry, I used wrong binaries during the check.  My test case works
correctly, because SJE doesn't apply to the target relation.

# explain update test set val = t.val + 1 from test t where test.id = t.id;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Update on test  (cost=60.85..105.04 rows=0 width=0)
   ->  Hash Join  (cost=60.85..105.04 rows=2260 width=16)
         Hash Cond: (test.id = t.id)
         ->  Seq Scan on test  (cost=0.00..32.60 rows=2260 width=10)
         ->  Hash  (cost=32.60..32.60 rows=2260 width=14)
               ->  Seq Scan on test t  (cost=0.00..32.60 rows=2260 width=14)
(6 rows)

Previously, patch rejected applying SJE for result relation, which as
I see now is wrong.  Andrei's patch rejects SJE for target relation on
the base of row marks, which seems correct to me as the first glance.
So, this doesn't change anything regarding my conclusions regarding
applying SJE for target relation.  But the Andrei's patch yet looks
good indeed.

------
Regards,
Alexander Korotkov
Supabase



pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: Pluggable cumulative statistics
Next
From: Said Assemlal
Date:
Subject: Re: CREATE OR REPLACE MATERIALIZED VIEW