Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Removing unneeded self joins
Date
Msg-id CAPpHfdvTHuhB5+Eohb-z7zWM6u9Y=W6vb8MMOqhfKeRkhVZGxg@mail.gmail.com
Whole thread Raw
In response to Re: Removing unneeded self joins  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: Removing unneeded self joins
List pgsql-hackers
Hi, Andrei!

On Fri, Jul 12, 2024 at 6:05 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 7/11/24 14:43, jian he wrote:
> > On Tue, Jul 9, 2024 at 2:06 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> >>
> >> On 7/2/24 07:25, jian he wrote:
> >>> to make sure it's correct, I have added a lot of tests,
> >>> Some of this may be contrived, maybe some of the tests are redundant.
> >> Thanks for your job!
> >> I passed through the patches and have some notes:
> >> 1. Patch 0001 has not been applied anymore since the previous week's
> >> changes in the core. Also, there is one place with trailing whitespace.
> >
> > thanks.
> > because the previous thread mentioned the EPQ problem.
> > in remove_useless_self_joins, i make it can only process CMD_SELECT query.
> I would like to oppose here: IMO, it is just a mishap which we made
> because of a long history of patch transformations. There we lost the
> case where RowMark exists for only one of candidate relations.
> Also, after review I think we don't need so many new tests. Specifically
> for DML we already have one:
>
> EXPLAIN (COSTS OFF)
> UPDATE sj sq SET b = 1 FROM sj as sz WHERE sq.a = sz.a;
>
> And we should just add something to elaborate it a bit.
> See the patch in attachment containing my proposal to improve v4-0001
> main SJE patch. I think it resolved the issue with EPQ assertion as well
> as problems with returning value.

I tried this.  I applied 0001 from [1] and 0002 from [2].  Then I
tried the concurrent test case [3].  It still fails with assert for
me.  But assert and related stuff is the least problem.  The big
problem, as described in [3], is semantical change in query.  When EPQ
is applied, we fetch the latest tuple of the target relation
regardless snapshot.  But for the self-joined relation we should still
use the snapshot-satisfying tuple.  I don't see even attempt to
address this in your patch.  And as I pointed before, this appears
quite complex.

Links.
1. https://www.postgresql.org/message-id/96250a42-20e3-40f0-9d45-f53ae852f8ed%40gmail.com
2. https://www.postgresql.org/message-id/5b49501c-9cb3-4c5d-9d56-49704ff08143%40gmail.com
3. https://www.postgresql.org/message-id/CAPpHfduM6X82ExT0r9UzFLJ12wOYPvRw5vT2Htq0gAPBgHhKeQ%40mail.gmail.com

------
Regards,
Alexander Korotkov
Supabase



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Flush pgstats file during checkpoints
Next
From: Alexander Pyhalov
Date:
Subject: Partition-wise join with whole row vars