Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Removing unneeded self joins
Date
Msg-id CA+TgmoYSRwjeTDeLY+OfMkKZOg0M6bBOh-L9UfZExu0xZVY_bA@mail.gmail.com
Whole thread Raw
In response to Re: Removing unneeded self joins  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Removing unneeded self joins
List pgsql-hackers
On Fri, May 3, 2024 at 4:57 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I agree to revert it for v17, but I'm not exactly sure the issue is
> design (nevertheless design review is very welcome as any other type
> of review).  The experience of the bugs arising with the SJE doesn't
> show me a particular weak spot in the feature.  It looks more like
> this patch has to revise awfully a lot planner data structures to
> replace one relid with another.  And I don't see the design, which
> could avoid that.  Somewhere in the thread I have proposed a concept
> of "alias relids".  However, I suspect that could leave us with more
> lurking bugs instead of the bug-free code.

I agree that reverting it for v17 makes sense. In terms of moving
forward, whether a design review is exactly the right idea or not, I'm
not sure. However, I think that the need to replace relids in a lot of
places is something that a design review might potentially flag as a
problem. Maybe there is some other approach that can avoid the need
for this.

On the other hand, maybe there's not. But in that case, the question
becomes how the patch author(s), and committer, are going to make sure
that most of the issues get flushed out before the initial commit.
What we can't do is say - we know that we need to replace relids in a
bunch of places, so we'll change the ones we know about, and then rely
on testing to find any that we missed. There has to be some kind of
systematic plan that everyone can agree should find all of the
affected places, and then if a few slip through, that's fine, that's
how life goes.

I haven't followed the self-join elimination work very closely, and I
do quite like the idea of the feature. However, looking over all the
follow-up commits, it's pretty hard to escape the conclusion that
there were a lot of cases that weren't adequately considered in the
initial work (lateral, result relations, PHVs, etc.). And that is a
big problem -- it really creates a lot of issues for the project when
a major feature commit misses whole areas that it needs to have
considered, as plenty of previous history will show. When anybody
starts to realize that they've not just had a few goofs but have
missed some whole affected area entirely, it's time to start thinking
about a revert.

One of my most embarrassing gaffes in this area personally was
a448e49bcbe40fb72e1ed85af910dd216d45bad8. I don't know how I managed
to commit the original patch without realizing it was going to cause
an increase in the WAL size, but I can tell you that when I realized
it, my heart sank through the floor. I'd love to return to that work
if we can all ever agree on a way of addressing that problem, but in
the meantime, that patch is very dead. And ... if somebody had taken
the time to give me a really good design review of that patch, they
might well have noticed, and saved me the embarrassment of committing
something that had no shot of remaining in the tree. Unfortunately,
one of the downsides of being a committer is that you tend to get less
of that sort of review, because people assume you know what you're
doing. Which is fabulous, when you actually do know what you're doing,
and really sucks, when you don't. One of the things I'd like to see
discussed at 2024.pgconf.dev is how we can improve this aspect of how
we work together.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Next
From: Peter Eisentraut
Date:
Subject: Re: Tarball builds in the new world order