Re: Pulling up sublink may break join-removal logic - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Pulling up sublink may break join-removal logic
Date
Msg-id CAMbWs4_K+H2SYzNPsMFXPqaRyP0Hyy+ZbnL0H0mwrBNZv1Zeyg@mail.gmail.com
Whole thread Raw
In response to Re: Pulling up sublink may break join-removal logic  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Pulling up sublink may break join-removal logic
List pgsql-hackers

On Wed, Apr 29, 2020 at 8:23 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 28 Apr 2020 at 19:04, Richard Guo <guofenglinux@gmail.com> wrote:
> I happened to notice $subject and not sure if it's an issue or not. When
> we're trying to remove a LEFT JOIN, one of the requirements is the inner
> side needs to be a single baserel. If there is a join qual that is a
> sublink and can be converted to a semi join with the inner side rel, the
> inner side would no longer be a single baserel and as a result the LEFT
> JOIN can no longer be removed.

I think, in theory at least, that can be fixed by [1], where we no
longer rely on looking to see if the RelOptInfo has a unique index to
determine if the relation can duplicate outer side rows during the
join. Of course, they'll only exist on base relations, so hence the
check you're talking about. Instead, the patch's idea is to propagate
uniqueness down the join tree in the form of UniqueKeys.

Do you mean we're tracking the uniqueness of each RelOptInfo, baserel or
joinrel, with UniqueKeys? I like the idea!
 

A quick glance shows there are a few implementation details of join
removals of why the removal still won't work with [1].  For example,
the singleton rel check causes it to abort both on the pre-check and
the final join removal check.  There's also the removal itself that
assumes we're just removing a single relation. I'd guess that would
need to loop over the min_righthand relids with a bms_next_member loop
and remove each base rel one by one.  I'd need to look in more detail
to know if there are any other limiting factors there.

Yeah, we'll have to teach remove_useless_joins to work with multiple
relids.

Thanks
Richard

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: Avoiding hash join batch explosions with extreme skew and weird stats
Next
From: Andy Fan
Date:
Subject: Re: Pulling up sublink may break join-removal logic