Re: Improving worst-case merge join performance with often-null foreign key - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Improving worst-case merge join performance with often-null foreign key
Date
Msg-id CAMbWs49W7wq-9aFh6fX30mVabCt9c0XEEHd_LfBehFF54_QjWw@mail.gmail.com
Whole thread Raw
In response to Re: Improving worst-case merge join performance with often-null foreign key  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Improving worst-case merge join performance with often-null foreign key
Re: Improving worst-case merge join performance with often-null foreign key
List pgsql-hackers

On Sat, Apr 22, 2023 at 11:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Steinar Kaldager <steinar.kaldager@oda.com> writes:
> First-time potential contributor here. We recently had an incident due
> to a sudden 1000x slowdown of a Postgres query (from ~10ms to ~10s)
> due to a join with a foreign key that was often null. We found that it
> was caused by a merge join with an index scan on one join path --
> whenever the non-null data happened to be such that the merge join
> couldn't be terminated early, the index would proceed to scan all of
> the null rows and filter each one out individually. Since this was an
> inner join, this was pointless; the nulls would never have matched the
> join clause anyway.

Hmm.  I don't entirely understand why the existing stop-at-nulls logic
in nodeMergejoin.c didn't fix this for you.  Maybe somebody has broken
that?  See the commentary for MJEvalOuterValues/MJEvalInnerValues.

I think it's just because the MergeJoin didn't see a NULL foo_id value
from test_bar tuples because all such tuples are removed by the filter
'test_bar.active', thus it does not have a chance to stop at nulls.

# select count(*) from test_bar where foo_id is null and active;
 count
-------
     0
(1 row)

Instead, the index scan on test_bar will have to scan all the tuples
with NULL foo_id because none of them satisfies the qual clause.

Thanks
Richard

pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Improve list manipulation in several places
Next
From: tender wang
Date:
Subject: Re: Improve list manipulation in several places