Improving worst-case merge join performance with often-null foreign key - Mailing list pgsql-hackers

From Steinar Kaldager
Subject Improving worst-case merge join performance with often-null foreign key
Date
Msg-id CANcDffdhmnt+6mB0G+=oF_1tGbr+2RkRokjjORxG67BYjiKTkw@mail.gmail.com
Whole thread Raw
Responses Re: Improving worst-case merge join performance with often-null foreign key  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

First-time potential contributor here. We recently had an incident due
to a sudden 1000x slowdown of a Postgres query (from ~10ms to ~10s)
due to a join with a foreign key that was often null. We found that it
was caused by a merge join with an index scan on one join path --
whenever the non-null data happened to be such that the merge join
couldn't be terminated early, the index would proceed to scan all of
the null rows and filter each one out individually. Since this was an
inner join, this was pointless; the nulls would never have matched the
join clause anyway.

Test script to reproduce + example explain output:
https://paste.depesz.com/s/VUj

Once you're aware of it for a given index, this is a solvable issue.
We solved it by adding a partial index. Including an IS NOT NULL
clause in the query also seems to solve it.

However, I've gotten the notion that it should be possible to solve
this on the Postgres side, with no user action required. When doing an
inner-join merge join with a normal equality operator, we should
already "know" that we don't have to look at rows where the join keys
are null, since any such comparison involving NULL will not return
true. If we can just push that knowledge down to the index scan node,
then we should be able to avoid this entire problem, leaving one less
performance trap to trip people up.

Proof-of-concept patch (please ignore style):
https://github.com/steinarvk-oda/postgres/pull/1/files

I have a few questions:

(1) Does this approach seem reasonable and worthwhile?
(2) Can I determine programmatically at query planning time whether
the binary operator in an OpExpr has the property that all comparisons
involving nulls will be non-true? Or, failing that, can I at least
somehow hardcode and identify the built-in equality operators (which
have this property)? (Integer equality and UUID equality probably
covers a lot when it comes to joins.)
(3) Is there a systematic way to check whether adding an "IS NOT NULL"
condition would be redundant? E.g. if there is such a check in the
query already, adding another is just messy.
(4) Should this sort of thing be made conditional on a high null_frac?
Or is that unnecessary complexity, and the simplicity of just always
doing it would be better?

Thanks!
Steinar Kaldager



pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Improve list manipulation in several places
Next
From: Jeff Davis
Date:
Subject: Re: Order changes in PG16 since ICU introduction