Reduce planning time for large NOT IN lists containing NULL - Mailing list pgsql-hackers

From Ilia Evdokimov
Subject Reduce planning time for large NOT IN lists containing NULL
Date
Msg-id eaa2598c-5356-4e1e-9ec3-5fd6eb1cd704@tantorlabs.com
Whole thread Raw
Responses Re: Reduce planning time for large NOT IN lists containing NULL
List pgsql-hackers
Hi hackers,

In this thread [0] an interesting idea came up about avoiding 
unnecessary work during selectivity estimation for x <> ALL (NULL, ...) 
or x NOT IN (NULL, ...)

Semantically, if the array contains at least one NULL, the selectivity 
of x NOT IN (...) is always 0.0, regardless of the other elements in the 
list.

Currently, the planner still iterates over all array elements and 
invokes the operator's selectivity estimator for each of them. For large 
IN / ALL lists, this increases planning time.

For constant arrays I propose adding a simple pre-check before entering 
the per-element loop: detect whether the array contains at least one 
NULL element (e.g., via memchr() for the deconstructed array case). If 
so, and we are in the ALL / NOT IN case, we can immediately return 
selectivity = 0.0 and skip all further computation. This would avoid 
extra per-element estimation work while preserving semantics.

In cases where array elements are not known to be constants in advance, 
such a pre-check is less straightforward, because each element must 
first be inspected to determine whether it is a Const and whether it is 
NULL. That already requires iterating through the list, so introducing a 
separate early pass would not actually reduce the amount of work. 
Therefore, it like makes sense to keep the current behavior in that 
situation.

Thoughts?

[0]: 
https://www.postgresql.org/message-id/CAHza6qdAH99C0xa27YDTixiNVFa99j90QYquUPAxL0JwahmggA%40mail.gmail.com

-- 
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

Attachment

pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Next
From: Aleksander Alekseev
Date:
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions