[Patch] optimizer - simplify $VAR1 IS NULL AND $VAR1 IS NOT NULL - Mailing list pgsql-hackers

From Pierre Ducroquet
Subject [Patch] optimizer - simplify $VAR1 IS NULL AND $VAR1 IS NOT NULL
Date
Msg-id 5177016.MnsixQzrC5@peanuts2
Whole thread Raw
Responses Re: [Patch] optimizer - simplify $VAR1 IS NULL AND $VAR1 IS NOT NULL  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
Hello

In several queries relying on views, I noticed that the optimizer miss a quite
simple to implement optimization. My views contain several branches, with
different paths that are simplified by the caller of the view. This
simplification is based on columns to be null or not.

Today, even with a single table, the following (silly) query is not optimized
away:
    SELECT * FROM test WHERE a IS NULL AND a IS NOT NULL;

In more complex cases, it of course isn't any better:
    SELECT * FROM (
        SELECT a, NULL::integer AS b FROM foo
      UNION ALL
        SELECT a, b FROM bar WHERE b IS NOT NULL
    ) WHERE a = 1 AND b IS NULL;

The attached patch handles both situations. When flattening and simplifying
the AND clauses, a list of the NullChecks is built, and subsequent NullChecks
are compared to the list. If opposite NullChecks on the same variable are
found, the whole AND is optimized away.
This lead to nice boosts, since instead of having 'never executed' branches,
the optimizer can go even further. Right now, the algorithmic complexity of
this optimization is not great: it is in O(n²), with n being the number of
NullCheck in a given AND clause. But compared to the possible benefits, and
the very low risk of n being high enough to have a real planification-time
impact, I feel this optimization would be worth it.


Regards

 Pierre

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: tableam vs. TOAST
Next
From: Andrew Gierth
Date:
Subject: Re: [Patch] optimizer - simplify $VAR1 IS NULL AND $VAR1 IS NOT NULL