Thread: Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

On Fri, Jan  3, 2025 at 01:39:44PM +0000, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/17/sql-createdomain.html
> Description:
> 
> The manual claims:
> 
> The syntax NOT NULL in this command is a PostgreSQL extension. (A
> standard-conforming way to write the same would be CHECK (VALUE IS NOT
> NULL). […])
> 
> But both variants differ when composite types are involved:
> 
> CREATE TYPE complex AS (real float8, imag float8);
> 
> CREATE DOMAIN d1 AS complex NOT NULL;
> CREATE DOMAIN d2 AS complex CHECK (VALUE IS NOT NULL);
> 
> SELECT '(,)'::d1; -- allowed
> SELECT '(,)'::d2; -- not allowed

(Theead moved to hackers since there are correctness questions here.)

Wow, I dug into this and found even more problems.  First, your examples
in master return what you reported:

    CREATE TYPE complex AS (real float8, imag float8);

    CREATE DOMAIN d1 AS complex NOT NULL;
    CREATE DOMAIN d2 AS complex CHECK (VALUE IS NOT NULL);

-->    SELECT '(,)'::d1; -- allowed
     d1
    -----
     (,)

    SELECT '(,)'::d2; -- not allowed
    ERROR:  value for domain d2 violates check constraint "d2_check"

I added some tests without DOMAINs and found further problems,
specifically the ability to put a value that tests IS NULL as true in a
NOT NULL composite-type column, and not honoring WHERE col IS NULL
tests:

    CREATE TABLE complex_test (col complex NOT NULL);

    -- accepts NULL in a NOT NULL column
    INSERT INTO complex_test VALUES ('(,)');

    -- proof it is NULL
    SELECT col, col IS NULL FROM complex_test;
     col | ?column?
    -----+----------
     (,) | t
    
    -- NOT NULL column returns NULL value
    SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
     col | ?column?
    -----+----------
     (,) | t
    
    EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
                               QUERY PLAN
    -----------------------------------------------------------------
     Seq Scan on complex_test  (cost=0.00..23.60 rows=1360 width=33)
    
    -- IS NULL does not return NULL value
    SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
     col | ?column?
    -----+----------
    
    -- optimization in PG 17 prevents any comparison to NULL
    EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
                    QUERY PLAN
    -------------------------------------------
     Result  (cost=0.00..0.00 rows=0 width=33)
       One-Time Filter: false

With the optimizations in PG 17 for NULL checks:

    https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-OPTIMIZER
-->    Remove IS NOT NULL restrictions from queries on NOT NULL columns and
-->    eliminate scans on NOT NULL columns if IS NULL is specified.

I see different output in pre-PG 17, so I would say this got worse in PG
17+ because I think the IS NULL and IS NOT NULL are being removed during
optimization.  Notice the IS [NOT] NULL checks that appear in the
EXPLAIN output below, but not above:


    SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
     col | ?column?
    -----+----------
    
    EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
                               QUERY PLAN
    -----------------------------------------------------------------
     Seq Scan on complex_test  (cost=0.00..23.60 rows=1353 width=33)
       Filter: (col IS NOT NULL)
    
    SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
     col | ?column?
    -----+----------
     (,) | t
    
    EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
                              QUERY PLAN
    --------------------------------------------------------------
     Seq Scan on complex_test  (cost=0.00..23.60 rows=7 width=33)
       Filter: (col IS NULL)

I think this needs some serious research.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Bruce Momjian <bruce@momjian.us> writes:
> I think this needs some serious research.

We've discussed this topic before.  The spec's definition of IS [NOT]
NULL for composite values is bizarre to say the least.  I think
there's been an intentional choice to keep most NOT NULL checks
"simple", that is we look at the overall value's isnull bit and
don't probe any deeper than that.

If the optimizations added in v17 changed existing behavior,
I agree that's bad.  We should probably fix it so that those
are only applied when argisrow is false.

            regards, tom lane