Re: cataloguing NOT NULL constraints - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: cataloguing NOT NULL constraints
Date
Msg-id 20230831100239.gzjmiovzabx2diuo@alvherre.pgsql
Whole thread Raw
In response to Re: cataloguing NOT NULL constraints  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: cataloguing NOT NULL constraints
Re: cataloguing NOT NULL constraints
List pgsql-hackers
On 2023-Mar-29, Peter Eisentraut wrote:

> On 27.03.23 15:55, Peter Eisentraut wrote:
> > The information schema should be updated.  I think the following views:
> > 
> > - CHECK_CONSTRAINTS
> > - CONSTRAINT_COLUMN_USAGE
> > - DOMAIN_CONSTRAINTS
> > - TABLE_CONSTRAINTS
> > 
> > It looks like these have no test coverage; maybe that could be addressed
> > at the same time.
> 
> Here are patches for this.  I haven't included the expected files for the
> tests; this should be checked again that output is correct or the changes
> introduced by this patch set are as expected.
> 
> The reason we didn't have tests for this before was probably in part because
> the information schema made up names for not-null constraints involving
> OIDs, so the test wouldn't have been stable.
> 
> Feel free to integrate this, or we can add it on afterwards.

I'm eyeing patch 0002 here.  I noticed that in view check_constraints it
defines the not-null constraint definition as substrings over the
pg_get_constraintdef() function[q1], so I wondered whether it might be
better to join to pg_attribute instead.  I see two options:

1. add a scalar subselect in the select list for each constraint [q2]
2. add a LEFT JOIN to pg_attribute to the main FROM list [q3]
   ON con.conrelid=att.attrelid AND con.conkey[1] = con.attrelid

With just the regression test tables in place, these forms are all
pretty much the same in execution time.  I then created 20k tables with
6 columns each and NOT NULL constraint on each column[4].  That's not a
huge amount but it's credible for a medium-size database with a bunch of
partitions (it's amazing what passes for a medium-size database these
days).  I was surprised to find out that q3 (~130ms) is three times
faster than q2 (~390ms), which is in turn more than twice faster than
your proposed q1 (~870ms).  So unless you have another reason to prefer
it, I think we should use q3 here.


In constraint_column_usage, you're adding a relkind to the catalog scan
that goes through pg_depend for CHECK constraints.  Here we can rely on
a simple conkey[1] check and a separate UNION ALL arm[q5]; this is also
faster when there are many tables.

The third view definition looks ok.  It's certainly very nice to be able
to delete XXX comments there.


[q1]
SELECT current_database()::information_schema.sql_identifier AS constraint_catalog,
    rs.nspname::information_schema.sql_identifier AS constraint_schema,
    con.conname::information_schema.sql_identifier AS constraint_name,
        CASE con.contype
            WHEN 'c'::"char" THEN "left"(SUBSTRING(pg_get_constraintdef(con.oid) FROM 8), '-1'::integer)
            WHEN 'n'::"char" THEN SUBSTRING(pg_get_constraintdef(con.oid) FROM 10) || ' IS NOT NULL'::text
                              
 
            ELSE NULL::text
        END::information_schema.character_data AS check_clause
   FROM pg_constraint con
     LEFT JOIN pg_namespace rs ON rs.oid = con.connamespace
     LEFT JOIN pg_class c ON c.oid = con.conrelid
     LEFT JOIN pg_type t ON t.oid = con.contypid
               
 
  WHERE pg_has_role(COALESCE(c.relowner, t.typowner), 'USAGE'::text) AND (con.contype = ANY (ARRAY['c'::"char",
'n'::"char"]));

[q2]
SELECT current_database()::information_schema.sql_identifier AS constraint_catalog,
    rs.nspname::information_schema.sql_identifier AS constraint_schema,
    con.conname::information_schema.sql_identifier AS constraint_name,
        CASE con.contype
            WHEN 'c'::"char" THEN "left"(SUBSTRING(pg_get_constraintdef(con.oid) FROM 8), '-1'::integer)
            WHEN 'n'::"char" THEN FORMAT('CHECK (%s IS NOT NULL)',
                                         (SELECT attname FROM pg_attribute WHERE attrelid = conrelid AND attnum =
conkey[1]))
            ELSE NULL::text
        END::information_schema.character_data AS check_clause
   FROM pg_constraint con
     LEFT JOIN pg_namespace rs ON rs.oid = con.connamespace
     LEFT JOIN pg_class c ON c.oid = con.conrelid
     LEFT JOIN pg_type t ON t.oid = con.contypid
  WHERE pg_has_role(COALESCE(c.relowner, t.typowner), 'USAGE'::text) AND (con.contype = ANY (ARRAY['c'::"char",
'n'::"char"]));

[q3]
SELECT current_database()::information_schema.sql_identifier AS constraint_catalog,
    rs.nspname::information_schema.sql_identifier AS constraint_schema,
    con.conname::information_schema.sql_identifier AS constraint_name,
        CASE con.contype
            WHEN 'c'::"char" THEN "left"(SUBSTRING(pg_get_constraintdef(con.oid) FROM 8), '-1'::integer)
            WHEN 'n'::"char" THEN FORMAT('CHECK (%s IS NOT NULL)', at.attname)
                              
 
            ELSE NULL::text
        END::information_schema.character_data AS check_clause
   FROM pg_constraint con
     LEFT JOIN pg_namespace rs ON rs.oid = con.connamespace
     LEFT JOIN pg_class c ON c.oid = con.conrelid
     LEFT JOIN pg_type t ON t.oid = con.contypid
     LEFT JOIN pg_attribute at ON (con.conrelid = at.attrelid AND con.conkey[1] = at.attnum)
  WHERE pg_has_role(COALESCE(c.relowner, t.typowner), 'USAGE'::text) AND (con.contype = ANY (ARRAY['c'::"char",
'n'::"char"]));

[4]
do $$ begin for i in 0 .. 20000 loop
 execute format('create table t_%s (a1 int not null, a2 int not null, a3 int not null,
    a4 int not null, a5 int not null, a6 int not null);',
   i);
 if i % 1000 = 0 then commit; end if;
end loop; end $$;

[q5]
SELECT CAST(current_database() AS sql_identifier) AS table_catalog,
            CAST(tblschema AS sql_identifier) AS table_schema,
            CAST(tblname AS sql_identifier) AS table_name,
            CAST(colname AS sql_identifier) AS column_name,
            CAST(current_database() AS sql_identifier) AS constraint_catalog,
            CAST(cstrschema AS sql_identifier) AS constraint_schema,
            CAST(cstrname AS sql_identifier) AS constraint_name

     FROM (
         /* check constraints */
         SELECT DISTINCT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname
           FROM pg_namespace nr, pg_class r, pg_attribute a, pg_depend d, pg_namespace nc, pg_constraint c
           WHERE nr.oid = r.relnamespace
             AND r.oid = a.attrelid
             AND d.refclassid = 'pg_catalog.pg_class'::regclass
             AND d.refobjid = r.oid
             AND d.refobjsubid = a.attnum
             AND d.classid = 'pg_catalog.pg_constraint'::regclass
             AND d.objid = c.oid
             AND c.connamespace = nc.oid
             AND c.contype = 'c'
             AND r.relkind IN ('r', 'p')
             AND NOT a.attisdropped

        UNION ALL

        /* not-null constraints */
        SELECT DISTINCT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname
          FROM pg_namespace nr, pg_class r, pg_attribute a, pg_namespace nc, pg_constraint c
          WHERE nr.oid = r.relnamespace
        AND r.oid = a.attrelid
        AND r.oid = c.conrelid
        AND a.attnum = c.conkey[1]
        AND c.connamespace = nc.oid
        AND c.contype = 'n'
        AND r.relkind in ('r', 'p')
        AND not a.attisdropped

         UNION ALL

         /* unique/primary key/foreign key constraints */
         SELECT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname
           FROM pg_namespace nr, pg_class r, pg_attribute a, pg_namespace nc,
                pg_constraint c
           WHERE nr.oid = r.relnamespace
             AND r.oid = a.attrelid
             AND nc.oid = c.connamespace
             AND r.oid = CASE c.contype WHEN 'f' THEN c.confrelid ELSE c.conrelid END
             AND a.attnum = ANY (CASE c.contype WHEN 'f' THEN c.confkey ELSE c.conkey END)
             AND NOT a.attisdropped
             AND c.contype IN ('p', 'u', 'f')
             AND r.relkind IN ('r', 'p')

       ) AS x (tblschema, tblname, tblowner, colname, cstrschema, cstrname)

     WHERE pg_has_role(x.tblowner, 'USAGE') ;

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: cataloguing NOT NULL constraints
Next
From: Heikki Linnakangas
Date:
Subject: Re: Sync scan & regression tests