New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX" - Mailing list pgsql-hackers

From Vitaly Burovoy
Subject New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"
Date
Msg-id CAKOSWN=DOVc0Y_+b+OZ8rGPWyebb6P9-U-FzZP1zpMz_D9frjw@mail.gmail.com
Whole thread Raw
Responses Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Hello hackers,

I want to implement a new feature that allows to decrease time when
table is under ExclusiveLock on ALTERing new constraints NOT NULL or
CHECK.

In Postgres9.1 a new feature was implemented [1] for adding PK and
UNIQUE constraints using indexes created concurrently, but constraints
NOT NULL and CHECK still require full seqscan of a table. New CHECK
constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
does seqscan (with RowExclusiveLock, but for big and constantly
updatable table it is still awful).

It is possible to find wrong rows in a table without seqscan if there
is an index with a predicate allows to find such rows. There is no
sense what columns it has since it is enough to check whether
index_getnext for it returns NULL (table is OK) or any tuple (table
has wrong rows).

Index must be BTREE (since it is not supposed to hold any data), valid
and has a predicate depending on a constraint type:
* for NOT NULL constraint predicate must be "(col) IS NULL";
* for CHECK constraint predicate must be "(expr) IS DISTINCT FROM
TRUE" (to cover both "(expr) IS NULL" and "NOT(expr)" cases).

I propose the next syntax ("action" in "ALTER TABLE"):
ALTER [ COLUMN ] column_name SET NOT NULL [ VERIFY USING INDEX index_name ]

and (all rows except the last one was got from a documentation[2])
ADD CONSTRAINT constraint_name   CHECK ( expression ) [ NO INHERIT ]   [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY
DEFERRED| INITIALLY IMMEDIATE ]   [ VERIFY USING INDEX index_name ]
 

The new nonreserved keyword "VERIFY" is necessary to avoid people be
confused what "USING INDEX" really does, and to show the index is not
needed after DDL finishes (verifies table) correctly.

I wasn't succeed in my search for similar feature (and such syntax) in
an SQL standard or in any existing DBMS.

I've done some research to be familiar with the source code (and be
ready to get advice) and have a working version for CHECK constraint.

My patch has a WIP state since I don't know how the community will
meet proposal feature/syntax, and some work is necessary for NOT NULL,
documentation and psql.

So I'm ready for a discussion.


P.S.: for NOT NULL it'll allow to do something like:

ALTER TABLE tablename ADD COLUMN newcol data_type;
ALTER TABLE tablename ALTER COLUMN newcol SET DEFAULT default_expr;
ALTER TABLE tablename ADD CONSTRAINT tablename_nonnull_chk CHECK
(newcol IS NOT NULL) NOT VALID;  --optional

CREATE INDEX CONCURRENTLY tablename_chk   ON tablename (id DESC)  -- PK col(s), "DESC" is to begin from the
oldest rows   WHERE newcol IS NULL;

-- query for repeat
UPDATE tablename SET newcol=DEFAULT
WHERE id IN (   SELECT id FROM tablename WHERE newcol IS NULL   ORDER BY id DESC LIMIT 100000   FOR UPDATE SKIP LOCKED
);
-- repeat above command until 0 rows is affected.

ALTER TABLE tablename ALTER COLUMN newcol SET NOT NULL VERIFY USING
INDEX tablename_chk;

DROP INDEX CONCURRENTLY tablename_chk;  -- if the command above succeed
ALTER TABLE tablename DROP CONSTRAINT tablename_nonnull_chk;  --optional

[1] http://www.postgresql.org/docs/9.1/static/release-9-1.html#AEN110279
[2] http://www.postgresql.org/docs/9.5/static/sql-createtable.html

-- 
Best regards,
Vitaly Burovoy



pgsql-hackers by date:

Previous
From: Rahila Syed
Date:
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Next
From: Simon Riggs
Date:
Subject: Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"