Thread: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Vitaly Burovoy

Date:

08 January 2016, 15:49:31

Hello hackers,

I want to implement a new feature that allows to decrease time when
table is under ExclusiveLock on ALTERing new constraints NOT NULL or
CHECK.

In Postgres9.1 a new feature was implemented [1] for adding PK and
UNIQUE constraints using indexes created concurrently, but constraints
NOT NULL and CHECK still require full seqscan of a table. New CHECK
constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
does seqscan (with RowExclusiveLock, but for big and constantly
updatable table it is still awful).

It is possible to find wrong rows in a table without seqscan if there
is an index with a predicate allows to find such rows. There is no
sense what columns it has since it is enough to check whether
index_getnext for it returns NULL (table is OK) or any tuple (table
has wrong rows).

Index must be BTREE (since it is not supposed to hold any data), valid
and has a predicate depending on a constraint type:
* for NOT NULL constraint predicate must be "(col) IS NULL";
* for CHECK constraint predicate must be "(expr) IS DISTINCT FROM
TRUE" (to cover both "(expr) IS NULL" and "NOT(expr)" cases).

I propose the next syntax ("action" in "ALTER TABLE"):
ALTER [ COLUMN ] column_name SET NOT NULL [ VERIFY USING INDEX index_name ]

and (all rows except the last one was got from a documentation[2])
ADD CONSTRAINT constraint_name   CHECK ( expression ) [ NO INHERIT ]   [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY
DEFERRED| INITIALLY IMMEDIATE ]   [ VERIFY USING INDEX index_name ]
 

The new nonreserved keyword "VERIFY" is necessary to avoid people be
confused what "USING INDEX" really does, and to show the index is not
needed after DDL finishes (verifies table) correctly.

I wasn't succeed in my search for similar feature (and such syntax) in
an SQL standard or in any existing DBMS.

I've done some research to be familiar with the source code (and be
ready to get advice) and have a working version for CHECK constraint.

My patch has a WIP state since I don't know how the community will
meet proposal feature/syntax, and some work is necessary for NOT NULL,
documentation and psql.

So I'm ready for a discussion.


P.S.: for NOT NULL it'll allow to do something like:

ALTER TABLE tablename ADD COLUMN newcol data_type;
ALTER TABLE tablename ALTER COLUMN newcol SET DEFAULT default_expr;
ALTER TABLE tablename ADD CONSTRAINT tablename_nonnull_chk CHECK
(newcol IS NOT NULL) NOT VALID;  --optional

CREATE INDEX CONCURRENTLY tablename_chk   ON tablename (id DESC)  -- PK col(s), "DESC" is to begin from the
oldest rows   WHERE newcol IS NULL;

-- query for repeat
UPDATE tablename SET newcol=DEFAULT
WHERE id IN (   SELECT id FROM tablename WHERE newcol IS NULL   ORDER BY id DESC LIMIT 100000   FOR UPDATE SKIP LOCKED
);
-- repeat above command until 0 rows is affected.

ALTER TABLE tablename ALTER COLUMN newcol SET NOT NULL VERIFY USING
INDEX tablename_chk;

DROP INDEX CONCURRENTLY tablename_chk;  -- if the command above succeed
ALTER TABLE tablename DROP CONSTRAINT tablename_nonnull_chk;  --optional

[1] http://www.postgresql.org/docs/9.1/static/release-9-1.html#AEN110279
[2] http://www.postgresql.org/docs/9.5/static/sql-createtable.html

-- 
Best regards,
Vitaly Burovoy

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Simon Riggs

Date:

08 January 2016, 16:00:39

On 8 January 2016 at 12:49, Vitaly Burovoy <vitaly.burovoy@gmail.com> wrote:

In Postgres9.1 a new feature was implemented [1] for adding PK and
UNIQUE constraints using indexes created concurrently, but constraints
NOT NULL and CHECK still require full seqscan of a table. New CHECK
constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
does seqscan (with RowExclusiveLock, but for big and constantly
updatable table it is still awful).

It is possible to find wrong rows in a table without seqscan if there
is an index with a predicate allows to find such rows. There is no
sense what columns it has since it is enough to check whether
index_getnext for it returns NULL (table is OK) or any tuple (table
has wrong rows).

You avoid a full seqscan by creating an index which also does a full seq scan.

How does this help? The lock and scan times are the same.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Vitaly Burovoy

Date:

08 January 2016, 16:13:32

On 1/8/16, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 8 January 2016 at 12:49, Vitaly Burovoy <vitaly.burovoy@gmail.com>
> wrote:
>
>
>> In Postgres9.1 a new feature was implemented [1] for adding PK and
>> UNIQUE constraints using indexes created concurrently, but constraints
>> NOT NULL and CHECK still require full seqscan of a table. New CHECK
>> constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
>> does seqscan (with RowExclusiveLock, but for big and constantly
>> updatable table it is still awful).
>>
>> It is possible to find wrong rows in a table without seqscan if there
>> is an index with a predicate allows to find such rows. There is no
>> sense what columns it has since it is enough to check whether
>> index_getnext for it returns NULL (table is OK) or any tuple (table
>> has wrong rows).
>>
>
> You avoid a full seqscan by creating an index which also does a full seq
> scan.
>
> How does this help? The lock and scan times are the same.
>
> --
> Simon Riggs                http://www.2ndQuadrant.com/
> <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

I avoid not a full seqscan, but a time when table is under
ExclusiveLock: index can be build concurrently without locking table.

-- 
Best regards,
Vitaly Burovoy

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Simon Riggs

Date:

08 January 2016, 16:28:20

On 8 January 2016 at 13:13, Vitaly Burovoy <vitaly.burovoy@gmail.com> wrote:

On 1/8/16, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 8 January 2016 at 12:49, Vitaly Burovoy <vitaly.burovoy@gmail.com>
> wrote:
>
>
>> In Postgres9.1 a new feature was implemented [1] for adding PK and
>> UNIQUE constraints using indexes created concurrently, but constraints
>> NOT NULL and CHECK still require full seqscan of a table. New CHECK
>> constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
>> does seqscan (with RowExclusiveLock, but for big and constantly
>> updatable table it is still awful).
>>
>> It is possible to find wrong rows in a table without seqscan if there
>> is an index with a predicate allows to find such rows. There is no
>> sense what columns it has since it is enough to check whether
>> index_getnext for it returns NULL (table is OK) or any tuple (table
>> has wrong rows).
>>
>
> You avoid a full seqscan by creating an index which also does a full seq
> scan.
>
> How does this help? The lock and scan times are the same.

I avoid not a full seqscan, but a time when table is under
ExclusiveLock: index can be build concurrently without locking table.

That is exactly what ADD ...NOT VALID and VALIDATE already does, as of 9.4.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Vitaly Burovoy

Date:

08 January 2016, 17:15:00

On 1/8/16, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 8 January 2016 at 13:13, Vitaly Burovoy <vitaly.burovoy@gmail.com>
> wrote:
>
>> On 1/8/16, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > On 8 January 2016 at 12:49, Vitaly Burovoy <vitaly.burovoy@gmail.com>
>> > wrote:
>> >
>> >
>> >> In Postgres9.1 a new feature was implemented [1] for adding PK and
>> >> UNIQUE constraints using indexes created concurrently, but constraints
>> >> NOT NULL and CHECK still require full seqscan of a table. New CHECK
>> >> constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
>> >> does seqscan (with RowExclusiveLock, but for big and constantly
>> >> updatable table it is still awful).
>> >>
>> >> It is possible to find wrong rows in a table without seqscan if there
>> >> is an index with a predicate allows to find such rows. There is no
>> >> sense what columns it has since it is enough to check whether
>> >> index_getnext for it returns NULL (table is OK) or any tuple (table
>> >> has wrong rows).
>> >>
>> >
>> > You avoid a full seqscan by creating an index which also does a full
>> > seq
>> > scan.
>> >
>> > How does this help? The lock and scan times are the same.
>>
>> I avoid not a full seqscan, but a time when table is under
>> ExclusiveLock: index can be build concurrently without locking table.
>
>
> That is exactly what ADD ...NOT VALID  and VALIDATE already does, as of
> 9.4.
>
> --
> Simon Riggs                http://www.2ndQuadrant.com/
> <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Hmm... It really does. I was confused by a line in ATExecValidateConstraint
conrel = heap_open(ConstraintRelationId, RowExclusiveLock);

but validateCheckConstraint doesn't do anything for applying the lock to a row.

What about SET NOT NULL constraints? There is no VALIDATE CONSTRAINT for it.
-- 
Best regards,
Vitaly Burovoy

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Simon Riggs

Date:

08 January 2016, 17:32:43

On 8 January 2016 at 14:14, Vitaly Burovoy <vitaly.burovoy@gmail.com> wrote:

What about SET NOT NULL constraints? There is no VALIDATE CONSTRAINT for it.

Sounds like a useful addition.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Alvaro Herrera

Date:

08 January 2016, 17:39:41

Simon Riggs wrote:
> On 8 January 2016 at 14:14, Vitaly Burovoy <vitaly.burovoy@gmail.com> wrote:
> 
> > What about SET NOT NULL constraints? There is no VALIDATE CONSTRAINT for
> > it.
> 
> Sounds like a useful addition.

Yes.  In order to make it a reality you need to make the NOT NULL
constraints appear in pg_constraint.  Years ago I wrote a patch to do
that, which was very close to done.  It would be really cool if Vitaly
or someone else could look into that patch, update it and get it ready
for commit.

If someone is interested, I can send the patch along.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Vitaly Burovoy

Date:

09 January 2016, 05:41:34

On 1/8/16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Simon Riggs wrote:
>> On 8 January 2016 at 14:14, Vitaly Burovoy <vitaly.burovoy@gmail.com>
>> wrote:
>>
>> > What about SET NOT NULL constraints? There is no VALIDATE CONSTRAINT
>> > for
>> > it.
>>
>> Sounds like a useful addition.
>
> Yes.  In order to make it a reality you need to make the NOT NULL
> constraints appear in pg_constraint.  Years ago I wrote a patch to do
> that, which was very close to done.  It would be really cool if Vitaly
> or someone else could look into that patch, update it and get it ready
> for commit.
>
> If someone is interested, I can send the patch along.
>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

I guess you are talking about the other thread[1].
I'm not sure I have enough experience in Postgres hacking to start
working on it right now, but I'll have a look.
IMO the best way is to raise that topic by a letter with summary what
troubles are left there (in addition to a rebasing one).


[1] http://www.postgresql.org/message-id/flat/20110707213401.GA27098@alvh.no-ip.org

--
Best regards,
Vitaly Burovoy

Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"

From

Alvaro Herrera

Date:

09 January 2016, 06:00:32

Vitaly Burovoy wrote:

> I guess you are talking about the other thread[1].
> I'm not sure I have enough experience in Postgres hacking to start
> working on it right now, but I'll have a look.
> IMO the best way is to raise that topic by a letter with summary what
> troubles are left there (in addition to a rebasing one).
>
>
> [1] http://www.postgresql.org/message-id/flat/20110707213401.GA27098@alvh.no-ip.org

Here's a newer thread:
https://www.postgresql.org/message-id/1343682669-sup-2532%40alvh.no-ip.org
which includes some points needing additional work.

In my local tree I have a branch that maybe is not the same as the last
patch I posted in that thread, because the last commit has a newer date.
Here I attach what that branch has.  It applies cleanly on top of
03bda4535ee119d3.  I don't remember if it was in compilable state at the
time I abandoned it.

It needs some work to rebase, but from a quick experimental "git merge"
I just ran, it's not too bad.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

catalog-notnull-2.patch