Thread: Why we allow CHECK constraint contradiction?

Why we allow CHECK constraint contradiction?

From

"Imai, Yoshikazu"

Date:

10 October 2018, 05:25:27

Hi, all.

I have a wonder about the behaviour of creating table which has a constraint
contradiction.

I created below table.

bugtest=# create table ct (a int, CHECK(a is not null and a >= 0 and a < 100 and a >= 200 and a < 300));
bugtest=# \d+ ct
                                    Table "public.ct"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+---------+--------------+-------------
 a      | integer |           |          |         | plain   |              | 
Check constraints:
    "ct_a_check" CHECK (a IS NOT NULL AND a >= 0 AND a < 100 AND a >= 200 AND a < 300)


Are there any rows which can satisfy the ct's CHECK constraint? If not, why we
allow creating table when check constraint itself is contradicted?


I originally noticed this while creating partitioned range table as below.

bugtest=# create table rt (a int) partition by range (a);
bugtest=# create table rt_sub1 partition of rt for values from (0) to (100) partition by range (a);
bugtest=# create table rt_sub2 partition of rt for values from (100) to (200) partition by range (a);
bugtest=# create table rt150 partition of rt_sub1 for values from (150) to (151);
bugtest=# \d+ rt_sub1
                                  Table "public.rt_sub1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+---------+--------------+-------------
 a      | integer |           |          |         | plain   |              | 
Partition of: rt FOR VALUES FROM (0) TO (100)
Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100))
Partition key: RANGE (a)
Partitions: rt150 FOR VALUES FROM (150) TO (151)

bugtest=# \d+ rt150
                                   Table "public.rt150"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+---------+--------------+-------------
 a      | integer |           |          |         | plain   |              | 
Partition of: rt_sub1 FOR VALUES FROM (150) TO (151)
Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100) AND (a IS NOT NULL) AND (a >= 150) AND (a < 151))


Any rows are not routed to rt150 through rt nor we can't insert any rows to 
rt150 directly because of its constraints. If we add check whether constraint
is contradicted, it prevent us from accidentally creating useless table like 
above rt150 which would not contain any rows.

I thought there might be a discussion or documentation about this, but I 
couldn't find it. If there is, please also tell me that.

Thanks,

--
Yoshikazu Imai

Re: Why we allow CHECK constraint contradiction?

From

"David G. Johnston"

Date:

10 October 2018, 05:44:14

On Tuesday, October 9, 2018, Imai, Yoshikazu <imai.yoshikazu@jp.fujitsu.com> wrote:

Are there any rows which can satisfy the ct's CHECK constraint? If not, why we
allow creating table when check constraint itself is contradicted?

I'd bet on it being a combination of complexity and insufficient expected benefit. Time is better spent elsewhere. Mathmatically proving a contradiction in software is harder than reasoning about it mentally.

David J.

Re: Why we allow CHECK constraint contradiction?

From

Corey Huinker

Date:

10 October 2018, 05:57:42

On Wed, Oct 10, 2018 at 1:44 AM David G. Johnston <david.g.johnston@gmail.com> wrote:

On Tuesday, October 9, 2018, Imai, Yoshikazu <imai.yoshikazu@jp.fujitsu.com> wrote:
Are there any rows which can satisfy the ct's CHECK constraint? If not, why we
allow creating table when check constraint itself is contradicted?

I'd bet on it being a combination of complexity and insufficient expected benefit. Time is better spent elsewhere. Mathmatically proving a contradiction in software is harder than reasoning about it mentally.

I've actually used that as a feature, in postgresql and other databases, where assertions were unavailable, or procedural code was unavailable or against policy.

Consider the following:

CREATE TABLE wanted_values ( x integer );
INSERT INTO wanted_values VALUES (1), (2), (3);

CREATE TABLE found_values ( x integer );
INSERT INTO found_values VALUES (1), (3);

CREATE TABLE missing_values (
x integer,
CONSTRAINT contradiction CHECK (false)
);

INSERT INTO missing_values
SELECT x FROM wanted_values
EXCEPT
SELECT x FROM found_values;

gives the error

ERROR: new row for relation "missing_values" violates check constraint "contradiction"
DETAIL: Failing row contains (2).

Which can be handy when you need to fail a transaction because of bad data and don't have branching logic available.

Re: Why we allow CHECK constraint contradiction?

From

Amit Langote

Date:

10 October 2018, 06:00:58

On 2018/10/10 14:25, Imai, Yoshikazu wrote:
> Hi, all.
> 
> I have a wonder about the behaviour of creating table which has a constraint
> contradiction.
> 
> I created below table.
> 
> bugtest=# create table ct (a int, CHECK(a is not null and a >= 0 and a < 100 and a >= 200 and a < 300));
> bugtest=# \d+ ct
>                                     Table "public.ct"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
> --------+---------+-----------+----------+---------+---------+--------------+-------------
>  a      | integer |           |          |         | plain   |              | 
> Check constraints:
>     "ct_a_check" CHECK (a IS NOT NULL AND a >= 0 AND a < 100 AND a >= 200 AND a < 300)
> 
> 
> Are there any rows which can satisfy the ct's CHECK constraint? If not, why we
> allow creating table when check constraint itself is contradicted?
> 
> 
> I originally noticed this while creating partitioned range table as below.
> 
> bugtest=# create table rt (a int) partition by range (a);
> bugtest=# create table rt_sub1 partition of rt for values from (0) to (100) partition by range (a);
> bugtest=# create table rt_sub2 partition of rt for values from (100) to (200) partition by range (a);
> bugtest=# create table rt150 partition of rt_sub1 for values from (150) to (151);
> bugtest=# \d+ rt_sub1
>                                   Table "public.rt_sub1"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
> --------+---------+-----------+----------+---------+---------+--------------+-------------
>  a      | integer |           |          |         | plain   |              | 
> Partition of: rt FOR VALUES FROM (0) TO (100)
> Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100))
> Partition key: RANGE (a)
> Partitions: rt150 FOR VALUES FROM (150) TO (151)
> 
> bugtest=# \d+ rt150
>                                    Table "public.rt150"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
> --------+---------+-----------+----------+---------+---------+--------------+-------------
>  a      | integer |           |          |         | plain   |              | 
> Partition of: rt_sub1 FOR VALUES FROM (150) TO (151)
> Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100) AND (a IS NOT NULL) AND (a >= 150) AND (a < 151))
> 
> 
> Any rows are not routed to rt150 through rt nor we can't insert any rows to 
> rt150 directly because of its constraints. If we add check whether constraint
> is contradicted, it prevent us from accidentally creating useless table like 
> above rt150 which would not contain any rows.
> 
> I thought there might be a discussion or documentation about this, but I 
> couldn't find it. If there is, please also tell me that.

I had wondered about it when developing the partitioning feature about a
couple of years ago and this is the response I'd gotten:

https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS-b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com

To summarize, the answer I got was that it's pointless to create defenses
against it inside the database.  It's on the users to create the
constraints (or specify bounds) that are non-contradicting.  Interesting
quotes from the above email:

"If we allow partitioning on expressions, then it quickly becomes
altogether impossible to deduce anything useful - unless you can solve the
halting problem."

"... This patch is supposed to be implementing partitioning, not
artificial intelligence."

:)

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS-b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com

RE: Why we allow CHECK constraint contradiction?

From

"Imai, Yoshikazu"

Date:

10 October 2018, 06:24:03

Thanks for replying!

On Tue, Oct 9, 2018 at 5:58 PM, Corey Huinker wrote:
> On Wed, Oct 10, 2018 at 1:44 AM David G. Johnston
> <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com> >
> wrote:
> 
> 
>     On Tuesday, October 9, 2018, Imai, Yoshikazu
> <imai.yoshikazu@jp.fujitsu.com <mailto:imai.yoshikazu@jp.fujitsu.com>
> > wrote:
> 
>         Are there any rows which can satisfy the ct's CHECK
> constraint? If not, why we
>         allow creating table when check constraint itself is
> contradicted?
> 
> 
> 
>     I'd bet on it being a combination of complexity and insufficient
> expected benefit.  Time is better spent elsewhere.  Mathmatically
> proving a contradiction in software is harder than reasoning about it
> mentally.
> 
> 
> I've actually used that as a feature, in postgresql and other databases,
> where assertions were unavailable, or procedural code was unavailable
> or against policy.
> 
> Consider the following:
> 
> 
>     CREATE TABLE wanted_values ( x integer );
> 
>     INSERT INTO wanted_values VALUES (1), (2), (3);
> 
> 
> 
> 
>     CREATE TABLE found_values ( x integer );
> 
>     INSERT INTO found_values VALUES (1), (3);
> 
> 
> 
> 
>     CREATE TABLE missing_values (
> 
>         x integer,
> 
>         CONSTRAINT contradiction CHECK (false)
> 
>     );
> 
> 
> 
> 
>     INSERT INTO missing_values
> 
>     SELECT x FROM wanted_values
> 
>     EXCEPT
> 
>     SELECT x FROM found_values;
> 
> 
> 
> 
> gives the error
> 
> 
>     ERROR:  new row for relation "missing_values" violates check
> constraint "contradiction"
> 
>     DETAIL:  Failing row contains (2).
> 
> 
> Which can be handy when you need to fail a transaction because of bad
> data and don't have branching logic available.

That's an interesting using! So, there are useful case of constraint
contradiction table not only for time shortage/difficulties of
implementing mathematically proving a contradiction.

--
Yoshikazu Imai

RE: Why we allow CHECK constraint contradiction?

From

"Imai, Yoshikazu"

Date:

10 October 2018, 07:28:19

On Tue, Oct 9, 2018 at 6:01 PM, Amit Langote wrote:
> On 2018/10/10 14:25, Imai, Yoshikazu wrote:
> > Hi, all.
> >
> > I have a wonder about the behaviour of creating table which has a
> > constraint contradiction.
> >
> > I created below table.
> >
> > bugtest=# create table ct (a int, CHECK(a is not null and a >= 0 and
> a
> > < 100 and a >= 200 and a < 300)); bugtest=# \d+ ct
> >                                     Table "public.ct"
> >  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> >
> --------+---------+-----------+----------+---------+---------+------
> --------+-------------
> >  a      | integer |           |          |         | plain   |
> |
> > Check constraints:
> >     "ct_a_check" CHECK (a IS NOT NULL AND a >= 0 AND a < 100 AND a >=
> > 200 AND a < 300)
> >
> >
> > Are there any rows which can satisfy the ct's CHECK constraint? If
> > not, why we allow creating table when check constraint itself is
> contradicted?
> >
> >
> > I originally noticed this while creating partitioned range table as
> below.
> >
> > bugtest=# create table rt (a int) partition by range (a); bugtest=#
> > create table rt_sub1 partition of rt for values from (0) to (100)
> > partition by range (a); bugtest=# create table rt_sub2 partition of
> rt
> > for values from (100) to (200) partition by range (a); bugtest=#
> > create table rt150 partition of rt_sub1 for values from (150) to (151);
> bugtest=# \d+ rt_sub1
> >                                   Table "public.rt_sub1"
> >  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> >
> --------+---------+-----------+----------+---------+---------+------
> --------+-------------
> >  a      | integer |           |          |         | plain   |
> |
> > Partition of: rt FOR VALUES FROM (0) TO (100) Partition constraint:
> > ((a IS NOT NULL) AND (a >= 0) AND (a < 100)) Partition key: RANGE (a)
> > Partitions: rt150 FOR VALUES FROM (150) TO (151)
> >
> > bugtest=# \d+ rt150
> >                                    Table "public.rt150"
> >  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> >
> --------+---------+-----------+----------+---------+---------+------
> --------+-------------
> >  a      | integer |           |          |         | plain   |
> |
> > Partition of: rt_sub1 FOR VALUES FROM (150) TO (151) Partition
> > constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100) AND (a IS NOT
> > NULL) AND (a >= 150) AND (a < 151))
> >
> >
> > Any rows are not routed to rt150 through rt nor we can't insert any
> > rows to
> > rt150 directly because of its constraints. If we add check whether
> > constraint is contradicted, it prevent us from accidentally creating
> > useless table like above rt150 which would not contain any rows.
> >
> > I thought there might be a discussion or documentation about this, but
> > I couldn't find it. If there is, please also tell me that.
> 
> I had wondered about it when developing the partitioning feature about
> a couple of years ago and this is the response I'd gotten:
> 
> https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS
> -b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com

Thanks for tell me one of a discussion about this.

> To summarize, the answer I got was that it's pointless to create defenses
> against it inside the database.  It's on the users to create the
> constraints (or specify bounds) that are non-contradicting.

I just thought it's kind to tell users whether users mistakenly specify
bounds. 


> Interesting quotes from the above email:
>
> "If we allow partitioning on expressions, then it quickly becomes
> altogether impossible to deduce anything useful - unless you can solve
> the halting problem."
> 
> "... This patch is supposed to be implementing partitioning, not
> artificial intelligence."

It takes little more time to completely understand this interesting quotes,
but I guess I see that point.


Thanks again!

--
Yoshikazu Imai

Re: Why we allow CHECK constraint contradiction?

From

Amit Langote

Date:

10 October 2018, 08:26:08

On 2018/10/10 16:28, Imai, Yoshikazu wrote:
> On Tue, Oct 9, 2018 at 6:01 PM, Amit Langote wrote:
>> I had wondered about it when developing the partitioning feature about
>> a couple of years ago and this is the response I'd gotten:
>>
>> https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS
>> -b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com
> 
> Thanks for tell me one of a discussion about this.
> 
>> To summarize, the answer I got was that it's pointless to create defenses
>> against it inside the database.  It's on the users to create the
>> constraints (or specify bounds) that are non-contradicting.
> 
> I just thought it's kind to tell users whether users mistakenly specify
> bounds. 
> 
> 
>> Interesting quotes from the above email:
>>
>> "If we allow partitioning on expressions, then it quickly becomes
>> altogether impossible to deduce anything useful - unless you can solve
>> the halting problem."
>>
>> "... This patch is supposed to be implementing partitioning, not
>> artificial intelligence."
> 
> It takes little more time to completely understand this interesting quotes,
> but I guess I see that point.

The task of developing a contradiction proof module that takes an
arbitrary expression and returns whether it's self-contradictory seems
daunting to me.  You may know of predtest.c in the optimizer directory as
one example of such a module, but if you look closely it's scope is fairly
limited; it works only if the input expressions contain variables of
certain types and operators that btree operator classes can handle, and
gives up on producing a proof otherwise.

On the other hand, the syntax of CHECK constraints allows a fairly wide
range of expressions to be specified, with expressions/values of arbitrary
types and operators with arbitrary semantics (not limited to
btree/ordering semantics, for example).  It won't be a good enough
solution if it can provide the error for only a certain subset of
user-specified expressions, IMHO.

Thanks,
Amit

Re: Why we allow CHECK constraint contradiction?

From

Robert Haas

Date:

12 October 2018, 16:53:19

On Wed, Oct 10, 2018 at 4:26 AM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On the other hand, the syntax of CHECK constraints allows a fairly wide
> range of expressions to be specified, with expressions/values of arbitrary
> types and operators with arbitrary semantics (not limited to
> btree/ordering semantics, for example).  It won't be a good enough
> solution if it can provide the error for only a certain subset of
> user-specified expressions, IMHO.

It's impossible to solve that problem in general -- solving it for
only a certain subset of user expressions is the best we can ever do.
If you found a fully general solution, you would have solved the
halting problem, which has been proved to be impossible.

Now, I think there is a reasonable argument that it would still be
nice to give an ERROR diagnostic in the cases we can detect, but I do
not agree with that argument, for all of the reasons stated here: the
development resources are better spent elsewhere, somebody might be
creating such a contradictory constraint deliberately for whatever
purpose, it might annoy or confuse users to get the error in only some
cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Why we allow CHECK constraint contradiction?

From

Tom Lane

Date:

12 October 2018, 16:57:38

Robert Haas <robertmhaas@gmail.com> writes:
> Now, I think there is a reasonable argument that it would still be
> nice to give an ERROR diagnostic in the cases we can detect, but I do
> not agree with that argument, for all of the reasons stated here: the
> development resources are better spent elsewhere, somebody might be
> creating such a contradictory constraint deliberately for whatever
> purpose, it might annoy or confuse users to get the error in only some
> cases.

It's also arguable that throwing an ERROR would be contrary to spec,
in that it would prevent creation of constraints that the SQL standard
does not forbid.

You could dodge that problem by making the message be just a WARNING
or less.  Still, though, the other arguments-against remain valid.

            regards, tom lane