Thread: Why we allow CHECK constraint contradiction?
Hi, all. I have a wonder about the behaviour of creating table which has a constraint contradiction. I created below table. bugtest=# create table ct (a int, CHECK(a is not null and a >= 0 and a < 100 and a >= 200 and a < 300)); bugtest=# \d+ ct Table "public.ct" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- a | integer | | | | plain | | Check constraints: "ct_a_check" CHECK (a IS NOT NULL AND a >= 0 AND a < 100 AND a >= 200 AND a < 300) Are there any rows which can satisfy the ct's CHECK constraint? If not, why we allow creating table when check constraint itself is contradicted? I originally noticed this while creating partitioned range table as below. bugtest=# create table rt (a int) partition by range (a); bugtest=# create table rt_sub1 partition of rt for values from (0) to (100) partition by range (a); bugtest=# create table rt_sub2 partition of rt for values from (100) to (200) partition by range (a); bugtest=# create table rt150 partition of rt_sub1 for values from (150) to (151); bugtest=# \d+ rt_sub1 Table "public.rt_sub1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- a | integer | | | | plain | | Partition of: rt FOR VALUES FROM (0) TO (100) Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100)) Partition key: RANGE (a) Partitions: rt150 FOR VALUES FROM (150) TO (151) bugtest=# \d+ rt150 Table "public.rt150" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- a | integer | | | | plain | | Partition of: rt_sub1 FOR VALUES FROM (150) TO (151) Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100) AND (a IS NOT NULL) AND (a >= 150) AND (a < 151)) Any rows are not routed to rt150 through rt nor we can't insert any rows to rt150 directly because of its constraints. If we add check whether constraint is contradicted, it prevent us from accidentally creating useless table like above rt150 which would not contain any rows. I thought there might be a discussion or documentation about this, but I couldn't find it. If there is, please also tell me that. Thanks, -- Yoshikazu Imai
On Tuesday, October 9, 2018, Imai, Yoshikazu <imai.yoshikazu@jp.fujitsu.com> wrote:
Are there any rows which can satisfy the ct's CHECK constraint? If not, why we
allow creating table when check constraint itself is contradicted?
I'd bet on it being a combination of complexity and insufficient expected benefit. Time is better spent elsewhere. Mathmatically proving a contradiction in software is harder than reasoning about it mentally.
David J.
On Wed, Oct 10, 2018 at 1:44 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tuesday, October 9, 2018, Imai, Yoshikazu <imai.yoshikazu@jp.fujitsu.com> wrote:Are there any rows which can satisfy the ct's CHECK constraint? If not, why we
allow creating table when check constraint itself is contradicted?I'd bet on it being a combination of complexity and insufficient expected benefit. Time is better spent elsewhere. Mathmatically proving a contradiction in software is harder than reasoning about it mentally.
I've actually used that as a feature, in postgresql and other databases, where assertions were unavailable, or procedural code was unavailable or against policy.
Consider the following:
CREATE TABLE wanted_values ( x integer );
INSERT INTO wanted_values VALUES (1), (2), (3);
CREATE TABLE found_values ( x integer );
INSERT INTO found_values VALUES (1), (3);
CREATE TABLE missing_values (
x integer,
CONSTRAINT contradiction CHECK (false)
);
INSERT INTO missing_values
SELECT x FROM wanted_values
EXCEPT
SELECT x FROM found_values;
gives the error
ERROR: new row for relation "missing_values" violates check constraint "contradiction"
DETAIL: Failing row contains (2).
Which can be handy when you need to fail a transaction because of bad data and don't have branching logic available.
On 2018/10/10 14:25, Imai, Yoshikazu wrote: > Hi, all. > > I have a wonder about the behaviour of creating table which has a constraint > contradiction. > > I created below table. > > bugtest=# create table ct (a int, CHECK(a is not null and a >= 0 and a < 100 and a >= 200 and a < 300)); > bugtest=# \d+ ct > Table "public.ct" > Column | Type | Collation | Nullable | Default | Storage | Stats target | Description > --------+---------+-----------+----------+---------+---------+--------------+------------- > a | integer | | | | plain | | > Check constraints: > "ct_a_check" CHECK (a IS NOT NULL AND a >= 0 AND a < 100 AND a >= 200 AND a < 300) > > > Are there any rows which can satisfy the ct's CHECK constraint? If not, why we > allow creating table when check constraint itself is contradicted? > > > I originally noticed this while creating partitioned range table as below. > > bugtest=# create table rt (a int) partition by range (a); > bugtest=# create table rt_sub1 partition of rt for values from (0) to (100) partition by range (a); > bugtest=# create table rt_sub2 partition of rt for values from (100) to (200) partition by range (a); > bugtest=# create table rt150 partition of rt_sub1 for values from (150) to (151); > bugtest=# \d+ rt_sub1 > Table "public.rt_sub1" > Column | Type | Collation | Nullable | Default | Storage | Stats target | Description > --------+---------+-----------+----------+---------+---------+--------------+------------- > a | integer | | | | plain | | > Partition of: rt FOR VALUES FROM (0) TO (100) > Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100)) > Partition key: RANGE (a) > Partitions: rt150 FOR VALUES FROM (150) TO (151) > > bugtest=# \d+ rt150 > Table "public.rt150" > Column | Type | Collation | Nullable | Default | Storage | Stats target | Description > --------+---------+-----------+----------+---------+---------+--------------+------------- > a | integer | | | | plain | | > Partition of: rt_sub1 FOR VALUES FROM (150) TO (151) > Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100) AND (a IS NOT NULL) AND (a >= 150) AND (a < 151)) > > > Any rows are not routed to rt150 through rt nor we can't insert any rows to > rt150 directly because of its constraints. If we add check whether constraint > is contradicted, it prevent us from accidentally creating useless table like > above rt150 which would not contain any rows. > > I thought there might be a discussion or documentation about this, but I > couldn't find it. If there is, please also tell me that. I had wondered about it when developing the partitioning feature about a couple of years ago and this is the response I'd gotten: https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS-b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com To summarize, the answer I got was that it's pointless to create defenses against it inside the database. It's on the users to create the constraints (or specify bounds) that are non-contradicting. Interesting quotes from the above email: "If we allow partitioning on expressions, then it quickly becomes altogether impossible to deduce anything useful - unless you can solve the halting problem." "... This patch is supposed to be implementing partitioning, not artificial intelligence." :) Thanks, Amit [1] https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS-b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com
Thanks for replying! On Tue, Oct 9, 2018 at 5:58 PM, Corey Huinker wrote: > On Wed, Oct 10, 2018 at 1:44 AM David G. Johnston > <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com> > > wrote: > > > On Tuesday, October 9, 2018, Imai, Yoshikazu > <imai.yoshikazu@jp.fujitsu.com <mailto:imai.yoshikazu@jp.fujitsu.com> > > wrote: > > Are there any rows which can satisfy the ct's CHECK > constraint? If not, why we > allow creating table when check constraint itself is > contradicted? > > > > I'd bet on it being a combination of complexity and insufficient > expected benefit. Time is better spent elsewhere. Mathmatically > proving a contradiction in software is harder than reasoning about it > mentally. > > > I've actually used that as a feature, in postgresql and other databases, > where assertions were unavailable, or procedural code was unavailable > or against policy. > > Consider the following: > > > CREATE TABLE wanted_values ( x integer ); > > INSERT INTO wanted_values VALUES (1), (2), (3); > > > > > CREATE TABLE found_values ( x integer ); > > INSERT INTO found_values VALUES (1), (3); > > > > > CREATE TABLE missing_values ( > > x integer, > > CONSTRAINT contradiction CHECK (false) > > ); > > > > > INSERT INTO missing_values > > SELECT x FROM wanted_values > > EXCEPT > > SELECT x FROM found_values; > > > > > gives the error > > > ERROR: new row for relation "missing_values" violates check > constraint "contradiction" > > DETAIL: Failing row contains (2). > > > Which can be handy when you need to fail a transaction because of bad > data and don't have branching logic available. That's an interesting using! So, there are useful case of constraint contradiction table not only for time shortage/difficulties of implementing mathematically proving a contradiction. -- Yoshikazu Imai
On Tue, Oct 9, 2018 at 6:01 PM, Amit Langote wrote: > On 2018/10/10 14:25, Imai, Yoshikazu wrote: > > Hi, all. > > > > I have a wonder about the behaviour of creating table which has a > > constraint contradiction. > > > > I created below table. > > > > bugtest=# create table ct (a int, CHECK(a is not null and a >= 0 and > a > > < 100 and a >= 200 and a < 300)); bugtest=# \d+ ct > > Table "public.ct" > > Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > > > --------+---------+-----------+----------+---------+---------+------ > --------+------------- > > a | integer | | | | plain | > | > > Check constraints: > > "ct_a_check" CHECK (a IS NOT NULL AND a >= 0 AND a < 100 AND a >= > > 200 AND a < 300) > > > > > > Are there any rows which can satisfy the ct's CHECK constraint? If > > not, why we allow creating table when check constraint itself is > contradicted? > > > > > > I originally noticed this while creating partitioned range table as > below. > > > > bugtest=# create table rt (a int) partition by range (a); bugtest=# > > create table rt_sub1 partition of rt for values from (0) to (100) > > partition by range (a); bugtest=# create table rt_sub2 partition of > rt > > for values from (100) to (200) partition by range (a); bugtest=# > > create table rt150 partition of rt_sub1 for values from (150) to (151); > bugtest=# \d+ rt_sub1 > > Table "public.rt_sub1" > > Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > > > --------+---------+-----------+----------+---------+---------+------ > --------+------------- > > a | integer | | | | plain | > | > > Partition of: rt FOR VALUES FROM (0) TO (100) Partition constraint: > > ((a IS NOT NULL) AND (a >= 0) AND (a < 100)) Partition key: RANGE (a) > > Partitions: rt150 FOR VALUES FROM (150) TO (151) > > > > bugtest=# \d+ rt150 > > Table "public.rt150" > > Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > > > --------+---------+-----------+----------+---------+---------+------ > --------+------------- > > a | integer | | | | plain | > | > > Partition of: rt_sub1 FOR VALUES FROM (150) TO (151) Partition > > constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 100) AND (a IS NOT > > NULL) AND (a >= 150) AND (a < 151)) > > > > > > Any rows are not routed to rt150 through rt nor we can't insert any > > rows to > > rt150 directly because of its constraints. If we add check whether > > constraint is contradicted, it prevent us from accidentally creating > > useless table like above rt150 which would not contain any rows. > > > > I thought there might be a discussion or documentation about this, but > > I couldn't find it. If there is, please also tell me that. > > I had wondered about it when developing the partitioning feature about > a couple of years ago and this is the response I'd gotten: > > https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS > -b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com Thanks for tell me one of a discussion about this. > To summarize, the answer I got was that it's pointless to create defenses > against it inside the database. It's on the users to create the > constraints (or specify bounds) that are non-contradicting. I just thought it's kind to tell users whether users mistakenly specify bounds. > Interesting quotes from the above email: > > "If we allow partitioning on expressions, then it quickly becomes > altogether impossible to deduce anything useful - unless you can solve > the halting problem." > > "... This patch is supposed to be implementing partitioning, not > artificial intelligence." It takes little more time to completely understand this interesting quotes, but I guess I see that point. Thanks again! -- Yoshikazu Imai
On 2018/10/10 16:28, Imai, Yoshikazu wrote: > On Tue, Oct 9, 2018 at 6:01 PM, Amit Langote wrote: >> I had wondered about it when developing the partitioning feature about >> a couple of years ago and this is the response I'd gotten: >> >> https://www.postgresql.org/message-id/CA+TgmoaQABrsLQK4ms_4NiyavyJGS >> -b6ZFkZBBNC+-P5DjJNFA@mail.gmail.com > > Thanks for tell me one of a discussion about this. > >> To summarize, the answer I got was that it's pointless to create defenses >> against it inside the database. It's on the users to create the >> constraints (or specify bounds) that are non-contradicting. > > I just thought it's kind to tell users whether users mistakenly specify > bounds. > > >> Interesting quotes from the above email: >> >> "If we allow partitioning on expressions, then it quickly becomes >> altogether impossible to deduce anything useful - unless you can solve >> the halting problem." >> >> "... This patch is supposed to be implementing partitioning, not >> artificial intelligence." > > It takes little more time to completely understand this interesting quotes, > but I guess I see that point. The task of developing a contradiction proof module that takes an arbitrary expression and returns whether it's self-contradictory seems daunting to me. You may know of predtest.c in the optimizer directory as one example of such a module, but if you look closely it's scope is fairly limited; it works only if the input expressions contain variables of certain types and operators that btree operator classes can handle, and gives up on producing a proof otherwise. On the other hand, the syntax of CHECK constraints allows a fairly wide range of expressions to be specified, with expressions/values of arbitrary types and operators with arbitrary semantics (not limited to btree/ordering semantics, for example). It won't be a good enough solution if it can provide the error for only a certain subset of user-specified expressions, IMHO. Thanks, Amit
On Wed, Oct 10, 2018 at 4:26 AM Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On the other hand, the syntax of CHECK constraints allows a fairly wide > range of expressions to be specified, with expressions/values of arbitrary > types and operators with arbitrary semantics (not limited to > btree/ordering semantics, for example). It won't be a good enough > solution if it can provide the error for only a certain subset of > user-specified expressions, IMHO. It's impossible to solve that problem in general -- solving it for only a certain subset of user expressions is the best we can ever do. If you found a fully general solution, you would have solved the halting problem, which has been proved to be impossible. Now, I think there is a reasonable argument that it would still be nice to give an ERROR diagnostic in the cases we can detect, but I do not agree with that argument, for all of the reasons stated here: the development resources are better spent elsewhere, somebody might be creating such a contradictory constraint deliberately for whatever purpose, it might annoy or confuse users to get the error in only some cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Now, I think there is a reasonable argument that it would still be > nice to give an ERROR diagnostic in the cases we can detect, but I do > not agree with that argument, for all of the reasons stated here: the > development resources are better spent elsewhere, somebody might be > creating such a contradictory constraint deliberately for whatever > purpose, it might annoy or confuse users to get the error in only some > cases. It's also arguable that throwing an ERROR would be contrary to spec, in that it would prevent creation of constraints that the SQL standard does not forbid. You could dodge that problem by making the message be just a WARNING or less. Still, though, the other arguments-against remain valid. regards, tom lane