Thread: null values in non-nullable column
In 8.1 I have a situation where nullability of user defined datatypes does not seem to be enforced. Using the following steps I end up with a table that has a column that should not be nullable, but has nulls in it. Here's a pared down outline of the steps: -- create a datatype that should enforce not null values gp_test=# create domain boolean_not_null as boolean not null; -- a table with a field using the datatype gp_test=# create table a (id integer, test boolean_not_null); gp_test=# insert into a values (1, true); gp_test=# select * from a;id | test ----+------ 1 | t -- as expected no nulls are allowed gp_test=# insert into a values (100, null); ERROR: domain boolean_not_null does not allow null values -- a second table (a parent) gp_test=# create table b (id integer); gp_test=# insert into b values (1); gp_test=# insert into b values (2); gp_test=# select * from b;id ---- 1 2 -- now create a table based on a left join -- this creates a table with a not-nullable column (datatype -- inherited from the original table) which contains nulls; -- even though insertion of new nulls is not allowed gp_test=# create table m as select id, test from b left join a using (id); gp_test=# \d m; Table "public.m"Column | Type | Modifiers --------+------------------+-----------id | integer | test | boolean_not_null | gp_test=# select * from m;id | test ----+-------- 1 | t 2 | <NULL> gp_test=# insert into m values (100, null); ERROR: domain boolean_not_null does not allow null values gp_test=# insert into m (id) values (100); ERROR: domain boolean_not_null does not allow null values gp_test=# update m set test = test; -- note no error here! gp_test=# update m set test = (test and true); ERROR: domain boolean_not_null does not allow null values I would have expected failure at the table creation step, but it proceeds (and inserts the nulls). Interestingly, I do see a failure after I try to restore the table from a dump (using pg_dump/pg_restore). George
"George Pavlov" <gpavlov@mynewplace.com> writes: > -- this creates a table with a not-nullable column (datatype > -- inherited from the original table) which contains nulls; Hm. Arguably we should discard domain-ness in any SELECT result, but I'm sure some people would complain about that ... regards, tom lane
On Dec 5, 2006, at 11:52 , Tom Lane wrote: > "George Pavlov" <gpavlov@mynewplace.com> writes: >> -- this creates a table with a not-nullable column (datatype >> -- inherited from the original table) which contains nulls; > > Hm. Arguably we should discard domain-ness in any SELECT result, > but I'm sure some people would complain about that ... Tom, Could you expand on that a bit? Here's what I've interpreted: The column types of the select are assumed to be (int, boolean_not_null), and so the values aren't checked again before the insert during CREATE TABLE AS. "discarding domain-ness" would mean considering the results as their base type, and rechecking the domain would be checked when inserting into the table. Just trying to fit my head around this. Michael Glaesemann grzm seespotcode net
Michael Glaesemann <grzm@seespotcode.net> writes: > Could you expand on that a bit? Here's what I've interpreted: > The column types of the select are assumed to be (int, > boolean_not_null), and so the values aren't checked again before the > insert during CREATE TABLE AS. "discarding domain-ness" would mean > considering the results as their base type, and rechecking the domain > would be checked when inserting into the table. To be clear: the problem is not with CREATE TABLE AS. The problem is with the definition of what a SELECT ... LEFT JOIN ... is supposed to return. C.T.A. is supposed to create a table matching the defined column types of the SELECT result; either those types allow null-ness, or they don't. Actually I think this is a bug in the SQL spec :-( The description of <joined table> says that output columns are "possibly nullable" if they're on the nullable side of the outer join, but it's not apparent that that idea is meant to negate a domain constraint. And yet, if it does not, then an outer join with a NOT NULL domain column on the nullable side is just invalid. regards, tom lane
On Dec 5, 2006, at 12:49 , Tom Lane wrote: > Actually I think this is a bug in the SQL spec :-( The description of > <joined table> says that output columns are "possibly nullable" if > they're on the nullable side of the outer join, but it's not apparent > that that idea is meant to negate a domain constraint. And yet, if it > does not, then an outer join with a NOT NULL domain column on the > nullable side is just invalid. Very interesting. Thanks for the explanation. I should dig into the spec more. Michael Glaesemann grzm seespotcode net
Hi, George, "George Pavlov" <gpavlov@mynewplace.com> wrote: > In 8.1 I have a situation where nullability of user defined datatypes > does not seem to be enforced. Using the following steps I end up with a > table that has a column that should not be nullable, but has nulls in > it. Ouch. That hurts! Now all those MySQL freaks can argue "Well, PostgreSQL does allow constraint violation, too.", despite the fact that most of them don't even basically understand the special case we have here. So, please, keep silent about it :-) How difficult is it to fix this (e. G. by disallowing NULL-generating JOINs on NOT NULL domains?) Regards, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Yes, the thread did not seem to go very far. The SQL standard does seem inconsistent in this area, but that is not an argument for allowing data constraint violation. Until the standard is modified I think it would be good for the reputation of the DBMS we all love to come up with a fix... Even though, as Tom Lane explained, CREATE TABLE AS is not the problem here, it seems to me that might be the cleanest, least obtrusive place to add validation. If C.T.A failed at the table creation step because of the JOIN-produces NULLs that would be an early and decent warning. Fixing it from the JOIN side (e.g. disallowing NULL-generating JOINs on NOT NULL domains) seems too strict -- JOINs are mostly used for result sets that are not materialized and you hardly have the potential for a problem until they are materialized as a table. Similarly, removing the domain-ness of the JOIN column strikes me as too drastic and as having the potential of breaking existing functionality. I am sure I am missing something, just my two cents... George
George Pavlov wrote: > Even though, as Tom Lane explained, CREATE TABLE AS is not the problem > here, it seems to me that might be the cleanest, least obtrusive place > to add validation. If C.T.A failed at the table creation step because of > the JOIN-produces NULLs that would be an early and decent warning. > Fixing it from the JOIN side (e.g. disallowing NULL-generating JOINs on > NOT NULL domains) seems too strict -- JOINs are mostly used for result > sets that are not materialized and you hardly have the potential for a > problem until they are materialized as a table. Similarly, removing the > domain-ness of the JOIN column strikes me as too drastic and as having > the potential of breaking existing functionality. I am sure I am missing > something, just my two cents... I think it has to go in the join... If a result-set has nulls in a particular column, that column can't be NOT NULL (by definition). Therefore, either the column has its not-null constraint removed (through type-casting away the domain) or the query fails on that NOT NULL constraint. Any query could result in this sort of problem, not just an explicit JOIN with NULLs. Imagine a domain "even_numbers_only" and a "SELECT my_even_numbers+1 FROM foo". Hmm - it strikes me that any result-set should perhaps have the domain removed and substituted with its parent type, except perhaps in the simplest "pass column through" case. -- Richard Huxton Archonet Ltd