Thread: DISTINCT with NULLs and INT fields
What does The SQL Standard say about this one? create table foo (a varchar, b int); insert into foo (a, b) select null, null from bar; -- no problem insert into foo (a, b) select distinct null, null from bar; -- ERROR: column "b" is of type integer but expression is of type text -- HINT: You will need to rewrite or cast the expression. It seems that applying DISTINCT makes the NULL be of type TEXT... I just tried the exact same thing in Ms. SQL Server 2005 and it works with no errors. Looks like SQL Server makes the NULLs be INTs, but is probably better at doing the implicit conversion from INT. Any thoughts on what the "correct" behavior should be here? George
"George Pavlov" <gpavlov@mynewplace.com> writes: > What does The SQL Standard say about this one? > insert into foo (a, b) select distinct null, null from bar; > -- ERROR: column "b" is of type integer but expression is of type text According to the SQL spec that query is illegal on its face --- the spec does not allow a bare NULL keyword except in contexts where the datatype is immediately evident. You'd have had to writeSELECT DISTINCT CAST(NULL AS varchar), CAST(NULL AS int) to be fully spec-conformant. Postgres handles NULL the same way as untyped literals ('foo'), so we tend to prefer resolving to type text when things are ambiguous. In this case the DISTINCT operation forces a type choice (if you don't know what type the data is, you can hardly decide whether two values are equal or not) and so by the time the INSERT sees it, it looks like a SELECT of two text columns. > Looks like SQL Server makes the NULLs be INTs, but is probably > better at doing the implicit conversion from INT. "Better" is in the eye of the beholder. It surprises me not at all that Microsoft would be lax about implicit coercions, but that doesn't make it a good idea to coerce anything to anything else without complaint. You might as well not have a type system at all, if you're going to destroy its ability to detect mistakes that way. regards, tom lane
> "Better" is in the eye of the beholder. sorry for the value-laden term. "laxer" is more appropriate, of course! the funny thing is that had they cast the NULLs to TEXT it would have failed there too (they do not do implicit TEXT to INT). > It surprises me not at all that > Microsoft would be lax about implicit coercions, but that doesn't make > it a good idea to coerce anything to anything else without complaint. > You might as well not have a type system at all, if you're going to > destroy its ability to detect mistakes that way. indeed! but, wait, doesn't our favorite dbms do some implicit casting too? continuing with my table foo (a varchar, b int): test=# delete from foo; DELETE 2 test=# insert into foo values (4,4); INSERT 0 1 test=# insert into foo values ('4','4'); INSERT 0 1 test=# insert into foo values ('oh','no'); ERROR: invalid input syntax for integer: "no" test=# select * from foo;a | b ---+---4 | 44 | 4 (2 rows)
"George Pavlov" <gpavlov@mynewplace.com> writes: > indeed! but, wait, doesn't our favorite dbms do some implicit casting > too? continuing with my table foo (a varchar, b int): > test=# delete from foo; > DELETE 2 > test=# insert into foo values (4,4); > INSERT 0 1 > test=# insert into foo values ('4','4'); > INSERT 0 1 > test=# insert into foo values ('oh','no'); > ERROR: invalid input syntax for integer: "no" Sure, but in this example the required type of the value is clear from immediate context (ie, the INSERT). This is one of the cases where the SQL spec requires a bare NULL to be accepted. (BTW, 'no' and '4' in this example are *not* values of type text; they are untyped literals which we eventually resolve as varchar or int.) The other case you mentioned is one where we are going out on a limb a bit: INSERT INTO foo SELECT NULL, NULL; In this case we allow the SELECT to not force a type choice, so that the INSERT sees the raw untyped values, but I think that this query is probably illegal per spec --- I believe the spec requires a SELECT to deliver well-defined data types. Too lazy to go look up chapter and verse at the moment. regards, tom lane
> Sure, but in this example the required type of the value is clear from > immediate context (ie, the INSERT). This is one of the cases where > the SQL spec requires a bare NULL to be accepted. (BTW, 'no' and '4' > in this example are *not* values of type text; they are > untyped literals which we eventually resolve as varchar or int.) hmmm... with the risk of boring everyone to tears: test=# insert into foo values (4::int,4::int); INSERT 0 1 test=# insert into foo values (4::text,4::text); ERROR: column "b" is of type integer but expression is of type text HINT: You will need to rewrite or cast the expression. test=# insert into foo values (cast(4 as int),cast(4 as int)); INSERT 0 1 test=# insert into foo values (cast(4 as text),cast(4 as text)); ERROR: column "b" is of type integer but expression is of type text HINT: You will need to rewrite or cast the expression. (i actually think it is important to understand how this stuff works and thanks for explaining!)
George Pavlov wrote: > test=# insert into foo values (4::int,4::int); > INSERT 0 1 > test=# insert into foo values (4::text,4::text); > ERROR: column "b" is of type integer but expression is of type text > HINT: You will need to rewrite or cast the expression. > test=# insert into foo values (cast(4 as int),cast(4 as int)); > INSERT 0 1 > test=# insert into foo values (cast(4 as text),cast(4 as text)); > ERROR: column "b" is of type integer but expression is of type text > HINT: You will need to rewrite or cast the expression. The last sentence (and the second one as well -- they are exactly equivalent) exemplifies that there isn't an implicit typecase from text to integer. If you use single quotes instead of an explicit cast, the literal is assumed to be of type "unknown", which _can_ be casted automatically to integer. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> > test=# insert into foo values (4::int,4::int); > > INSERT 0 1 > > test=# insert into foo values (4::text,4::text); > > ERROR: column "b" is of type integer but expression is of type text > > HINT: You will need to rewrite or cast the expression. > > test=# insert into foo values (cast(4 as int),cast(4 as int)); > > INSERT 0 1 > > test=# insert into foo values (cast(4 as text),cast(4 as text)); > > ERROR: column "b" is of type integer but expression is of type text > > HINT: You will need to rewrite or cast the expression. > > The last sentence (and the second one as well -- they are exactly > equivalent) exemplifies that there isn't an implicit typecase > from text > to integer. If you use single quotes instead of an explicit cast, the > literal is assumed to be of type "unknown", which _can_ be casted > automatically to integer. yes, #1 is the same as #3, and #2 is the same as #4. i was trying to provide a counter example to tom's statement that pgsql does not do type coercions. the interesting thing here is that 4::int gets into a text field whereas 4::text does not get into an integer field. seems to me like there is an implicit int-to-text cast (without a symmetrical text-to-int one) unless 4::int is somehow *still* considered an 'untyped literal' (4 and '4' seem untyped to me, but 4::int does not)??
"George Pavlov" <gpavlov@mynewplace.com> writes: > the interesting thing here is that 4::int gets into a text > field whereas 4::text does not get into an integer field. seems to me > like there is an implicit int-to-text cast (without a symmetrical > text-to-int one) Yeah, there is. You can easily see the list of implicit casts for yourself: select castsource::regtype, casttarget::regtype from pg_castwhere castcontext = 'i' and castsource != casttarget; There's 90 of them in current CVS tip, and most of 'em are unsurprising within-type-category casts, such as implicit promotion of int to bigint. However we have about a dozen implicit casts to text from the numeric and datetime type categories. Personally I would dearly love to get rid of these, because they are accidents waiting to happen (and they do regularly bite people, see the mail list archives for evidence). But it seems people expect to be able to do things likenumber || ' string' without explicitly casting the number to text. regards, tom lane