Thread: BUG #5028: CASE returns ELSE value always when type is "char"

BUG #5028: CASE returns ELSE value always when type is "char"

From
"Joseph Shraibman"
Date:
The following bug has been logged online:

Bug reference:      5028
Logged by:          Joseph Shraibman
Email address:      jks@selectacast.net
PostgreSQL version: 8.3.7
Operating system:   Linux
Description:        CASE returns ELSE value always when type is "char"
Details:

[local]:playpen=> SELECT version();
                                                  version

----------------------------------------------------------------------------
-------------------------------
 PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20080704 (Red Hat 4.1.2-44)
(1 row)

Time: 1.658 ms
[local]:playpen=>
[local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE
c.relkind
playpen->  END from (select 'r'::"char" AS relkind) c;
 relkind | relkind
---------+---------
 r       | t
(1 row)

Time: 1.407 ms
[local]:playpen=>
[local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast'
playpen->  END from (select 'r'::"char" AS relkind) c;
 relkind | case
---------+-------
 r       | table
(1 row)

Time: 0.426 ms
[local]:playpen=>
[local]:playpen=>
[local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE
c.relkind
playpen->  END from (select 'r'::char AS relkind) c;
 relkind | relkind
---------+---------
 r       | table
(1 row)

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Sam Mason
Date:
On Tue, Sep 01, 2009 at 04:36:25PM +0000, Joseph Shraibman wrote:
> Description:        CASE returns ELSE value always when type is "char"

I think it's just silently truncating the literal to a single character.

> [local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
> WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
> THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE
> c.relkind
> playpen->  END from (select 'r'::"char" AS relkind) c;
>  relkind | relkind
> ---------+---------
>  r       | t

Here, 'r' maps to the "char" literal 'table' which PG interprets as the
value 't'--i.e. PG silently chops of the 'able'.  The bug would seem to
be in your code, but PG could maybe throw an error to tell you this is
what is happening?

A possible fix would be to have your ELSE clause as:

  c.relkind::text

As that way the other branches would be interpreted as text and they
wouldn't be getting chopped off along the way.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Joseph Shraibman
Date:
It appears the problem is with the "char" being in the ELSE expression,
not being in the CASE expression

[local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN
's' THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast'
ELSE c.relkind::char
playpen->  END from (select 'r'::"char" AS relkind) c;
  relkind | relkind
---------+---------
  r       | table
(1 row)

Time: 0.674 ms
[local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN
's' THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast'
ELSE c.relkind::"char"
playpen->  END from (select 'r'::char AS relkind) c;
  relkind | relkind
---------+---------
  r       | t
(1 row)



Joseph Shraibman wrote:
> The following bug has been logged online:
>
> Bug reference:      5028
> Logged by:          Joseph Shraibman
> Email address:      jks@selectacast.net
> PostgreSQL version: 8.3.7
> Operating system:   Linux
> Description:        CASE returns ELSE value always when type is "char"
> Details:
>
> [local]:playpen=> SELECT version();
>                                                   version
>
> ----------------------------------------------------------------------------
> -------------------------------
>  PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
> 4.1.2 20080704 (Red Hat 4.1.2-44)
> (1 row)
>
> Time: 1.658 ms
> [local]:playpen=>
> [local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
> WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
> THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE
> c.relkind
> playpen->  END from (select 'r'::"char" AS relkind) c;
>  relkind | relkind
> ---------+---------
>  r       | t
> (1 row)
>
> Time: 1.407 ms
> [local]:playpen=>
> [local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
> WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
> THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast'
> playpen->  END from (select 'r'::"char" AS relkind) c;
>  relkind | case
> ---------+-------
>  r       | table
> (1 row)
>
> Time: 0.426 ms
> [local]:playpen=>
> [local]:playpen=>
> [local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
> WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
> THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE
> c.relkind
> playpen->  END from (select 'r'::char AS relkind) c;
>  relkind | relkind
> ---------+---------
>  r       | table
> (1 row)
>

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Joseph Shraibman
Date:
Sam Mason wrote:
> On Tue, Sep 01, 2009 at 04:36:25PM +0000, Joseph Shraibman wrote:
>> Description:        CASE returns ELSE value always when type is "char"
>
> I think it's just silently truncating the literal to a single character.
>
>> [local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table'
>> WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's'
>> THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE
>> c.relkind
>> playpen->  END from (select 'r'::"char" AS relkind) c;
>>  relkind | relkind
>> ---------+---------
>>  r       | t

Ah, OK it seems I didn't pay close enough attention.

>
> Here, 'r' maps to the "char" literal 'table' which PG interprets as the
> value 't'--i.e. PG silently chops of the 'able'.

So the type of what is in the ELSE clause determines the type of the output?

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Tom Lane
Date:
Joseph Shraibman <jks@selectacast.net> writes:
> So the type of what is in the ELSE clause determines the type of the output?

If all the other branches are unknown literals, yes.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Joseph Shraibman <jks@selectacast.net> writes:
>> So the type of what is in the ELSE clause determines the type of
>> the output?
>
> If all the other branches are unknown literals, yes.

What's the best place to look to get a handle on what the benefits are
of treating character string literals as being of unknown type?
(Other than the obvious backward compatibility issues.)

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Joseph Shraibman <jks@selectacast.net> writes:
>>> So the type of what is in the ELSE clause determines the type of
>>> the output?
>>
>> If all the other branches are unknown literals, yes.

> What's the best place to look to get a handle on what the benefits are
> of treating character string literals as being of unknown type?
> (Other than the obvious backward compatibility issues.)

I think the odds of changing that are not measurably different from
zero.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Joseph Shraibman <jks@selectacast.net> writes:
>>>> So the type of what is in the ELSE clause determines the type of
>>>> the output?
>>>
>>> If all the other branches are unknown literals, yes.
>
>> What's the best place to look to get a handle on what the benefits
>> are of treating character string literals as being of unknown type?
>> (Other than the obvious backward compatibility issues.)
>
> I think the odds of changing that are not measurably different from
> zero.

I figured that; I'm just trying to understand what seems to me like an
odd wart on the type system.  I figure I must be missing something
important, so I'd kinda like to find out what that is.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Jeff Davis
Date:
On Tue, 2009-09-01 at 13:49 -0500, Kevin Grittner wrote:
> I figured that; I'm just trying to understand what seems to me like an
> odd wart on the type system.  I figure I must be missing something
> important, so I'd kinda like to find out what that is.

If I understand your question, you're comparing:

  (a) leaving a literal as "unknown" until you've finished inferring
      types (current behavior)
  (b) casting every unknown to text immediately, and then trying to
      infer the types

In general, option (b) eliminates information that might be useful for
making good inferences about the correct operators to use, and also
finding cases of ambiguity.

For instance, consider the current behavior:

  1. select now()::text < 'January 01, 2009'; -- true
  2. select now()       < 'January 01, 2009'; -- false
  3. select now()       < 'January 01, 2009'::text;
     ERROR:  operator does not exist: timestamp with time zone < text

Example #2 shows that we can infer the the RHS is of type timestamptz
based on the type of the LHS. That's desirable behavior in any
type-inferencing system -- without it you might as well just explicitly
cast all literals. Example #3 is ambiguous: we have no way to know
whether to choose "< (timestamptz, timestamptz)" or "< (text, text)",
and an ERROR is desirable behavior to avoid confusing results.

But you can't have both of those desirable behaviors unless you are
somehow aware that "'January 01, 2009'" is something more malleable than
"now()" in example #2. Calling the RHS "unknown" in example #2 gives us
that information.

Regards,
    Jeff Davis

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Sam Mason
Date:
On Tue, Sep 01, 2009 at 05:49:25PM +0100, Sam Mason wrote:
> PG could maybe throw an error to tell you this is
> what is happening?

Would something like the included patch be accepted?

--
  Sam  http://samason.me.uk/

Attachment

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Jeff Davis <pgsql@j-davis.com> wrote:
> On Tue, 2009-09-01 at 13:49 -0500, Kevin Grittner wrote:
>> I figured that; I'm just trying to understand what seems to me like
>> an odd wart on the type system.  I figure I must be missing
>> something important, so I'd kinda like to find out what that is.
>
> If I understand your question, you're comparing:
>
>   (a) leaving a literal as "unknown" until you've finished
>       inferring types (current behavior)
>   (b) casting every unknown to text immediately, and then trying to
>       infer the types

No, that's not it.  I'm wondering why it isn't treated as text.
Period.  Full stop.  Nothing to infer.  Anywhere that we have implicit
casts defined from text to something else could, of course, still
operate; but it would be text.  No guessing.

> In general, option (b) eliminates information that might be useful
> for making good inferences about the correct operators to use, and
> also finding cases of ambiguity.

It often seems to have the opposite effect.  See the original post.

> For instance, consider the current behavior:
>
>   1. select now()::text < 'January 01, 2009'; -- true
>   2. select now()       < 'January 01, 2009'; -- false
>   3. select now()       < 'January 01, 2009'::text;
>      ERROR:  operator does not exist: timestamp with time zone <
>              text

In my view, it is wrong that any of those work.  I would expect to
have to code one of these:

select now() < date '2009-01-01';  -- implicit casts should cover
select now() < timestamp with time zone '2009-01-01 00:00:00.0';

I understand that there is probably a huge base of existing code which
counts on being able to be sloppy with types and have PostgreSQL
automagically infer types other than what is actually specified; but
I'd rather not expose such sloppy behavior to those running ad hoc
queries at my site.

> Example #2 shows that we can infer the the RHS is of type
> timestamptz based on the type of the LHS. That's desirable behavior
> in any type-inferencing system -- without it you might as well just
> explicitly cast all literals. Example #3 is ambiguous: we have no
> way to know whether to choose "< (timestamptz, timestamptz)" or "<
> (text, text)", and an ERROR is desirable behavior to avoid confusing
> results.

Here I think you have answered my question.  It is seen as a feature,
since it allows people to avoid the extra keystrokes of coding
type-specific literal values, and allows them the entertainment of
seeing how the values get interpreted.  :-)

> But you can't have both of those desirable behaviors

Whether they are desirable is the point of disagreement.  At least I
now understand the reasoning.

Thanks,

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Jeff Davis <pgsql@j-davis.com> wrote:
>> If I understand your question, you're comparing:
>>
>> (a) leaving a literal as "unknown" until you've finished
>> inferring types (current behavior)
>> (b) casting every unknown to text immediately, and then trying to
>> infer the types

> No, that's not it.  I'm wondering why it isn't treated as text.
> Period.  Full stop.  Nothing to infer.

Because then we would have to provide implicit casts from text to
everything else, which would be horribly dangerous.

> In my view, it is wrong that any of those work.  I would expect to
> have to code one of these:

> select now() < date '2009-01-01';  -- implicit casts should cover
> select now() < timestamp with time zone '2009-01-01 00:00:00.0';

[ shrug... ]  The current design is a compromise between usability
and strictness of semantics.  This proposal appears to be all strictness
and no usability.  No thanks ... even without any backward-compatibility
considerations, I wouldn't find this to be an improvement.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

>> No, that's not it.  I'm wondering why it isn't treated as text.
>> Period.  Full stop.  Nothing to infer.
>
> Because then we would have to provide implicit casts from text to
> everything else, which would be horribly dangerous.

I would like that even less.  I like errors on type conflicts.

>> In my view, it is wrong that any of those work.  I would expect to
>> have to code one of these:
>
>> select now() < date '2009-01-01';  -- implicit casts should cover
>> select now() < timestamp with time zone '2009-01-01 00:00:00.0';
>
> The current design is a compromise between usability and strictness
> of semantics.  This proposal appears to be all strictness and no
> usability.

I was not proposing anything; I was trying to understand the reasons
for the current behavior so that I could think about what might make
sense to address some of the places where current behavior causes a
result which is different from a non-error result should be obtained
under the standard.  I couldn't begin to anticipate what might be
acceptable in these situations without understanding the reason things
are as they are.

I do understand that there will be "convenience" extensions to the
standard -- all products do that.  I wasn't sure whether that was the
reason for the behavior or whether there was something else in play.

Thanks for clarifying,

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> On Tue, Sep 01, 2009 at 05:49:25PM +0100, Sam Mason wrote:
>> PG could maybe throw an error to tell you this is
>> what is happening?

> Would something like the included patch be accepted?

ISTM this was debated once before and rejected.  However, there's a
proposal over here
http://archives.postgresql.org/pgsql-hackers/2009-08/msg02073.php
to tighten up what the "char" input functions will take, and maybe
throwing error for excess input would fit in with that.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Jeff Davis
Date:
On Wed, 2009-09-02 at 08:57 -0500, Kevin Grittner wrote:
> >   (a) leaving a literal as "unknown" until you've finished
> >       inferring types (current behavior)
> >   (b) casting every unknown to text immediately, and then trying to
> >       infer the types
>
> No, that's not it.  I'm wondering why it isn't treated as text.
> Period.  Full stop.  Nothing to infer.  Anywhere that we have implicit
> casts defined from text to something else could, of course, still
> operate; but it would be text.  No guessing.

If you have very many implicit casts, I think you lose the
predictability and safety you're looking for, and/or end up with a lot
of errors that eliminate the convenience of implicit casting.

> It often seems to have the opposite effect.  See the original post.

The original problem has more to do with the fact that interpreting an
unknown value as a char seems to just discard a lot of information. I
assume that's part of the standard, but it seems like a bad idea any
time you silently discard data (which is why we prevented varchar(n)
from silently truncating a while ago).

> Here I think you have answered my question.  It is seen as a feature,
> since it allows people to avoid the extra keystrokes of coding
> type-specific literal values, and allows them the entertainment of
> seeing how the values get interpreted.  :-)
>
> > But you can't have both of those desirable behaviors
>
> Whether they are desirable is the point of disagreement.  At least I
> now understand the reasoning.

They are desirable for a system that infers types from context. I agree
that there's more safety by explicitly declaring the type of all
literals; but I disagree that using implicit casts to make up for a lack
of an "unknown" type will improve matters (either for convenience or
safety).

Regards,
    Jeff Davis

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Greg Stark
Date:
On Wed, Sep 2, 2009 at 3:44 PM, Jeff Davis<pgsql@j-davis.com> wrote:
> On Wed, 2009-09-02 at 08:57 -0500, Kevin Grittner wrote:
>> > =A0 (a) leaving a literal as "unknown" until you've finished
>> > =A0 =A0 =A0 inferring types (current behavior)
>> > =A0 (b) casting every unknown to text immediately, and then trying to
>> > =A0 =A0 =A0 infer the types
>>
>> No, that's not it. =A0I'm wondering why it isn't treated as text.
>> Period. =A0Full stop. =A0Nothing to infer. =A0Anywhere that we have impl=
icit
>> casts defined from text to something else could, of course, still
>> operate; but it would be text. =A0No guessing.
>
> If you have very many implicit casts, I think you lose the
> predictability and safety you're looking for, and/or end up with a lot
> of errors that eliminate the convenience of implicit casting.

Perhaps we should stop thinking of "unknown" as, er, "unknown" and
think of it as "text literal". A text literal has implicit casts to
every data type but a normal text string has to be explicitly cast.

Hm, that's not quite right because things like array(1)||'5' don't
treat the '5' as a text literal. The "implicit cast" is preferred to
treating it as text.


--=20
greg
http://mit.edu/~gsstark/resume.pdf

Re: BUG #5028: CASE returns ELSE value always when typeis"char"

From
"Kevin Grittner"
Date:
Jeff Davis <pgsql@j-davis.com> wrote:

> I disagree that using implicit casts to make up for a lack of an
> "unknown" type will improve matters

I certainly never meant to imply that additional implicit casts should
be added.  I apologize for not being more clear about that.

Thanks again for helping fill in the blanks.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 03:50:05PM +0100, Greg Stark wrote:
> Perhaps we should stop thinking of "unknown" as, er, "unknown" and
> think of it as "text literal". A text literal has implicit casts to
> every data type but a normal text string has to be explicitly cast.

How does that help things?  You seem to be keeping the semantics and
only changing the name, when it's the semantics that you seem to be
complaining about.

I'm pretty sure it's correct to treat it as "unknown", other
type-systems do this and it all works well.  The thing that makes things
ambiguous is the ad-hoc polymorphism that exists in function calls
and operators.  With PG's type system you know almost nothing about
any types involved in an arbitrary expression, operators are better
than functions (given the types of the arguments you know the return
type, with default parameters even this knowledge doesn't exist with
functions) but still leave things far too open to have any rigor without
spelling out types in full everywhere.  For example:

  CREATE FUNCTION add(int,int) RETURNS int LANGUAGE sql
     AS $$ SELECT $1 + $2; $$;

  CREATE FUNCTION add(int,int,int DEFAULT NULL) RETURNS text LANGUAGE sql
    AS $$ SELECT ($1 + $2)::text; $$;

What type should it attribute to the result of:

  SELECT add(1,2);

In fact it doesn't seem to want to play ball at all.  Even given the
apparently unambiguous:

  SELECT 1+add(1,2);
or
  SELECT 'hi'||add(1,2);

It doesn't get anywhere.  No need for "text 'hi'" in the second one
because || isn't defined for values of integer type.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Robert Haas
Date:
On Wed, Sep 2, 2009 at 11:55 AM, Sam Mason<sam@samason.me.uk> wrote:
> In fact it doesn't seem to want to play ball at all. =A0Even given the
> apparently unambiguous:
>
> =A0SELECT 1+add(1,2);
> or
> =A0SELECT 'hi'||add(1,2);
>
> It doesn't get anywhere. =A0No need for "text 'hi'" in the second one
> because || isn't defined for values of integer type.

Right.  This is exactly the sort of thing that languages with real
type inference have no problem handling.  Of course, ML for example
doesn't allow overloading precisely because (AIUI) it makes type
inference difficult.  It would be awesome if we could make this work
though.

...Robert

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 12:36:00PM -0400, Robert Haas wrote:
> On Wed, Sep 2, 2009 at 11:55 AM, Sam Mason<sam@samason.me.uk> wrote:
> > In fact it doesn't seem to want to play ball at all.  Even given the
> > apparently unambiguous:
> >
> >   SELECT 1+add(1,2);
> > or
> >   SELECT 'hi'||add(1,2);
> >
> > It doesn't get anywhere.  No need for "text 'hi'" in the second one
> > because || isn't defined for values of integer type.
>
> Right.  This is exactly the sort of thing that languages with real
> type inference have no problem handling.  Of course, ML for example
> doesn't allow overloading precisely because (AIUI) it makes type
> inference difficult.  It would be awesome if we could make this work
> though.

Difficult, but not intractable.  Haskell has done this sort of thing for
quite a while; although it handles ad-hoc polymorphism differently than
PG does.  You basically end up saying how much polymorphism you want
to allow for each function, for example the return type of an operator
(in PG) is determined exactly by the type of its arguments.  In Haskell
you would have the power to say, if you so wanted, that the type of an
operator's RHS is determined exactly by the type of its LHS and return
type, or even, in the most general case, that it's parametrized over all
three types.  Obviously the more types you leave free the more typing
you have to do specifying all the types as the type-inference has less
leverage to work with.


I've been trying to think about how to apply a more modern type system
to PG for a while and hence my comments about how things like NULL
rows should be handled are based on this and may come across as rather
dogmatic sometimes, it's about the only way I can get things to hold
together without introducing much more complexity than seems strictly
necessary.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> ... For example:

>   CREATE FUNCTION add(int,int) RETURNS int LANGUAGE sql
>      AS $$ SELECT $1 + $2; $$;

>   CREATE FUNCTION add(int,int,int DEFAULT NULL) RETURNS text LANGUAGE sql
>     AS $$ SELECT ($1 + $2)::text; $$;

> What type should it attribute to the result of:

>   SELECT add(1,2);

> In fact it doesn't seem to want to play ball at all.  Even given the
> apparently unambiguous:

>   SELECT 1+add(1,2);
> or
>   SELECT 'hi'||add(1,2);

> It doesn't get anywhere.

Well, no, because our type resolution is bottom-up; it does not consider
context when trying to resolve the overloaded "add()" function
reference.  "Unknown" is the only part of the system that allows for any
delay at all in identifying the type of a construct, and even that is
limited to a literal and its first-level surrounding context.

It's interesting that you want to go in 100% the opposite direction from
Kevin, who seems to want to eliminate type inference altogether.  Maybe
our current compromise isn't too bad, if it makes everybody unhappy in
opposite directions ;-)

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> It's interesting that you want to go in 100% the opposite direction
> from Kevin, who seems to want to eliminate type inference
> altogether.  Maybe our current compromise isn't too bad, if it makes
> everybody unhappy in opposite directions ;-)

Well, it's probably worth noting that, while I would prefer strong
typing all around, I recognize that unless that's easier than I think
to make as a configuration option, and strict is not the default, it's
not going to happen.  Too many people prefer things the other way.
What I'm most concerned about are the corner cases where strict typing
would give one non-error result and the inferred typing results in an
error or a different result from the strict typing.  I'm willing to
argue that those are bugs, at least when the strongly typed behavior
is mandated by the SQL standard.

I pray that I never again have to deal with a database as generous
with typing as Pick or Advanced Revelation.  Seriously, working with
those gave me that feeling you normally only get during a nightmare.
:-(  When a "date of birth" column can contain a list of phone numbers
-- well, you want just want do something that makes you forget you
saw that....

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 01:19:07PM -0400, Tom Lane wrote:
> Sam Mason <sam@samason.me.uk> writes:
> > ... For example:
>
> >   CREATE FUNCTION add(int,int) RETURNS int LANGUAGE sql
> >      AS $$ SELECT $1 + $2; $$;
>
> >   CREATE FUNCTION add(int,int,int DEFAULT NULL) RETURNS text LANGUAGE sql
> >     AS $$ SELECT ($1 + $2)::text; $$;
>
> > What type should it attribute to the result of:
>
> >   SELECT add(1,2);
>
> > In fact it doesn't seem to want to play ball at all.  Even given the
> > apparently unambiguous:
>
> >   SELECT 1+add(1,2);
> > or
> >   SELECT 'hi'||add(1,2);
>
> > It doesn't get anywhere.
>
> Well, no, because our type resolution is bottom-up; it does not consider
> context when trying to resolve the overloaded "add()" function
> reference.  "Unknown" is the only part of the system that allows for any
> delay at all in identifying the type of a construct, and even that is
> limited to a literal and its first-level surrounding context.

OK, I got distracted and my example was bad.

> It's interesting that you want to go in 100% the opposite direction from
> Kevin, who seems to want to eliminate type inference altogether.  Maybe
> our current compromise isn't too bad, if it makes everybody unhappy in
> opposite directions ;-)

The current compromise has worked for a while so there's no immediate
reason to change it.  I wasn't interpreting Kevin's request directly
as I don't think he really means what he's saying and really wants
an unambiguous way of writing literals in queries.  I think it's
easier to get this without specifying types everywhere and introducing
constraints elsewhere to maintain what compatibility we've got with the
SQL standard.

If we did follow Kevin's request directly, should we also be specifying
the type of NULL?

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:

> If we did follow Kevin's request directly, should we also be
> specifying the type of NULL?

I don't *think* the SQL standard requires that, and barring that I
don't see any compelling reason to type NULL.  One problem I do see
with the current scheme, however, is that NULL *does* get typed to
text when it makes no sense.  In my view, a CASE expression which has
only NULL for its return values, or an abbreviated form of CASE, such
as COALESCE or NULLIF, should be evaluated exactly the same as if they
were replaced by NULL itself.  For example, COALESCE(NULL, NULL)
currently yields NULL::text.  In my view that's wrong.  I view it as a
bug, but that seems to be a hard sell here.

Likewise, I think that in the query which started this thread, the
cast to "char" is not sensible.  I'm not sure how that could be
resolved, but it doesn't seem correct to me.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> What I'm most concerned about are the corner cases where strict typing
> would give one non-error result and the inferred typing results in an
> error or a different result from the strict typing.  I'm willing to
> argue that those are bugs, at least when the strongly typed behavior
> is mandated by the SQL standard.

Are there any such cases?  Your interpretation of strict typing seems to
be that everything is type-labeled to start with, which means that type
inference doesn't actually have anything to do.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> What I'm most concerned about are the corner cases where strict
>> typing would give one non-error result and the inferred typing
>> results in an error or a different result from the strict typing.
>> I'm willing to argue that those are bugs, at least when the
>> strongly typed behavior is mandated by the SQL standard.
>
> Are there any such cases?  Your interpretation of strict typing
> seems to be that everything is type-labeled to start with, which
> means that type inference doesn't actually have anything to do.

A simple, self-contained example derived from the OP:

test=# create table t (c "char");
CREATE TABLE
test=# insert into t values ('a');
INSERT 0 1
test=# select case when c = 'a' then 'Hey' else c end from t;
 c
---
 H
(1 row)

test=# select case when c = 'a' then 'Hey'::text else c end from t;
  c
-----
 Hey
(1 row)

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 12:54:03PM -0500, Kevin Grittner wrote:
> Sam Mason <sam@samason.me.uk> wrote:
> > If we did follow Kevin's request directly, should we also be
> > specifying the type of NULL?
>
> I don't *think* the SQL standard requires that, and barring that I
> don't see any compelling reason to type NULL.

The SQL standard certainly doesn't require it.  It's just that you were
requiring the types of literals that happened to be enclosed in quotes
to have their type ascribed, so why not the NULL literal?

> One problem I do see
> with the current scheme, however, is that NULL *does* get typed to
> text when it makes no sense.  In my view, a CASE expression which has
> only NULL for its return values, or an abbreviated form of CASE, such
> as COALESCE or NULLIF, should be evaluated exactly the same as if they
> were replaced by NULL itself.  For example, COALESCE(NULL, NULL)
> currently yields NULL::text.  In my view that's wrong.  I view it as a
> bug, but that seems to be a hard sell here.

Yes, that's because PG does a bottom-up solve of the type constraints.
I think it should really result in an unknown type as well.

> Likewise, I think that in the query which started this thread, the
> cast to "char" is not sensible.  I'm not sure how that could be
> resolved, but it doesn't seem correct to me.

All branches unify with the "char" type (i.e. they're all "char" or
unknown) and hence the result of the expression is determined to be of
type "char".

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
I wrote:

> A simple, self-contained example derived from the OP:
>
> test=# create table t (c "char");
> CREATE TABLE
> test=# insert into t values ('a');
> INSERT 0 1
> test=# select case when c = 'a' then 'Hey' else c end from t;
>  c
> ---
>  H
> (1 row)
>
> test=# select case when c = 'a' then 'Hey'::text else c end from t;
>   c
> -----
>  Hey
> (1 row)

And I'm not even sure how I'd explain the rules to someone.  (I guess
that's because I don't understand them, really, but the other way
sounds better....)

test=# drop table t;
DROP TABLE
test=# create table t (c varchar(2));
CREATE TABLE
test=# insert into t values ('a');
INSERT 0 1
test=# select case when c = 'a' then 'Hey' else c end from t;
  c
-----
 Hey
(1 row)

test=# select case when c = 'a' then 'Hey'::text else c end from t;
  c
-----
 Hey
(1 row)

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:

> you were requiring the types of literals that happened to be
> enclosed in quotes to have their type ascribed, so why not the NULL
> literal?

Well, unless things have changed in recent versions of the standard
and I've missed the change, a series of characters enclosed in
apostrophes is what the standard calls a "character string literal"
and defines it to be be related to character based types such as
varchar.  As far as I'm aware, considering it to be undefined is a
PostgreSQL extension.  If you can point to something in the standard
to show where I'm mistaken, I'll look it over.  I'll go looking for
something to back my memories on the topic, too, since my memory seems
to be less reliable than it once was.....

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

> Well, unless things have changed in recent versions of the standard
> and I've missed the change, a series of characters enclosed in
> apostrophes is what the standard calls a "character string literal"
> and defines it to be be related to character based types such as
> varchar.

That still seems to be the case in the draft of the 2003 standard I
have:

<general literal> ::=
    <character string literal>
  | <national character string literal>
  | <Unicode character string literal>
  | <binary string literal>
  | <datetime literal>
  | <interval literal>
  | <boolean literal>
<character string literal> ::=
    [ <introducer><character set specification> ]
    <quote> [ <character representation>... ] <quote>
    [ { <separator> <quote> [ <character representation>... ] <quote>
}... ]

The ball's in your court to show something in the standard to say that
a character string literal is ever *not* to be taken as a character
string.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 01:27:35PM -0500, Kevin Grittner wrote:
> Sam Mason <sam@samason.me.uk> wrote:
> > you were requiring the types of literals that happened to be
> > enclosed in quotes to have their type ascribed, so why not the NULL
> > literal?
>
> Well, unless things have changed in recent versions of the standard
> and I've missed the change, a series of characters enclosed in
> apostrophes is what the standard calls a "character string literal"
> and defines it to be be related to character based types such as
> varchar.  As far as I'm aware, considering it to be undefined is a
> PostgreSQL extension.  If you can point to something in the standard
> to show where I'm mistaken, I'll look it over.  I'll go looking for
> something to back my memories on the topic, too, since my memory seems
> to be less reliable than it once was.....

Sorry, I was referring to your explicit naming of types as in the
following:

  http://archives.postgresql.org/message-id/4A9E3378020000250002A87E@gw.wicourts.gov

reading it back again I'm not sure if that's what you meant now.  Email
is hard work isn't it!

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> And I'm not even sure how I'd explain the rules to someone.

text is preferred to "char" which is preferred to unknown.

This particular example would be less confusing if 'Hey'::"char"
threw an error, but that behavior is the result of an ancient
(bad?) decision in the input function of one legacy datatype.
It's not, IMNSHO, evidence of an overall failure of the type system
as a whole.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Sam Mason <sam@samason.me.uk> wrote:
>> If we did follow Kevin's request directly, should we also be
>> specifying the type of NULL?

> I don't *think* the SQL standard requires that, and barring that I
> don't see any compelling reason to type NULL.

Actually, AFAICS the SQL spec *does* require you to cast NULL literals
explicitly in nearly all contexts.  There are exceptions for assigning
NULL directly to a field (in INSERT or UPDATE), and maybe some other
places, but not many.

The PG type system treats a bare NULL literal as "unknown", so most
of the same issues come up for either NULL or literal constants.
If you were to try to get rid of the "unknown" concept, I think the
only place you'd find yourself violating the standard is the aforesaid
narrow exceptions.  Usability is a different question though ...

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> And I'm not even sure how I'd explain the rules to someone.
>
> text is preferred to "char" which is preferred to unknown.
>
> This particular example would be less confusing if 'Hey'::"char"
> threw an error, but that behavior is the result of an ancient
> (bad?) decision in the input function of one legacy datatype.
> It's not, IMNSHO, evidence of an overall failure of the type system
> as a whole.

So the behavior of the "char" type is anomalous in this regard?  Other
character-based types behave like varchar (which has the behavior I
would expect here)?  That is encouraging.  Why isn't the behavior of
"char" in this regard considered a bug to be fixed?

I'm not sure I'm exactly understanding why the varchar(2) worked,
though.  Perhaps it would be more clear if I grasped why *that* one
does what I would think is the right thing.  At this point my first
guess would be that it discards the length for a varchar, and just
treats it as text (or some other "neutral" character-based type).  If
so, perhaps "char" should do the same?

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 01:55:28PM -0500, Kevin Grittner wrote:
> So the behavior of the "char" type is anomalous in this regard?  Other
> character-based types behave like varchar (which has the behavior I
> would expect here)?  That is encouraging.  Why isn't the behavior of
> "char" in this regard considered a bug to be fixed?

I think there are lots of implicit casts going on that muddy the water
with respect to what's going on at a type level.  Things get promoted to
TEXT easily in PG.

I posted a patch in this thread:

  http://archives.postgresql.org/message-id/20090902091654.GL5407@samason.me.uk

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 01:37:20PM -0500, Kevin Grittner wrote:
> That still seems to be the case in the draft of the 2003 standard I
> have:
>
> <general literal> ::=
>     <character string literal>
>   | <national character string literal>
>   | <Unicode character string literal>
>   | <binary string literal>
>   | <datetime literal>
>   | <interval literal>
>   | <boolean literal>
> <character string literal> ::=
>     [ <introducer><character set specification> ]
>     <quote> [ <character representation>... ] <quote>
>     [ { <separator> <quote> [ <character representation>... ] <quote>
> }... ]
>
> The ball's in your court to show something in the standard to say that
> a character string literal is ever *not* to be taken as a character
> string.

Huh, you're right.  I'd always thought '2001-01-01' was a valid date
literal, seems the standard has required it to be prefixed by DATE at
least back to SQL92.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 12:54:03PM -0500, Kevin Grittner wrote:
> Sam Mason <sam@samason.me.uk> wrote:
> > If we did follow Kevin's request directly, should we also be
> > specifying the type of NULL?
>
> I don't *think* the SQL standard requires that, and barring that I
> don't see any compelling reason to type NULL.

I've just realized that either I'm missing your point entirely (it's
happened before :) or this ignores the point entirely.  PG wants to
assign types to every expression, whether this expression will evaluate
to a NULL value at run-time or not is immaterial in this regard.  I
think SQL wants to do the same, but I don't have as much conviction as
Tom here.  Once we're ascribing types to expressions then whether it
happens to contain the literal "1", "'txt'" or "NULL" we're committed to
giving it some type---the only question is which one.  We thus need to
type expressions consisting of just NULL constants.

A fun puzzle to base any inductive solution on is what type to ascribe
to the following:

  CREATE VIEW v (c) AS
    SELECT NULL;

PG allows it, but the resulting view seems somewhat unusable.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:

> I'd always thought '2001-01-01' was a valid date literal, seems the
> standard has required it to be prefixed by DATE at least back to
> SQL92.

Yep.  I don't know if it would be remotely feasible, but the
implementation which seems like it would be "standard-safe" but still
give reasonable concessions to those wanting to skip the extra
keystrokes of declaring the type of literals which are not character
based would be to go with the suggestion of having a character string
literal type, and change the semantics such that if there is a valid
interpretation of the statement with the character string literal
taken as text, it should be used; if not, resolve by current "unknown"
rules. Probably not feasible, but it seems likely it would make
everyone reasonably happy if it could be done.

That leaves the issue of NULL being forced to type text in the absence
of any type info in CASE, COALESCE, and NULLIF.  If there were a way
to say that these could return unknown type, that would be solved.
That doesn't seem as though it would be likely to be massively
difficult, although I could be wrong about that.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:

>   CREATE VIEW v (c) AS
>     SELECT NULL;
>
> PG allows it, but the resulting view seems somewhat unusable.

I'm not sure whether the only place the standard doesn't require a
cast is on assignment, but this is one place that the standard clearly
does require a cast, and I'm all for that.  Requiring a cast anywhere
else the standard requires it would not offend me; although not
requiring it anywhere it doesn't generate nonstandard results, and
where the semantics are relatively sane, wouldn't offend me, either.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 02:41:32PM -0500, Kevin Grittner wrote:
> Sam Mason <sam@samason.me.uk> wrote:
>
> > I'd always thought '2001-01-01' was a valid date literal, seems the
> > standard has required it to be prefixed by DATE at least back to
> > SQL92.
>
> Yep.  I don't know if it would be remotely feasible, but the
> implementation which seems like it would be "standard-safe" but still
> give reasonable concessions to those wanting to skip the extra
> keystrokes of declaring the type of literals which are not character
> based would be to go with the suggestion of having a character string
> literal type, and change the semantics such that if there is a valid
> interpretation of the statement with the character string literal
> taken as text, it should be used; if not, resolve by current "unknown"
> rules. Probably not feasible, but it seems likely it would make
> everyone reasonably happy if it could be done.

Sounds as though that'll introduce more ambiguity into the system than
there is already.

> That leaves the issue of NULL being forced to type text in the absence
> of any type info in CASE, COALESCE, and NULLIF.  If there were a way
> to say that these could return unknown type, that would be solved.
> That doesn't seem as though it would be likely to be massively
> difficult, although I could be wrong about that.

Would be nice and I'd love it to work like this, but it's quite a big
change I think.  Currently, once PG has decided on a type it sticks with
it against all further evidence.  Another example:

  SELECT NULL AS c UNION SELECT '1' UNION SELECT 2;

Once you're doing the above you're into the world of full parametric
polymorphism and you're having to do much more complicated things at the
type level.  When my free time becomes infinite I'll have a chance!

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Yep.  I don't know if it would be remotely feasible, but the
> implementation which seems like it would be "standard-safe" but still
> give reasonable concessions to those wanting to skip the extra
> keystrokes of declaring the type of literals which are not character
> based would be to go with the suggestion of having a character string
> literal type, and change the semantics such that if there is a valid
> interpretation of the statement with the character string literal
> taken as text, it should be used; if not, resolve by current "unknown"
> rules.

There is already a weak preference for resolving unknown as text in
the presence of multiple alternatives.  So I'm not sure that you're
suggesting anything different from what happens now.  In particular,
weren't you the same person complaining a moment ago about
COALESCE(NULL,NULL) defaulting to text?  Why is that bad if the
above is good?

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Sam Mason
Date:
On Wed, Sep 02, 2009 at 02:59:54PM -0500, Kevin Grittner wrote:
> Sam Mason <sam@samason.me.uk> wrote:
> >   CREATE VIEW v (c) AS
> >     SELECT NULL;
> >
> > PG allows it, but the resulting view seems somewhat unusable.
>
> I'm not sure whether the only place the standard doesn't require a
> cast is on assignment, but this is one place that the standard clearly
> does require a cast, and I'm all for that.

I'm probably missing something obvious again, but where does it say
that?  Bear in mind that my simple NULL could be an arbitrarily complex
expression, I've just chosen a simple NULL for pedagogic reasons.

I can only see a requirement that the chosen type must be compatible.
That seems to leave it open to arbitrarily choosing any type in this
case.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Yep.  I don't know if it would be remotely feasible, but the
>> implementation which seems like it would be "standard-safe" but
>> still give reasonable concessions to those wanting to skip the
>> extra keystrokes of declaring the type of literals which are not
>> character based would be to go with the suggestion of having a
>> character string literal type, and change the semantics such that
>> if there is a valid interpretation of the statement with the
>> character string literal taken as text, it should be used; if not,
>> resolve by current "unknown" rules.
>
> There is already a weak preference for resolving unknown as text in
> the presence of multiple alternatives.  So I'm not sure that you're
> suggesting anything different from what happens now.  In particular,
> weren't you the same person complaining a moment ago about
> COALESCE(NULL,NULL) defaulting to text?  Why is that bad if the
> above is good?

Because COALESCE(NULL, NULL) has given no indication that it is
character based, while 'x' is defined by the standard to be a
character string literal.  The two uses of "unknown" in typing seem to
be solving different problems.  Perhaps using the same flag for both
is part of the problem.  The above was suggesting we differentiate.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:
> On Wed, Sep 02, 2009 at 02:59:54PM -0500, Kevin Grittner wrote:
>> Sam Mason <sam@samason.me.uk> wrote:
>> >   CREATE VIEW v (c) AS
>> >     SELECT NULL;
>> >
>> > PG allows it, but the resulting view seems somewhat unusable.
>>
>> I'm not sure whether the only place the standard doesn't require a
>> cast is on assignment, but this is one place that the standard
>> clearly does require a cast, and I'm all for that.
>
> I'm probably missing something obvious again, but where does it say
> that?  Bear in mind that my simple NULL could be an arbitrarily
> complex expression, I've just chosen a simple NULL for pedagogic
> reasons.
>
> I can only see a requirement that the chosen type must be
> compatible.  That seems to leave it open to arbitrarily choosing any
> type in this case.

I'll look.  On reflection, I based that statement on the fact that our
SQL parser requires it, and it was largely created by scanning in the
SQL syntax rules, using OCR on the image, and sed to format them for
ANTLR.  That doesn't guarantee that we didn't add that particular
requirement for the sake of our own sanity some time later.  I doubt I
can get to looking through the standard for it today, though.

By the way, the case of a bare NULL literal is potentially very
different from a complex expression which might resolve to NULL.  The
latter would generally have some result type which could be determined
even in tha absence of an actual value.  For example, a null-capable
column clearly has a type which can be determined.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

>> go with the suggestion of having a character string literal type,
>> and change the semantics such that if there is a valid
>> interpretation of the statement with the character string literal
>> taken as text, it should be used; if not, resolve by current
>> "unknown" rules.
>
> There is already a weak preference for resolving unknown as text in
> the presence of multiple alternatives.  So I'm not sure that you're
> suggesting anything different from what happens now.

It is certainly different for the example I gave up-thread involving
"char".  Perhaps that is a very unique and isolated situation.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> On Wed, Sep 02, 2009 at 02:59:54PM -0500, Kevin Grittner wrote:
>> I'm not sure whether the only place the standard doesn't require a
>> cast is on assignment, but this is one place that the standard clearly
>> does require a cast, and I'm all for that.

> I'm probably missing something obvious again, but where does it say
> that?

They express it as a syntactic constraint not a semantic one.  NULL
without decoration is a <contextually typed value specification>
which is allowed in only a limited number of places.  One place
it's allowed is inside a <cast specification>, which is reachable
from <value expression>.  The latter is allowed in lots of places.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Because COALESCE(NULL, NULL) has given no indication that it is
> character based, while 'x' is defined by the standard to be a
> character string literal.  The two uses of "unknown" in typing seem to
> be solving different problems.  Perhaps using the same flag for both
> is part of the problem.  The above was suggesting we differentiate.

I've always thought that having the type system treat NULL and 'foo'
alike was a feature, not a bug.  In a formal sense the type information
available is the same, ie, none.  The argument that SQL says 'foo'
must be character, so we should too, is greatly weakened by the fact
that SQL has such an impoverished set of built-in types.  If we want
to treat user-defined types as anything approaching first-class types,
we have to be pretty suspicious of that restriction.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> In a formal sense the type information available is the same, ie,
> none.

Well, in the sense that an international standard is formal, there is
a formal difference, in that one has no type information and the other
is a character string.  However:

> The argument that SQL says 'foo' must be character, so we should
> too, is greatly weakened by the fact that SQL has such an
> impoverished set of built-in types.  If we want to treat
> user-defined types as anything approaching first-class types, we
> have to be pretty suspicious of that restriction.

Another big clue for me in terms of the community perspective on this.
Thanks.

I think that the current approach leaves a small number of corner
cases where we break SQL compliance.  I think it's worthwhile trying
to fix that.  Whether that's best done by identifying the individual
corners and fixing them independently as aberrations, or implementing
some changes which provide the PostgreSQL extensions in a way that
doesn't tend to break standard usage (and of course has little or no
impact on current PostgreSQL users), is beyond my ken.

I'm also not suggesting that this is the most urgent issue around.
If anyone can suggest an appropriate wording for a TODO on the topic,
I'll happily shut up and move on....  ;-)

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Greg Stark
Date:
So one of the elephants in the room in this (rather dead-end)
discussion is that one of the things "unknown" is good for is the fact
that most clients don't bind their parameter types to specific types.
Doing so is extremely cumbersome in just about every interface because
it forces you to think about SQL types and look up constants for every
parameter type. It's even worse if you have user-defined types on the
server and have to start figuring out how to look these up
dynamically.

We use unknown to normally dtrt when a client passes a text literal
representation without forcing them to tell us what type to interpret
it as. Most client interfaces can just leave every parameter set to
type unknown and let Postgres figure out what to do with everything.

However it occurs to me that that doesn't work very well for
substring(). If your client interface doesn't implicitly bind the
second argument to integer it'll be interpreted as text by default and
you get what is usually going to not be what you want;

postgres=# select substring('foobar456',4);
 substring
-----------
 bar456
(1 row)

postgres=# select substring('foobar456','4');
 substring
-----------
 4
(1 row)

This for example makes it awkward to use from Perl:

$ perl -e 'use Data::Dumper; use DBI; use DBD::Pg; $d =
DBI->connect("dbi:Pg:database=postgres;host=localhost;port=1234");
print Dumper $d->selectall_arrayref(q{select
substring('\''foobar'\'',?)},{},'4');'
$VAR1 = [
          [
            undef
          ]
        ];
$ perl -e 'use Data::Dumper; use DBI; use DBD::Pg; $d =
DBI->connect("dbi:Pg:database=postgres;host=localhost;port=1234");
print Dumper $d->selectall_arrayref(q{select
substring('\''foobar'\'',?)},{},4);'
$VAR1 = [
          [
            undef
          ]
        ];

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I think that the current approach leaves a small number of corner
> cases where we break SQL compliance.  I think it's worthwhile trying
> to fix that.  Whether that's best done by identifying the individual
> corners and fixing them independently as aberrations, or implementing
> some changes which provide the PostgreSQL extensions in a way that
> doesn't tend to break standard usage (and of course has little or no
> impact on current PostgreSQL users), is beyond my ken.

It's certainly worth looking into.  I would not want to prejudge how
to fix such issues; I think the best way to proceed would be to create
a list of the problems first.

One other point worth making is that we don't always consider SQL
compliance to be a hard requirement that trumps every other
consideration.  An example is case-folding of identifiers; it's
been pretty well agreed that between readability and backwards-
compatibility considerations, we simply aren't going to switch over
to doing it exactly like the spec.  So any proposed tweaks in this area
would be considered as tradeoffs between better spec compliance and
other goals.

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> One other point worth making is that we don't always consider SQL
> compliance to be a hard requirement that trumps every other
> consideration.

Point noted.

> An example is case-folding of identifiers; it's been pretty well
> agreed that between readability and backwards-compatibility
> considerations, we simply aren't going to switch over to doing it
> exactly like the spec.

I've left that one alone both because I saw the discussions of it and
because our framework automatically fixes identifier capitalization
based on our metadata and then wraps the all identifiers in quotes.
As long as PostgreSQL honors the rules about quoted identifiers, we're
golden.  :-)  (Although, one of these days I may try to do something
about how psql, etc. treat identifiers which are all lower case.  The
refusal to consider the quoted form a match during tab-completion, for
example, is a regular annoyance.)

> So any proposed tweaks in this area would be considered as tradeoffs
> between better spec compliance and other goals.

Fair enough.  I consider myself warned.  ;-)

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is"char"

From
Robert Haas
Date:
On Wed, Sep 2, 2009 at 3:34 PM, Sam Mason<sam@samason.me.uk> wrote:
> On Wed, Sep 02, 2009 at 12:54:03PM -0500, Kevin Grittner wrote:
>> Sam Mason <sam@samason.me.uk> wrote:
>> > If we did follow Kevin's request directly, should we also be
>> > specifying the type of NULL?
>>
>> I don't *think* the SQL standard requires that, and barring that I
>> don't see any compelling reason to type NULL.
>
> I've just realized that either I'm missing your point entirely (it's
> happened before :) or this ignores the point entirely. =A0PG wants to
> assign types to every expression, whether this expression will evaluate
> to a NULL value at run-time or not is immaterial in this regard. =A0I
> think SQL wants to do the same, but I don't have as much conviction as
> Tom here. =A0Once we're ascribing types to expressions then whether it
> happens to contain the literal "1", "'txt'" or "NULL" we're committed to
> giving it some type---the only question is which one. =A0We thus need to
> type expressions consisting of just NULL constants.
>
> A fun puzzle to base any inductive solution on is what type to ascribe
> to the following:
>
> =A0CREATE VIEW v (c) AS
> =A0 =A0SELECT NULL;

'a of course.

...Robert

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Tom Lane
Date:
Sam Mason <sam@samason.me.uk> writes:
> Would something like the included patch be accepted?
> [ patch to make charin() throw error for excess input ]

I remembered the reason why this idea got rejected previously.
The winning argument was that "char" should behave as nearly
as possible like the standard type char(1).  And guess what,
that truncates silently:

regression=# select 'foo'::char(1);
 bpchar
--------
 f
(1 row)

However, I think we forgot that that is only true for an explicit
coercion to char.  If you do this:

regression=# create table fool (f1 char(1));
CREATE TABLE
regression=# insert into fool values('123');
ERROR:  value too long for type character(1)

(I'm too lazy to go look up chapter and verse, but all of this is per
SQL standard.  Consistency ain't their strong point.)

I certainly don't want to have "char" emulate the misbegotten decision
to have explicit and implicit coercions behave differently.  So it
looks to me like the argument to make "char" work like char(1) doesn't
actually help us much to decide if an error should be thrown here or
not.  On the whole, throwing an error seems better from a usability
perspective.

Comments?

            regards, tom lane

PS: I would suggest that we borrow the wording and SQLSTATE of the
existing message, if we do make it throw an error.

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Robert Haas
Date:
On Fri, Sep 4, 2009 at 11:15 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> I certainly don't want to have "char" emulate the misbegotten decision
> to have explicit and implicit coercions behave differently. =A0So it
> looks to me like the argument to make "char" work like char(1) doesn't
> actually help us much to decide if an error should be thrown here or
> not. =A0On the whole, throwing an error seems better from a usability
> perspective.

I like errors.

...Robert

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I certainly don't want to have "char" emulate the misbegotten
> decision to have explicit and implicit coercions behave differently.
> So it looks to me like the argument to make "char" work like char(1)
> doesn't actually help us much to decide if an error should be thrown
> here or not.  On the whole, throwing an error seems better from a
> usability perspective.

I feel that the behavior of "char" in at least this case should match
char(1) (or just plain char):

test=# select case when true then 'xxx' else 'a'::"char" end from t;
 case
------
 x
(1 row)

test=# select case when true then 'xxx' else 'a'::char(1) end from t;
 case
------
 xxx
(1 row)

test=# select case when true then 'xxx' else 'a'::char end from t;
 case
------
 xxx
(1 row)

Much as the reason for the behavior of "char" may seem clear when
inside the code looking out, it is astonishing for someone writing
application code.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Robert Haas
Date:
On Fri, Sep 4, 2009 at 11:59 AM, Kevin
Grittner<Kevin.Grittner@wicourts.gov> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> I certainly don't want to have "char" emulate the misbegotten
>> decision to have explicit and implicit coercions behave differently.
>> So it looks to me like the argument to make "char" work like char(1)
>> doesn't actually help us much to decide if an error should be thrown
>> here or not. =A0On the whole, throwing an error seems better from a
>> usability perspective.
>
> I feel that the behavior of "char" in at least this case should match
> char(1) (or just plain char):
>
> test=3D# select case when true then 'xxx' else 'a'::"char" end from t;
> =A0case
> ------
> =A0x
> (1 row)
>
> test=3D# select case when true then 'xxx' else 'a'::char(1) end from t;
> =A0case
> ------
> =A0xxx
> (1 row)
>
> test=3D# select case when true then 'xxx' else 'a'::char end from t;
> =A0case
> ------
> =A0xxx
> (1 row)
>
> Much as the reason for the behavior of "char" may seem clear when
> inside the code looking out, it is astonishing for someone writing
> application code.
>
> -Kevin

Yeah,  I agree.  That's really confusing.

...Robert

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Sam Mason
Date:
On Fri, Sep 04, 2009 at 10:59:48AM -0500, Kevin Grittner wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I certainly don't want to have "char" emulate the misbegotten
> > decision to have explicit and implicit coercions behave differently.
> > So it looks to me like the argument to make "char" work like char(1)
> > doesn't actually help us much to decide if an error should be thrown
> > here or not.  On the whole, throwing an error seems better from a
> > usability perspective.

I'm all for the error being thrown; no particular feelings about whether
it only happens during "explicit" casts or everywhere.  There's always
the substring() function if the user wants it.

> I feel that the behavior of "char" in at least this case should match
> char(1) (or just plain char):

Hum, I'm not sure if that's useful behavior.  As far as I can tell, you
seem to be wanting in-memory representations of "string like types"
to all use the same representation and only use the actual types when
saving to/from disk.  This would give behavior that is consistent with
what you're for asking below.

> test=# select case when true then 'xxx' else 'a'::"char" end from t;
>  case
> ------
>  x
> (1 row)

With the patch I gave, or something like it, this would throw an error
because 'xxx' is being used to initialize a value of "char" type.

> test=# select case when true then 'xxx' else 'a'::char(1) end from t;
>  case
> ------
>  xxx
> (1 row)

This gives back 'xxx' because the types character and bpchar have an
implicit cast defined between them.  The result is thus of type bpchar
which places no restrictions on the length (that would apply here
anyway), with 'a' being cast from character to bpchar implicitly.

> test=# select case when true then 'xxx' else 'a'::char end from t;
>  case
> ------
>  xxx
> (1 row)

This does the same as above.

> Much as the reason for the behavior of "char" may seem clear when
> inside the code looking out, it is astonishing for someone writing
> application code.

I think things would be clearer if an error was thrown in the
constructor of "char" types when the length wasn't correct.  Implicit
casts are muddying the waters as well, which doesn't aid understanding
of what's going on here.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:

> you seem to be wanting in-memory representations of "string like
> types" to all use the same representation and only use the actual
> types when saving to/from disk.

Doing so when actually assigning to *something* of the more specific
type would probably be better.  In my view, that's not happening here;
although I see that the code currently doesn't recognize the
difference.

>> test=# select case when true then 'xxx' else 'a'::"char" end from
t;
>>  case
>> ------
>>  x
>> (1 row)
>
> With the patch I gave, or something like it, this would throw an
> error because 'xxx' is being used to initialize a value of "char"
> type.

As I read the semantics of the CASE predicate, it returns one of the
given values.  'x' is not one of the given values, regardless of type.
I don't think an error is the right thing, I think returning the
specified value is the right thing.  I don't think it's a good thing
that the type system decides that the result type for this case
predicate is "char" and that 'xxx' needs to be coerced to that type.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Jeff Davis
Date:
On Fri, 2009-09-04 at 11:15 -0400, Tom Lane wrote:
> On the whole, throwing an error seems better from a usability
> perspective.
>
> Comments?

Agreed.

Regards,
    Jeff Davis

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Sam Mason
Date:
On Fri, Sep 04, 2009 at 12:26:19PM -0500, Kevin Grittner wrote:
> Sam Mason <sam@samason.me.uk> wrote:
> > Kevin Grittner wrote:
> >> test=# select case when true then 'xxx' else 'a'::"char" end from t;
> >
> > 'xxx' is being used to initialize a value of "char" type.
>
> As I read the semantics of the CASE predicate, it returns one of the
> given values.  'x' is not one of the given values, regardless of type.

You seem to be confused about the difference between literals and
values.  Maybe a different example:

  SELECT '00001'::int;

I get '1' back from that, and not '00001'.  This is because '00001' is
the literal that is parsed into a value of type integer and then the
query is run and this same value is asked to convert itself back into a
literal to be written out to the screen.

Back to your example; you're asking PG to interpret the literal 'xxx'
as a "char" and it does that (but doesn't give any error back when it
chucks data away).  This behavior may be confusing because for text
types the literal exactly the same as the value itself, but this is only
a very specific behavior of text types.  For example, '{"1"}', '{1}'
and even '{"+001"}' are all literal representations of identical integer
arrays.

> I don't think an error is the right thing, I think returning the
> specified value is the right thing.  I don't think it's a good thing
> that the type system decides that the result type for this case
> predicate is "char" and that 'xxx' needs to be coerced to that type.

I fail to see how an error isn't the right thing; if we try with some
other types let see if you think any of these should succeed.

  SELECT CASE WHEN TRUE THEN text 'xxx' ELSE 0 END;
  SELECT CASE WHEN TRUE THEN text 'xxx' ELSE TRUE END;
  SELECT CASE WHEN TRUE THEN text 'xxx' ELSE '{1}'::INT[] END;
  SELECT CASE WHEN TRUE THEN text 'xxx' ELSE array [1] END;

"char" is no different other than, by default, it happens to look a lot
like any value of text type.  It's a different type (that happens to
have some implicit casts to confuse things) and hence I can't see why
invalid literals should not be thrown out.

--
  Sam  http://samason.me.uk/

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
"Kevin Grittner"
Date:
Sam Mason <sam@samason.me.uk> wrote:

> I fail to see how an error isn't the right thing; if we try with
> some other types let see if you think any of these should succeed.
>
>   SELECT CASE WHEN TRUE THEN text 'xxx' ELSE 0 END;
>   SELECT CASE WHEN TRUE THEN text 'xxx' ELSE TRUE END;
>   SELECT CASE WHEN TRUE THEN text 'xxx' ELSE '{1}'::INT[] END;
>   SELECT CASE WHEN TRUE THEN text 'xxx' ELSE array [1] END;
>
> "char" is no different other than, by default, it happens to look a
> lot like any value of text type.

So much so that it has the same name as a text type (wrapped in
quotes) and behaves a lot like one:

test=# SELECT CASE WHEN TRUE THEN text 'xxx' ELSE 'a'::"char" END;
 case
------
 xxx
(1 row)

test=# select upper('a'::"char");
 upper
-------
 A
(1 row)

test=# select char_length('a'::"char");
 char_length
-------------
           1
(1 row)

test=# select substring('a'::"char" from 1 for 1);
 substring
-----------
 a
(1 row)

Making it behave so much like character-based types and giving it a
name which implies that it is character based and storing a character
in it, but then not treating it like other character types in the
CASE context is bound to cause surprises for people.

> It's a different type (that happens to have some implicit casts to
> confuse things) and hence I can't see why invalid literals should
> not be thrown out.

Only, I guess, because of the name.  If it weren't called "char" I
guess I wouldn't be concerned about people expecting it to behave
something like char.  If "char" behaved more like char, the 'xxx'
literal wouldn't be taken as input to the type in the above CASE
statement.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Only, I guess, because of the name.  If it weren't called "char" I
> guess I wouldn't be concerned about people expecting it to behave
> something like char.  If "char" behaved more like char, the 'xxx'
> literal wouldn't be taken as input to the type in the above CASE
> statement.

I'm not certain what you're trying to say, but the above is complete
nonsense ...

            regards, tom lane

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I'm not certain what you're trying to say, but the above is complete
> nonsense ...

Ah, so it is.  I thought someone up-thread said that in this case it
wound up as bpchar; but I see that's not so:

test=# select pg_typeof((select case when true then 'xxx' else
'a'::char(1) end));
 pg_typeof
-----------
 character
(1 row)

All that's done is to strip off the length.

I guess that since the "char" type is documented as being for internal
use, these issues would only affect those who choose to write queries
against catalog tables or use an internal type in their tables, so I
guess it's not worth going to extremes to make it behave like char.

Given all that, I'll conceed the point, and give a +1 for the error
message.

-Kevin

Re: BUG #5028: CASE returns ELSE value always when type is "char"

From
Sam Mason
Date:
On Fri, Sep 04, 2009 at 02:01:07PM -0500, Kevin Grittner wrote:
> > "char" is no different other than, by default, it happens to look a
> > lot like any value of text type.
>
> So much so that it has the same name as a text type (wrapped in
> quotes) and behaves a lot like one:

You're getting bitten by implicit casts.  Try creating a view of any
of your examples and seeing what the resulting types are, or in 8.4 an
EXPLAIN VERBOSE seems to show how it's typed the expression.

--
  Sam  http://samason.me.uk/