Thread: boolean <=> text explicit casts

boolean <=> text explicit casts

From
Neil Conway
Date:
I noticed that SQL:2003 specifies explicit casts between "boolean" and
the character string types. Attached is a patch that implements them,
and adds some simple regression tests.

A few points worth noting:

(1) The SQL spec requires that text::boolean trim leading and trailing
whitespace from the input

(2) The spec also requires that boolean::varchar(n) should raise an
error if "n" is not large enough to accomodate the textual
representation of the boolean value. We currently truncate:

=> select true::boolean::varchar(3);
 varchar
---------
 TRU

Not sure offhand if there's an easy way to satisfy the spec's
requirement...

(3) The spec suggests that true/false should be upper-cased when
converted to text, so that's what I've implemented, but one could argue
that converting to lower-case would be more consistent with PG's general
approach to case folding.

-Neil


Attachment

Re: boolean <=> text explicit casts

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> (2) The spec also requires that boolean::varchar(n) should raise an
> error if "n" is not large enough to accomodate the textual
> representation of the boolean value.

Really?  That's in direct contradiction to the "normal" spec-required
behavior of casting to varchar(n).  I'd suggest ignoring it on the
grounds that the SQL committee have forgotten what they wrote
themselves.

> (3) The spec suggests that true/false should be upper-cased when
> converted to text, so that's what I've implemented, but one could argue
> that converting to lower-case would be more consistent with PG's general
> approach to case folding.

hm, +1 for lower case myself, but not dead set on it.

More generally, I'm really hoping to get rid of bespoke text<->whatever
cast functions in favor of using datatypes' I/O functions.  To what
extent can we make the boolean I/O functions serve for this?  It seems
relatively painless on the input side --- just allow whitespace --- but
I suppose we can't change boolout's historical result of "t"/"f" without
causing problems.

Also, invoking btrim() seems an exceedingly expensive way of ignoring a
bit of whitespace.  I suppose inefficiency in a seldom-used cast
function does not matter, but please don't do it that way in boolin.

            regards, tom lane

Re: boolean <=> text explicit casts

From
Neil Conway
Date:
On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote:
> More generally, I'm really hoping to get rid of bespoke text<->whatever
> cast functions in favor of using datatypes' I/O functions.

I don't object, but I'm curious: is there a benefit to this other than
brevity of implementation? ISTM the spec has the idea that the input to
a type's constructor is often distinct from the type's text => type
casting behavior.

-Neil



Re: boolean <=> text explicit casts

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote:
>> More generally, I'm really hoping to get rid of bespoke text<->whatever
>> cast functions in favor of using datatypes' I/O functions.

> I don't object, but I'm curious: is there a benefit to this other than
> brevity of implementation? ISTM the spec has the idea that the input to
> a type's constructor is often distinct from the type's text => type
> casting behavior.

Well, (a) it would fill in a whole lot of text-conversion cases that are
currently missing, and (b) it would encourage datatype implementors to
keep the I/O and text-conversion cases behaving alike unless there were
a REALLY good reason not to.  IMHO most of the cases that the SQL spec
calls out as behaving differently are pure brain-damage.

            regards, tom lane

Re: boolean <=> text explicit casts

From
Neil Conway
Date:
On Mon, 2007-28-05 at 15:38 -0400, Tom Lane wrote:
> More generally, I'm really hoping to get rid of bespoke text<->whatever
> cast functions in favor of using datatypes' I/O functions.  To what
> extent can we make the boolean I/O functions serve for this?  It seems
> relatively painless on the input side --- just allow whitespace --- but
> I suppose we can't change boolout's historical result of "t"/"f" without
> causing problems.

Attached is a revised version of this patch that modifies boolin() to
ignore leading and trailing whitespace. This makes text => boolean
trivial, but boolean => text is still distinct from boolout().

Barring any objections, I'll apply this later today or tomorrow.

-Neil


Attachment

Re: boolean <=> text explicit casts

From
Peter Eisentraut
Date:
Neil Conway wrote:
> Attached is a revised version of this patch that modifies boolin() to
> ignore leading and trailing whitespace. This makes text => boolean
> trivial, but boolean => text is still distinct from boolout().

I'm not sure what your rationale was for creating lower-case words
instead of upper case, except for it looks nicer.  Is there a technical
reason?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: boolean <=> text explicit casts

From
Neil Conway
Date:
On Wed, 2007-30-05 at 21:23 +0200, Peter Eisentraut wrote:
> I'm not sure what your rationale was for creating lower-case words
> instead of upper case, except for it looks nicer.  Is there a technical
> reason?

There's no real technical reason: the standard says upper-case, but PG's
general philosophy of case folding would suggest folding to lower-case.
If we were compliant with the spec's case folding requirements then
emitting uppercase would be the clear choice, but since we aren't, I
don't have strong feelings either way.

-Neil



Re: boolean <=> text explicit casts

From
Jim Nasby
Date:
On May 30, 2007, at 3:40 PM, Neil Conway wrote:
> On Wed, 2007-30-05 at 21:23 +0200, Peter Eisentraut wrote:
>> I'm not sure what your rationale was for creating lower-case words
>> instead of upper case, except for it looks nicer.  Is there a
>> technical
>> reason?
>
> There's no real technical reason: the standard says upper-case, but
> PG's
> general philosophy of case folding would suggest folding to lower-
> case.
> If we were compliant with the spec's case folding requirements then
> emitting uppercase would be the clear choice, but since we aren't, I
> don't have strong feelings either way.

Sorry for the late reply...

I'm worried that this would make us incompatible with cross-database
code. Granted, should probably be using a boolean representation, but
I'm not sure if that's universally true. And if we find out later
that lower case is a problem, it won't be possible to change it
without messing with the rest of our users. I think it'd be best to
go with the spec.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)