Re: Allowing postgresql to accept 0xff syntax for data types that it makes sense for? - Mailing list pgsql-general

From Arjen Nienhuis
Subject Re: Allowing postgresql to accept 0xff syntax for data types that it makes sense for?
Date
Msg-id CAG6W84K9CtFj1+3YSfpKTuhkZzUJqbZgDK7fBtuC1XnfG87b3Q@mail.gmail.com
Whole thread Raw
In response to Re: Allowing postgresql to accept 0xff syntax for data types that it makes sense for?  (Bill Moran <wmoran@potentialtech.com>)
List pgsql-general
On Fri, May 22, 2015 at 6:57 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> On Fri, 22 May 2015 12:44:40 -0400
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> Bill Moran <wmoran@potentialtech.com> writes:
>> > Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >> Other questions you'd have to think about: what is the data type of
>> >> 0xffffffff; what do you do with 0xffffffffffffffffffffffff (too big
>> >> even for int8).  And it'd likely behoove you to check how Microsoft
>> >> answers those questions, if you want to point to SQL Server as what's
>> >> going to keep you out of standards-compatibility problems.  (IOW,
>> >> if 0x ever did get standardized, the text might well match what
>> >> SQL Server does.)
>>
>> > MSSQL seems to use it specifically for the equivalent of BYTEA types,
>> > and it seems to me that should be how it works in PostgreSQL.
>>
>> Oh really?  Wow, I'd just assumed you wanted this as a way to write
>> integers.  That's certainly the use-case I would have personally.
>> I'm not even sure I like the idea of being able to write byteas without
>> quotes --- they seem like strings to me, not numbers.
>
> Arrgh ... it's good that you're bringing this up, but you're making me
> realize that there's more to figure out than I originally thought ...
> My focus had been on it being used for BYTEA columns, but there _are_
> plenty of places in the code that do things like:
>
> WHERE int_col & 0x04 = 0x04
>
> Which means that Sybase will implicitly cast that to an int, which
> probably means that MSSQL will as well.
>
> Once I take that into consideration, I start thinking that int_col
> should actualy be a bit string. which means that:
>
> WHERE bit_varying_col & 0x04 = 0x04
>
> should probably work without explicit casts as well.
>
>> > If an implicit cast from a 4-byte BYTEA to int works now, then it
>> > should work ... otherwise an explicit cast would be needed, with the
>> > same behavior if you tried to specify a number that overflows an int
>> > in any other way.
>>
>> There's no cast at all from bytea to int.  For one thing, it's quite
>> unclear what endianness should be assumed for such a cast.  (To get
>> unsurprising behavior from what you're describing, I think we'd have
>> to use a big-endian interpretation of the bytea; but that would be
>> a pain for a lot of other scenarios, or even for this case if you'd
>> written a bytea of length other than 4 or 8 bytes.)
>
> As an implicit cast, obviously anything too large to fit in the
> target data type should be an error. But the subject of endianness
> becomes damn important.
>
> It's more complex than I original thought, but it still seems like it
> can be done without creating idiocy in the way things are cast. I'll
> think about it some more and try to come up with some more specific
> use scenarios to see what behavior seems the most POLA to me.
>

SQL server does this:

0x10 is VARBINARY:

   0x10 + 0x00 = 0x1000

There is an implicit cast from varbinary to int:

   0x10 + 0 = 16
   0xfffffff9  + 0 = -7

And there is silent truncation:

    0xff00000000 + 0 = 0


pgsql-general by date:

Previous
From: Melvin Davidson
Date:
Subject: Re: Queries for unused/useless indexes
Next
From: Francisco Olarte
Date:
Subject: Re: FW: Constraint exclusion in partitions