Re: [GENERAL] SELECT x'00000000F'::int leading zeros causes"integer out of range" - Mailing list pgsql-general

From Gavan Schneider
Subject Re: [GENERAL] SELECT x'00000000F'::int leading zeros causes"integer out of range"
Date
Msg-id 4390-1487985597-102760@sneakemail.com
Whole thread Raw
In response to Re: [GENERAL] SELECT x'00000000F'::int leading zeros causes "integerout of range"  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
List pgsql-general
On 2/25/17 at 6:56 AM, Gavin Flower wrote:

>On 25/02/17 08:39, John McKown wrote:
>>On Fri, Feb 24, 2017 at 1:25 PM, David G. Johnston
>>On Friday, February 24, 2017, Tom Lane wrote:
>>
>>Justin Pryzby writes:
>>>Is this expected behavior ?
>>>ts=# SELECT x'00000000F'::int;
>>>ERROR:  22003: integer out of range
>>>LOCATION:  bittoint4, varbit.c:1575
>>
>>Yes.  The provided operation is "convert a bitstring of up to
>>32 bits to an integer".  It's not "guess whether it's okay to
>>throw away some bits to make an integer".
>>
>>IME The error message itself is to blame here - we are checking
>>for a malformed (too many characters) integer varbit
>>representation but then reporting that the we somehow got a valid
>>integer but that it is "out of range".
>>
>>​A better reply would be good. Another possibility is for the parser
>>to remove unneeded leading zeros.​
>>
>[...]
>
>I think the latter would be a good idea!
>
This is interesting in that the views expressed range from
something close to "every bit is sacred" through to something
resembling "drop what's needed to make it work".

My take is PostgreSQL is already pragmatic:

     pendari=# select ((x'FFFFFFFFFFFFFFFF')::bigint)::int;
      int4
     ------
        -1
     (1 row)

Clearly we've quietly dropped a lot of bits moving across this line.

The problem posed in the OP example happens when the bit pattern
is under specifying a long value (or over specifying a short
value), and, in an ideal world, the correct behaviour should be
close to what all well behaved CPUs are already doing:

Opclass             Operand         Action (MSB=most significant bit)
==================|===============|==================================

logical/bitwise     Small->Large    Zero fill most significant, but
                     Large->Small    check which "standard" applies

arthmetic/signed    Small->Large    Propagate sign bit to left
                     Large->Small    Truncate sign bits, error
                                     if sign bits are not all equal,
                                     and not equal to MSB of result

arithmetic/unsig    Small->Large    Zero fill most significant part
                     Large->Small    Truncate from MSB, error if
                                     any truncated bit is not zero

To my mind Tom's reply resembles the bitwise case but I think
the OP's example should ideally have been interpreted in an
arithmetic manner (i.e., treating the extra bits as representing
the sign and nothing more) since the desired result was to be a
signed integer.

But! This gets problematic for something like:  x'FFF67'::bigint

My analogy would have this interpreted as
x'FFFFFFFFFFFFFF67'::bigint whereas the current behaviour is
equivalent to x'00000000000FFF67'::bigint, and I doubt anyone
has the appetite to change this. (Of course we have always known
using bit masks across architectures with different word sizes
was never an easy or safe activity. :)

So, getting back to the OP problem… what's a good parser to do?

I suggest:
1. the error message might be better (i.e., help get the focus
onto the real problem); and/or,
2. consider dropping excess leading zeros when building an
integer value. (I don't think this breaks anything.)

Other than that there really isn't a realisable consistent
behaviour beyond the current strict bitwise interpretation.
Specifically any behaviour which tries to promote or truncate
some "sign" bits in an arithmetically consistent manner is going
to break existing behaviour.

Regards
Gavan Schneider



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [GENERAL] could not translate host name
Next
From: Patrick B
Date:
Subject: Re: [GENERAL] bloat indexes - opinion