Thread: BUG #5237: strange int->bit and bit->int conversions

BUG #5237: strange int->bit and bit->int conversions

From
"Roman Kononov"
Date:
The following bug has been logged online:

Bug reference:      5237
Logged by:          Roman Kononov
Email address:      kononov@ftml.net
PostgreSQL version: 8.4.1
Operating system:   GNU/Linux x86_64
Description:        strange int->bit and bit->int conversions
Details:

test=# select (11::int4<<23 | 11::int4)::bit(32);
 00000101100000000000000000001011

test=# select (11::int4<<23 | 11::int4)::bit(33);
 000001011100000000000000000001011

test=# select (11::int4<<23 | 11::int4)::bit(39);
 000001010000101100000000000000000001011

test=# select (11::int4<<23 | 11::int4)::bit(40);
 0000000000000101100000000000000000001011

The ::bit(33) and ::bit(39) conversions seem wrong.

test-std=# select 1::int4::bit(32)::int4;
    1

test-std=# select 1::int4::bit(33)::int4;
ERROR:  integer out of range

Why is it out of range?

Re: BUG #5237: strange int->bit and bit->int conversions

From
Roman Kononov
Date:
The bitfromint8() and bitfromint4() are hosed. They produce wrong
results when the BIT size is more than 64 and 32 respectively, and the
BIT size is not multiple of 8, and the most significant byte of the
integer value is not 0x00 or 0xff.

For example:

test=# select (11::int4<<23 | 11::int4)::bit(32);
  00000101100000000000000000001011

test=# select (11::int4<<23 | 11::int4)::bit(33);
  000001011100000000000000000001011

test=# select (11::int4<<23 | 11::int4)::bit(39);
  000001010000101100000000000000000001011

test=# select (11::int4<<23 | 11::int4)::bit(40);
  0000000000000101100000000000000000001011

The ::bit(33) and ::bit(39) conversions are wrong.

The patch re-implements the conversion functions.
diff -urp -x '*.o' -x '*.txt' -x '*.so' postgresql-8.4.1-orig/src/backend/utils/adt/varbit.c
postgresql-8.4.1/src/backend/utils/adt/varbit.c
--- postgresql-8.4.1-orig/src/backend/utils/adt/varbit.c    2009-12-12 09:19:13.000000000 -0600
+++ postgresql-8.4.1/src/backend/utils/adt/varbit.c    2009-12-12 10:29:59.000000000 -0600
@@ -1321,8 +1321,8 @@ bitfromint4(PG_FUNCTION_ARGS)
     VarBit       *result;
     bits8       *r;
     int            rlen;
-    int            destbitsleft,
-                srcbitsleft;
+    int const    srcbits=sizeof(a)*BITS_PER_BYTE;
+    int i;

     if (typmod <= 0)
         typmod = 1;                /* default bit length */
@@ -1333,32 +1333,21 @@ bitfromint4(PG_FUNCTION_ARGS)
     VARBITLEN(result) = typmod;

     r = VARBITS(result);
-    destbitsleft = typmod;
-    srcbitsleft = 32;
-    /* drop any input bits that don't fit */
-    srcbitsleft = Min(srcbitsleft, destbitsleft);
-    /* sign-fill any excess bytes in output */
-    while (destbitsleft >= srcbitsleft + 8)
-    {
-        *r++ = (bits8) ((a < 0) ? BITMASK : 0);
-        destbitsleft -= 8;
-    }
-    /* store first fractional byte */
-    if (destbitsleft > srcbitsleft)
-    {
-        *r++ = (bits8) ((a >> (srcbitsleft - 8)) & BITMASK);
-        destbitsleft -= 8;
-    }
-    /* Now srcbitsleft and destbitsleft are the same, need not track both */
-    /* store whole bytes */
-    while (destbitsleft >= 8)
-    {
-        *r++ = (bits8) ((a >> (destbitsleft - 8)) & BITMASK);
-        destbitsleft -= 8;
+    rlen=(typmod+BITS_PER_BYTE-1)/BITS_PER_BYTE;
+
+    if (srcbits>=typmod) {
+        a<<=srcbits-typmod;
+        for (i=0; i!=rlen; ++i, a<<=BITS_PER_BYTE) r[i]=a>>(srcbits-BITS_PER_BYTE);
+    } else {
+        int sh=typmod%BITS_PER_BYTE;
+        int32 h=a>>sh;
+        int32 l=a<<(srcbits-sh);
+        size_t const zsize=rlen-sizeof(a)-(sh!=0);
+        bits8 sx=(a>=0)-1;
+        memset(r,sx,zsize);
+        for (i=0; i!=sizeof(a); ++i, h<<=8) r[zsize+i]=h>>(srcbits-BITS_PER_BYTE);
+        if (sh!=0) r[zsize+sizeof(a)]=l>>(srcbits-BITS_PER_BYTE);
     }
-    /* store last fractional byte */
-    if (destbitsleft > 0)
-        *r = (bits8) ((a << (8 - destbitsleft)) & BITMASK);

     PG_RETURN_VARBIT_P(result);
 }
@@ -1396,8 +1385,8 @@ bitfromint8(PG_FUNCTION_ARGS)
     VarBit       *result;
     bits8       *r;
     int            rlen;
-    int            destbitsleft,
-                srcbitsleft;
+    int const    srcbits=sizeof(a)*BITS_PER_BYTE;
+    int            i;

     if (typmod <= 0)
         typmod = 1;                /* default bit length */
@@ -1408,36 +1397,21 @@ bitfromint8(PG_FUNCTION_ARGS)
     VARBITLEN(result) = typmod;

     r = VARBITS(result);
-    destbitsleft = typmod;
-#ifndef INT64_IS_BUSTED
-    srcbitsleft = 64;
-#else
-    srcbitsleft = 32;            /* don't try to shift more than 32 */
-#endif
-    /* drop any input bits that don't fit */
-    srcbitsleft = Min(srcbitsleft, destbitsleft);
-    /* sign-fill any excess bytes in output */
-    while (destbitsleft >= srcbitsleft + 8)
-    {
-        *r++ = (bits8) ((a < 0) ? BITMASK : 0);
-        destbitsleft -= 8;
-    }
-    /* store first fractional byte */
-    if (destbitsleft > srcbitsleft)
-    {
-        *r++ = (bits8) ((a >> (srcbitsleft - 8)) & BITMASK);
-        destbitsleft -= 8;
-    }
-    /* Now srcbitsleft and destbitsleft are the same, need not track both */
-    /* store whole bytes */
-    while (destbitsleft >= 8)
-    {
-        *r++ = (bits8) ((a >> (destbitsleft - 8)) & BITMASK);
-        destbitsleft -= 8;
+    rlen=(typmod+BITS_PER_BYTE-1)/BITS_PER_BYTE;
+
+    if (srcbits>=typmod) {
+        a<<=srcbits-typmod;
+        for (i=0; i!=rlen; ++i, a<<=BITS_PER_BYTE) r[i]=a>>(srcbits-BITS_PER_BYTE);
+    } else {
+        int sh=typmod%BITS_PER_BYTE;
+        int64 h=a>>sh;
+        int64 l=a<<(srcbits-sh);
+        size_t const zsize=rlen-sizeof(a)-(sh!=0);
+        bits8 sx=(a>=0)-1;
+        memset(r,sx,zsize);
+        for (i=0; i!=sizeof(a); ++i, h<<=8) r[zsize+i]=h>>(srcbits-BITS_PER_BYTE);
+        if (sh!=0) r[zsize+sizeof(a)]=l>>(srcbits-BITS_PER_BYTE);
     }
-    /* store last fractional byte */
-    if (destbitsleft > 0)
-        *r = (bits8) ((a << (8 - destbitsleft)) & BITMASK);

     PG_RETURN_VARBIT_P(result);
 }

Re: BUG #5237: strange int->bit and bit->int conversions

From
Tom Lane
Date:
Roman Kononov <kononov@ftml.net> writes:
> The bitfromint8() and bitfromint4() are hosed. They produce wrong
> results when the BIT size is more than 64 and 32 respectively, and the
> BIT size is not multiple of 8, and the most significant byte of the
> integer value is not 0x00 or 0xff.

Hm, you're right, it's definitely busted ...

> The patch re-implements the conversion functions.

... but I don't much care for this patch.  It's unreadable, uncommented,
and doesn't even try to follow Postgres coding conventions.

It looks to me like the actual bug is that the "first fractional byte"
case ought to shift by destbitsleft - 8 not srcbitsleft - 8.  A
secondary problem (which your patch fails to fix) is that if the
compiler chooses to implement signed >> as zero-fill rather than
sign-bit-fill (which it is allowed to do per C99) then we'll get
wrong, or at least not the desired, results.  So I arrive at
the attached patch.

Unfortunately we've just missed the window for 8.4.2 etc, but I'll
get this committed for the next updates.

            regards, tom lane

Index: varbit.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/varbit.c,v
retrieving revision 1.58
diff -c -r1.58 varbit.c
*** varbit.c    1 Jan 2009 17:23:50 -0000    1.58
--- varbit.c    12 Dec 2009 18:53:43 -0000
***************
*** 1346,1352 ****
      /* store first fractional byte */
      if (destbitsleft > srcbitsleft)
      {
!         *r++ = (bits8) ((a >> (srcbitsleft - 8)) & BITMASK);
          destbitsleft -= 8;
      }
      /* Now srcbitsleft and destbitsleft are the same, need not track both */
--- 1346,1357 ----
      /* store first fractional byte */
      if (destbitsleft > srcbitsleft)
      {
!         int        val = (int) (a >> (destbitsleft - 8));
!
!         /* Force sign-fill in case the compiler implements >> as zero-fill */
!         if (a < 0)
!             val |= (-1) << (srcbitsleft + 8 - destbitsleft);
!         *r++ = (bits8) (val & BITMASK);
          destbitsleft -= 8;
      }
      /* Now srcbitsleft and destbitsleft are the same, need not track both */
***************
*** 1425,1431 ****
      /* store first fractional byte */
      if (destbitsleft > srcbitsleft)
      {
!         *r++ = (bits8) ((a >> (srcbitsleft - 8)) & BITMASK);
          destbitsleft -= 8;
      }
      /* Now srcbitsleft and destbitsleft are the same, need not track both */
--- 1430,1441 ----
      /* store first fractional byte */
      if (destbitsleft > srcbitsleft)
      {
!         int        val = (int) (a >> (destbitsleft - 8));
!
!         /* Force sign-fill in case the compiler implements >> as zero-fill */
!         if (a < 0)
!             val |= (-1) << (srcbitsleft + 8 - destbitsleft);
!         *r++ = (bits8) (val & BITMASK);
          destbitsleft -= 8;
      }
      /* Now srcbitsleft and destbitsleft are the same, need not track both */

Re: BUG #5237: strange int->bit and bit->int conversions

From
Robert Haas
Date:
Woops, forgot to copy the list.

On Sat, Dec 12, 2009 at 2:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Dec 12, 2009 at 2:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Roman Kononov <kononov@ftml.net> writes:
>>> The bitfromint8() and bitfromint4() are hosed. They produce wrong
>>> results when the BIT size is more than 64 and 32 respectively, and the
>>> BIT size is not multiple of 8, and the most significant byte of the
>>> integer value is not 0x00 or 0xff.
>>
>> Hm, you're right, it's definitely busted ...
>>
>>> The patch re-implements the conversion functions.
>>
>> ... but I don't much care for this patch. =A0It's unreadable, uncommente=
d,
>> and doesn't even try to follow Postgres coding conventions.
>>
>> It looks to me like the actual bug is that the "first fractional byte"
>> case ought to shift by destbitsleft - 8 not srcbitsleft - 8. =A0A
>> secondary problem (which your patch fails to fix) is that if the
>> compiler chooses to implement signed >> as zero-fill rather than
>> sign-bit-fill (which it is allowed to do per C99) then we'll get
>> wrong, or at least not the desired, results. =A0So I arrive at
>> the attached patch.
>
> I'm not sure this fixes it, although I haven't tested. =A0When we take
> the /* store first fractional byte */ branch, destbitsleft is between
> 1 and 7 bits greater than srcbitsleft. =A0We then subtract 8 from
> destbitsleft, and the comment on the next line now asserts that the
> two are equal. =A0That doesn't seem right.
>
> Also, I thought about the sign extension problem, but aren't we
> chopping those bits off anyway on the next line?
>
> ...Robert
>

Re: BUG #5237: strange int->bit and bit->int conversions

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> I'm not sure this fixes it, although I haven't tested.  When we take
>> the /* store first fractional byte */ branch, destbitsleft is between
>> 1 and 7 bits greater than srcbitsleft.  We then subtract 8 from
>> destbitsleft, and the comment on the next line now asserts that the
>> two are equal.  That doesn't seem right.

The comment's a bit bogus.  What it should say, probably, is that the
*correct* value of srcbitsleft is now equal to destbitsleft so we aren't
bothering to track it anymore.  If we were being fully anal we'd have
reduced srcbitsleft by some number less than 8 inside the if-branch.

>> Also, I thought about the sign extension problem, but aren't we
>> chopping those bits off anyway on the next line?

Nope, we'd have already stored the wrong bits into the output byte.

            regards, tom lane