Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field - Mailing list pgsql-sql

From Allan Kamau
Subject Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
Date
Msg-id ab1ea6540903121110l2a3021d4h6632b206e2419898@mail.gmail.com
Whole thread Raw
In response to Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field  (Allan Kamau <allank@sanbi.ac.za>)
List pgsql-sql
Seems I have a solution based on the code TJ had provided for counting
the bits sets, for those interested below are the two functions.



#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(last_bit_set);
Datum
last_bit_set(PG_FUNCTION_ARGS)
{
/* position of last set bit of a bitstring? */int n=0;VarBit     *a = PG_GETARG_VARBIT_P(0);int i;unsigned char *ap =
VARBITS(a);unsignedchar aval;int b=0;int byte_cnt=0;int last_bit_set_position=0;for(i=0;i<VARBITBYTES(a);++i){
aval=*ap;++ap;   if(aval&128)    {        ++n;        b=(8*byte_cnt)+1;    }    if(aval&64)    {        ++n;
b=(8*byte_cnt)+2;   }    if(aval&32)    {        ++n;        b=(8*byte_cnt)+3;    }    if(aval&16)    {        ++n;
  b=(8*byte_cnt)+4;    }    if(aval&8)    {        ++n;        b=(8*byte_cnt)+5;    }    if(aval&4)    {        ++n;
   b=(8*byte_cnt)+6;    }    if(aval&2)    {        ++n;        b=(8*byte_cnt)+7;    }    if(aval&1)    {        ++n;
    b=(8*byte_cnt)+8;    }    ++byte_cnt;}last_bit_set_position=b;PG_RETURN_INT32(last_bit_set_position);
 
}


#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(first_bit_set);
Datum
first_bit_set(PG_FUNCTION_ARGS)
{
/* position of first set bit of a bitstring? */int n=0;VarBit     *a = PG_GETARG_VARBIT_P(0);int i;unsigned char *ap =
VARBITS(a);unsignedchar aval;int b=0;int byte_cnt=0;int first_bit_set_position=0;for(i=0;b==0&&i<VARBITBYTES(a);++i){
aval=*ap;++ap;    if(aval&128)    {        ++n;        b=1;        break;    }    if(aval&64)    {        ++n;
b=2;       break;    }    if(aval&32)    {        ++n;        b=3;        break;    }    if(aval&16)    {        ++n;
    b=4;        break;    }    if(aval&8)    {        ++n;        b=5;        break;    }    if(aval&4)    {
++n;       b=6;        break;    }    if(aval&2)    {        ++n;        b=7;        break;    }    if(aval&1)    {
  ++n;        b=8;        break;    }    ++byte_cnt;}if(b>0)    first_bit_set_position=(8*byte_cnt)+b;else
first_bit_set_position=0;PG_RETURN_INT32(first_bit_set_position);
}


Allan.

On Thu, Mar 12, 2009 at 2:53 PM, Allan Kamau <allank@sanbi.ac.za> wrote:
> Hi all,
> I am now looking for a function to return the position of the first position
> of the left most set bit. And optionally another to return the position of
> the right most set bit.
>
> I have been looking at
> "http://graphics.stanford.edu/~seander/bithacks.html#OperationCounting" but
> it seems it will take me a while to figure out bit manipulation.
>
> Allan.
>
> Allan Kamau wrote:
>>
>> All was well with the code below, apologies to all who read my previous
>> email. The error (an oversight) was on my part. In the "CREATE FUNCTION ..."
>> statement I had FLOAT as the return type instead of INTEGER.
>> Now the function runs smoothly. Preliminary results show it is orders of
>> magnitude faster than the LENGTH(REGEXP(CAST(myVarBit AS TEXT),'0','','g'))
>> solution.
>> Thanks again TJ and the rest of the team.
>>
>> Allan
>>
>> Allan Kamau wrote:
>>>
>>> Thank you TJ and everyone else for the advise and the c code. Today I did
>>> finally return to the 'number of bits set challenge' and managed to compile
>>> and link the nbits c function which went smoothly. However the function does
>>> crash my postgres server installation (8.3.3) with a segmentation fault each
>>> time I call it for example SELECT nbits_set(B'1101');
>>> My C skills are very sparse and am unable to debug the function, I have
>>> included the C code of this function. Is there something I may have left
>>> out?
>>>
>>>
>>>
>>> #include "postgres.h"
>>> #include "utils/varbit.h"
>>> #include "fmgr.h"
>>> #ifdef PG_MODULE_MAGIC
>>> PG_MODULE_MAGIC;
>>> #endif
>>>
>>> PG_FUNCTION_INFO_V1(nbits_set);
>>> Datum
>>> nbits_set(PG_FUNCTION_ARGS)
>>> {
>>> /* how many bits are set in a bitstring? */
>>>   VarBit     *a = PG_GETARG_VARBIT_P(0);
>>>   int n=0;
>>>   int i;
>>>   unsigned char *ap = VARBITS(a);
>>>   unsigned char aval;
>>>   for (i=0; i < VARBITBYTES(a); ++i) {
>>>       aval = *ap; ++ap;
>>>       if (aval == 0) continue;
>>>       if (aval & 1) ++n;
>>>       if (aval & 2) ++n;
>>>       if (aval & 4) ++n;
>>>       if (aval & 8) ++n;
>>>       if (aval & 16) ++n;
>>>       if (aval & 32) ++n;
>>>       if (aval & 64) ++n;
>>>       if (aval & 128) ++n;
>>>   }
>>>   PG_RETURN_INT32(n);
>>> }
>>>
>>> Allan
>>>
>>> Bruce Momjian wrote:
>>>>
>>>> Jean-David Beyer wrote:
>>>>
>>>>>
>>>>> TJ O'Donnell wrote:
>>>>>
>>>>>>
>>>>>> I use a c function, nbits_set that will do what you need.
>>>>>> I've posted the code in this email.
>>>>>>
>>>>>> TJ O'Donnell
>>>>>> http://www.gnova.com
>>>>>>
>>>>>> #include "postgres.h"
>>>>>> #include "utils/varbit.h"
>>>>>>
>>>>>> Datum   nbits_set(PG_FUNCTION_ARGS);
>>>>>> PG_FUNCTION_INFO_V1(nbits_set);
>>>>>> Datum
>>>>>> nbits_set(PG_FUNCTION_ARGS)
>>>>>> {
>>>>>> /* how many bits are set in a bitstring? */
>>>>>>
>>>>>>         VarBit     *a = PG_GETARG_VARBIT_P(0);
>>>>>>         int n=0;
>>>>>>         int i;
>>>>>>         unsigned char *ap = VARBITS(a);
>>>>>>         unsigned char aval;
>>>>>>         for (i=0; i < VARBITBYTES(a); ++i) {
>>>>>>                 aval = *ap; ++ap;
>>>>>>                 if (aval == 0) continue;
>>>>>>                 if (aval & 1) ++n;
>>>>>>                 if (aval & 2) ++n;
>>>>>>                 if (aval & 4) ++n;
>>>>>>                 if (aval & 8) ++n;
>>>>>>                 if (aval & 16) ++n;
>>>>>>                 if (aval & 32) ++n;
>>>>>>                 if (aval & 64) ++n;
>>>>>>                 if (aval & 128) ++n;
>>>>>>         }
>>>>>>         PG_RETURN_INT32(n);
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>> Am looking for a fast and efficient way to count the number of bits
>>>>>>> set (to 1) in a VARBIT field. I am currently using
>>>>>>> "LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS TEXT),'0','','g'))".
>>>>>>>
>>>>>>> Allan.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> When I had to do that, in days with smaller amounts of RAM, but very
>>>>> long
>>>>> bit-vectors, I used a faster function sort-of like this:
>>>>>
>>>>> static char table[256] = {
>>>>> 0,1,1,2,1,2,2,3,1,.....
>>>>> };
>>>>>
>>>>> Then like above, but instead of the loop,
>>>>>
>>>>> n+= table[aval];
>>>>>
>>>>>
>>>>> You get the idea.
>>>>>
>>>>
>>>> Uh, I was kind of confused by this, even when I saw a full
>>>> implementation:
>>>>
>>>>    http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable
>>>>
>>>> Actually, this looks even better:
>>>>
>>>>
>>>>  http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
>>>>
>>>>
>>>
>>>
>>
>>
>
>


pgsql-sql by date:

Previous
From: Duffer Do
Date:
Subject: select count of all overlapping geometries and return 0 if none.
Next
From: Lennin Caro
Date:
Subject: Re: Permanent alias for postgresql table