Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field - Mailing list pgsql-sql
| From | Allan Kamau |
|---|---|
| Subject | Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field |
| Date | |
| Msg-id | ab1ea6540903121110l2a3021d4h6632b206e2419898@mail.gmail.com Whole thread Raw |
| In response to | Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field (Allan Kamau <allank@sanbi.ac.za>) |
| List | pgsql-sql |
Seems I have a solution based on the code TJ had provided for counting
the bits sets, for those interested below are the two functions.
#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
PG_FUNCTION_INFO_V1(last_bit_set);
Datum
last_bit_set(PG_FUNCTION_ARGS)
{
/* position of last set bit of a bitstring? */int n=0;VarBit *a = PG_GETARG_VARBIT_P(0);int i;unsigned char *ap =
VARBITS(a);unsignedchar aval;int b=0;int byte_cnt=0;int last_bit_set_position=0;for(i=0;i<VARBITBYTES(a);++i){
aval=*ap;++ap; if(aval&128) { ++n; b=(8*byte_cnt)+1; } if(aval&64) { ++n;
b=(8*byte_cnt)+2; } if(aval&32) { ++n; b=(8*byte_cnt)+3; } if(aval&16) { ++n;
b=(8*byte_cnt)+4; } if(aval&8) { ++n; b=(8*byte_cnt)+5; } if(aval&4) { ++n;
b=(8*byte_cnt)+6; } if(aval&2) { ++n; b=(8*byte_cnt)+7; } if(aval&1) { ++n;
b=(8*byte_cnt)+8; } ++byte_cnt;}last_bit_set_position=b;PG_RETURN_INT32(last_bit_set_position);
}
#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
PG_FUNCTION_INFO_V1(first_bit_set);
Datum
first_bit_set(PG_FUNCTION_ARGS)
{
/* position of first set bit of a bitstring? */int n=0;VarBit *a = PG_GETARG_VARBIT_P(0);int i;unsigned char *ap =
VARBITS(a);unsignedchar aval;int b=0;int byte_cnt=0;int first_bit_set_position=0;for(i=0;b==0&&i<VARBITBYTES(a);++i){
aval=*ap;++ap; if(aval&128) { ++n; b=1; break; } if(aval&64) { ++n;
b=2; break; } if(aval&32) { ++n; b=3; break; } if(aval&16) { ++n;
b=4; break; } if(aval&8) { ++n; b=5; break; } if(aval&4) {
++n; b=6; break; } if(aval&2) { ++n; b=7; break; } if(aval&1) {
++n; b=8; break; } ++byte_cnt;}if(b>0) first_bit_set_position=(8*byte_cnt)+b;else
first_bit_set_position=0;PG_RETURN_INT32(first_bit_set_position);
}
Allan.
On Thu, Mar 12, 2009 at 2:53 PM, Allan Kamau <allank@sanbi.ac.za> wrote:
> Hi all,
> I am now looking for a function to return the position of the first position
> of the left most set bit. And optionally another to return the position of
> the right most set bit.
>
> I have been looking at
> "http://graphics.stanford.edu/~seander/bithacks.html#OperationCounting" but
> it seems it will take me a while to figure out bit manipulation.
>
> Allan.
>
> Allan Kamau wrote:
>>
>> All was well with the code below, apologies to all who read my previous
>> email. The error (an oversight) was on my part. In the "CREATE FUNCTION ..."
>> statement I had FLOAT as the return type instead of INTEGER.
>> Now the function runs smoothly. Preliminary results show it is orders of
>> magnitude faster than the LENGTH(REGEXP(CAST(myVarBit AS TEXT),'0','','g'))
>> solution.
>> Thanks again TJ and the rest of the team.
>>
>> Allan
>>
>> Allan Kamau wrote:
>>>
>>> Thank you TJ and everyone else for the advise and the c code. Today I did
>>> finally return to the 'number of bits set challenge' and managed to compile
>>> and link the nbits c function which went smoothly. However the function does
>>> crash my postgres server installation (8.3.3) with a segmentation fault each
>>> time I call it for example SELECT nbits_set(B'1101');
>>> My C skills are very sparse and am unable to debug the function, I have
>>> included the C code of this function. Is there something I may have left
>>> out?
>>>
>>>
>>>
>>> #include "postgres.h"
>>> #include "utils/varbit.h"
>>> #include "fmgr.h"
>>> #ifdef PG_MODULE_MAGIC
>>> PG_MODULE_MAGIC;
>>> #endif
>>>
>>> PG_FUNCTION_INFO_V1(nbits_set);
>>> Datum
>>> nbits_set(PG_FUNCTION_ARGS)
>>> {
>>> /* how many bits are set in a bitstring? */
>>> VarBit *a = PG_GETARG_VARBIT_P(0);
>>> int n=0;
>>> int i;
>>> unsigned char *ap = VARBITS(a);
>>> unsigned char aval;
>>> for (i=0; i < VARBITBYTES(a); ++i) {
>>> aval = *ap; ++ap;
>>> if (aval == 0) continue;
>>> if (aval & 1) ++n;
>>> if (aval & 2) ++n;
>>> if (aval & 4) ++n;
>>> if (aval & 8) ++n;
>>> if (aval & 16) ++n;
>>> if (aval & 32) ++n;
>>> if (aval & 64) ++n;
>>> if (aval & 128) ++n;
>>> }
>>> PG_RETURN_INT32(n);
>>> }
>>>
>>> Allan
>>>
>>> Bruce Momjian wrote:
>>>>
>>>> Jean-David Beyer wrote:
>>>>
>>>>>
>>>>> TJ O'Donnell wrote:
>>>>>
>>>>>>
>>>>>> I use a c function, nbits_set that will do what you need.
>>>>>> I've posted the code in this email.
>>>>>>
>>>>>> TJ O'Donnell
>>>>>> http://www.gnova.com
>>>>>>
>>>>>> #include "postgres.h"
>>>>>> #include "utils/varbit.h"
>>>>>>
>>>>>> Datum nbits_set(PG_FUNCTION_ARGS);
>>>>>> PG_FUNCTION_INFO_V1(nbits_set);
>>>>>> Datum
>>>>>> nbits_set(PG_FUNCTION_ARGS)
>>>>>> {
>>>>>> /* how many bits are set in a bitstring? */
>>>>>>
>>>>>> VarBit *a = PG_GETARG_VARBIT_P(0);
>>>>>> int n=0;
>>>>>> int i;
>>>>>> unsigned char *ap = VARBITS(a);
>>>>>> unsigned char aval;
>>>>>> for (i=0; i < VARBITBYTES(a); ++i) {
>>>>>> aval = *ap; ++ap;
>>>>>> if (aval == 0) continue;
>>>>>> if (aval & 1) ++n;
>>>>>> if (aval & 2) ++n;
>>>>>> if (aval & 4) ++n;
>>>>>> if (aval & 8) ++n;
>>>>>> if (aval & 16) ++n;
>>>>>> if (aval & 32) ++n;
>>>>>> if (aval & 64) ++n;
>>>>>> if (aval & 128) ++n;
>>>>>> }
>>>>>> PG_RETURN_INT32(n);
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>> Am looking for a fast and efficient way to count the number of bits
>>>>>>> set (to 1) in a VARBIT field. I am currently using
>>>>>>> "LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS TEXT),'0','','g'))".
>>>>>>>
>>>>>>> Allan.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> When I had to do that, in days with smaller amounts of RAM, but very
>>>>> long
>>>>> bit-vectors, I used a faster function sort-of like this:
>>>>>
>>>>> static char table[256] = {
>>>>> 0,1,1,2,1,2,2,3,1,.....
>>>>> };
>>>>>
>>>>> Then like above, but instead of the loop,
>>>>>
>>>>> n+= table[aval];
>>>>>
>>>>>
>>>>> You get the idea.
>>>>>
>>>>
>>>> Uh, I was kind of confused by this, even when I saw a full
>>>> implementation:
>>>>
>>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable
>>>>
>>>> Actually, this looks even better:
>>>>
>>>>
>>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
>>>>
>>>>
>>>
>>>
>>
>>
>
>