Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field - Mailing list pgsql-sql

From Allan Kamau
Subject Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
Date
Msg-id 48AF0846.9020108@sanbi.ac.za
Whole thread Raw
In response to Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field  (Allan Kamau <allank@sanbi.ac.za>)
Responses Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
List pgsql-sql
All was well with the code below, apologies to all who read my previous 
email. The error (an oversight) was on my part. In the "CREATE FUNCTION 
..." statement I had FLOAT as the return type instead of INTEGER.
Now the function runs smoothly. Preliminary results show it is orders of 
magnitude faster than the LENGTH(REGEXP(CAST(myVarBit AS 
TEXT),'0','','g')) solution.
Thanks again TJ and the rest of the team.

Allan

Allan Kamau wrote:
> Thank you TJ and everyone else for the advise and the c code. Today I 
> did finally return to the 'number of bits set challenge' and managed 
> to compile and link the nbits c function which went smoothly. However 
> the function does crash my postgres server installation (8.3.3) with a 
> segmentation fault each time I call it for example SELECT 
> nbits_set(B'1101');
> My C skills are very sparse and am unable to debug the function, I 
> have included the C code of this function. Is there something I may 
> have left out?
>
>
>
> #include "postgres.h"
> #include "utils/varbit.h"
> #include "fmgr.h"
> #ifdef PG_MODULE_MAGIC
> PG_MODULE_MAGIC;
> #endif
>
> PG_FUNCTION_INFO_V1(nbits_set);
> Datum
> nbits_set(PG_FUNCTION_ARGS)
> {
> /* how many bits are set in a bitstring? */
>    VarBit     *a = PG_GETARG_VARBIT_P(0);
>    int n=0;
>    int i;
>    unsigned char *ap = VARBITS(a);
>    unsigned char aval;
>    for (i=0; i < VARBITBYTES(a); ++i) {
>        aval = *ap; ++ap;
>        if (aval == 0) continue;
>        if (aval & 1) ++n;
>        if (aval & 2) ++n;
>        if (aval & 4) ++n;
>        if (aval & 8) ++n;
>        if (aval & 16) ++n;
>        if (aval & 32) ++n;
>        if (aval & 64) ++n;
>        if (aval & 128) ++n;
>    }
>    PG_RETURN_INT32(n);
> }
>
> Allan
>
> Bruce Momjian wrote:
>> Jean-David Beyer wrote:
>>  
>>> TJ O'Donnell wrote:
>>>    
>>>> I use a c function, nbits_set that will do what you need.
>>>> I've posted the code in this email.
>>>>
>>>> TJ O'Donnell
>>>> http://www.gnova.com
>>>>
>>>> #include "postgres.h"
>>>> #include "utils/varbit.h"
>>>>
>>>> Datum   nbits_set(PG_FUNCTION_ARGS);
>>>> PG_FUNCTION_INFO_V1(nbits_set);
>>>> Datum
>>>> nbits_set(PG_FUNCTION_ARGS)
>>>> {
>>>> /* how many bits are set in a bitstring? */
>>>>
>>>>          VarBit     *a = PG_GETARG_VARBIT_P(0);
>>>>          int n=0;
>>>>          int i;
>>>>          unsigned char *ap = VARBITS(a);
>>>>          unsigned char aval;
>>>>          for (i=0; i < VARBITBYTES(a); ++i) {
>>>>                  aval = *ap; ++ap;
>>>>                  if (aval == 0) continue;
>>>>                  if (aval & 1) ++n;
>>>>                  if (aval & 2) ++n;
>>>>                  if (aval & 4) ++n;
>>>>                  if (aval & 8) ++n;
>>>>                  if (aval & 16) ++n;
>>>>                  if (aval & 32) ++n;
>>>>                  if (aval & 64) ++n;
>>>>                  if (aval & 128) ++n;
>>>>          }
>>>>          PG_RETURN_INT32(n);
>>>> }
>>>>
>>>>
>>>>
>>>>      
>>>>> Hi all,
>>>>> Am looking for a fast and efficient way to count the number of 
>>>>> bits set (to 1) in a VARBIT field. I am currently using 
>>>>> "LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS 
>>>>> TEXT),'0','','g'))".
>>>>>
>>>>> Allan.
>>>>>         
>>>>       
>>> When I had to do that, in days with smaller amounts of RAM, but very 
>>> long
>>> bit-vectors, I used a faster function sort-of like this:
>>>
>>> static char table[256] = {
>>> 0,1,1,2,1,2,2,3,1,.....
>>> };
>>>
>>> Then like above, but instead of the loop,
>>>
>>> n+= table[aval];
>>>
>>>
>>> You get the idea.
>>>     
>>
>> Uh, I was kind of confused by this, even when I saw a full
>> implementation:
>>
>>     http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable 
>>
>>
>> Actually, this looks even better:
>>
>>     http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan 
>>
>>
>>   
>
>



pgsql-sql by date:

Previous
From: Lennin Caro
Date:
Subject: Re: Concat field result in select query
Next
From: Emi Lu
Date:
Subject: Why *no* ambig·uous complain in select part?