Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field - Mailing list pgsql-sql

From Allan Kamau
Subject Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
Date
Msg-id 49B905D2.1050606@sanbi.ac.za
Whole thread Raw
In response to Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field  (Allan Kamau <allank@sanbi.ac.za>)
Responses Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
List pgsql-sql
Hi all,
I am now looking for a function to return the position of the first 
position of the left most set bit. And optionally another to return the 
position of the right most set bit.

I have been looking at 
"http://graphics.stanford.edu/~seander/bithacks.html#OperationCounting" 
but it seems it will take me a while to figure out bit manipulation.

Allan.

Allan Kamau wrote:
> All was well with the code below, apologies to all who read my 
> previous email. The error (an oversight) was on my part. In the 
> "CREATE FUNCTION ..." statement I had FLOAT as the return type instead 
> of INTEGER.
> Now the function runs smoothly. Preliminary results show it is orders 
> of magnitude faster than the LENGTH(REGEXP(CAST(myVarBit AS 
> TEXT),'0','','g')) solution.
> Thanks again TJ and the rest of the team.
>
> Allan
>
> Allan Kamau wrote:
>> Thank you TJ and everyone else for the advise and the c code. Today I 
>> did finally return to the 'number of bits set challenge' and managed 
>> to compile and link the nbits c function which went smoothly. However 
>> the function does crash my postgres server installation (8.3.3) with 
>> a segmentation fault each time I call it for example SELECT 
>> nbits_set(B'1101');
>> My C skills are very sparse and am unable to debug the function, I 
>> have included the C code of this function. Is there something I may 
>> have left out?
>>
>>
>>
>> #include "postgres.h"
>> #include "utils/varbit.h"
>> #include "fmgr.h"
>> #ifdef PG_MODULE_MAGIC
>> PG_MODULE_MAGIC;
>> #endif
>>
>> PG_FUNCTION_INFO_V1(nbits_set);
>> Datum
>> nbits_set(PG_FUNCTION_ARGS)
>> {
>> /* how many bits are set in a bitstring? */
>>    VarBit     *a = PG_GETARG_VARBIT_P(0);
>>    int n=0;
>>    int i;
>>    unsigned char *ap = VARBITS(a);
>>    unsigned char aval;
>>    for (i=0; i < VARBITBYTES(a); ++i) {
>>        aval = *ap; ++ap;
>>        if (aval == 0) continue;
>>        if (aval & 1) ++n;
>>        if (aval & 2) ++n;
>>        if (aval & 4) ++n;
>>        if (aval & 8) ++n;
>>        if (aval & 16) ++n;
>>        if (aval & 32) ++n;
>>        if (aval & 64) ++n;
>>        if (aval & 128) ++n;
>>    }
>>    PG_RETURN_INT32(n);
>> }
>>
>> Allan
>>
>> Bruce Momjian wrote:
>>> Jean-David Beyer wrote:
>>>  
>>>> TJ O'Donnell wrote:
>>>>   
>>>>> I use a c function, nbits_set that will do what you need.
>>>>> I've posted the code in this email.
>>>>>
>>>>> TJ O'Donnell
>>>>> http://www.gnova.com
>>>>>
>>>>> #include "postgres.h"
>>>>> #include "utils/varbit.h"
>>>>>
>>>>> Datum   nbits_set(PG_FUNCTION_ARGS);
>>>>> PG_FUNCTION_INFO_V1(nbits_set);
>>>>> Datum
>>>>> nbits_set(PG_FUNCTION_ARGS)
>>>>> {
>>>>> /* how many bits are set in a bitstring? */
>>>>>
>>>>>          VarBit     *a = PG_GETARG_VARBIT_P(0);
>>>>>          int n=0;
>>>>>          int i;
>>>>>          unsigned char *ap = VARBITS(a);
>>>>>          unsigned char aval;
>>>>>          for (i=0; i < VARBITBYTES(a); ++i) {
>>>>>                  aval = *ap; ++ap;
>>>>>                  if (aval == 0) continue;
>>>>>                  if (aval & 1) ++n;
>>>>>                  if (aval & 2) ++n;
>>>>>                  if (aval & 4) ++n;
>>>>>                  if (aval & 8) ++n;
>>>>>                  if (aval & 16) ++n;
>>>>>                  if (aval & 32) ++n;
>>>>>                  if (aval & 64) ++n;
>>>>>                  if (aval & 128) ++n;
>>>>>          }
>>>>>          PG_RETURN_INT32(n);
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>> Hi all,
>>>>>> Am looking for a fast and efficient way to count the number of 
>>>>>> bits set (to 1) in a VARBIT field. I am currently using 
>>>>>> "LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS 
>>>>>> TEXT),'0','','g'))".
>>>>>>
>>>>>> Allan.
>>>>>>         
>>>>>       
>>>> When I had to do that, in days with smaller amounts of RAM, but 
>>>> very long
>>>> bit-vectors, I used a faster function sort-of like this:
>>>>
>>>> static char table[256] = {
>>>> 0,1,1,2,1,2,2,3,1,.....
>>>> };
>>>>
>>>> Then like above, but instead of the loop,
>>>>
>>>> n+= table[aval];
>>>>
>>>>
>>>> You get the idea.
>>>>     
>>>
>>> Uh, I was kind of confused by this, even when I saw a full
>>> implementation:
>>>
>>>     
>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable
>>>
>>> Actually, this looks even better:
>>>
>>>     
>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan 
>>>
>>>
>>>   
>>
>>
>
>



pgsql-sql by date:

Previous
From: "Paul Dam"
Date:
Subject: Re: Convert text from UTF8 to ASCII
Next
From: "Marco Lechner"
Date:
Subject: Permanent alias for postgresql table