Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field - Mailing list pgsql-sql
From | Allan Kamau |
---|---|
Subject | Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field |
Date | |
Msg-id | ab1ea6540903121110l2a3021d4h6632b206e2419898@mail.gmail.com Whole thread Raw |
In response to | Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field (Allan Kamau <allank@sanbi.ac.za>) |
List | pgsql-sql |
Seems I have a solution based on the code TJ had provided for counting the bits sets, for those interested below are the two functions. #include "postgres.h" #include "utils/varbit.h" #include "fmgr.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif PG_FUNCTION_INFO_V1(last_bit_set); Datum last_bit_set(PG_FUNCTION_ARGS) { /* position of last set bit of a bitstring? */int n=0;VarBit *a = PG_GETARG_VARBIT_P(0);int i;unsigned char *ap = VARBITS(a);unsignedchar aval;int b=0;int byte_cnt=0;int last_bit_set_position=0;for(i=0;i<VARBITBYTES(a);++i){ aval=*ap;++ap; if(aval&128) { ++n; b=(8*byte_cnt)+1; } if(aval&64) { ++n; b=(8*byte_cnt)+2; } if(aval&32) { ++n; b=(8*byte_cnt)+3; } if(aval&16) { ++n; b=(8*byte_cnt)+4; } if(aval&8) { ++n; b=(8*byte_cnt)+5; } if(aval&4) { ++n; b=(8*byte_cnt)+6; } if(aval&2) { ++n; b=(8*byte_cnt)+7; } if(aval&1) { ++n; b=(8*byte_cnt)+8; } ++byte_cnt;}last_bit_set_position=b;PG_RETURN_INT32(last_bit_set_position); } #include "postgres.h" #include "utils/varbit.h" #include "fmgr.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif PG_FUNCTION_INFO_V1(first_bit_set); Datum first_bit_set(PG_FUNCTION_ARGS) { /* position of first set bit of a bitstring? */int n=0;VarBit *a = PG_GETARG_VARBIT_P(0);int i;unsigned char *ap = VARBITS(a);unsignedchar aval;int b=0;int byte_cnt=0;int first_bit_set_position=0;for(i=0;b==0&&i<VARBITBYTES(a);++i){ aval=*ap;++ap; if(aval&128) { ++n; b=1; break; } if(aval&64) { ++n; b=2; break; } if(aval&32) { ++n; b=3; break; } if(aval&16) { ++n; b=4; break; } if(aval&8) { ++n; b=5; break; } if(aval&4) { ++n; b=6; break; } if(aval&2) { ++n; b=7; break; } if(aval&1) { ++n; b=8; break; } ++byte_cnt;}if(b>0) first_bit_set_position=(8*byte_cnt)+b;else first_bit_set_position=0;PG_RETURN_INT32(first_bit_set_position); } Allan. On Thu, Mar 12, 2009 at 2:53 PM, Allan Kamau <allank@sanbi.ac.za> wrote: > Hi all, > I am now looking for a function to return the position of the first position > of the left most set bit. And optionally another to return the position of > the right most set bit. > > I have been looking at > "http://graphics.stanford.edu/~seander/bithacks.html#OperationCounting" but > it seems it will take me a while to figure out bit manipulation. > > Allan. > > Allan Kamau wrote: >> >> All was well with the code below, apologies to all who read my previous >> email. The error (an oversight) was on my part. In the "CREATE FUNCTION ..." >> statement I had FLOAT as the return type instead of INTEGER. >> Now the function runs smoothly. Preliminary results show it is orders of >> magnitude faster than the LENGTH(REGEXP(CAST(myVarBit AS TEXT),'0','','g')) >> solution. >> Thanks again TJ and the rest of the team. >> >> Allan >> >> Allan Kamau wrote: >>> >>> Thank you TJ and everyone else for the advise and the c code. Today I did >>> finally return to the 'number of bits set challenge' and managed to compile >>> and link the nbits c function which went smoothly. However the function does >>> crash my postgres server installation (8.3.3) with a segmentation fault each >>> time I call it for example SELECT nbits_set(B'1101'); >>> My C skills are very sparse and am unable to debug the function, I have >>> included the C code of this function. Is there something I may have left >>> out? >>> >>> >>> >>> #include "postgres.h" >>> #include "utils/varbit.h" >>> #include "fmgr.h" >>> #ifdef PG_MODULE_MAGIC >>> PG_MODULE_MAGIC; >>> #endif >>> >>> PG_FUNCTION_INFO_V1(nbits_set); >>> Datum >>> nbits_set(PG_FUNCTION_ARGS) >>> { >>> /* how many bits are set in a bitstring? */ >>> VarBit *a = PG_GETARG_VARBIT_P(0); >>> int n=0; >>> int i; >>> unsigned char *ap = VARBITS(a); >>> unsigned char aval; >>> for (i=0; i < VARBITBYTES(a); ++i) { >>> aval = *ap; ++ap; >>> if (aval == 0) continue; >>> if (aval & 1) ++n; >>> if (aval & 2) ++n; >>> if (aval & 4) ++n; >>> if (aval & 8) ++n; >>> if (aval & 16) ++n; >>> if (aval & 32) ++n; >>> if (aval & 64) ++n; >>> if (aval & 128) ++n; >>> } >>> PG_RETURN_INT32(n); >>> } >>> >>> Allan >>> >>> Bruce Momjian wrote: >>>> >>>> Jean-David Beyer wrote: >>>> >>>>> >>>>> TJ O'Donnell wrote: >>>>> >>>>>> >>>>>> I use a c function, nbits_set that will do what you need. >>>>>> I've posted the code in this email. >>>>>> >>>>>> TJ O'Donnell >>>>>> http://www.gnova.com >>>>>> >>>>>> #include "postgres.h" >>>>>> #include "utils/varbit.h" >>>>>> >>>>>> Datum nbits_set(PG_FUNCTION_ARGS); >>>>>> PG_FUNCTION_INFO_V1(nbits_set); >>>>>> Datum >>>>>> nbits_set(PG_FUNCTION_ARGS) >>>>>> { >>>>>> /* how many bits are set in a bitstring? */ >>>>>> >>>>>> VarBit *a = PG_GETARG_VARBIT_P(0); >>>>>> int n=0; >>>>>> int i; >>>>>> unsigned char *ap = VARBITS(a); >>>>>> unsigned char aval; >>>>>> for (i=0; i < VARBITBYTES(a); ++i) { >>>>>> aval = *ap; ++ap; >>>>>> if (aval == 0) continue; >>>>>> if (aval & 1) ++n; >>>>>> if (aval & 2) ++n; >>>>>> if (aval & 4) ++n; >>>>>> if (aval & 8) ++n; >>>>>> if (aval & 16) ++n; >>>>>> if (aval & 32) ++n; >>>>>> if (aval & 64) ++n; >>>>>> if (aval & 128) ++n; >>>>>> } >>>>>> PG_RETURN_INT32(n); >>>>>> } >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> Am looking for a fast and efficient way to count the number of bits >>>>>>> set (to 1) in a VARBIT field. I am currently using >>>>>>> "LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS TEXT),'0','','g'))". >>>>>>> >>>>>>> Allan. >>>>>>> >>>>>> >>>>>> >>>>> >>>>> When I had to do that, in days with smaller amounts of RAM, but very >>>>> long >>>>> bit-vectors, I used a faster function sort-of like this: >>>>> >>>>> static char table[256] = { >>>>> 0,1,1,2,1,2,2,3,1,..... >>>>> }; >>>>> >>>>> Then like above, but instead of the loop, >>>>> >>>>> n+= table[aval]; >>>>> >>>>> >>>>> You get the idea. >>>>> >>>> >>>> Uh, I was kind of confused by this, even when I saw a full >>>> implementation: >>>> >>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable >>>> >>>> Actually, this looks even better: >>>> >>>> >>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan >>>> >>>> >>> >>> >> >> > >