Re: BIT/BIT VARYING status - Mailing list pgsql-hackers
From | Adriaan Joubert |
---|---|
Subject | Re: BIT/BIT VARYING status |
Date | |
Msg-id | 39FD3235.1B25DEBC@albourne.com Whole thread Raw |
In response to | BIT/BIT VARYING status (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Re: BIT/BIT VARYING status
Re: Re: BIT/BIT VARYING status Re: Re: BIT/BIT VARYING status |
List | pgsql-hackers |
Tom Lane wrote: > > I have made a first cut at completing integration of Adriaan Joubert's > BIT code into the backend. There are a couple little things left to > do (for example, scalarltsel doesn't know what to do with BIT values) > as well as some not-so-little things: > > 1. SQL92 mentions a bitwise position function, which we do not have. Sorry, I have been very busy, so only got down to implementing a position function last night. It's a bit messy (lots of masks and bit-twiddling), but I feel fairly happy now that it is doing the right thing. I tested it with my own loadable types, as the integration into postgres proper stumped my somewhat. The next oid up for a bit function is in use already. Anyway, the patches are attached, and I'm hoping that some friendly sole will integrate the new position function into postgres proper. > 2. We don't handle <bit string> and <hex string> literals correctly; > the scanner converts them into integers which seems quite at variance > with the spec's semantics. This is still a problem that needs to be fixed. Also, it the parser did not seem to be too happy about the 'position' syntax, but I may have it wrong of course. I don;t know how to attach the position function to a piece of syntax such as (position <substr> in <field>) either, so I'm hoping that somebody can pick this up. Also, i have started putting together a file for regression testing. I noticed that the substring syntax does not seem to work: SELECT SUBSTRING(b FROM 2 FOR 4) FROM ZPBIT_TABLE; gives: ERROR: Function 'substr(bit, int4, int4)' does not exist Unable to identify a function that satisfies the given argument types You may need to add explicit typecasts and similar for a varying bit argument. If somebody with better knowledge of postgres could do the integration, please, I will finish off a regression test. Thanks! Adriaan*** src/backend/utils/adt/varbit.c.old Sun Oct 29 11:05:11 2000 --- src/backend/utils/adt/varbit.c Mon Oct 30 04:58:35 2000 *************** *** 1053,1060 **** /* Negative shift is a shift to the left */ if (shft < 0) PG_RETURN_DATUM(DirectFunctionCall2(bitshiftleft, ! VarBitPGetDatum(arg), ! Int32GetDatum(-shft))); result = (VarBit *) palloc(VARSIZE(arg)); VARATT_SIZEP(result) = VARSIZE(arg); --- 1053,1060 ---- /* Negative shift is a shift to the left */ if (shft < 0) PG_RETURN_DATUM(DirectFunctionCall2(bitshiftleft, ! VarBitPGetDatum(arg), ! Int32GetDatum(-shft))); result = (VarBit *) palloc(VARSIZE(arg)); VARATT_SIZEP(result)= VARSIZE(arg); *************** *** 1145,1148 **** --- 1145,1242 ---- result >>= VARBITPAD(arg); PG_RETURN_INT32(result); + } + + /* Determines the position of S1 in the bitstring S2 (1-based string). + * If S1 does not appear in S2 this function returns 0. + * If S1 is of length 0 this function returns 1. + */ + Datum + bitposition(PG_FUNCTION_ARGS) + { + VarBit *substr = PG_GETARG_VARBIT_P(0); + VarBit *arg = PG_GETARG_VARBIT_P(1); + int substr_length, + arg_length, + i, + is; + bits8 *s, /* pointer into substring */ + *p; /* pointer into arg */ + bits8 cmp, /* shifted substring byte to compare */ + mask1, /* mask for substring byte shifted right */ + mask2, /* mask for substring byte shifted left */ + end_mask, /* pad mask for last substring byte */ + arg_mask; /* pad mask for last argument byte */ + bool is_match; + + /* Get the substring length */ + substr_length = VARBITLEN(substr); + arg_length = VARBITLEN(arg); + + /* Argument has 0 length or substring longer than argument, return 0 */ + if (arg_length == 0 || substr_length > arg_length) + PG_RETURN_INT32(0); + + /* 0-length means return 1 */ + if (substr_length == 0) + PG_RETURN_INT32(1); + + /* Initialise the padding masks */ + end_mask = BITMASK << VARBITPAD(substr); + arg_mask = BITMASK << VARBITPAD(arg); + for (i = 0; i < VARBITBYTES(arg) - VARBITBYTES(substr) + 1; i++) + { + for (is = 0; is < BITS_PER_BYTE; is++) { + is_match = true; + p = VARBITS(arg) + i; + mask1 = BITMASK >> is; + mask2 = ~mask1; + for (s = VARBITS(substr); + is_match && s < VARBITEND(substr); s++) + { + cmp = *s >> is; + if (s == VARBITEND(substr) - 1) + { + mask1 &= end_mask >> is; + if (p == VARBITEND(arg) - 1) { + /* Check that there is enough of arg left */ + if (mask1 & ~arg_mask) { + is_match = false; + break; + } + mask1 &= arg_mask; + } + } + is_match = ((cmp ^ *p) & mask1) == 0; + if (!is_match) + break; + // Move on to the next byte + p++; + if (p == VARBITEND(arg)) { + mask2 = end_mask << (BITS_PER_BYTE - is); + is_match = mask2 == 0; + elog(NOTICE,"S. %d %d em=%2x sm=%2x r=%d", + i,is,end_mask,mask2,is_match); + break; + } + cmp = *s << (BITS_PER_BYTE - is); + if (s == VARBITEND(substr) - 1) + { + mask2 &= end_mask << (BITS_PER_BYTE - is); + if (p == VARBITEND(arg) - 1) { + if (mask2 & ~arg_mask) { + is_match = false; + break; + } + mask2 &= arg_mask; + } + } + is_match = ((cmp ^ *p) & mask2) == 0; + } + /* Have we found a match */ + if (is_match) + PG_RETURN_INT32(i*BITS_PER_BYTE + is + 1); + } + } + PG_RETURN_INT32(0); } *** src/include/utils/varbit.h.old Sun Oct 29 11:04:58 2000 --- src/include/utils/varbit.h Sun Oct 29 11:05:58 2000 *************** *** 87,91 **** --- 87,92 ---- extern Datum bitoctetlength(PG_FUNCTION_ARGS); extern Datum bitfromint4(PG_FUNCTION_ARGS); extern Datum bittoint4(PG_FUNCTION_ARGS); + extern Datum bitposition(PG_FUNCTION_ARGS); #endif
pgsql-hackers by date: