Re: Review: Patch: Allow substring/replace() to get/set bit values - Mailing list pgsql-hackers

From Leonardo F
Subject Re: Review: Patch: Allow substring/replace() to get/set bit values
Date
Msg-id 838145.54566.qm@web29012.mail.ird.yahoo.com
Whole thread Raw
In response to Re: Review: Patch: Allow substring/replace() to get/set bit values  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
> In the
> documentation, the get_bit and set_bit methods are added to a list
> where we state "The following SQL-standard functions work on bit
> strings as well as character strings"; however they are not
> SQL-standard functions and are implemented on binary strings, not
> character strings.

Ok, I didn't change that, but I guess I can fix it.


> The tests don't include any examples where the bit
> string lines up with a byte boundary or is operating on the first or
> last bit of a byte, and the overlay function is not tested with
> values which change the length of the bit string. (All of these
> cases seem to work in the current version, but seem like the sort of
> thing which could get broken in revision.)


Ok, I'll add some corner-cases tests.

> There is a peculiar omission from the patch -- it adds supportfor the
> overlay function to bit strings but not binary strings. Sincethe
> standard drops the bit string as a standard type in the 2003standard,
> in favor of boolean type and binary string types, it seemsodd to omit
> binary string support (which is defined in the standard)but include
> bit string support (which isn't). What the patch adds seemsuseful,
> and I don't think the omission should be considered fatal, butit
> would be nice to see us also cover binary strings if we can.


Mmh, don't know how complicated that would be, but I can give it a try:
I guess it could be a simple SQL replacement (as for char strings and bit
strings?)

> Well, there was one thing which is beyond my ken
> in this regard: I'm not sure that the cases from bits8 to (unsigned
> char *) are needed or appropriate, although on my machine they don't
> seem to hurt.  Perhaps someone with a better grasp of such issues can
> comment.


That's some kind of copy&paste from varlena.c byteaGetBit: I don't know
if they can be avoided.

> I'm not sure the leading whitespace in pg_proc.h is as it should be.


Do you mean the tab in:
_null_(tab)"select ...

?

Yes, I think I should remove it.

> I'd prefer to see the comments for get_bit and set_bit mention that
> the location is specified left-to-right in a zero-based fashion
> consistent with the other get_bit and set_bit functions, but
> inconsistent with the standard substring, position, overlay
> functions.  Personally, I find it unfortunate that our extensions
> introduced this inconsistency, but let's keep it consistent within
> the similarly named functions.

Ok; I found them bizarre too, but I kept it consistent.

> It seems odd to me that you specify the bit to set as a 32 bit
> integer, and that that is what get_bit returns, but again, this is
> consistent with what we do for the bytea functions, so it's probably
> the right thing to match that.


Same as above: I tried to be consistent.

> Using a ERRCODE_ARRAY_SUBSCRIPT_ERROR for a bit position which is
> out-of-bounds seems somewhat bizarre to me -- I'd probably have
> chosen ERRCODE_INVALID_PARAMETER_VALUE in a green field -- but again,
> it matches the bytea implementation.


Again, same as above: consistency; but I can change it if you want

> This last point is probably more a matter of C coding style than
> anything, and I'm way to rusty in C to want to be the final arbiter
> of something like this, but it seems to me that oldByte and newByte
> local variables are just cluttering things up rather than clarifying:
>
>     int            oldByte,
>                 newByte;
> [...]
>     oldByte = ((unsigned char *) r)[byteNo];
>
>     if (newBit == 0)
>         newByte = oldByte & (~(1 << bitNo));
>     else
>         newByte = oldByte | (1 << bitNo);
>
>     ((unsigned char *) r)[byteNo] = newByte;
>
> vs
>
>     if (newBit == 0)
>         ((unsigned char *) r)[byteNo] &= (~(1 << bitNo));
>     else
>         ((unsigned char *) r)[byteNo] |= (1 << bitNo);
>


I agree, you version is cleaner.


Leonardo





pgsql-hackers by date:

Previous
From: Boszormenyi Zoltan
Date:
Subject: Re: Fix auto-prepare #2
Next
From: Magnus Hagander
Date:
Subject: Re: Clearing global statistics