Thread: Bytea string operator support
> > > I agree that it would be better to *not* allow implicit coercions. Given > > > that, any preferences on function names? Are text_to_bytea() and > > > bytea_to_text() too ugly? > > > > They're pretty ugly, but more importantly they're only suitable if we > > have exactly one conversion function each way. If we have two, what > > will we call the second one? > > Why not just stick these things into encode() and name them > "my-cool-encoding" or whatever. There is no truly natural conversion > between text and bytea, so encode/decode seem like the proper place. > (I'm sending directly to Peter, Tom, and Bruce because you were all involved in this thread, and the list seems to be down) Here's a patch for bytea string functions. As discussed: text encode(bytea, 'escape') bytea decode(text, 'escape') to allow conversion bytea-text/text-bytea conversion. Also implemented (SQL99 defines Binary Strings with all of these operators): byteacat and "||" operator substring trim (only did trim(bytea, bytea) since there is no default trim character for bunary per SQL99) length (just aliased octet_length, which is correct for bytea, I think) position like and "~~" operator not like and "!~~" operator I think that's it. Passes all regression tests. Based on the discussion, I did not create functions to allow casting text-to-bytea or bytea-to-text -- it sounded like we just want people to use encode/decode. I'm still planning to write PQescapeBytea, but that will come later as a seperate patch. One operator defined by SQL99, but not implemented here (or for text datatype, that I could see) is the "overlay" function (modifies string argument by replacing a substring given start and length with a replacement string). It sounds useful -- any interest? Review and comments much appreciated! -- Joe
Attachment
On Wed, Sep 05, 2001 at 01:34:06PM -0700, Joe Conway wrote: > > Why not just stick these things into encode() and name them > > "my-cool-encoding" or whatever. There is no truly natural conversion > > between text and bytea, so encode/decode seem like the proper place. > > Here's a patch for bytea string functions. As discussed: > > text encode(bytea, 'escape') > bytea decode(text, 'escape') Why are you using \xxx encoding there? As the 'escape' encoding is supposed to be 'minimalistic' as it escapes only 2 problematic values, then IMHO it would be better to use \0 and \\ as escapes - takes less room. -- marko
> On Wed, Sep 05, 2001 at 01:34:06PM -0700, Joe Conway wrote: > > > Why not just stick these things into encode() and name them > > > "my-cool-encoding" or whatever. There is no truly natural conversion > > > between text and bytea, so encode/decode seem like the proper place. > > > > Here's a patch for bytea string functions. As discussed: > > > > text encode(bytea, 'escape') > > bytea decode(text, 'escape') > > Why are you using \xxx encoding there? As the 'escape' encoding > is supposed to be 'minimalistic' as it escapes only 2 > problematic values, then IMHO it would be better to use > \0 and \\ as escapes - takes less room. Agreed, and I have documented this in the SGML pages. Knowing this, bytea becomes a much easier format to use. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> > > Here's a patch for bytea string functions. As discussed: > > > > > > text encode(bytea, 'escape') > > > bytea decode(text, 'escape') > > > > Why are you using \xxx encoding there? As the 'escape' encoding > > is supposed to be 'minimalistic' as it escapes only 2 > > problematic values, then IMHO it would be better to use > > \0 and \\ as escapes - takes less room. > > Agreed, and I have documented this in the SGML pages. Knowing this, > bytea becomes a much easier format to use. No problem -- I kind of like the octal style better, but I can see your point. I'll wait for awhile for more comments, and then send in a new patch. -- Joe
> > > > Here's a patch for bytea string functions. As discussed: > > > > > > > > text encode(bytea, 'escape') > > > > bytea decode(text, 'escape') > > > > > > Why are you using \xxx encoding there? As the 'escape' encoding > > > is supposed to be 'minimalistic' as it escapes only 2 > > > problematic values, then IMHO it would be better to use > > > \0 and \\ as escapes - takes less room. > > > > Agreed, and I have documented this in the SGML pages. Knowing this, > > bytea becomes a much easier format to use. > > No problem -- I kind of like the octal style better, but I can see your > point. I'll wait for awhile for more comments, and then send in a new patch. Here's a revised patch. Changes: 1. Now outputs '\\' instead of '\134' when using encode(bytea, 'escape') Note that I ended up leaving \0 as \000 so that there are no ambiguities when decoding something like, for example, \0123. 2. Fixed bug in byteain which allowed input values which were not valid octals (e.g. \789), to be parsed as if they were octals. Joe