Thread: Bytea string operator support

Bytea string operator support

From

"Joe Conway"

Date:

05 September 2001, 20:13:38

> > > I agree that it would be better to *not* allow implicit coercions.
Given
> > > that, any preferences on function names? Are text_to_bytea() and
> > > bytea_to_text() too ugly?
> >
> > They're pretty ugly, but more importantly they're only suitable if we
> > have exactly one conversion function each way.  If we have two, what
> > will we call the second one?
>
> Why not just stick these things into encode() and name them
> "my-cool-encoding" or whatever.  There is no truly natural conversion
> between text and bytea, so encode/decode seem like the proper place.
>
(I'm sending directly to Peter, Tom, and Bruce because you were all involved
in this thread, and the list seems to be down)

Here's a patch for bytea string functions. As discussed:

text encode(bytea, 'escape')
bytea decode(text, 'escape')

to allow conversion bytea-text/text-bytea conversion. Also implemented
(SQL99 defines Binary Strings with all of these operators):

byteacat and "||" operator
substring
trim (only did trim(bytea, bytea) since there is no default trim character
for bunary per SQL99)
length (just aliased octet_length, which is correct for bytea, I think)
position
like and "~~" operator
not like and "!~~" operator

I think that's it.

Passes all regression tests. Based on the discussion, I did not create
functions to allow casting text-to-bytea or bytea-to-text -- it sounded like
we just want people to use encode/decode. I'm still planning to write
PQescapeBytea, but that will come later as a seperate patch. One operator
defined by SQL99, but not implemented here (or for text datatype, that I
could see) is the "overlay" function (modifies string argument by replacing
a substring given start and length with a replacement string). It sounds
useful -- any interest?

Review and comments much appreciated!

-- Joe

Attachment

bytea_string_funcs_r00.diff

Re: Bytea string operator support

From

Marko Kreen

Date:

06 September 2001, 13:24:58

On Wed, Sep 05, 2001 at 01:34:06PM -0700, Joe Conway wrote:
> > Why not just stick these things into encode() and name them
> > "my-cool-encoding" or whatever.  There is no truly natural conversion
> > between text and bytea, so encode/decode seem like the proper place.
>
> Here's a patch for bytea string functions. As discussed:
>
> text encode(bytea, 'escape')
> bytea decode(text, 'escape')

Why are you using \xxx encoding there?  As the 'escape' encoding
is supposed to be 'minimalistic' as it escapes only 2
problematic values, then IMHO it would be better to use
\0 and \\ as escapes - takes less room.

--
marko

Re: Bytea string operator support

From

Bruce Momjian

Date:

06 September 2001, 13:44:09

> On Wed, Sep 05, 2001 at 01:34:06PM -0700, Joe Conway wrote:
> > > Why not just stick these things into encode() and name them
> > > "my-cool-encoding" or whatever.  There is no truly natural conversion
> > > between text and bytea, so encode/decode seem like the proper place.
> >
> > Here's a patch for bytea string functions. As discussed:
> >
> > text encode(bytea, 'escape')
> > bytea decode(text, 'escape')
>
> Why are you using \xxx encoding there?  As the 'escape' encoding
> is supposed to be 'minimalistic' as it escapes only 2
> problematic values, then IMHO it would be better to use
> \0 and \\ as escapes - takes less room.

Agreed, and I have documented this in the SGML pages.  Knowing this,
bytea becomes a much easier format to use.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Bytea string operator support

From

"Joe Conway"

Date:

06 September 2001, 14:08:32

> > > Here's a patch for bytea string functions. As discussed:
> > >
> > > text encode(bytea, 'escape')
> > > bytea decode(text, 'escape')
> >
> > Why are you using \xxx encoding there?  As the 'escape' encoding
> > is supposed to be 'minimalistic' as it escapes only 2
> > problematic values, then IMHO it would be better to use
> > \0 and \\ as escapes - takes less room.
>
> Agreed, and I have documented this in the SGML pages.  Knowing this,
> bytea becomes a much easier format to use.

No problem -- I kind of like the octal style better, but I can see your
point. I'll wait for awhile for more comments, and then send in a new patch.

-- Joe

Re: Bytea string operator support

From

"Joe Conway"

Date:

07 September 2001, 02:47:25

> > > > Here's a patch for bytea string functions. As discussed:
> > > >
> > > > text encode(bytea, 'escape')
> > > > bytea decode(text, 'escape')
> > >
> > > Why are you using \xxx encoding there?  As the 'escape' encoding
> > > is supposed to be 'minimalistic' as it escapes only 2
> > > problematic values, then IMHO it would be better to use
> > > \0 and \\ as escapes - takes less room.
> >
> > Agreed, and I have documented this in the SGML pages.  Knowing this,
> > bytea becomes a much easier format to use.
>
> No problem -- I kind of like the octal style better, but I can see your
> point. I'll wait for awhile for more comments, and then send in a new
patch.

Here's a revised patch. Changes:

1. Now outputs '\\' instead of '\134' when using encode(bytea, 'escape')
Note that I ended up leaving \0 as \000 so that there are no ambiguities
when decoding something like, for example, \0123.

2. Fixed bug in byteain which allowed input values which were not valid
octals (e.g. \789), to be parsed as if they were octals.

Joe

Attachment

bytea_ops_r01.diff