Thread: Speeding up loops in pl/pgsql function

Speeding up loops in pl/pgsql function

From

"Reuven M. Lerner"

Date:

25 May 2011, 17:05:33

Hi, everyone.  I'm working on a project that's using PostgreSQL 8.3,
that requires me to translate strings of octal digits into strings of
characters -- so '141142143' should become 'abc', although the database
column containing this data (both before and after) is a bytea.


While the function I've written is accurate, it turns out that it's also
ridiculously slow.  I've managed to speed it up a fair amount, to twice
what it was previously doing, by folding a helper function into a main
one, and appending to an array (which I then join into a string at the
end of the function) instead of concatenating a string onto itself time
after time.


I realize that pl/pgsql is not a good choice for doing this sort of
task, and that another language -- say, one with direct support for
octal digits, or with built-in, speedy array functions such as pop() and
push()  -- would be a better choice.  But that's not an option at this
point.


I should also note that I'm not manipulating a huge amount of data
here.  We're talking about 300 or so rows, each of which contains about
250 KB of data.  (Hmm, could the problem be that I'm constantly forcing
the system to compress and uncompress the data in TOAST?  I hadn't
thought of that until just now...)


I thus have two basic questions:


(1) Are there any good guidelines for what operations in pl/pgsql are
optimized for which data structures?  For example, it turns out that a
great deal of time is being spent in the substring() function, which
surprised me.  I thought that by switching to an array, it might be
faster, but that wasn't the case, at least in my tests.  Having a sense
of what I should and shouldn't be trying, and which built-in functions
are particularly fast or slow, would be useful to know.


(2) Is there any configuration setting that would (perhaps) speed things
up a bit?  I thought that maybe work_mem would help, but the
documentation didn't indicate this at all, and sure enough, nothing
really changed when I increased it.


Of course, any suggestions for how to deal with octal digits in
PostgreSQL 8.3, such as an octal equivalent to the x'ff' syntax, would
be more than welcome.


Thanks in advance,


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

Re: Speeding up loops in pl/pgsql function

From

"Reuven M. Lerner"

Date:

25 May 2011, 18:55:38

Hi, Alex.  You wrote:
> Have you tried something like:
> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
> 'g')::bytea, 'escape');
Hmm, forgot about regexp_replace.  It might do the trick, but without a
full-blown eval that I can run on the replacement side, it'll be a bit
more challenging.  But that's a good direction to consider, for sure.

> I think select E'\XXX' is what you are looking for (per the fine
> manual: http://www.postgresql.org/docs/current/static/datatype-binary.html)
I didn't think that I could (easily) build a string like that from
digits in a variable or a column, but I'll poke around and see if it can
work.

Thanks,

Reuven


--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

Re: Speeding up loops in pl/pgsql function

From

Alex Hunsaker

Date:

25 May 2011, 18:56:12

On Wed, May 25, 2011 at 12:45, Reuven M. Lerner <reuven@lerner.co.il> wrote:
> Hi, Alex.  You wrote:

>> I think select E'\XXX' is what you are looking for (per the fine
>> manual:
>> http://www.postgresql.org/docs/current/static/datatype-binary.html)
>
> I didn't think that I could (easily) build a string like that from digits in
> a variable or a column, but I'll poke around and see if it can work.

Well, if you build '\XXX' you can call escape(..., 'escape') on it
like I did with the regex above.

Re: Speeding up loops in pl/pgsql function

From

Pavel Stehule

Date:

25 May 2011, 19:30:40

Hello

>
> (1) Are there any good guidelines for what operations in pl/pgsql are
> optimized for which data structures?  For example, it turns out that a great
> deal of time is being spent in the substring() function, which surprised me.
>  I thought that by switching to an array, it might be faster, but that
> wasn't the case, at least in my tests.  Having a sense of what I should and
> shouldn't be trying, and which built-in functions are particularly fast or
> slow, would be useful to know.
>

PL/pgSQL is perfect like glue for SQL. For all other isn't good

http://okbob.blogspot.com/2010/04/frequent-mistakes-in-plpgsql-design.html
http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#When_PL.2FpgSQL_is_not_applicable

>
> (2) Is there any configuration setting that would (perhaps) speed things up
> a bit?  I thought that maybe work_mem would help, but the documentation
> didn't indicate this at all, and sure enough, nothing really changed when I
> increased it.
>
>

probably not

Just PL/pgSQL is not C, and you cannot do some heavy string or array operations.

Regards

Pavel Stehule

Re: Speeding up loops in pl/pgsql function

From

Alex Hunsaker

Date:

25 May 2011, 19:42:12

On Wed, May 25, 2011 at 10:59, Reuven M. Lerner <reuven@lerner.co.il> wrote:
> Hi, everyone.  I'm working on a project that's using PostgreSQL 8.3, that
> requires me to translate strings of octal digits into strings of characters
> -- so '141142143' should become 'abc', although the database column
> containing this data (both before and after) is a bytea.

Have you tried something like:
SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
'g')::bytea, 'escape');

> ...
> Of course, any suggestions for how to deal with octal digits in PostgreSQL
> 8.3, such as an octal equivalent to the x'ff' syntax, would be more than
> welcome.

I think select E'\XXX' is what you are looking for (per the fine
manual: http://www.postgresql.org/docs/current/static/datatype-binary.html)

Re: Speeding up loops in pl/pgsql function

From

Merlin Moncure

Date:

25 May 2011, 21:09:30

On Wed, May 25, 2011 at 11:59 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
> Hi, everyone.  I'm working on a project that's using PostgreSQL 8.3, that
> requires me to translate strings of octal digits into strings of characters
> -- so '141142143' should become 'abc', although the database column
> containing this data (both before and after) is a bytea.
>
>
> While the function I've written is accurate, it turns out that it's also
> ridiculously slow.  I've managed to speed it up a fair amount, to twice what
> it was previously doing, by folding a helper function into a main one, and
> appending to an array (which I then join into a string at the end of the
> function) instead of concatenating a string onto itself time after time.
>
>
> I realize that pl/pgsql is not a good choice for doing this sort of task,
> and that another language -- say, one with direct support for octal digits,
> or with built-in, speedy array functions such as pop() and push()  -- would
> be a better choice.  But that's not an option at this point.
>
>
> I should also note that I'm not manipulating a huge amount of data here.
>  We're talking about 300 or so rows, each of which contains about 250 KB of
> data.  (Hmm, could the problem be that I'm constantly forcing the system to
> compress and uncompress the data in TOAST?  I hadn't thought of that until
> just now...)
>
>
> I thus have two basic questions:
>
>
> (1) Are there any good guidelines for what operations in pl/pgsql are
> optimized for which data structures?  For example, it turns out that a great
> deal of time is being spent in the substring() function, which surprised me.
>  I thought that by switching to an array, it might be faster, but that
> wasn't the case, at least in my tests.  Having a sense of what I should and
> shouldn't be trying, and which built-in functions are particularly fast or
> slow, would be useful to know.
>
>
> (2) Is there any configuration setting that would (perhaps) speed things up
> a bit?  I thought that maybe work_mem would help, but the documentation
> didn't indicate this at all, and sure enough, nothing really changed when I
> increased it.
>
>
> Of course, any suggestions for how to deal with octal digits in PostgreSQL
> 8.3, such as an octal equivalent to the x'ff' syntax, would be more than
> welcome.

let's see the source.  I bet we can get this figured out.

merlin

Re: Speeding up loops in pl/pgsql function

From

"Reuven M. Lerner"

Date:

25 May 2011, 22:26:34

Hi, everyone.  Merlin wrote:

> let's see the source.  I bet we can get this figured out.

Here you go... it looked nicer before I started to make optimizations;
I've gotten it to run about 2x as fast as the previous version, but now
I'm sorta stuck, looking for further optimizations, including possible
use of builtin functions.

Thanks for any suggestions you can offer.

CREATE OR REPLACE FUNCTION translate_octals_into_decimals(bytea_string
BYTEA) RETURNS BYTEA AS $$
DECLARE
   bytea_string_length INTEGER := length(bytea_string);
   current_substring TEXT := '';
   translated_string_array BYTEA[];

   output_number INTEGER := 0;
   output_number_text TEXT := '';
   current_digit TEXT := '';
BEGIN
   RAISE NOTICE '[translate_octals_into_decimals] start at %, string of
length %', clock_timestamp(), pg_size_pretty(length(bytea_string));

   FOR i IN 1..length(bytea_string) BY 3 LOOP
     current_substring := substring(bytea_string from i for 3);

     output_number := 0;

     FOR j IN 0..(length(current_substring) - 1) LOOP
       current_digit := substring(current_substring from
(length(current_substring) - j) for 1);
       output_number := output_number + current_digit::integer * (8 ^ j);
     END LOOP;

     output_number_text = lpad(output_number::text, 3, '0');

     IF output_number_text::int = 92 THEN
       translated_string_array := array_append(translated_string_array,
E'\\\\'::bytea);
     ELSIF output_number_text::int = 0 THEN
       translated_string_array := array_append(translated_string_array,
E'\\000'::bytea);
     ELSE
       translated_string_array := array_append( translated_string_array,
chr(output_number_text::integer)::bytea );
     END IF;

   END LOOP;

   RETURN array_to_string(translated_string_array, '');
END;
$$ LANGUAGE 'plpgsql';

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

Re: Speeding up loops in pl/pgsql function

From

Steve Crawford

Date:

26 May 2011, 00:03:13

On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:
> Hi, Alex.  You wrote:
>> Have you tried something like:
>> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
>> 'g')::bytea, 'escape');
> Hmm, forgot about regexp_replace.  It might do the trick, but without
> a full-blown eval that I can run on the replacement side, it'll be a
> bit more challenging.  But that's a good direction to consider, for sure.

The function given didn't work exactly as written for me but it is on
the right track. See if this works for you (input validation is left as
an exercise for the reader...:)):

create or replace function octal_string_to_text(someoctal text) returns
text as $$
declare
     binstring text;
begin
     execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1',
'g') || '''' into binstring;
return binstring;
end
$$ language plpgsql;

Cheers,
Steve

Re: Speeding up loops in pl/pgsql function

From

Merlin Moncure

Date:

26 May 2011, 01:20:52

On Wed, May 25, 2011 at 8:03 PM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
> On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:
>>
>> Hi, Alex.  You wrote:
>>>
>>> Have you tried something like:
>>> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
>>> 'g')::bytea, 'escape');
>>
>> Hmm, forgot about regexp_replace.  It might do the trick, but without a
>> full-blown eval that I can run on the replacement side, it'll be a bit more
>> challenging.  But that's a good direction to consider, for sure.
>
> The function given didn't work exactly as written for me but it is on the
> right track. See if this works for you (input validation is left as an
> exercise for the reader...:)):
>
> create or replace function octal_string_to_text(someoctal text) returns text
> as $$
> declare
>    binstring text;
> begin
>    execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
> || '''' into binstring;
> return binstring;
> end
> $$ language plpgsql;

four points (minor suggestions btw):
1. if you are dealing with strings that have backslashes in them,
don't escape, but dollar quote.  Also try not to use dollar parameter
notation if you can help it:
($1, E'(\\d{3})', E'\\\\\\1', 'g') -> (someoctal , $q$(\d{3})$q$,
$q$\\\1$q$, 'g')

this is particularly true with feeding strings to regex: that way you
can use the same string pg as in various validators.

2. there is no need for execute here.
execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
becomes:
binstring := 'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$,
'g')  /* I *think* I got this right */

3. if your function does not scribble on tables and has no or is not
influenced by any side effects, mark it as IMMUTABLE. always.
$$ language plpgsql IMMUTABLE;

4. since all we are doing is generating a variable, prefer sql
function vs plpgsql. this is particularly true in pre 8.4 postgres
(IIRC) where you can call the function much more flexibly (select
func(); vs select * from func();) if that's the case.  Putting it all
together,

create or replace function octal_string_to_text(someoctal text)
returns text as $$
   SELECT  'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$, 'g');
$$ sql immutable;

Note I didn't actually check to see what your regex is donig (I'm
assuming it's correct)...

merlin

Re: Speeding up loops in pl/pgsql function

From

Santhakumaran

Date:

26 May 2011, 04:01:59

Dkloskxe

Steve Crawford <scrawford@pinpointresearch.com> wrote:

>On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:
>> Hi, Alex.  You wrote:
>>> Have you tried something like:
>>> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
>>> 'g')::bytea, 'escape');
>> Hmm, forgot about regexp_replace.  It might do the trick, but without
>> a full-blown eval that I can run on the replacement side, it'll be a
>> bit more challenging.  But that's a good direction to consider, for sure.
>
>The function given didn't work exactly as written for me but it is on
>the right track. See if this works for you (input validation is left as
>an exercise for the reader...:)):
>
>create or replace function octal_string_to_text(someoctal text) returns
>text as $$
>declare
>     binstring text;
>begin
>     execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1',
>'g') || '''' into binstring;
>return binstring;
>end
>$$ language plpgsql;
>
>Cheers,
>Steve
>
>
>--
>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

Re: Speeding up loops in pl/pgsql function

From

Jochen Erwied

Date:

26 May 2011, 09:07:20

Thursday, May 26, 2011, 12:26:17 AM you wrote:

> Here you go... it looked nicer before I started to make optimizations;
> I've gotten it to run about 2x as fast as the previous version, but now
> I'm sorta stuck, looking for further optimizations, including possible
> use of builtin functions.

I've got only a 9.0.4 to play with, and bytea's are passed as an
hexadecimal string, so I resorted to writing the function with TEXT as
parameters, but maybe the following helps a bit, avoiding a few IMHO
useless string/int-operations:

CREATE OR REPLACE FUNCTION translate_octals_into_decimals(bytea_string text) RETURNS text AS $$
DECLARE
   bytea_string_length INTEGER := length(bytea_string);
   translated_string_array BYTEA[];

   output_number INTEGER := 0;
   num1 INTEGER;
   num2 INTEGER;
   num3 INTEGER;
   npos INTEGER;
   nlen INTEGER;
BEGIN
   RAISE NOTICE '[translate_octals_into_decimals] start at %, string of
length %', clock_timestamp(), pg_size_pretty(length(bytea_string));

   npos := 1;
   FOR i IN 1..bytea_string_length BY 3 LOOP
     num1 := substring(bytea_string from i for 1);
     num2 := substring(bytea_string from i+1 for 1);
     num3 := substring(bytea_string from i+2 for 1);
     output_number := 64*num1 + 8*num2 + num3;

     IF output_number = 0 THEN
        translated_string_array[npos] := E'\\000'::bytea;
     ELSIF output_number = 92 THEN
        translated_string_array[npos] := E'\\\\'::bytea;
     ELSE
        translated_string_array[npos] := chr(output_number)::bytea;
     END IF;
     npos := npos+1;
   END LOOP;

   RETURN array_to_string(translated_string_array, '');
END;
$$ LANGUAGE 'plpgsql';


--
Jochen Erwied     |   home: jochen@erwied.eu     +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work: joe@mbs-software.de  +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile: jochen.erwied@vodafone.de       +49-173-5404164

Re: Speeding up loops in pl/pgsql function

From

Merlin Moncure

Date:

26 May 2011, 12:11:41

On Wed, May 25, 2011 at 9:20 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, May 25, 2011 at 8:03 PM, Steve Crawford
> <scrawford@pinpointresearch.com> wrote:
>> On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:
>>>
>>> Hi, Alex.  You wrote:
>>>>
>>>> Have you tried something like:
>>>> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
>>>> 'g')::bytea, 'escape');
>>>
>>> Hmm, forgot about regexp_replace.  It might do the trick, but without a
>>> full-blown eval that I can run on the replacement side, it'll be a bit more
>>> challenging.  But that's a good direction to consider, for sure.
>>
>> The function given didn't work exactly as written for me but it is on the
>> right track. See if this works for you (input validation is left as an
>> exercise for the reader...:)):
>>
>> create or replace function octal_string_to_text(someoctal text) returns text
>> as $$
>> declare
>>    binstring text;
>> begin
>>    execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
>> || '''' into binstring;
>> return binstring;
>> end
>> $$ language plpgsql;
>
> four points (minor suggestions btw):
> 1. if you are dealing with strings that have backslashes in them,
> don't escape, but dollar quote.  Also try not to use dollar parameter
> notation if you can help it:
> ($1, E'(\\d{3})', E'\\\\\\1', 'g') -> (someoctal , $q$(\d{3})$q$,
> $q$\\\1$q$, 'g')
>
> this is particularly true with feeding strings to regex: that way you
> can use the same string pg as in various validators.
>
> 2. there is no need for execute here.
> execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
> becomes:
> binstring := 'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$,
> 'g')  /* I *think* I got this right */
>
> 3. if your function does not scribble on tables and has no or is not
> influenced by any side effects, mark it as IMMUTABLE. always.
> $$ language plpgsql IMMUTABLE;
>
> 4. since all we are doing is generating a variable, prefer sql
> function vs plpgsql. this is particularly true in pre 8.4 postgres
> (IIRC) where you can call the function much more flexibly (select
> func(); vs select * from func();) if that's the case.  Putting it all
> together,
>
> create or replace function octal_string_to_text(someoctal text)
> returns text as $$
>   SELECT  'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$, 'g');
> $$ sql immutable;
>
> Note I didn't actually check to see what your regex is donig (I'm
> assuming it's correct)...

hm, I slept on this and had the vague unsettling feeling I had said
something stupid -- and I did.  Double +1 to you for being cleverer
than me -- you are using 'execute' to eval the string back in to the
string.  Only plpgsql can do that, so point 4 is also moot.  Still,
the above points hold in principle, so if a way could be figured out
to do this without execute, that would be nice.

merlin

Re: Speeding up loops in pl/pgsql function

From

Merlin Moncure

Date:

26 May 2011, 12:36:23

On Thu, May 26, 2011 at 8:11 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, May 25, 2011 at 9:20 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Wed, May 25, 2011 at 8:03 PM, Steve Crawford
>> <scrawford@pinpointresearch.com> wrote:
>>> On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:
>>>>
>>>> Hi, Alex.  You wrote:
>>>>>
>>>>> Have you tried something like:
>>>>> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
>>>>> 'g')::bytea, 'escape');
>>>>
>>>> Hmm, forgot about regexp_replace.  It might do the trick, but without a
>>>> full-blown eval that I can run on the replacement side, it'll be a bit more
>>>> challenging.  But that's a good direction to consider, for sure.
>>>
>>> The function given didn't work exactly as written for me but it is on the
>>> right track. See if this works for you (input validation is left as an
>>> exercise for the reader...:)):
>>>
>>> create or replace function octal_string_to_text(someoctal text) returns text
>>> as $$
>>> declare
>>>    binstring text;
>>> begin
>>>    execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
>>> || '''' into binstring;
>>> return binstring;
>>> end
>>> $$ language plpgsql;
>>
>> four points (minor suggestions btw):
>> 1. if you are dealing with strings that have backslashes in them,
>> don't escape, but dollar quote.  Also try not to use dollar parameter
>> notation if you can help it:
>> ($1, E'(\\d{3})', E'\\\\\\1', 'g') -> (someoctal , $q$(\d{3})$q$,
>> $q$\\\1$q$, 'g')
>>
>> this is particularly true with feeding strings to regex: that way you
>> can use the same string pg as in various validators.
>>
>> 2. there is no need for execute here.
>> execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
>> becomes:
>> binstring := 'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$,
>> 'g')  /* I *think* I got this right */
>>
>> 3. if your function does not scribble on tables and has no or is not
>> influenced by any side effects, mark it as IMMUTABLE. always.
>> $$ language plpgsql IMMUTABLE;
>>
>> 4. since all we are doing is generating a variable, prefer sql
>> function vs plpgsql. this is particularly true in pre 8.4 postgres
>> (IIRC) where you can call the function much more flexibly (select
>> func(); vs select * from func();) if that's the case.  Putting it all
>> together,
>>
>> create or replace function octal_string_to_text(someoctal text)
>> returns text as $$
>>   SELECT  'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$, 'g');
>> $$ sql immutable;
>>
>> Note I didn't actually check to see what your regex is donig (I'm
>> assuming it's correct)...
>
> hm, I slept on this and had the vague unsettling feeling I had said
> something stupid -- and I did.  Double +1 to you for being cleverer
> than me -- you are using 'execute' to eval the string back in to the
> string.  Only plpgsql can do that, so point 4 is also moot.  Still,
> the above points hold in principle, so if a way could be figured out
> to do this without execute, that would be nice.

got it:
select decode(regexp_replace('141142143', '([0-9][0-9][0-9])',
$q$\\\1$q$ , 'g'), 'escape');
 decode
--------
 abc
(1 row)

merlin

Re: Speeding up loops in pl/pgsql function

From

"Reuven M. Lerner"

Date:

26 May 2011, 12:49:59

Wow.

Color me impressed and grateful.  I've been working on a different
project today, but I'll test these tonight.

I'll never underestimate the regexp functionality in PostgreSQL again!

Reuven

Re: Speeding up loops in pl/pgsql function

From

Steve Crawford

Date:

26 May 2011, 15:25:14

On 05/26/2011 05:36 AM, Merlin Moncure wrote:
> ...
> got it:
> select decode(regexp_replace('141142143', '([0-9][0-9][0-9])',
> $q$\\\1$q$ , 'g'), 'escape');
>   decode
> --------
>   abc
> (1 row)
>
> merlin
>
Nice. A word of warning, in 9.0 this returns a hex string:

select decode(regexp_replace('141142143', '([0-9][0-9][0-9])',
$q$\\\1$q$ , 'g'), 'escape');
   decode
----------
  \x616263

See http://www.postgresql.org/docs/9.0/static/release-9-0.html:

E.5.2.3. Data Types
     bytea output now appears in hex format by default (Peter Eisentraut)
     The server parameter bytea_output can be used to select the
traditional output format if needed for compatibility.

Another wrinkle, the function I wrote sort of ignored the bytea issue by
using text. But text is subject to character-encoding (for both good and
bad) while bytea is not so the ultimate solution will depend on whether
the input string is the octal representation of an un-encoded sequence
of bytes or represents a string of ASCII/UTF-8/whatever... encoded text.

Cheers,
Steve

Re: Speeding up loops in pl/pgsql function

From

"Reuven M. Lerner"

Date:

01 June 2011, 09:19:52

Hi, everyone.

First of all, thanks for all of your help several days ago.  The
improvements to our program were rather dramatic (in a positive sense).

Based on the help that everyone gave, I'm working on something similar,
trying to use regexp_replace to transform a string into the result of
invoking a function on each character.  For example, I'd like to do the
following:

regexp_replace('abc', '(.)', ascii(E'\\1')::text, 'g');

Unfortunately, the above invokes ascii() on the literal string E'\\1',
rather than on the value of the backreference, which isn't nearly as
useful.  I'd like to get '979899' back as a string.  And of course, once
I can get back the value of ascii(), I figure that it should work for
any function that I define.

Thanks again for any suggestions everyone might have.

(And if this should go to pgsql-general, then I'll understand.  If it
helps, my alternative to regexp_replace is a super-slow function, akin
to the one that I showed here last week.)

Reuven

Re: Speeding up loops in pl/pgsql function

From

Merlin Moncure

Date:

01 June 2011, 13:26:38

On Wed, Jun 1, 2011 at 4:19 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
> Hi, everyone.
>
> First of all, thanks for all of your help several days ago.  The
> improvements to our program were rather dramatic (in a positive sense).
>
> Based on the help that everyone gave, I'm working on something similar,
> trying to use regexp_replace to transform a string into the result of
> invoking a function on each character.  For example, I'd like to do the
> following:
>
> regexp_replace('abc', '(.)', ascii(E'\\1')::text, 'g');
>
> Unfortunately, the above invokes ascii() on the literal string E'\\1',
> rather than on the value of the backreference, which isn't nearly as useful.
>  I'd like to get '979899' back as a string.  And of course, once I can get
> back the value of ascii(), I figure that it should work for any function
> that I define.
>
> Thanks again for any suggestions everyone might have.
>
> (And if this should go to pgsql-general, then I'll understand.  If it helps,
> my alternative to regexp_replace is a super-slow function, akin to the one
> that I showed here last week.)

select string_agg(v, '') from (select
ascii(regexp_split_to_table('abc', $$\s*$$))::text as v) q;

(what about 3 digit ascii codes?)

merlin

Re: Speeding up loops in pl/pgsql function

From

"Reuven M. Lerner"

Date:

02 June 2011, 06:49:46

Hi, Merlin.  You wrote:

> select string_agg(v, '') from (select
> ascii(regexp_split_to_table('abc', $$\s*$$))::text as v) q;
Wow.   I've been programming with pl/pgsql for a good number of years,
but only now do I see the amazing usefulness of regexp_split_to_table
and string_agg, neither of which I really used until now.  Thanks for
both the solution and for opening my eyes.
> (what about 3 digit ascii codes?)
I have to put the number into a text field anyway, so I've been
converting the resulting number to text, and then using lpad to add
leading zeroes as necessary.

Thanks again,

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner