Re: Speeding up loops in pl/pgsql function - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Speeding up loops in pl/pgsql function
Date
Msg-id BANLkTin8u2fM+RjE4Ao78v2W83V2Z9iHUA@mail.gmail.com
Whole thread Raw
In response to Re: Speeding up loops in pl/pgsql function  (Steve Crawford <scrawford@pinpointresearch.com>)
Responses Re: Speeding up loops in pl/pgsql function
List pgsql-performance
On Wed, May 25, 2011 at 8:03 PM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
> On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:
>>
>> Hi, Alex.  You wrote:
>>>
>>> Have you tried something like:
>>> SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
>>> 'g')::bytea, 'escape');
>>
>> Hmm, forgot about regexp_replace.  It might do the trick, but without a
>> full-blown eval that I can run on the replacement side, it'll be a bit more
>> challenging.  But that's a good direction to consider, for sure.
>
> The function given didn't work exactly as written for me but it is on the
> right track. See if this works for you (input validation is left as an
> exercise for the reader...:)):
>
> create or replace function octal_string_to_text(someoctal text) returns text
> as $$
> declare
>    binstring text;
> begin
>    execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
> || '''' into binstring;
> return binstring;
> end
> $$ language plpgsql;

four points (minor suggestions btw):
1. if you are dealing with strings that have backslashes in them,
don't escape, but dollar quote.  Also try not to use dollar parameter
notation if you can help it:
($1, E'(\\d{3})', E'\\\\\\1', 'g') -> (someoctal , $q$(\d{3})$q$,
$q$\\\1$q$, 'g')

this is particularly true with feeding strings to regex: that way you
can use the same string pg as in various validators.

2. there is no need for execute here.
execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\\\\\1', 'g')
becomes:
binstring := 'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$,
'g')  /* I *think* I got this right */

3. if your function does not scribble on tables and has no or is not
influenced by any side effects, mark it as IMMUTABLE. always.
$$ language plpgsql IMMUTABLE;

4. since all we are doing is generating a variable, prefer sql
function vs plpgsql. this is particularly true in pre 8.4 postgres
(IIRC) where you can call the function much more flexibly (select
func(); vs select * from func();) if that's the case.  Putting it all
together,

create or replace function octal_string_to_text(someoctal text)
returns text as $$
   SELECT  'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$, 'g');
$$ sql immutable;

Note I didn't actually check to see what your regex is donig (I'm
assuming it's correct)...

merlin

pgsql-performance by date:

Previous
From: Steve Crawford
Date:
Subject: Re: Speeding up loops in pl/pgsql function
Next
From: Craig Ringer
Date:
Subject: Re: Hash Anti Join performance degradation