Re: Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING
Date
Msg-id 10393.1408974539@sss.pgh.pa.us
Whole thread Raw
In response to Re: Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING
List pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> On 07/12/2014 05:16 AM, Jeff Davis wrote:
>> I was able to see about a 2% increase in runtime when using the
>> similar_escape function directly. I made a 10M tuple table and did:
>> 
>> explain analyze
>> select
>> similar_escape('ΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣ','#') from t;
>> 
>> which was the worst reasonable case I could think of. (It appears that
>> selecting from a table is faster than from generate_series. I'm curious
>> what you use when testing the performance of an individual function at
>> the SQL level.)

> A large table like that is what I usually do. A large generate_series() 
> spends a lot of time building the tuplestore, especially if it doesn't 
> fit in work_mem and spills to disk. Sometimes I use this to avoid it:

> explain analyze
>        select
> similar_escape('ΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣΣ','#') 
> from generate_series(1, 10000) a, generate_series(1,1000);

> although in my experience it still has somewhat more overhead than a 
> straight seqscan because.

[ scratches head... ]  Surely similar_escape is marked immutable, and
will therefore be executed exactly once in either of these formulations,
because the planner will fold the expression to a constant.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Hardening pg_upgrade
Next
From: Petr Jelinek
Date:
Subject: Re: Built-in binning functions