Thread: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

When perl function executes first time, it is too slowly, but if
execute perl function(not function which executed first time) again it
runs in 1000 times faster. Why ? how can i optimize it ?
Configure shared_preload_libraries = '$libdir/plperl' or
local_preload_libraries = '$libdir/plugins/plperl' does not help;
Reproduce code:
-- First of all, creating functions
CREATE OR REPLACE FUNCTION "perl_test_speed" () RETURNS
"pg_catalog"."void" AS $body$;$body$ LANGUAGE 'plperl' VOLATILE CALLED
ON NULL INPUT SECURITY INVOKER;
CREATE OR REPLACE FUNCTION "perl_test_speed_two" () RETURNS
"pg_catalog"."void" AS $body$;$body$ LANGUAGE 'plperl' VOLATILE CALLED
ON NULL INPUT SECURITY INVOKER;

-- Reopen connection(THIS IS IMPORTANT STEP!)

-- Execute:
EXPLAIN ANALYZE SELECT perl_test_speed();
-- Result:Result  (cost=0.00..0.26 rows=1 width=0) (actual
time=149.994..149.999 rows=1 loops=1)Total runtime: 150.363 ms -- WTF?
-- Other calls in this connection:
EXPLAIN ANALYZE SELECT perl_test_speed();
-- Result:Result  (cost=0.00..0.26 rows=1 width=0) (actual time=0.121..0.122
rows=1 loops=1)Total runtime: 0.176 ms


-- Reopen connection(THIS IS IMPORTANT STEP!)

EXPLAIN ANALYZE SELECT perl_test_speed_two();
----------------------------------------------------------------------------------------Result  (cost=0.00..0.26 rows=1
width=0)(actual
 
time=173.713..173.716 rows=1 loops=1)Total runtime: 174.073 ms -- WTF ?

EXPLAIN ANALYZE SELECT perl_test_speed();
------------------------------------------------------------------------------------Result  (cost=0.00..0.26 rows=1
width=0)(actual time=0.973..0.975
 
rows=1 loops=1)Total runtime: 1.035 ms


EXPLAIN ANALYZE SELECT perl_test_speed();
------------------------------------------------------------------------------------Result  (cost=0.00..0.26 rows=1
width=0)(actual time=0.122..0.123
 
rows=1 loops=1)Total runtime: 0.171 ms
-- END;

How can i make it run faster, after every reconnect ?
Thanks!
PS. Sorry for my English.



Oleg Serov wrote:
> When perl function executes first time, it is too slowly, but if
> execute perl function(not function which executed first time) again it
> runs in 1000 times faster. Why ? how can i optimize it ?
> Configure shared_preload_libraries = '$libdir/plperl' or
> local_preload_libraries = '$libdir/plugins/plperl' does not help;
>
>   

The function is recompiled on the first call in each backend. (The same 
is true of most PLs, BTW, it's not a perl-only problem.) There is no 
immediate cure, unfortunately. Using pooled connections might help to 
mitigate the effect.

One might imagine providing for certain functions to be loaded on 
backend startup, or even (where we aren't using BACKEND_EXEC) on 
postmaster start. But that would be a new feature, and we are past 
feature freeze for the upcoming release, so it's not going to happen any 
time soon.

cheers

andrrew


Why pl/pgsql doesn't have this effect ?

2008/11/16 Andrew Dunstan <andrew@dunslane.net>:
>
>
> Oleg Serov wrote:
>>
>> When perl function executes first time, it is too slowly, but if
>> execute perl function(not function which executed first time) again it
>> runs in 1000 times faster. Why ? how can i optimize it ?
>> Configure shared_preload_libraries = '$libdir/plperl' or
>> local_preload_libraries = '$libdir/plugins/plperl' does not help;
>>
>>
>
> The function is recompiled on the first call in each backend. (The same is
> true of most PLs, BTW, it's not a perl-only problem.) There is no immediate
> cure, unfortunately. Using pooled connections might help to mitigate the
> effect.
>
> One might imagine providing for certain functions to be loaded on backend
> startup, or even (where we aren't using BACKEND_EXEC) on postmaster start.
> But that would be a new feature, and we are past feature freeze for the
> upcoming release, so it's not going to happen any time soon.
>
> cheers
>
> andrrew
>


Hey, you are wrong, compile time is 1 ms, but not 100 ms, it hapens on
first plperl function call, it is perl init problem. not function
compilation.

2008/11/16 Oleg Serov <serovov@gmail.com>:
> Why pl/pgsql doesn't have this effect ?
>
> 2008/11/16 Andrew Dunstan <andrew@dunslane.net>:
>>
>>
>> Oleg Serov wrote:
>>>
>>> When perl function executes first time, it is too slowly, but if
>>> execute perl function(not function which executed first time) again it
>>> runs in 1000 times faster. Why ? how can i optimize it ?
>>> Configure shared_preload_libraries = '$libdir/plperl' or
>>> local_preload_libraries = '$libdir/plugins/plperl' does not help;
>>>
>>>
>>
>> The function is recompiled on the first call in each backend. (The same is
>> true of most PLs, BTW, it's not a perl-only problem.) There is no immediate
>> cure, unfortunately. Using pooled connections might help to mitigate the
>> effect.
>>
>> One might imagine providing for certain functions to be loaded on backend
>> startup, or even (where we aren't using BACKEND_EXEC) on postmaster start.
>> But that would be a new feature, and we are past feature freeze for the
>> upcoming release, so it's not going to happen any time soon.
>>
>> cheers
>>
>> andrrew
>>
>




Oleg Serov wrote:
> Hey, you are wrong, compile time is 1 ms, but not 100 ms, it hapens on
> first plperl function call, it is perl init problem. not function
> compilation.
>   

Please do NOT top-answer - it makes threads totally unreadable.

Is a one-off cost per connection of 100ms so bothersome?

The alternative would be to start it up when the backend starts, but 
backend startup cost is already high, so that doesn't seem like a win.

cheers

andrew




Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

From
Martijn van Oosterhout
Date:
On Sun, Nov 16, 2008 at 09:15:28AM -0500, Andrew Dunstan wrote:
> Oleg Serov wrote:
> >Hey, you are wrong, compile time is 1 ms, but not 100 ms, it hapens on
> >first plperl function call, it is perl init problem. not function
> >compilation.
> >
>
> Please do NOT top-answer - it makes threads totally unreadable.
>
> Is a one-off cost per connection of 100ms so bothersome?
>
> The alternative would be to start it up when the backend starts, but
> backend startup cost is already high, so that doesn't seem like a win.

If per-connection costs are a problem, then connection pooling usually
helps.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Andrew Dunstan <andrew@dunslane.net> writes:
> Is a one-off cost per connection of 100ms so bothersome?

I think the complaint is that there's no obvious reason for it to be so
high.

On my Fedora 9 machine, the overhead to start plperl seems to be about
40 msec.  This compares unfavorably to the time to start perl from the
command line, which is under 4 msec.  I see that /usr/bin/perl pulls in
libperl.so, so it's paying the same shlib overhead as we do.  How is it
that we take ten times longer to start up?
        regards, tom lane


I wrote:
> On my Fedora 9 machine, the overhead to start plperl seems to be about
> 40 msec.  This compares unfavorably to the time to start perl from the
> command line, which is under 4 msec.  I see that /usr/bin/perl pulls in
> libperl.so, so it's paying the same shlib overhead as we do.  How is it
> that we take ten times longer to start up?

The above number was for plperl in a SQL_ASCII database.
Some more data points:
    plperl        plperlu

SQL_ASCII    40        18
UTF8        67        18

which leads to the conclusion that those random little startup things
plperl does are just unbelievably expensive.
        regards, tom lane



Tom Lane wrote:
> I wrote:
>   
>> On my Fedora 9 machine, the overhead to start plperl seems to be about
>> 40 msec.  This compares unfavorably to the time to start perl from the
>> command line, which is under 4 msec.  I see that /usr/bin/perl pulls in
>> libperl.so, so it's paying the same shlib overhead as we do.  How is it
>> that we take ten times longer to start up?
>>     
>
> The above number was for plperl in a SQL_ASCII database.
> Some more data points:
>
>         plperl        plperlu
>
> SQL_ASCII    40        18
> UTF8        67        18
>
> which leads to the conclusion that those random little startup things
> plperl does are just unbelievably expensive.
>
>             
>   

I suspect much of this time is taken in loading perl modules, (strict, 
Safe, utf8, SPI among others), not in anything very special that we do.:
   andrew@constanza plperl]$ time perl -e ';'
   real    0m0.004s   user    0m0.003s   sys     0m0.001s   [andrew@constanza plperl]$ time perl -e 'use strict; use
Safe;useutf8;'
 
   real    0m0.053s   user    0m0.016s   sys     0m0.007s

cheers

andrew






So, how i must optimize it?

2008/11/16 Andrew Dunstan <andrew@dunslane.net>:
>
>
> Tom Lane wrote:
>>
>> I wrote:
>>
>>>
>>> On my Fedora 9 machine, the overhead to start plperl seems to be about
>>> 40 msec.  This compares unfavorably to the time to start perl from the
>>> command line, which is under 4 msec.  I see that /usr/bin/perl pulls in
>>> libperl.so, so it's paying the same shlib overhead as we do.  How is it
>>> that we take ten times longer to start up?
>>>
>>
>> The above number was for plperl in a SQL_ASCII database.
>> Some more data points:
>>
>>                plperl          plperlu
>>
>> SQL_ASCII       40              18
>> UTF8            67              18
>>
>> which leads to the conclusion that those random little startup things
>> plperl does are just unbelievably expensive.
>>
>>
>>
>
> I suspect much of this time is taken in loading perl modules, (strict, Safe,
> utf8, SPI among others), not in anything very special that we do.:
>
>   andrew@constanza plperl]$ time perl -e ';'
>
>   real    0m0.004s
>   user    0m0.003s
>   sys     0m0.001s
>   [andrew@constanza plperl]$ time perl -e 'use strict; use Safe;use utf8;'
>
>   real    0m0.053s
>   user    0m0.016s
>   sys     0m0.007s
>
> cheers
>
> andrew
>
>
>
>
>


Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> ... which leads to the conclusion that those random little startup things
>> plperl does are just unbelievably expensive.

> I suspect much of this time is taken in loading perl modules, (strict, 
> Safe, utf8, SPI among others), not in anything very special that we do.:

I did some more poking and determined that on my machine, it takes about
14 msec to do plperl_init_interp and about 20 msec to do
plperl_safe_init.

Most of the former time (over 9 msec) is going into the perl_parse call;
I'm not entirely sure why it's so expensive.  Taking out the SPI module
initialization does *not* save anything worth noticing.  I tried
trimming various bits of the PERLBOOT script and the only thing that
made a dent was removing "use vars qw(%_SHARED);", but that was still
not much (about 5 msec).

In plperl_safe_init, it's entirely clear that "require Safe" is the
expensive part; I can duplicate a comparable time expenditure when doing
that from the command line.

So about the only real answer is going to be preloading.  It seems worth
considering that on machines where can_run_two is true, we should just
go ahead and initialize both interps at _PG_init time, so as to allow
the "require Safe" overhead to be bought back by preloading.  I'm not
sure how to deal with the UTF8 hack though.
        regards, tom lane


"Oleg Serov" <serovov@gmail.com> writes:
> So, how i must optimize it?

The short-term answer seems to be "preload and use plperlu".
        regards, tom lane


Wee need to use shared memory for passing one BIGINT value(is session
throwout triggers), can you advice the method to get/set it with best
performance ?

2008/11/16 Tom Lane <tgl@sss.pgh.pa.us>:
> "Oleg Serov" <serovov@gmail.com> writes:
>> So, how i must optimize it?
>
> The short-term answer seems to be "preload and use plperlu".
>
>                        regards, tom lane
>



Tom Lane wrote:
>
> So about the only real answer is going to be preloading.  It seems worth
> considering that on machines where can_run_two is true, we should just
> go ahead and initialize both interps at _PG_init time, so as to allow
> the "require Safe" overhead to be bought back by preloading. 

Even if only one language is defined?

cheers

andrew




Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> So about the only real answer is going to be preloading.  It seems worth
>> considering that on machines where can_run_two is true, we should just
>> go ahead and initialize both interps at _PG_init time, so as to allow
>> the "require Safe" overhead to be bought back by preloading. 

> Even if only one language is defined?

The point here is to do the work at postmaster start time.  You won't
get a chance to find out whether both languages are defined in some
database or other.  (This is the same thing as the point about the
UTF8 hack --- you can't tell if it's needed or not.)
        regards, tom lane



Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tom Lane wrote:
>>     
>>> So about the only real answer is going to be preloading.  It seems worth
>>> considering that on machines where can_run_two is true, we should just
>>> go ahead and initialize both interps at _PG_init time, so as to allow
>>> the "require Safe" overhead to be bought back by preloading. 
>>>       
>
>   
>> Even if only one language is defined?
>>     
>
> The point here is to do the work at postmaster start time.  You won't
> get a chance to find out whether both languages are defined in some
> database or other.  (This is the same thing as the point about the
> UTF8 hack --- you can't tell if it's needed or not.)
>
>             
>   

w.r.t. UTF8, I guess we'll need a way of knowing if we're preloading or 
not, and if so we'd need to skip the calls to GetDatabaseEncoding().

I assume this will all happen in the 8.5 dev cycle (or later).

cheers

andrew


Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> The point here is to do the work at postmaster start time.  You won't
>> get a chance to find out whether both languages are defined in some
>> database or other.  (This is the same thing as the point about the
>> UTF8 hack --- you can't tell if it's needed or not.)

> w.r.t. UTF8, I guess we'll need a way of knowing if we're preloading or 
> not, and if so we'd need to skip the calls to GetDatabaseEncoding().

If you mean that you want to still do the UTF8 hack at backend start
time, I think you might as well not bother with any of this.  That hack
in itself is slow enough to be 50% of the problem (presumably because
it's loading some modules down in there somewhere).  We need to arrange
to either not do it at all or do it at preload.
        regards, tom lane


On Sun, 2008-11-16 at 23:20 +0300, Oleg Serov wrote:
> Wee need to use shared memory for passing one BIGINT value(is session
> throwout triggers), can you advice the method to get/set it with best
> performance ?

have you tried "setval(seq, value)" to set and "select last_value from
seq" to read it.

or just use a table, if concurrency control is important

hannu=# create table shvaltable(id int, value bigint);
CREATE TABLE
Time: 34.704 ms
hannu=# insert into shvaltable values(1,0);
INSERT 0 1
Time: 0.742 ms
hannu=# explain analyse update shvaltable set value=47 where id=1;                                             QUERY
PLAN                                               
-------------------------------------------------------------------------------------------------------Seq Scan on
shvaltable (cost=0.00..34.25 rows=10 width=10) (actual
 
time=0.015..0.017 rows=1 loops=1)  Filter: (id = 1)Total runtime: 0.073 ms
(3 rows)

Time: 3.503 ms
hannu=# explain analyse update shvaltable set value=47 where id=1;                                             QUERY
PLAN                                               
-------------------------------------------------------------------------------------------------------Seq Scan on
shvaltable (cost=0.00..34.25 rows=10 width=10) (actual
 
time=0.014..0.016 rows=1 loops=1)  Filter: (id = 1)Total runtime: 0.058 ms
(3 rows)

Time: 3.298 ms


> 2008/11/16 Tom Lane <tgl@sss.pgh.pa.us>:
> > "Oleg Serov" <serovov@gmail.com> writes:
> >> So, how i must optimize it?
> >
> > The short-term answer seems to be "preload and use plperlu".
> >
> >                        regards, tom lane
> >
> 




Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tom Lane wrote:
>>     
>>> The point here is to do the work at postmaster start time.  You won't
>>> get a chance to find out whether both languages are defined in some
>>> database or other.  (This is the same thing as the point about the
>>> UTF8 hack --- you can't tell if it's needed or not.)
>>>       
>
>   
>> w.r.t. UTF8, I guess we'll need a way of knowing if we're preloading or 
>> not, and if so we'd need to skip the calls to GetDatabaseEncoding().
>>     
>
> If you mean that you want to still do the UTF8 hack at backend start
> time, I think you might as well not bother with any of this.  That hack
> in itself is slow enough to be 50% of the problem (presumably because
> it's loading some modules down in there somewhere).  We need to arrange
> to either not do it at all or do it at preload.
>
>             
>   


I don't mean that we should do it at backend start even if we are 
preloading. But you're not intending to make preloading compulsory, are 
you?

The whole cost of loading that has been complained about is 1/10 of a 
second per backend (or maybe more if you use both plperl and plperlu). 
I'm fine with providing a preloading solution, but plenty of us who use 
plperl from time to time are not disturbed by the current cost, 
especially when it's used in web apps that use connection pooling, where 
it disappears entirely in the noise. 

Does preloading work at all in the BACKEND_EXEC case?

cheers

andrew



Hannu Krosing wrote:
> On Sun, 2008-11-16 at 23:20 +0300, Oleg Serov wrote:
>   
>> Wee need to use shared memory for passing one BIGINT value(is session
>> throwout triggers), can you advice the method to get/set it with best
>> performance ?
>>     
>
> have you tried "setval(seq, value)" to set and "select last_value from
> seq" to read it.
>
> or just use a table, if concurrency control is important
>   

Neither of these would be the same as using plperl's %_SHARED, which I 
gather is what is being done. But using a temp table would be.

cheers

andrew




Andrew Dunstan <andrew@dunslane.net> writes:
> I don't mean that we should do it at backend start even if we are 
> preloading. But you're not intending to make preloading compulsory, are 
> you?

No, certainly not.  But people who are doing preloading should get as
much benefit out of doing so as possible.

> Does preloading work at all in the BACKEND_EXEC case?

No.
        regards, tom lane


> The point here is to do the work at postmaster start time.

What happens if I upgrade one of the perl modules that is getting
preloaded here?  It seems this change will require the entire database
to be shut down to pick up the changes, vs. just needing to terminate
individual backends.

...Robert


"Robert Haas" <robertmhaas@gmail.com> writes:
>> The point here is to do the work at postmaster start time.

> What happens if I upgrade one of the perl modules that is getting
> preloaded here?  It seems this change will require the entire database
> to be shut down to pick up the changes, vs. just needing to terminate
> individual backends.

Well, that would be one of the tradeoffs to consider when deciding
to preload plperl.
        regards, tom lane


On Mon, 2008-11-17 at 12:37 -0500, Andrew Dunstan wrote:
> 
> Hannu Krosing wrote:
> > On Sun, 2008-11-16 at 23:20 +0300, Oleg Serov wrote:
> >   
> >> Wee need to use shared memory for passing one BIGINT value(is session
> >> throwout triggers), can you advice the method to get/set it with best
> >> performance ?
> >>     
> >
> > have you tried "setval(seq, value)" to set and "select last_value from
> > seq" to read it.
> >
> > or just use a table, if concurrency control is important
> >   
> 
> Neither of these would be the same as using plperl's %_SHARED, which I 
> gather is what is being done.

He asked for a method to get/set a shared value "with best performance"
and updating a table row seems about twice as fast as calling a plperl
function.

>  But using a temp table would be.

You can't access temp table from several sessions, which is what I think
they want.


-- 
------------------------------------------
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability   Services, Consulting and Training




Hannu Krosing wrote:
>>
>> Neither of these would be the same as using plperl's %_SHARED, which I 
>> gather is what is being done.
>>     
>
> He asked for a method to get/set a shared value "with best performance"
> and updating a table row seems about twice as fast as calling a plperl
> function.
>
>   
>>  But using a temp table would be.
>>     
>
> You can't access temp table from several sessions, which is what I think
> they want.
>
>   

Well, neither is %_SHARED shared across backends, so if that's what they 
want I'm wondering what exactly they are doing. Maybe using plperlu to 
talk to some shared memory, or memcached, or some such.

cheers

andrew



Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> I don't mean that we should do it at backend start even if we are 
>> preloading. But you're not intending to make preloading compulsory, are 
>> you?
>>     
>
> No, certainly not.  But people who are doing preloading should get as
> much benefit out of doing so as possible.
>
>   
>   

OK, I think we're in furious agreement.

cheers

andrew