Re: simple functions, huge overhead, no cache - Mailing list pgsql-general

From Craig Ringer
Subject Re: simple functions, huge overhead, no cache
Date
Msg-id 4C381C2A.4000301@postnewspapers.com.au
Whole thread Raw
In response to simple functions, huge overhead, no cache  (Josip Rodin <joy@entuzijast.net>)
Responses Re: simple functions, huge overhead, no cache  (Scott Ribe <scott_ribe@killerbytes.com>)
Re: simple functions, huge overhead, no cache  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On 09/07/10 22:26, Josip Rodin wrote:

> db=# explain analyze select issuperuser(id) from users;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>  Seq Scan on users  (cost=0.00..6624.00 rows=23000 width=4) (actual time=0.186..644.488 rows=23000 loops=1)
>  Total runtime: 664.486 ms
> (2 rows)
>
> db=# explain analyze select userInGroup(id, 1000) from users;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>  Seq Scan on users  (cost=0.00..6624.00 rows=23000 width=4) (actual time=0.125..417.948 rows=23000 loops=1)
>  Total runtime: 437.594 ms
> (2 rows)

That's within expected bounds for PL/PgSQL function overhead.

regress=> CREATE OR REPLACE FUNCTION noop(int) returns int as $$
begin
return $1;
end;
$$ language 'plpgsql';

regress=> explain analyze select x from generate_series(0,23000) as x;
                                                       QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series x  (cost=0.00..12.50 rows=1000
width=4) (actual time=9.990..44.339 rows=23001 loops=1)
 Total runtime: 78.061 ms
(2 rows)

regress=> explain analyze select noop(x) from generate_series(0,23000) as x;
                                                         QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series x  (cost=0.00..262.50 rows=1000
width=4) (actual time=10.846..104.445 rows=23001 loops=1)
 Total runtime: 139.622 ms
(2 rows)



The comparison to unnesting two layers of PL/PgSQL function calls is
harsher, but then there *are* two layers of expensive calls around very
simple expressions.

Your problem boils down to the fact that PL/PgSQL function calls are
expensive. PL/PgSQL is good for complex work, but bad for wrapping up
simple expressions because setting up / tearing down the function call
context is so expensive.

For such simple expressions, you should use 'SQL' functions. These can
often be inlined to allow the query planner to avoid call overheads
entirely, and are WAY cheaper even if they can't be inlined. They're
less flexible, but much faster.

So you might write:

create or replace function
usercandoonobject(integer,integer,character,integer) returns boolean as $$
    select case  when isSuperuser(p_user_id) then true
                 else userCanDoOnObjectCheckGod($1, $2, $3, $4) end;
$$ language 'sql' stable;

create or replace function
issuperuser(integer) returns boolean as $$
    SELECT userInGroup($1, 1000)
$$ language 'sql' stable;

... and so on.

--
Craig Ringer

pgsql-general by date:

Previous
From: Felipe de Jesús Molina Bravo
Date:
Subject: Re: pl-perl for 64 bits in Solaris 9
Next
From: "Daniel Verite"
Date:
Subject: Re: weird empty return from select problem; periodically get no data returned - could it be a network issue?