Thread: SQL functions vs. PL/PgSQL functions

SQL functions vs. PL/PgSQL functions

From

"Reuven M. Lerner"

Date:

13 October 2010, 04:40:11

Hi, everyone.  I'm working with a client to try to optimize their use of
PostgreSQL.  They're running 8.3 on a Windows platform, packaged as part
of a physical product that is delivered to customers.

We're planning to upgrade to 9.0 at some point in the coming months, but
this question is relevant for 8.3 (and perhaps beyond).

All of the database-related logic for this application is in server-side
functions, written in PL/PgSQL.  That is, the application never issues a
SELECT or INSERT; rather, it invokes a function with parameters, and the
function handles the query.   It's not unusual for a function to invoke
one or more other PL/PgSQL functions as part of its execution.

Since many of these PL/PgSQL functions are just acting as wrappers around
queries, I thought that it would be a cheap speedup for us to change some
of them to SQL functions, rather than PL/PgSQL.  After all, PL/PgSQL is (I
thought) interpreted, whereas SQL functions can be inlined and handled
directly by the optimizer and such.

We made the change to one or two functions, and were rather surprised to
see the performance drop by quite a bit.

My question is whether this is somehow to be expected.  Under what
conditions will SQL functions be slower than PL/PgSQL functions?  Is there
a heuristic that I can/should use to know this in advance?  Does it matter
if the SELECT being executed operates against a table, or a PL/PgSQL
function?

Thanks in advance for any insights everyone can offer.

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

Re: SQL functions vs. PL/PgSQL functions

From

Craig Ringer

Date:

13 October 2010, 05:12:09

On 13/10/2010 3:30 PM, Reuven M. Lerner wrote:

> My question is whether this is somehow to be expected.  Under what
> conditions will SQL functions be slower than PL/PgSQL functions?

The main cases I can think of:

- Where the SQL function is inlined (PL/PgSQL functions can't be
inlined, some SQL functions can) and the inlining turns out to be a
performance loss rather than a gain.

- Where the PL/PgSQL function was constructing queries dynamically for
EXECUTE ... USING, so each query contained its parameters directly. If
converted to an SQL function (or a PL/PgSQL function using SELECT /
PERFORM instead of EXECUTE ... USING) the planner will make more generic
choices because it doesn't have stats on specific parameter values.
These choices are sometimes not all that great.

Beyond that, I'd have to wait to hear from someone who has more real
knowledge than my hand-waving can provide.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: SQL functions vs. PL/PgSQL functions

From

Merlin Moncure

Date:

13 October 2010, 10:50:43

On Wed, Oct 13, 2010 at 3:30 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
> Hi, everyone.  I'm working with a client to try to optimize their use of
> PostgreSQL.  They're running 8.3 on a Windows platform, packaged as part
> of a physical product that is delivered to customers.
>
> We're planning to upgrade to 9.0 at some point in the coming months, but
> this question is relevant for 8.3 (and perhaps beyond).
>
> All of the database-related logic for this application is in server-side
> functions, written in PL/PgSQL.  That is, the application never issues a
> SELECT or INSERT; rather, it invokes a function with parameters, and the
> function handles the query.   It's not unusual for a function to invoke
> one or more other PL/PgSQL functions as part of its execution.
>
> Since many of these PL/PgSQL functions are just acting as wrappers around
> queries, I thought that it would be a cheap speedup for us to change some
> of them to SQL functions, rather than PL/PgSQL.  After all, PL/PgSQL is (I
> thought) interpreted, whereas SQL functions can be inlined and handled
> directly by the optimizer and such.
>
> We made the change to one or two functions, and were rather surprised to
> see the performance drop by quite a bit.
>
> My question is whether this is somehow to be expected.  Under what
> conditions will SQL functions be slower than PL/PgSQL functions?  Is there
> a heuristic that I can/should use to know this in advance?  Does it matter
> if the SELECT being executed operates against a table, or a PL/PgSQL
> function?
>
> Thanks in advance for any insights everyone can offer.

*) SQL functions require you to use $n notation for input arguments vs
the argument name.
*) SQL functions are fairly transparent to planning/execution. They
are re-planned every time they are run (as are views)
*) simple SQL functions can be inlined, allowing for much smarter
plans where they are called (especially if they are immutable/stable)
*) SQL functions are much more forcefully validated when created.
This is of course very nice, but can occasionally be a pain, if you
want the function to apply to a search path other than the default
search path.  This forces me to disable body checking in particular
cases.
*) In the not so old days, SQL functions could be called in more
conexts (select func() vs select * from func()). This is now changed
though.
*) SQL returning setof functions, can send RETURNING from
insert/update to the output of the function.  This is the ONLY way to
do this at present (until we get wCTE) w/o involving the client.

*) plpgsql functions are completely planned and that plan is held for
the duration of the session, or until a invalidation event occurs
(statistics driven, table dropping, etc).  This adds overhead to first
call but reduces overhead in subsequent calls since you don't have to
re-plan.  This also means you can't float the function over multiple
search paths on the same connection (EVER, even if you DISCARD).  This
also means you have to be aware of temp table interactions w/plans if
you are concerned about performance.
*) plpgsql allows dynamic execution (can use to get around above),
specific variable names, sane error handling, and all kinds of other
wonderful things too numerous to mention.
*) plpgsql simple expressions (like n:=n+1) can bypass SPI, and
therefore run pretty quickly.

both sql and plpgsql functions create a mvcc snapshot as soon as the
function is entered.  This can and will cause headaches if you are
writing highly concurrent systems utilizing serializable transactions.
(this is one of the biggest annoyances with a 100% pl interface to
your db).

when you make the jump to 9.0, you might want to check out libpqtypes
if you are writing your client in C.  it will greatly easy sending
complex data to/from the database to receiving functions.  certain
other db interfaces can also do this, for example python has a very
good database driver for postgres.

merlin

Re: SQL functions vs. PL/PgSQL functions

From

Tom Lane

Date:

13 October 2010, 11:14:59

"Reuven M. Lerner" <reuven@lerner.co.il> writes:
> All of the database-related logic for this application is in server-side
> functions, written in PL/PgSQL.  That is, the application never issues a
> SELECT or INSERT; rather, it invokes a function with parameters, and the
> function handles the query.   It's not unusual for a function to invoke
> one or more other PL/PgSQL functions as part of its execution.

> Since many of these PL/PgSQL functions are just acting as wrappers around
> queries, I thought that it would be a cheap speedup for us to change some
> of them to SQL functions, rather than PL/PgSQL.  After all, PL/PgSQL is (I
> thought) interpreted, whereas SQL functions can be inlined and handled
> directly by the optimizer and such.

> We made the change to one or two functions, and were rather surprised to
> see the performance drop by quite a bit.

> My question is whether this is somehow to be expected.

It's not particularly surprising, especially not if your past
development has tended to tune the app so that plpgsql works well.

In the first place, SQL operations issued in plpgsql aren't somehow
"interpreted" when everything else is "compiled".  It's the same
execution engine.  It would be fair to speak of control logic in
plpgsql as being interpreted; but since SQL functions don't have any
ability to include control logic at all, you're not going to be moving
anything of that description over.  Besides, the control logic usually
takes next to no time compared to the SQL operations.

The reason that plpgsql-issued queries are sometimes slower than queries
executed directly is that plpgsql parameterizes the queries according
to whatever plpgsql variables/parameters they use, and sometimes you get
a worse plan if the planner can't see the exact values of particular
variables used in a query.

The reason plpgsql does that is that it saves the plans for individual
SQL queries within a function for the life of the session.  SQL
functions involve no such state --- either they get inlined into the
calling query, in which case they have to be planned when that query is,
or else they are planned on-the-fly at beginning of execution.  So your
change has definitely de-optimized things in the sense of introducing
more planning work.

Now you could have seen a win anyway, if plpgsql's parameterized
query plans were sufficiently inefficient that planning on-the-fly
with actual variable values would beat them out.  But that's evidently
not the case for (most of?) your usage patterns.  In places where it is
the case, the usual advice is to fix it by using EXECUTE, not by giving
up plpgsql's ability to cache plans everywhere else.

It's possible that at some point we'll try to introduce plan caching
for non-inlined SQL functions.  But at best this would put them on a
par with plpgsql speed-wise.  Really the only place where a SQL function
will be a win for performance is if it can be inlined into the calling
query, and that's pretty much never the case in the usage pattern you're
talking about.  (The sort of inlining we're talking about is more or
less textual substitution, and you can't insert an INSERT/UPDATE/DELETE
in a SELECT.)

            regards, tom lane

Re: SQL functions vs. PL/PgSQL functions

From

"Reuven M. Lerner"

Date:

14 October 2010, 01:59:43

  Wow.  Thanks so much to all of you for the thoughtful and helpful
responses!

Reuven

Re: SQL functions vs. PL/PgSQL functions

From

Merlin Moncure

Date:

14 October 2010, 11:30:08

On Wed, Oct 13, 2010 at 10:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It's possible that at some point we'll try to introduce plan caching
> for non-inlined SQL functions.

hm, I think the search_path/function plan issue would have to be dealt
with before doing this -- a while back IIRC you suggested function
plans might be organized around search_path setting at plan time, or
this would break a fair amount of code (for example, mine) :-).

merlin

Re: SQL functions vs. PL/PgSQL functions

From

Tom Lane

Date:

14 October 2010, 11:43:12

Merlin Moncure <mmoncure@gmail.com> writes:
> On Wed, Oct 13, 2010 at 10:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's possible that at some point we'll try to introduce plan caching
>> for non-inlined SQL functions.

> hm, I think the search_path/function plan issue would have to be dealt
> with before doing this --

Yeah, perhaps.  There doesn't seem to be any groundswell of demand for
doing anything about that anyway.  Particularly since plpgsql is now
installed by default, a reasonable answer to "I'd like the system to
cache plans for this" is now "so write it in plpgsql instead".

            regards, tom lane