Thread: plpgsql functions organisation

plpgsql functions organisation

From

Yves Dorfsman

Date:

02 May 2015, 16:54:04

I find my plpgsql functions becomes unreadable very quickly. I want to break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level, is there
any way to organise them in packages like with python and other languages, so
the smaller functions are hidden away in a package/directory?


Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

Re: plpgsql functions organisation

From

Melvin Davidson

Date:

02 May 2015, 17:12:15

AFAIK, you cannot "package" functions in PostgreSQL, but it is possible to
call a function from within a function.

That being said, I would seriously look at how and why you are writing your functions
as functions that call other functions are not very efficient.

Also note that PostgreSQL allows you define functions using Python, so that might be a possible compromise.

On Sat, May 2, 2015 at 12:53 PM, Yves Dorfsman <yves@zioup.com> wrote:

I find my plpgsql functions becomes unreadable very quickly. I want to break
them up in smaller functions.

What is the best way to organised them?
Is there any way to define functions inside functions?
When I list functions in psql, I can see them all at the same level, is there
any way to organise them in packages like with python and other languages, so
the smaller functions are hidden away in a package/directory?

Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: plpgsql functions organisation

From

Adrian Klaver

Date:

02 May 2015, 19:32:49

On 05/02/2015 09:53 AM, Yves Dorfsman wrote:
>
> I find my plpgsql functions becomes unreadable very quickly. I want to break
> them up in smaller functions.
>
> What is the best way to organised them?
> Is there any way to define functions inside functions?
> When I list functions in psql, I can see them all at the same level, is there
> any way to organise them in packages like with python and other languages, so
> the smaller functions are hidden away in a package/directory?

The  only thing I can think of is to use SCHEMAs;

http://www.postgresql.org/docs/9.4/interactive/sql-createschema.html

>
>
> Thanks.
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: plpgsql functions organisation

From

Yves Dorfsman

Date:

02 May 2015, 19:34:38

On 2015-05-02 11:12, Melvin Davidson wrote:
> AFAIK, you cannot "package" functions in  PostgreSQL, but it is possible to
> call a function from within a function.
>
> That being said, I would seriously look at how and why you are writing your
> functions
> as functions that call other functions are not very efficient.

Simply to make long procedures easier to read and follow the logic.

>
> Also note that PostgreSQL allows you define functions using Python, so that
> might be a possible compromise.


Thanks.

--
http://yves.zioup.com
gpg: 4096R/32B0F416

Re: plpgsql functions organisation

From

Adrian Klaver

Date:

02 May 2015, 20:05:08

On 05/02/2015 10:12 AM, Melvin Davidson wrote:
> AFAIK, you cannot "package" functions in  PostgreSQL, but it is possible to
> call a function from within a function.
>
> That being said, I would seriously look at how and why you are writing
> your functions
> as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is using
existing functions in new functions/classes. In fact in the Postgres
source I see this in many places. Now it is entirely possible I missed a
memo, so I am open to a more detailed explanation of the inefficiencies
involved.

>
> Also note that PostgreSQL allows you define functions using Python, so
> that might be a possible compromise.
>
> On Sat, May 2, 2015 at 12:53 PM, Yves Dorfsman <yves@zioup.com
> <mailto:yves@zioup.com>> wrote:
>
>
>     I find my plpgsql functions becomes unreadable very quickly. I want
>     to break
>     them up in smaller functions.
>
>     What is the best way to organised them?
>     Is there any way to define functions inside functions?
>     When I list functions in psql, I can see them all at the same level,
>     is there
>     any way to organise them in packages like with python and other
>     languages, so
>     the smaller functions are hidden away in a package/directory?
>
>
>     Thanks.
>
>     --
>     http://yves.zioup.com
>     gpg: 4096R/32B0F416
>
>
>
>     --
>     Sent via pgsql-general mailing list (pgsql-general@postgresql.org
>     <mailto:pgsql-general@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: plpgsql functions organisation

From

Jeff Janes

Date:

02 May 2015, 21:07:39

On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:
AFAIK, you cannot "package" functions in PostgreSQL, but it is possible to
call a function from within a function.

That being said, I would seriously look at how and why you are writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out 'units of work' so they can be combined as needed. Part of that is using existing functions in new functions/classes. In fact in the Postgres source I see this in many places. Now it is entirely possible I missed a memo, so I am open to a more detailed explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good optimizing compiler and plpgsql doesn't.

Cheers,

Jeff

Re: plpgsql functions organisation

From

Bill Moran

Date:

02 May 2015, 21:17:37

On Sat, 2 May 2015 14:07:31 -0700
Jeff Janes <jeff.janes@gmail.com> wrote:

> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com>
> wrote:
>
> > On 05/02/2015 10:12 AM, Melvin Davidson wrote:
> >
> >> AFAIK, you cannot "package" functions in  PostgreSQL, but it is possible
> >> to
> >> call a function from within a function.
> >>
> >> That being said, I would seriously look at how and why you are writing
> >> your functions
> >> as functions that call other functions are not very efficient.
> >>
> >
> > I am not following. That is what packaging is about, separating out 'units
> > of work' so they can be combined as needed. Part of that is using existing
> > functions in new functions/classes. In fact in the Postgres source I see
> > this in many places. Now it is entirely possible I missed a memo, so I am
> > open to a more detailed explanation of the inefficiencies involved.
> >
>
> The Postgres source is written in C, not in plpgsql.  C has a good
> optimizing compiler and plpgsql doesn't.

Maybe that's a roundabout way of saying that if your functions are
complex enough to require calling "sub-functions" they might be
justifying being writting in C?

--
Bill Moran

Re: plpgsql functions organisation

From

Adrian Klaver

Date:

02 May 2015, 22:06:32

On 05/02/2015 02:07 PM, Jeff Janes wrote:
> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 05/02/2015 10:12 AM, Melvin Davidson wrote:
>
>         AFAIK, you cannot "package" functions in  PostgreSQL, but it is
>         possible to
>         call a function from within a function.
>
>         That being said, I would seriously look at how and why you are
>         writing
>         your functions
>         as functions that call other functions are not very efficient.
>
>
>     I am not following. That is what packaging is about, separating out
>     'units of work' so they can be combined as needed. Part of that is
>     using existing functions in new functions/classes. In fact in the
>     Postgres source I see this in many places. Now it is entirely
>     possible I missed a memo, so I am open to a more detailed
>     explanation of the inefficiencies involved.
>
>
> The Postgres source is written in C, not in plpgsql.  C has a good
> optimizing compiler and plpgsql doesn't.

Does this actually matter?  I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so.  I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?



> Cheers,
>
> Jeff


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: plpgsql functions organisation

From

Melvin Davidson

Date:

02 May 2015, 22:11:00

Further to the point of saying functions are ineffiencent, consider the fact that as of the current version of PostgreSQL, plpgsql

functions cannot be pre-optimized. So when they are referenced in a SQL statement, PostgreSQL (optimizer) has load the

function from the catalogs, which involves overhead. If the function calls another function, then the process has to be repeated, which involves additional overhead. Ergo, that is not the most efficient way of doing things.

On Sat, May 2, 2015 at 5:17 PM, Bill Moran <wmoran@potentialtech.com> wrote:

On Sat, 2 May 2015 14:07:31 -0700
Jeff Janes <jeff.janes@gmail.com> wrote:

> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com>
> wrote:
>
> > On 05/02/2015 10:12 AM, Melvin Davidson wrote:
> >
> >> AFAIK, you cannot "package" functions in PostgreSQL, but it is possible
> >> to
> >> call a function from within a function.
> >>
> >> That being said, I would seriously look at how and why you are writing
> >> your functions
> >> as functions that call other functions are not very efficient.
> >>
> >
> > I am not following. That is what packaging is about, separating out 'units
> > of work' so they can be combined as needed. Part of that is using existing
> > functions in new functions/classes. In fact in the Postgres source I see
> > this in many places. Now it is entirely possible I missed a memo, so I am
> > open to a more detailed explanation of the inefficiencies involved.
> >
>
> The Postgres source is written in C, not in plpgsql. C has a good
> optimizing compiler and plpgsql doesn't.

Maybe that's a roundabout way of saying that if your functions are
complex enough to require calling "sub-functions" they might be
justifying being writting in C?

--
Bill Moran

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: plpgsql functions organisation

From

Bill Moran

Date:

02 May 2015, 22:28:17

On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

> On 05/02/2015 02:07 PM, Jeff Janes wrote:
> > On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
> > <mailto:adrian.klaver@aklaver.com>> wrote:
> >
> >     On 05/02/2015 10:12 AM, Melvin Davidson wrote:
> >
> >         AFAIK, you cannot "package" functions in  PostgreSQL, but it is
> >         possible to
> >         call a function from within a function.
> >
> >         That being said, I would seriously look at how and why you are
> >         writing
> >         your functions
> >         as functions that call other functions are not very efficient.
> >
> >
> >     I am not following. That is what packaging is about, separating out
> >     'units of work' so they can be combined as needed. Part of that is
> >     using existing functions in new functions/classes. In fact in the
> >     Postgres source I see this in many places. Now it is entirely
> >     possible I missed a memo, so I am open to a more detailed
> >     explanation of the inefficiencies involved.
> >
> >
> > The Postgres source is written in C, not in plpgsql.  C has a good
> > optimizing compiler and plpgsql doesn't.
>
> Does this actually matter?  I am a biologist that backed into computing,
> so I realize I am weak on the fundamentals. Still the scientist in me
> wants data backing assertions. As I understand it plpgsql works close to
> the server and is optimized to do so.  I know writing in C would be a
> better solution. Still is calling plpgsql functions inside plpgsql
> really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

--
Bill Moran

Re: plpgsql functions organisation

From

Adrian Klaver

Date:

02 May 2015, 23:36:32

On 05/02/2015 03:10 PM, Melvin Davidson wrote:
> Further to the point of saying functions are ineffiencent, consider the
> fact that as of the current version of PostgreSQL, plpgsql
> functions cannot be pre-optimized. So when they are referenced in a SQL
> statement, PostgreSQL (optimizer) has load the
> function from the catalogs, which involves overhead. If the function
> calls another function, then the process has to be repeated, which
> involves additional overhead. Ergo, that is not the most efficient way
> of doing things.
>

Yeah, I see the explanation here:

http://www.postgresql.org/docs/9.4/interactive/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

Which has this:

"As each expression and SQL command is first executed in the function,
the PL/pgSQL interpreter parses and analyzes the command to create a
prepared statement, using the SPI manager's SPI_prepare function.
Subsequent visits to that expression or command reuse the prepared
statement. Thus, a function with conditional code paths that are seldom
visited will never incur the overhead of analyzing those commands that
are never executed within the current session"

So it still not clear to me whether a monolithic function is better or
worse than one that calls other functions as needed. Probably over
thinking this, but it would make a good experiment. Just have to figure
out a realistic scenario to test. Thanks for the input.

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: plpgsql functions organisation

From

Adrian Klaver

Date:

02 May 2015, 23:37:47

On 05/02/2015 03:28 PM, Bill Moran wrote:
> On Sat, 02 May 2015 15:06:24 -0700
> Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
>> On 05/02/2015 02:07 PM, Jeff Janes wrote:
>>> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
>>> <mailto:adrian.klaver@aklaver.com>> wrote:
>>>
>>>      On 05/02/2015 10:12 AM, Melvin Davidson wrote:
>>>
>>>          AFAIK, you cannot "package" functions in  PostgreSQL, but it is
>>>          possible to
>>>          call a function from within a function.
>>>
>>>          That being said, I would seriously look at how and why you are
>>>          writing
>>>          your functions
>>>          as functions that call other functions are not very efficient.
>>>
>>>
>>>      I am not following. That is what packaging is about, separating out
>>>      'units of work' so they can be combined as needed. Part of that is
>>>      using existing functions in new functions/classes. In fact in the
>>>      Postgres source I see this in many places. Now it is entirely
>>>      possible I missed a memo, so I am open to a more detailed
>>>      explanation of the inefficiencies involved.
>>>
>>>
>>> The Postgres source is written in C, not in plpgsql.  C has a good
>>> optimizing compiler and plpgsql doesn't.
>>
>> Does this actually matter?  I am a biologist that backed into computing,
>> so I realize I am weak on the fundamentals. Still the scientist in me
>> wants data backing assertions. As I understand it plpgsql works close to
>> the server and is optimized to do so.  I know writing in C would be a
>> better solution. Still is calling plpgsql functions inside plpgsql
>> really a bad thing when just considering plpgsql?
>
> The answer to that is the same answer to so many other things: it depends.
>
> plpgsql functions are slower than C. They also lack a lot of language
> features that C has. That being said, if they're meeting your needs, then
> don't worry about it. plpgsql is around because for most people, it works
> well enough. There are certainly cases when you want to create very complex
> logic in the database and plpgsql is liable to make that difficult. But
> there are a lot of cases where having to manage pointers and a build
> environment and all the things that go with C aren't justified, because
> plpgsql has none of that complexity. There are advantages both ways.
>
> The beauty of PostgreSQL is that you have both available and you
> can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more complex
solutions.

>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: plpgsql functions organisation

From

Melvin Davidson

Date:

03 May 2015, 00:56:37

OK, Here is a simple example that shows the difference between using a self contained function and
one that calls sub functions.

After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
callsubs takes almost TWICE as long to execute as nosub.

CREATE OR REPLACE FUNCTION nosub(text)
RETURNS void AS
$BODY$

DECLARE

    p_in_str    ALIAS FOR $1;

BEGIN
    IF LENGTH(p_in_str) <= 6
        THEN RAISE NOTICE 'Hi %', p_in_str;
    ELSE
        RAISE NOTICE 'Hello %', p_in_str;
    END IF;

    RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION nosub(text)
OWNER TO postgres;

CREATE OR REPLACE FUNCTION called1(text)
RETURNS void AS
$BODY$

DECLARE

    p_in_str1    ALIAS FOR $1;

BEGIN
    RAISE NOTICE 'Hi %', p_in_str1;

    RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION called1(text)
OWNER TO postgres;

CREATE OR REPLACE FUNCTION called2(text)
RETURNS void AS
$BODY$

DECLARE

    p_in_str2    ALIAS FOR $1;

BEGIN
    RAISE NOTICE 'Hello %', p_in_str2;

    RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION called2(text)
OWNER TO postgres;

CREATE OR REPLACE FUNCTION callsubs(text)
RETURNS void AS
$BODY$

DECLARE

    p_in_str    ALIAS FOR $1;

BEGIN
    IF LENGTH(p_in_str) <= 6
        THEN PERFORM CALLED1(p_in_str);
    ELSE
        PERFORM CALLED2(p_in_str);
    END IF;

    RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION callsubs(text)
OWNER TO postgres;

EXPLAIN ANALYZE SELECT nosub('melvin');

EXPLAIN ANALYZE SELECT callsubs('melvin');

On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 03:28 PM, Bill Moran wrote:
On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/02/2015 02:07 PM, Jeff Janes wrote:
On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 05/02/2015 10:12 AM, Melvin Davidson wrote:

AFAIK, you cannot "package" functions in PostgreSQL, but it is
possible to
call a function from within a function.

That being said, I would seriously look at how and why you are
writing
your functions
as functions that call other functions are not very efficient.

I am not following. That is what packaging is about, separating out
'units of work' so they can be combined as needed. Part of that is
using existing functions in new functions/classes. In fact in the
Postgres source I see this in many places. Now it is entirely
possible I missed a memo, so I am open to a more detailed
explanation of the inefficiencies involved.

The Postgres source is written in C, not in plpgsql. C has a good
optimizing compiler and plpgsql doesn't.

Does this actually matter? I am a biologist that backed into computing,
so I realize I am weak on the fundamentals. Still the scientist in me
wants data backing assertions. As I understand it plpgsql works close to
the server and is optimized to do so. I know writing in C would be a
better solution. Still is calling plpgsql functions inside plpgsql
really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

Agreed, though in my case I drop into plpythonu when I want more complex solutions.

--
Adrian Klaver
adrian.klaver@aklaver.com

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: plpgsql functions organisation

From

Alban Hertroys

Date:

03 May 2015, 09:26:11

> On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@gmail.com> wrote:
>
> OK, Here is a simple example that shows the difference between using a self contained function  and
> one that calls sub functions.
>
> After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
> callsubs takes almost TWICE as long to execute as nosub.
>
> CREATE OR REPLACE FUNCTION nosub(text)
>   RETURNS void AS
> $BODY$
...
>     IF LENGTH(p_in_str) <= 6
>         THEN RAISE NOTICE 'Hi %', p_in_str;
>     ELSE
>         RAISE NOTICE 'Hello %', p_in_str;
>     END IF;
>
>     RETURN;
> END;
> $BODY$
…


>  CREATE OR REPLACE FUNCTION called1(text)
>   RETURNS void AS
> $BODY$
...
>     RAISE NOTICE 'Hi %', p_in_str1;
>
>     RETURN;
> END;
> $BODY$
…

>  CREATE OR REPLACE FUNCTION called2(text)
>   RETURNS void AS
> $BODY$
...
>     RAISE NOTICE 'Hello %', p_in_str2;
>
>     RETURN;
> END;
...


That's a rather uninteresting experiment, as all it does is call a function and raise a notice. Relative to what the
functionsdo, the function call itself takes a significant amount of time. No surprise there, you'll see something
similarin any language, even C. All you're showing is that calling a function takes some amount of time > 0. 

In C, a function call needs to look up an address to jump to, in plpgsql the database needs to look up the function
bodyin a table. If the function is small and atomic it often gets called from multiple other functions and is probably
cachedanyway. The main difference between C and plpgsql here is that the latter is an interpreted language, so it does
needto read in the entire function body after a call - which I'd expect to be quite a bit faster with a smaller
(atomic)function body, especially when it hasn't been cached yet. 

So far I haven't been convinced.

An actual use-case where the functions actually do something would be far more interesting. I doubt anybody writes
functionsjust to raise a notice. I expect that in reality most plpgsql functions perform database queries and do
somethingwith the result. In such cases, function call overhead could be significant if the call is done for each
recordin a result set, for example. And even then it's worth considering whether that matters to your situation enough
thatit outweighs the usual benefits of code separation. 


> On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> On 05/02/2015 03:28 PM, Bill Moran wrote:
> On Sat, 02 May 2015 15:06:24 -0700
> Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 05/02/2015 02:07 PM, Jeff Janes wrote:
> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>      On 05/02/2015 10:12 AM, Melvin Davidson wrote:
>
>          AFAIK, you cannot "package" functions in  PostgreSQL, but it is
>          possible to
>          call a function from within a function.
>
>          That being said, I would seriously look at how and why you are
>          writing
>          your functions
>          as functions that call other functions are not very efficient.
>
>
>      I am not following. That is what packaging is about, separating out
>      'units of work' so they can be combined as needed. Part of that is
>      using existing functions in new functions/classes. In fact in the
>      Postgres source I see this in many places. Now it is entirely
>      possible I missed a memo, so I am open to a more detailed
>      explanation of the inefficiencies involved.
>
>
> The Postgres source is written in C, not in plpgsql.  C has a good
> optimizing compiler and plpgsql doesn't.
>
> Does this actually matter?  I am a biologist that backed into computing,
> so I realize I am weak on the fundamentals. Still the scientist in me
> wants data backing assertions. As I understand it plpgsql works close to
> the server and is optimized to do so.  I know writing in C would be a
> better solution. Still is calling plpgsql functions inside plpgsql
> really a bad thing when just considering plpgsql?
>
> The answer to that is the same answer to so many other things: it depends.
>
> plpgsql functions are slower than C. They also lack a lot of language
> features that C has. That being said, if they're meeting your needs, then
> don't worry about it. plpgsql is around because for most people, it works
> well enough. There are certainly cases when you want to create very complex
> logic in the database and plpgsql is liable to make that difficult. But
> there are a lot of cases where having to manage pointers and a build
> environment and all the things that go with C aren't justified, because
> plpgsql has none of that complexity. There are advantages both ways.
>
> The beauty of PostgreSQL is that you have both available and you
> can choose whichever is best for your situation.
>
> Agreed, though in my case I drop into plpythonu when I want more complex solutions.
>
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
>
>
> --
> Melvin Davidson
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Re: plpgsql functions organisation

From

Melvin Davidson

Date:

03 May 2015, 14:14:50

The point was to show that yes, function calls take time, and using sub functions take even more time. I am not about to write an additional more detailed example just to show the same results. If you are in doubt, I respectfully suggest you do your own testing.

On Sun, May 3, 2015 at 5:26 AM, Alban Hertroys <haramrae@gmail.com> wrote:

> On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@gmail.com> wrote:
>
> OK, Here is a simple example that shows the difference between using a self contained function and
> one that calls sub functions.
>
> After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
> callsubs takes almost TWICE as long to execute as nosub.
>
> CREATE OR REPLACE FUNCTION nosub(text)
> RETURNS void AS
> $BODY$
...
> IF LENGTH(p_in_str) <= 6
> THEN RAISE NOTICE 'Hi %', p_in_str;
> ELSE
> RAISE NOTICE 'Hello %', p_in_str;
> END IF;
>
> RETURN;
> END;
> $BODY$
…

> CREATE OR REPLACE FUNCTION called1(text)
> RETURNS void AS
> $BODY$
...
> RAISE NOTICE 'Hi %', p_in_str1;
>
> RETURN;
> END;
> $BODY$
…

> CREATE OR REPLACE FUNCTION called2(text)
> RETURNS void AS
> $BODY$
...
> RAISE NOTICE 'Hello %', p_in_str2;
>
> RETURN;
> END;
...

That's a rather uninteresting experiment, as all it does is call a function and raise a notice. Relative to what the functions do, the function call itself takes a significant amount of time. No surprise there, you'll see something similar in any language, even C. All you're showing is that calling a function takes some amount of time > 0.

In C, a function call needs to look up an address to jump to, in plpgsql the database needs to look up the function body in a table. If the function is small and atomic it often gets called from multiple other functions and is probably cached anyway. The main difference between C and plpgsql here is that the latter is an interpreted language, so it does need to read in the entire function body after a call - which I'd expect to be quite a bit faster with a smaller (atomic) function body, especially when it hasn't been cached yet.

So far I haven't been convinced.

An actual use-case where the functions actually do something would be far more interesting. I doubt anybody writes functions just to raise a notice. I expect that in reality most plpgsql functions perform database queries and do something with the result. In such cases, function call overhead could be significant if the call is done for each record in a result set, for example. And even then it's worth considering whether that matters to your situation enough that it outweighs the usual benefits of code separation.

> On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> On 05/02/2015 03:28 PM, Bill Moran wrote:
> On Sat, 02 May 2015 15:06:24 -0700
> Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 05/02/2015 02:07 PM, Jeff Janes wrote:
> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 05/02/2015 10:12 AM, Melvin Davidson wrote:
>
> AFAIK, you cannot "package" functions in PostgreSQL, but it is
> possible to
> call a function from within a function.
>
> That being said, I would seriously look at how and why you are
> writing
> your functions
> as functions that call other functions are not very efficient.
>
>
> I am not following. That is what packaging is about, separating out
> 'units of work' so they can be combined as needed. Part of that is
> using existing functions in new functions/classes. In fact in the
> Postgres source I see this in many places. Now it is entirely
> possible I missed a memo, so I am open to a more detailed
> explanation of the inefficiencies involved.
>
>
> The Postgres source is written in C, not in plpgsql. C has a good
> optimizing compiler and plpgsql doesn't.
>
> Does this actually matter? I am a biologist that backed into computing,
> so I realize I am weak on the fundamentals. Still the scientist in me
> wants data backing assertions. As I understand it plpgsql works close to
> the server and is optimized to do so. I know writing in C would be a
> better solution. Still is calling plpgsql functions inside plpgsql
> really a bad thing when just considering plpgsql?
>
> The answer to that is the same answer to so many other things: it depends.
>
> plpgsql functions are slower than C. They also lack a lot of language
> features that C has. That being said, if they're meeting your needs, then
> don't worry about it. plpgsql is around because for most people, it works
> well enough. There are certainly cases when you want to create very complex
> logic in the database and plpgsql is liable to make that difficult. But
> there are a lot of cases where having to manage pointers and a build
> environment and all the things that go with C aren't justified, because
> plpgsql has none of that complexity. There are advantages both ways.
>
> The beauty of PostgreSQL is that you have both available and you
> can choose whichever is best for your situation.
>
> Agreed, though in my case I drop into plpythonu when I want more complex solutions.
>
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
>
>
> --
> Melvin Davidson
> I reserve the right to fantasize. Whether or not you
> wish to share my fantasy is entirely up to you.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: plpgsql functions organisation

From

Adrian Klaver

Date:

03 May 2015, 15:18:01

On 05/03/2015 07:14 AM, Melvin Davidson wrote:
> The point was to show that yes, function calls take time, and using sub
> functions take even more time. I am not about to write an additional
> more detailed example just to show the same results. If you are in
> doubt, I respectfully suggest you do your own testing.

Can't resist a challenge. I took an existing function that calculates an
aggregated attendance count for a student or all enrolled students over
a date period and modified it to call sub functions. There are two sub
functions, one that finds the students enrolled over a period(which by
the way calls another function) and dates range they where enrolled. The
other calculates the aggregate values for each student. The original
function is student_attendance, the modified student_attendance_sub. The
results are below, where the first argument is the student_id(where 0
equals all students). The all students version returns 600 rows, the
single student 16 rows.


hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.204865s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.014101s
hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.041182s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.011385s
hplc=> select * from student_attendance(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.040762s
hplc=> select * from student_attendance_sub(0, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.016506s

hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.00291s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.004125s
hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.001907s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.003476s
hplc=> select * from student_attendance(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.00597s
hplc=> select * from student_attendance_sub(1, '2005-01-17', '2008-01-31');
NOTICE:  Time 00:00:00.003986s

Definite difference in the all students run, probably because one of the
called functions is used in a LOOP and caching applies.


>
> On Sun, May 3, 2015 at 5:26 AM, Alban Hertroys <haramrae@gmail.com
> <mailto:haramrae@gmail.com>> wrote:
>
>
>     > On 03 May 2015, at 2:56, Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> wrote:
>     >
>     > OK, Here is a simple example that shows the difference between using a self contained function  and
>     > one that calls sub functions.
>     >
>     > After loading all the functions below, repeat each of the EXPLAIN statements a few times and note that
>     > callsubs takes almost TWICE as long to execute as nosub.
>     >
>     > CREATE OR REPLACE FUNCTION nosub(text)
>     >   RETURNS void AS
>     > $BODY$
>     ...
>     >     IF LENGTH(p_in_str) <= 6
>     >         THEN RAISE NOTICE 'Hi %', p_in_str;
>     >     ELSE
>     >         RAISE NOTICE 'Hello %', p_in_str;
>     >     END IF;
>     >
>     >     RETURN;
>     > END;
>     > $BODY$
>     …
>
>
>     >  CREATE OR REPLACE FUNCTION called1(text)
>     >   RETURNS void AS
>     > $BODY$
>     ...
>     >     RAISE NOTICE 'Hi %', p_in_str1;
>     >
>     >     RETURN;
>     > END;
>     > $BODY$
>     …
>
>     >  CREATE OR REPLACE FUNCTION called2(text)
>     >   RETURNS void AS
>     > $BODY$
>     ...
>     >     RAISE NOTICE 'Hello %', p_in_str2;
>     >
>     >     RETURN;
>     > END;
>     ...
>
>
>     That's a rather uninteresting experiment, as all it does is call a
>     function and raise a notice. Relative to what the functions do, the
>     function call itself takes a significant amount of time. No surprise
>     there, you'll see something similar in any language, even C. All
>     you're showing is that calling a function takes some amount of time > 0.
>
>     In C, a function call needs to look up an address to jump to, in
>     plpgsql the database needs to look up the function body in a table.
>     If the function is small and atomic it often gets called from
>     multiple other functions and is probably cached anyway. The main
>     difference between C and plpgsql here is that the latter is an
>     interpreted language, so it does need to read in the entire function
>     body after a call - which I'd expect to be quite a bit faster with a
>     smaller (atomic) function body, especially when it hasn't been
>     cached yet.
>
>     So far I haven't been convinced.
>
>     An actual use-case where the functions actually do something would
>     be far more interesting. I doubt anybody writes functions just to
>     raise a notice. I expect that in reality most plpgsql functions
>     perform database queries and do something with the result. In such
>     cases, function call overhead could be significant if the call is
>     done for each record in a result set, for example. And even then
>     it's worth considering whether that matters to your situation enough
>     that it outweighs the usual benefits of code separation.
>
>
>      > On Sat, May 2, 2015 at 7:37 PM, Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>      > On 05/02/2015 03:28 PM, Bill Moran wrote:
>      > On Sat, 02 May 2015 15:06:24 -0700
>      > Adrian Klaver <adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>> wrote:
>      >
>      > On 05/02/2015 02:07 PM, Jeff Janes wrote:
>      > On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>      > <mailto:adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>> wrote:
>      >
>      >      On 05/02/2015 10:12 AM, Melvin Davidson wrote:
>      >
>      >          AFAIK, you cannot "package" functions in  PostgreSQL,
>     but it is
>      >          possible to
>      >          call a function from within a function.
>      >
>      >          That being said, I would seriously look at how and why
>     you are
>      >          writing
>      >          your functions
>      >          as functions that call other functions are not very
>     efficient.
>      >
>      >
>      >      I am not following. That is what packaging is about,
>     separating out
>      >      'units of work' so they can be combined as needed. Part of
>     that is
>      >      using existing functions in new functions/classes. In fact
>     in the
>      >      Postgres source I see this in many places. Now it is entirely
>      >      possible I missed a memo, so I am open to a more detailed
>      >      explanation of the inefficiencies involved.
>      >
>      >
>      > The Postgres source is written in C, not in plpgsql.  C has a good
>      > optimizing compiler and plpgsql doesn't.
>      >
>      > Does this actually matter?  I am a biologist that backed into
>     computing,
>      > so I realize I am weak on the fundamentals. Still the scientist in me
>      > wants data backing assertions. As I understand it plpgsql works
>     close to
>      > the server and is optimized to do so.  I know writing in C would be a
>      > better solution. Still is calling plpgsql functions inside plpgsql
>      > really a bad thing when just considering plpgsql?
>      >
>      > The answer to that is the same answer to so many other things: it
>     depends.
>      >
>      > plpgsql functions are slower than C. They also lack a lot of language
>      > features that C has. That being said, if they're meeting your
>     needs, then
>      > don't worry about it. plpgsql is around because for most people,
>     it works
>      > well enough. There are certainly cases when you want to create
>     very complex
>      > logic in the database and plpgsql is liable to make that
>     difficult. But
>      > there are a lot of cases where having to manage pointers and a build
>      > environment and all the things that go with C aren't justified,
>     because
>      > plpgsql has none of that complexity. There are advantages both ways.
>      >
>      > The beauty of PostgreSQL is that you have both available and you
>      > can choose whichever is best for your situation.
>      >
>      > Agreed, though in my case I drop into plpythonu when I want more
>     complex solutions.
>      >
>      >
>      >
>      >
>      > --
>      > Adrian Klaver
>      > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>      >
>      >
>      >
>      > --
>      > Melvin Davidson
>      > I reserve the right to fantasize.  Whether or not you
>      > wish to share my fantasy is entirely up to you.
>
>     Alban Hertroys
>     --
>     If you can't see the forest for the trees,
>     cut the trees and you'll find there is no forest.
>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: plpgsql functions organisation

From

Jim Nasby

Date:

04 May 2015, 20:22:05

On 5/2/15 2:32 PM, Adrian Klaver wrote:
> On 05/02/2015 09:53 AM, Yves Dorfsman wrote:
>>
>> I find my plpgsql functions becomes unreadable very quickly. I want to
>> break
>> them up in smaller functions.
>>
>> What is the best way to organised them?
>> Is there any way to define functions inside functions?
>> When I list functions in psql, I can see them all at the same level,
>> is there
>> any way to organise them in packages like with python and other
>> languages, so
>> the smaller functions are hidden away in a package/directory?
>
> The  only thing I can think of is to use SCHEMAs;
>
> http://www.postgresql.org/docs/9.4/interactive/sql-createschema.html

You can do a crude form of public vs private methods using schemas; I
frequently define schemas that start with a _ and don't grant USAGE to
general users for those schemas as a way to do that (not defining USAGE
means the schemas won't show up in things like \df). I do wish I could
control visibility separately from USAGE... maybe someday.

As for performance concerns, in 99% of cases code maintainability is
going to be way more important than performance microoptimization. If
you're *that* concerned about performance than plpgsql probably isn't
the right answer anyway.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: plpgsql functions organisation

From

Yves Dorfsman

Date:

04 May 2015, 20:33:07

>
> As for performance concerns, in 99% of cases code maintainability is going to
> be way more important than performance microoptimization. If you're *that*
> concerned about performance than plpgsql probably isn't the right answer anyway.

Isn't one of the advantage of running on the server to avoid data round trip?

What would you recommend for better performance?

--
http://yves.zioup.com
gpg: 4096R/32B0F416

Re: plpgsql functions organisation

From

Jan de Visser

Date:

04 May 2015, 23:36:04

On May 4, 2015 02:32:14 PM Yves Dorfsman wrote:
> > As for performance concerns, in 99% of cases code maintainability is going
> > to be way more important than performance microoptimization. If you're
> > *that* concerned about performance than plpgsql probably isn't the right
> > answer anyway.
> Isn't one of the advantage of running on the server to avoid data round
> trip?
>
> What would you recommend for better performance?

You can run python or perl on the server. That's what Jim meant with 'If
you're *that* concerned about performance than plpgsql probably isn't the
right answer anyway.'

Additionally: Many moons ago I did extensive and aggressive performance
analysis on a system that did many recursive queries. This was before CTEs
(WITH statements) and we settled on recursive plpgsql functions. The queries
in the functions were trivial, but nevertheless the bottleneck was in the
query and data processing, and never in the surrounding infrastructure.