Thread: a plpgsql bug

a plpgsql bug

From

"daidewei@highgo.com"

Date:

19 September 2023, 02:59:07

hello！

I found a problem in plpgsql. When there is a large loop in plpgsql, it is found that the change of search_path will cause memory exhaustion and thus disconnect the connection.

The test examples are as follows:

1 create a schema

create schema test_schema;

2 create a function create or replace function test_schema.test_f(id integer) returns integer as $$ declare var2 integer := 1; begin if id % 4 = 1 then return var2 + 2; elseif id % 4 = 2 then return var2 + 3; elseif id % 4 = 3 then return var2 + 4; else return var2; end if; end; $$ language plpgsql;

3 a loop in plpgsql,which wil result disconnection. do $$ declare var1 integer; begin for id in 1 .. 10000000 LOOP set search_path to test_schema; var1 = test_schema.test_f(id); set search_path to public; var1 = test_schema.test_f(id); end loop; end; $$ language plpgsql;

daidewei@highgo.com

Attachment

Catch.jpg

Re: a plpgsql bug

From

"David G. Johnston"

Date:

19 September 2023, 15:24:25

On Mon, Sep 18, 2023 at 11:46 PM daidewei@highgo.com <daidewei@highgo.com> wrote:

I found a problem in plpgsql. When there is a large loop in plpgsql, it is found that the change of search_path will cause memory exhaustion and thus disconnect the connection.

It is impossible to prevent queries from exhausting memory so the fact that this one does isn't an indication of a bug on its own. I'm having trouble imagining a reasonable use case for this pattern of changing search_path frequently in a loop within a single execution of a function. That said, I'm not in a position to judge how easy or difficult an improvement in this area may be.

David J.

Re: a plpgsql bug

From

Tom Lane

Date:

19 September 2023, 16:12:11

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Sep 18, 2023 at 11:46 PM daidewei@highgo.com <daidewei@highgo.com>
> wrote:
>> I found a problem in plpgsql. When there is a large loop in plpgsql,
>> it is found that the change of search_path will cause memory exhaustion and
>> thus disconnect the connection.

> It is impossible to prevent queries from exhausting memory so the fact that
> this one does isn't an indication of a bug on its own.  I'm having trouble
> imagining a reasonable use case for this pattern of changing search_path
> frequently in a loop within a single execution of a function.  That said,
> I'm not in a position to judge how easy or difficult an improvement in this
> area may be.

I poked into this a bit with valgrind.  It seems that the problem
is that the changes to search_path thrash the "simple expression"
mechanism in plpgsql, such that it has to re-plan the various
expressions in the called function each time through.  It's good about
tracking the actual cached plans and not leaking those, but what is
getting leaked into transaction-lifespan memory is the data structures
made by

        expr->expr_simple_state =
            ExecInitExprWithParams(expr->expr_simple_expr,
                                   econtext->ecxt_param_list_info);

We could conceivably reclaim that data if we were willing to set up
yet another per-expression memory context to hold it.  That seems
like rather a high overhead though.

The given test case is obviously a bit artificial, but I think it
may be a simplification of fairly plausible use-cases.  The triggering
condition is that the same textual expression in a plpgsql function
gets executed repeatedly with different search_path settings, which
doesn't seem that unreasonable.

Perhaps another approach could be to assume that only a small number
of distinct search_path settings will be used in any one transaction,
and cache a separate plan and estate for each one.  That would have
the nice side-effect of avoiding the replanning overhead, but then
we'd have to figure out how to manage the cache and keep it from
blowing out.

            regards, tom lane

Re: Re: a plpgsql bug

From

"daidewei@highgo.com"

Date:

20 September 2023, 06:34:06

I find another example which doesn't change search_path,but use function 'SECURITY DEFINER'

1 create a user

./psql -dpostgres

create user test_user;

grant all on database postgres to test_user;

2 use the new user to logon and create function

./psql -dpostgres -U test_user

create schema test_user;

create or replace function test_user.test_m(id integer) returns integer as

declare

var2 integer := 1;

begin

if id % 4 = 1 then

return var2 + 2;

elseif id % 4 = 2 then

return var2 + 3;

elseif id % 4 = 3 then

return var2 + 4;

else

return var2;

end if;

end; $$ language plpgsql;

create or replace function test_user.test_f(id integer) returns integer SECURITY DEFINER as

declare

var1 integer := 1;

begin

var1 := test_user.test_m(23);

return var1;

end; $$ language plpgsql;

3 execute

./psql -dpostgres

do $$

declare

var1 integer;

begin

for id in 1 .. 10000000 LOOP

var1 = test_user.test_m(id);

var1 = test_user.test_f(id);

end loop;

end; $$ language plpgsql;

daidewei@highgo.com

From: Tom Lane
Date: 2023-09-20 00:12
To: David G. Johnston
CC: daidewei@highgo.com; pgsql-bugs@lists.postgresql.org
Subject: Re: a plpgsql bug
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Sep 18, 2023 at 11:46 PM daidewei@highgo.com <daidewei@highgo.com>
> wrote:
>> I found a problem in plpgsql. When there is a large loop in plpgsql,
>> it is found that the change of search_path will cause memory exhaustion and
>> thus disconnect the connection.

> It is impossible to prevent queries from exhausting memory so the fact that
> this one does isn't an indication of a bug on its own. I'm having trouble
> imagining a reasonable use case for this pattern of changing search_path
> frequently in a loop within a single execution of a function. That said,
> I'm not in a position to judge how easy or difficult an improvement in this
> area may be.

I poked into this a bit with valgrind. It seems that the problem
is that the changes to search_path thrash the "simple expression"
mechanism in plpgsql, such that it has to re-plan the various
expressions in the called function each time through. It's good about
tracking the actual cached plans and not leaking those, but what is
getting leaked into transaction-lifespan memory is the data structures
made by

        expr->expr_simple_state =
            ExecInitExprWithParams(expr->expr_simple_expr,
                                   econtext->ecxt_param_list_info);

We could conceivably reclaim that data if we were willing to set up
yet another per-expression memory context to hold it. That seems
like rather a high overhead though.

The given test case is obviously a bit artificial, but I think it
may be a simplification of fairly plausible use-cases. The triggering
condition is that the same textual expression in a plpgsql function
gets executed repeatedly with different search_path settings, which
doesn't seem that unreasonable.

Perhaps another approach could be to assume that only a small number
of distinct search_path settings will be used in any one transaction,
and cache a separate plan and estate for each one. That would have
the nice side-effect of avoiding the replanning overhead, but then
we'd have to figure out how to manage the cache and keep it from
blowing out.

regards, tom lane

Attachment

Catch4E91.jpg

Re: Re: a plpgsql bug

From

"daidewei@highgo.com"

Date:

21 September 2023, 01:22:33

maybe when function compiling，swith to its owner

daidewei@highgo.com

From: daidewei@highgo.com
Date: 2023-09-20 14:34
To: Tom Lane; David G. Johnston
CC: pgsql-bugs@lists.postgresql.org
Subject: Re: Re: a plpgsql bug
I find another example which doesn't change search_path,but use function 'SECURITY DEFINER'

1 create a user
./psql -dpostgres
create user test_user;
grant all on database postgres to test_user;

2 use the new user to logon and create function
./psql -dpostgres -U test_user

create schema test_user;

create or replace function test_user.test_m(id integer) returns integer as
$$
declare
var2 integer := 1;
begin
if id % 4 = 1 then
return var2 + 2;
elseif id % 4 = 2 then
return var2 + 3;
elseif id % 4 = 3 then
return var2 + 4;
else
return var2;
end if;
end; $$ language plpgsql;

create or replace function test_user.test_f(id integer) returns integer SECURITY DEFINER as
$$
declare
var1 integer := 1;
begin
var1 := test_user.test_m(23);
return var1;
end; $$ language plpgsql;

3 execute
./psql -dpostgres

do $$
declare
var1 integer;
begin
for id in 1 .. 10000000 LOOP
var1 = test_user.test_m(id);
var1 = test_user.test_f(id);
end loop;
end; $$ language plpgsql;
daidewei@highgo.com

From: Tom Lane
Date: 2023-09-20 00:12
To: David G. Johnston
CC: daidewei@highgo.com; pgsql-bugs@lists.postgresql.org
Subject: Re: a plpgsql bug
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Sep 18, 2023 at 11:46 PM daidewei@highgo.com <daidewei@highgo.com>
> wrote:
>> I found a problem in plpgsql. When there is a large loop in plpgsql,
>> it is found that the change of search_path will cause memory exhaustion and
>> thus disconnect the connection.

> It is impossible to prevent queries from exhausting memory so the fact that
> this one does isn't an indication of a bug on its own. I'm having trouble
> imagining a reasonable use case for this pattern of changing search_path
> frequently in a loop within a single execution of a function. That said,
> I'm not in a position to judge how easy or difficult an improvement in this
> area may be.

I poked into this a bit with valgrind. It seems that the problem
is that the changes to search_path thrash the "simple expression"
mechanism in plpgsql, such that it has to re-plan the various
expressions in the called function each time through. It's good about
tracking the actual cached plans and not leaking those, but what is
getting leaked into transaction-lifespan memory is the data structures
made by

        expr->expr_simple_state =
            ExecInitExprWithParams(expr->expr_simple_expr,
                                   econtext->ecxt_param_list_info);

We could conceivably reclaim that data if we were willing to set up
yet another per-expression memory context to hold it. That seems
like rather a high overhead though.

The given test case is obviously a bit artificial, but I think it
may be a simplification of fairly plausible use-cases. The triggering
condition is that the same textual expression in a plpgsql function
gets executed repeatedly with different search_path settings, which
doesn't seem that unreasonable.

Perhaps another approach could be to assume that only a small number
of distinct search_path settings will be used in any one transaction,
and cache a separate plan and estate for each one. That would have
the nice side-effect of avoiding the replanning overhead, but then
we'd have to figure out how to manage the cache and keep it from
blowing out.

regards, tom lane

Attachment

Catch4E91%2809-2.jpg