Re: Stack overflow issue - Mailing list pgsql-hackers

From Egor Chindyaskin
Subject Re: Stack overflow issue
Date
Msg-id c01f3c6a-97a9-41ea-7bbb-d47f5a77b8d9@mail.ru
Whole thread Raw
In response to Re: Stack overflow issue  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Stack overflow issue  (Егор Чиндяскин <kyzevan23@mail.ru>)
List pgsql-hackers
24.08.2022 20:58, Tom Lane writes:
> Nice work!  I wonder if you can make the regex crashes reachable by
> reducing the value of max_stack_depth enough that it's hit before
> reaching the "regular expression is too complex" limit.
>
>             regards, tom lane
Hello everyone! It's been a while since me and Alexander Lakhin have 
published a list of functions that have a stack overflow illness. We are 
back to tell you more about such places.
During our analyze we made a conclusion that some functions can be 
crashed without changing any of the parameters and some can be crashed 
only if we change some stuff.

The first function crashes without any changes:

# CheckAttributeType

(n=60000; printf "create domain dint as int; create domain dinta0 as 
dint[];"; for ((i=1;i<=$n;i++)); do printf "create domain dinta$i as 
dinta$(( $i - 1 ))[]; "; done; ) | psql
psql -c "create table t(f1 dinta60000[]);"

Some of the others crash if we change "max_locks_per_transaction" 
parameter:

# findDependentObjects

max_locks_per_transaction = 200

(n=10000; printf "create table t (i int); create view v0 as select * 
from t;"; for ((i=1;i<$n;i++)); do printf "create view v$i as select * 
from v$(( $i - 1 )); "; done; ) | psql
psql -c "drop table t"

# ATExecDropColumn

max_locks_per_transaction = 300

(n=50000; printf "create table t0 (a int, b int); "; for 
((i=1;i<=$n;i++)); do printf "create table t$i() inherits(t$(( $i - 1 
))); "; done; printf "alter table t0 drop b;" ) | psql

# ATExecDropConstraint

max_locks_per_transaction = 300

(n=50000; printf "create table t0 (a int, b int, constraint bc check (b 
 > 0));"; for ((i=1;i<=$n;i++)); do printf "create table t$i() 
inherits(t$(( $i - 1 ))); "; done; printf "alter table t0 drop 
constraint bc;" ) | psql

# ATExecAddColumn

max_locks_per_transaction = 200

(n=50000; printf "create table t0 (a int, b int);"; for 
((i=1;i<=$n;i++)); do printf "create table t$i() inherits(t$(( $i - 1 
))); "; done; printf "alter table t0 add column c int;" ) | psql

# ATExecAlterConstrRecurse

max_locks_per_transaction = 300

(n=50000;
printf "create table t(a int primary key); create table pt (a int 
primary key, foreign key(a) references t) partition by range (a);";
printf "create table pt0 partition of pt for values from (0) to (100000) 
partition by range (a);";
for ((i=1;i<=$n;i++)); do printf "create table pt$i partition of pt$(( 
$i - 1 )) for values from ($i) to (100000) partition by range (a); "; done;
printf "alter table pt alter constraint pt_a_fkey deferrable initially 
deferred;"
) | psql

This is where the fun begins. According to Tom Lane, a decrease in 
max_stack_depth could lead to new crashes, but it turned out that 
Alexander was able to find new crashes precisely due to the increase in 
this parameter. Also, we had ulimit -s set to 8MB as the default value.

# eval_const_expressions_mutator

max_stack_depth = '7000kB'

(n=10000; printf "select 'a' "; for ((i=1;i<$n;i++)); do printf " 
collate \"C\" "; done; ) | psql

If you didn’t have a crash, like me, when Alexander shared his find, 
then probably you configured your cluster with an optimization flag -Og. 
In the process of trying to break this function, we came to the 
conclusion that the maximum stack depth depends on the optimization flag 
(-O0/-Og). As it turned out, when optimizing, the function frame on the 
stack becomes smaller and because of this, the limit is reached more 
slowly, therefore, the system can withstand more torment. Therefore, 
this query will fail if you have a cluster configured with the -O0 
optimization flag.

The crash of the next function not only depends on the optimization 
flag, but also on a number of other things. While researching, we 
noticed that postgres enforces a distance ~400kB from max_stack_depth to 
ulimit -s. We thought we could hit the max_stack_depth limit and then 
hit the OS limit as well. Therefore, Alexander wrote a recursive SQL 
function, that eats up a stack within max_stack_depth, including a query 
that eats up the remaining ~400kB. And this causes a crash.

# executeBoolItem

max_stack_depth = '7600kB'

create function infinite_recurse(i int) returns int as $$
begin
   raise notice 'Level %', i;
   begin
     perform jsonb_path_query('{"a":[1]}'::jsonb, ('$.a[*] ? (' || 
repeat('!(', 4800) || '@ == @' || repeat(')', 4800) || ')')::jsonpath);
   exception
     when others then raise notice 'jsonb_path_query error at level %, 
%', i, sqlerrm;
   end;
   begin
     select infinite_recurse(i + 1) into i;
   exception
     when others then raise notice 'Max stack depth reached at level %, 
%', i, sqlerrm;
   end;
   return i;
end;
$$ language plpgsql;

select infinite_recurse(1);

To sum it all up, we have not yet decided on a general approach to such 
functions. Some functions are definitely subject to stack overflow. Some 
are definitely not. This can be seen from the code where the recurse 
flag is passed, or a function that checks the stack is called before a 
recursive call. Some require special conditions - for example, you need 
to parse the query and build a plan, and at that stage the stack is 
eaten faster (and checked) than by the function that we are interested in.

We keep researching and hope to come up with a good solution sooner or 
later.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Reducing duplicativeness of EquivalenceClass-derived clauses
Next
From: Julien Rouhaud
Date:
Subject: Re: Allow file inclusion in pg_hba and pg_ident files