Re: crash in plancache with subtransactions - Mailing list pgsql-hackers

From Tom Lane
Subject Re: crash in plancache with subtransactions
Date
Msg-id 12392.1287717047@sss.pgh.pa.us
Whole thread Raw
In response to Re: crash in plancache with subtransactions  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: crash in plancache with subtransactions
List pgsql-hackers
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Tom Lane's message of jue oct 21 19:36:07 -0300 2010:
>> I don't believe that it's plancache's fault; the real problem is that
>> plpgsql is keeping "simple expression" execution trees around longer
>> than it should.  Your patch masks the problem by forcing those trees to
>> be rebuilt, but it's the execution trees not the plan trees that contain
>> stale data.

> Ahh, this probably explains why I wasn't been able to reproduce the
> problem without involving subxacts, or prepared plans, that seemed to
> follow mostly the same paths around plancache cleanup.

> It's also the likely cause that this hasn't ben reported earlier.

I traced through the details and found that the proximate cause of the
crash is that this bit in fmgr_sql() gets confused:
   /*    * Convert params to appropriate format if starting a fresh execution. (If    * continuing execution, we can
re-useprior params.)    */   if (es && es->status == F_EXEC_START)       postquel_sub_params(fcache, fcinfo);
 

After the error in the first subtransaction, the execution state tree
for the "public.dummy(p_name_table)" expression has a fn_extra link
that is pointing to a SQLFunctionCache that's in F_EXEC_RUN state.
So when the second call of broken() tries to re-use the state tree,
fmgr_sql() thinks it's continuing the execution of a set-returning
function, and doesn't bother to re-initialize its ParamListInfo
struct.  So it merrily tries to execute using a text datum that's
pointing at long-since-pfree'd storage for the original
'nonexistant.stuffs' argument string.

I had always felt a tad uncomfortable with the way that plpgsql re-uses
execution state trees for simple expressions; it seemed to me that it
was entirely unsafe in recursive usage.  But I'd never been able to
prove that it was broken.  Now that I've seen this example, I know how
to break it: recurse indirectly through a SQL function.  For instance,
this will dump core immediately:

create or replace function recurse(float8) returns float8 as
$$
begin raise notice 'recurse(%)', $1; if ($1 < 10) then   return sql_recurse($1 + 1); else   return $1; end if;
end
$$ language plpgsql;

-- "limit" is to prevent this from being inlined
create or replace function sql_recurse(float8) returns float8 as
$$ select recurse($1) limit 1; $$ language sql;

select recurse(0);

Notice the lack of any subtransaction or error condition.  The reason
this fails is *not* plancache misfeasance or failure to clean up after
error.  Rather, it's that the inner execution of recurse() is trying to
re-use an execution state tree that is pointing at an already-active
execution of sql_recurse().  In general, what plpgsql is doing is
entirely unsafe whenever a called function tries to keep changeable
execution state in storage pointed to by fn_extra.  We've managed to
miss the problem because plpgsql doesn't try to use this technique on
functions returning set (see exec_simple_check_node), and the vast
majority of non-SRF functions that use fn_extra at all use it to cache
catalog lookup results, which don't change from call to call.  But
there's no convention that says a function can't keep execution status
data in fn_extra --- in fact, there isn't anyplace else for it to keep
such data.

Right at the moment I'm not seeing any way that the present
exec_eval_simple_expr approach can be fixed to work safely in the
presence of recursion.  What I think we might have to do is give up
on the idea of caching execution state trees across calls, instead
using them just for the duration of a single plpgsql function call.
I'm not sure what sort of runtime penalty might ensue.  The whole design
predates the plancache, and I think it was mostly intended to prevent
having to re-parse-and-plan simple expressions every time.  So a lot of
that overhead has gone away anyway given the plancache, and maybe we
shouldn't sweat too much about paying what remains.  (But on the third
hand, what are we gonna do for back-patching to versions without the
plancache?)


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: serializable lock consistency patch
Next
From: Greg Stark
Date:
Subject: Re: Creation of temporary tables on read-only standby servers