Re: cache in plpgsql - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: cache in plpgsql
Date
Msg-id 3FF59802.4020607@Yahoo.com
Whole thread Raw
In response to Re: cache in plpgsql  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
>> But can you tell me how to actually check if a saved SPI plan
>> references that particular catalog object or not?
> 
> This is certainly doable in principle.  recordDependencyOnExpr()
> contains much of the logic that would be needed -- it would need to be
> modified so that the dependency info can just be put in memory and not
> stored into pg_depend, but that's surely not hard.  (Note that to use
> it directly, we'd want to examine the parsed querytree not the plan
> tree, but that isn't a problem AFAICS.)
> 
> Whether it's *practical* is another question.  Cache inval events
> happen often enough that speed of response to 'em is an issue.
> Maybe we could use a hashtable of dependencies to avoid expensive
> searches for cached plans that must be invalidated.
> 
> Another little problem is that plpgsql doesn't really have any mechanism
> for invalidating cached stuff at all; it will leak memory like there's
> no tomorrow if we start dropping cached subplans.  plpgsql needs to be
> rewritten so that everything it allocates lives in per-function memory
> contexts that can be dropped when an invalidation happens.

Everyone seems to look at it as a PL/pgSQL specific problem. It is not!

The very same problem with cached plans exists for everything that uses 
SPI. Referential integrity is a good example, where we lately had a 
report from someone that dropping an index that was in fact not required 
for the constraint broke a cached plan. PL/Tcl has the ability to save 
prepared plans too.

Thus, this invalidation and recompilation of plans has to happen inside 
of SPI_execp() I'd say. Imagine SPI_prepare() would save the original 
query along with the parameter types, and the execution plan itself is 
stored in a sub memory context (and this separation is kept during 
SPI_saveplan()). Also we have a hash table holding all the dependencies 
with pointers to these plans, so that cache invalidation can quickly set 
a "recompile" flag in all the plans that depend on a dropped object. Now 
if SPI_execp() is called for a plan that is marked "recompile", all it 
has to do is remove all references for this plan from the hashtable, 
throw away the sub memory context, recompile the plan and store new 
dependencies in the hashtable. Note, that this whole process would only 
occur at the moment where the current system errors out with a rather 
cryptic error message.

This "hashtable" of dependencies would contain linked lists. The lookup 
key at the time of syscache invalidation is the object type and id. And 
each of this can be referenced by any number of plans.

> 
> As far as I can see a dependency-based solution is possible.  It's
> just a Small Matter Of Programming.
> http://www.catb.org/~esr/jargon/html/S/SMOP.html

As said, the idea is neither bad, nor new. And please let's not forget 
to add temp object detection into the dependency collector so that SPI 
automagically will handle temp tables used in PL/pgSQL by NOT storing 
prepared plans at all. I'm not sure, do we have a TODO item for this?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: cache in plpgsql
Next
From: Alvaro Herrera
Date:
Subject: Re: Postgres + Xapian (was Re: fulltext searching via a custom index type )