Re: Plan invalidation design - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Plan invalidation design
Date
Msg-id 7839.1171820158@sss.pgh.pa.us
Whole thread Raw
In response to Re: Plan invalidation design  (Russell Smith <mr-russ@pws.com.au>)
Responses Re: Plan invalidation design  (Gregory Stark <stark@enterprisedb.com>)
Re: Plan invalidation design  ("Andrew Hammond" <andrew.george.hammond@gmail.com>)
List pgsql-hackers
Russell Smith <mr-russ@pws.com.au> writes:
> Tom Lane wrote:
>> 2. Given a handle for a previously stored query, check to see if the plan
>> is still up to date; if not, regenerate it from the raw parse tree (note
>> this could result in failure, eg if a column used by the query has been
>> dropped).  Then return the plan tree.
>> 
> What do we do in the case of failure?  Die in the same way we do now 
> when you can't use the plan that's been made?

Well, the difference is that at plan use you might get an error that
currently could only occur at initial query parsing.  I don't see that
this is a big deal, but it will be a change in behavior.

One thing I realized since yesterday is that it'll have to be possible
for the caller to tell whether the plan has changed since he last saw
it (perhaps via a re-plan counter included in the cache entry).  It's
entirely possible that the set of output columns will have changed,
and so the caller may need to re-do derived work.  For example plpgsql
will need to re-do its analysis of whether a plan is "simple".

What we might want in some cases is for the caller to decide to error
out if the set of output columns changes.  I think this is likely
appropriate for queries prepared via the Parse protocol message, because
we may have already told the client what the column set is, and it won't
be prepared to deal with getting a different set of columns back.  I'm
not sure now whether that's appropriate for every call site, but if it
is then we could avoid some of these definitional issues.

>> We probably want to return a direct pointer to the cached plan tree
>> instead of making a copy.  This should be safe, because the executor now
>> treats plan trees as read-only, but it does mean that when plan
>> invalidation occurs the cached plan tree might still be in use.

> excuse my ignorance here, but under what circumstances is a plan in use 
> for a single backend at the same time as it's invalidated.

There shouldn't be any structural changes in a table once you've
acquired lock on it, but there could be statistics changes, eg from
VACUUM; and the relcache inval mechanism currently doesn't distinguish
those cases.  We'd need some such concept anyway if we ever extend
the invalidation to cover functions, because there's no locking on them.

> What other circumstances could you have a syntax error from a query that 
> has been successfully planned and parsed?

DROP COLUMN, DROP FUNCTION, ... lots of possibilities.

> I've read this paragraph 3 times now and am still quite unclear about 
> the requirements for the original query to be stored.  Is the plan cache 
> going to replace the syntax check which I thought would have been done 
> in gram.y. 

We don't need to re-do that syntax check, precisely because it's purely
a syntax check and doesn't involve any database state.

> If you replan and immutable function, aren't you possibly messing up a 
> functional index that is using the old function.  Hey, if you change an 
> immutable function that has an index, you are in trouble already.

True.

> Replanning pl/pgsql with CREATE TEMP TABLE would be a good use here.  
> You loose the preplanning benefits, but we remove the ongoing problem 
> where people report that their temp-table isn't working.

Yeah, that's one of the main arguments why this is worth the trouble.

> Even function alterations to pl/pgsql should a replan.  But of more 
> interest is being able to use the old function for currently running 
> transactions when the function is changed.  Last time I tried to edit a 
> pl/pgsql function while it was being used by a transaction, the 
> transaction failed because the function definition changed.

I fixed a couple of bugs in that area recently --- the current behavior
should be that any active execution of a plpgsql function will finish
out using the function definition that was current when it started.  But
that's something that's local to the PL function manager and doesn't
really have anything to do with plans using the function.  Inlined SQL
functions are the exception to the rule that a plan doesn't know exactly
what a function it calls does.

> Is the race condition here any more likely to happen than the failure of 
> a re plan when something has changed from underneath the original query?

It's not really the same thing --- the problem is making sure that your
check for table changes is accurate, and doesn't miss a change that
commits just after you look.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Knut P. Lehre"
Date:
Subject: FETCH from scroll cursor does not return count in libpq PQcmdStatus
Next
From: Hannu Krosing
Date:
Subject: Re: RFC: Temporal Extensions for PostgreSQL