Thread: Syscache/relcache invalidation event callbacks

Syscache/relcache invalidation event callbacks

From
Tom Lane
Date:
I'm planning to add a mechanism to backend/utils/cache/inval.c that will
allow additional modules to register to get notification of syscache and
relcache invalidation events.  Right now, only the syscache and relcache
get told about it --- but there's no reason we couldn't call additional
routines here.

The immediate thing I want this for is so that the namespace.c routines
can safely use a cached list of OIDs for accessible namespaces.  Without
this cache, we'll have to repeatedly look up pg_namespace entries and
check their USAGE privilege bits.  That strikes me as a fairly expensive
operation to have to do for each table, function, and operator name in
every query, when in practice the results aren't going to be changing
very often.  I'd like to only do the lookups when search_path changes
or we receive a notification of an update in pg_namespace.

This mechanism would also allow us to solve the plpgsql-cached-plans
problem that keeps coming up.  If plpgsql registers a callback routine
for relcache events, then it'll get a notification every time something
happens to a relation schema.  It could search its cached plans to see
if there is any reference to that relation OID.  If so, blow away the
cached plan.  (Or at least prevent it from being used again.  There'd
be some possibility of this happening for a plan that's currently in
use, I believe, so you'd probably need to avoid discarding the plan
until the active call is done.)

We'll have the same problem with the PREPAREd-plan feature that Neil is
working on, so it seems like time to do this.  Comments?
        regards, tom lane


Re: Syscache/relcache invalidation event callbacks

From
Karel Zak
Date:
On Mon, Apr 29, 2002 at 03:43:30PM -0400, Tom Lane wrote:
> I'm planning to add a mechanism to backend/utils/cache/inval.c that will
> allow additional modules to register to get notification of syscache and
> relcache invalidation events.  Right now, only the syscache and relcache
> get told about it --- but there's no reason we couldn't call additional
> routines here.
> 
> The immediate thing I want this for is so that the namespace.c routines
> can safely use a cached list of OIDs for accessible namespaces.  Without
> this cache, we'll have to repeatedly look up pg_namespace entries and
> check their USAGE privilege bits.  That strikes me as a fairly expensive
> operation to have to do for each table, function, and operator name in
> every query, when in practice the results aren't going to be changing
> very often.  I'd like to only do the lookups when search_path changes
> or we receive a notification of an update in pg_namespace.
> 
> This mechanism would also allow us to solve the plpgsql-cached-plans
> problem that keeps coming up.  If plpgsql registers a callback routine
> for relcache events, then it'll get a notification every time something
> happens to a relation schema.  It could search its cached plans to see
> if there is any reference to that relation OID.  If so, blow away the
IMHO is clean call a callback if a change is relavant for cachedplan -- it means if Oid is used in plan.

> cached plan.  (Or at least prevent it from being used again.  There'd
> be some possibility of this happening for a plan that's currently in
> use, I believe, so you'd probably need to avoid discarding the plan
> until the active call is done.)
> 
> We'll have the same problem with the PREPAREd-plan feature that Neil is
> working on, so it seems like time to do this.  Comments?
Wanted! It's very good idea.
I have a question, how I will know how changes are relevant for my query plan? IMHO is needful some hight-level API,
like
   list = ExtractQueryPlanOids( plan );   reg = RegisterOidsCallback( list, mycallback, mycallbackdata );   and now I
cando:
 
mycallback(reg, mycallbackdata){   remove_plan_from_my_cache( (MyKey *) mycallbackdata );
UnregisterOidsCallback(reg);}

      Karel

-- Karel Zak  <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz,
http://mape.jcu.cz


Re: Syscache/relcache invalidation event callbacks

From
Tom Lane
Date:
Karel Zak <zakkr@zf.jcu.cz> writes:
>  I have a question, how I will know how changes are relevant for my 
>  query plan? IMHO is needful some hight-level API, like

>     list = ExtractQueryPlanOids( plan );
>     reg = RegisterOidsCallback( list, mycallback, mycallbackdata );

Yes, some kind of routine to extract all the referenced relation OIDs
in a plan tree would be a good idea.  I can provide that.  The inval
callback just tells you the OID of the relation that got flushed; it's
up to you to get from there to the plans you need to rebuild.  Perhaps
a hash table would work well.

BTW, the inval callback had better just mark the plans invalid, not
try to rebuild them right away.  I don't think it's safe to try to
do more database accesses while we're in the relcache invalidation
path.  "Rebuild plan on next attempted use" seems like a better idea.
        regards, tom lane


Re: Syscache/relcache invalidation event callbacks

From
Karel Zak
Date:
On Tue, Apr 30, 2002 at 09:43:29AM -0400, Tom Lane wrote:
> Karel Zak <zakkr@zf.jcu.cz> writes:
> >  I have a question, how I will know how changes are relevant for my 
> >  query plan? IMHO is needful some hight-level API, like
> 
> >     list = ExtractQueryPlanOids( plan );
> >     reg = RegisterOidsCallback( list, mycallback, mycallbackdata );
> 
> Yes, some kind of routine to extract all the referenced relation OIDs
The relations or others things like operators, functions, etc. Right?

> in a plan tree would be a good idea.  I can provide that.  The inval
> callback just tells you the OID of the relation that got flushed; it's
> up to you to get from there to the plans you need to rebuild.  Perhaps
> a hash table would work well.
There must be possible define some callback specific data too, thecallback maybe will search query in some own specific
cacheand itrequire some key.
 
> BTW, the inval callback had better just mark the plans invalid, not
> try to rebuild them right away.  I don't think it's safe to try to
Hmm, it can depend on action, I can imagine:
DROP TABLE tab;
ERROR: mycallback(): can't rebuild the query used in PL/SQL function       'xyz'. Please drop this function first.
...tabledrop failed. This is maybe not possible implement now, but it's ideal conception:-)
 
> do more database accesses while we're in the relcache invalidation
> path.  "Rebuild plan on next attempted use" seems like a better idea.
Agree. It means store to cache query string too (I not sure if it'sused in current qcache, but it's simple).
There can be one query cache and one cached planns checking only, for RI, SPI, PL/SQL, PREPARE/EXECUTE. Or not? I think
implement4x same things is terrible.
 
       Karel

-- Karel Zak  <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz,
http://mape.jcu.cz


Re: Syscache/relcache invalidation event callbacks

From
Tom Lane
Date:
Karel Zak <zakkr@zf.jcu.cz> writes:
>  There must be possible define some callback specific data too, the
>  callback maybe will search query in some own specific cache and it
>  require some key.

Yeah, I have it set up similarly to on_proc_exit callbacks: when you
register the callback function you can also provide a Datum (typically
a pointer).  But I'm not sure that there's any real value in this.
AFAICS, such callbacks are always going to be interested in looking
through a cache data structure of some kind, and so there will always
be global variables pointing to what they need to get to.  (I'm
envisioning *one* callback function handling a whole cache, not a
callback for each entry!)
>> BTW, the inval callback had better just mark the plans invalid, not
>> try to rebuild them right away.  I don't think it's safe to try to

>  Hmm, it can depend on action, I can imagine:

>  DROP TABLE tab;

>  ERROR: mycallback(): can't rebuild the query used in PL/SQL function
>         'xyz'. Please drop this function first.

>  ...table drop failed.

Nope, that is NOT okay.

(1) The drop might have been done by some other backend.

(2) Even if it was your own backend, the drop is already committed   by the time you hear about it.

You invalidate the plan, and then when and if that function is called
again, you'll flag an error as a natural result of trying to recompute
the plan.  No shortcuts.

(If we do want to prevent a drop in cases like this, it has to be done
via the kind of dependency mechanism that Rod is working on.)

>  There can be one query cache and one cached planns checking only, for RI, 
>  SPI, PL/SQL, PREPARE/EXECUTE. Or not?

Hmm.  Seems like it might be a good idea, but I'm not certain that all
of these have exactly the same requirements.  If they can all share one
cache that'd definitely be a win.
        regards, tom lane


Re: Syscache/relcache invalidation event callbacks

From
Karel Zak
Date:
On Tue, Apr 30, 2002 at 10:39:38AM -0400, Tom Lane wrote:
> Karel Zak <zakkr@zf.jcu.cz> writes:
> >  There must be possible define some callback specific data too, the
> >  callback maybe will search query in some own specific cache and it
> >  require some key.
> 
> Yeah, I have it set up similarly to on_proc_exit callbacks: when you
> register the callback function you can also provide a Datum (typically
> a pointer).  But I'm not sure that there's any real value in this.
> AFAICS, such callbacks are always going to be interested in looking
> through a cache data structure of some kind, and so there will always
> be global variables pointing to what they need to get to.  (I'm
> envisioning *one* callback function handling a whole cache, not a
> callback for each entry!)
I understand. 
> >> BTW, the inval callback had better just mark the plans invalid, not
> >> try to rebuild them right away.  I don't think it's safe to try to
> 
> >  Hmm, it can depend on action, I can imagine:
> 
> >  DROP TABLE tab;
> 
> >  ERROR: mycallback(): can't rebuild the query used in PL/SQL function
> >         'xyz'. Please drop this function first.
> 
> >  ...table drop failed.
> 
> Nope, that is NOT okay.
It was dream.

> (If we do want to prevent a drop in cases like this, it has to be done
> via the kind of dependency mechanism that Rod is working on.)
It's good that someone works on dreams:-)

> >  There can be one query cache and one cached planns checking only, for RI, 
> >  SPI, PL/SQL, PREPARE/EXECUTE. Or not?
> 
> Hmm.  Seems like it might be a good idea, but I'm not certain that all
> of these have exactly the same requirements.  If they can all share one
> cache that'd definitely be a win.
All they needs save query planns -- IMHO it's enough. It's alwayscouple "plan-identificator" + "plan-tree".
There is not problem create branchs of the cache, separate for SPI,PL/SQL ..etc. But the _routines_ and API will
_same_.
branch = qCache_CreateBranch("PL/SQL");qCache_AddPlan(branch, plan, hashkey);
       Karel

-- Karel Zak  <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz,
http://mape.jcu.cz