Thread: Syscache/relcache invalidation event callbacks
I'm planning to add a mechanism to backend/utils/cache/inval.c that will allow additional modules to register to get notification of syscache and relcache invalidation events. Right now, only the syscache and relcache get told about it --- but there's no reason we couldn't call additional routines here. The immediate thing I want this for is so that the namespace.c routines can safely use a cached list of OIDs for accessible namespaces. Without this cache, we'll have to repeatedly look up pg_namespace entries and check their USAGE privilege bits. That strikes me as a fairly expensive operation to have to do for each table, function, and operator name in every query, when in practice the results aren't going to be changing very often. I'd like to only do the lookups when search_path changes or we receive a notification of an update in pg_namespace. This mechanism would also allow us to solve the plpgsql-cached-plans problem that keeps coming up. If plpgsql registers a callback routine for relcache events, then it'll get a notification every time something happens to a relation schema. It could search its cached plans to see if there is any reference to that relation OID. If so, blow away the cached plan. (Or at least prevent it from being used again. There'd be some possibility of this happening for a plan that's currently in use, I believe, so you'd probably need to avoid discarding the plan until the active call is done.) We'll have the same problem with the PREPAREd-plan feature that Neil is working on, so it seems like time to do this. Comments? regards, tom lane
On Mon, Apr 29, 2002 at 03:43:30PM -0400, Tom Lane wrote: > I'm planning to add a mechanism to backend/utils/cache/inval.c that will > allow additional modules to register to get notification of syscache and > relcache invalidation events. Right now, only the syscache and relcache > get told about it --- but there's no reason we couldn't call additional > routines here. > > The immediate thing I want this for is so that the namespace.c routines > can safely use a cached list of OIDs for accessible namespaces. Without > this cache, we'll have to repeatedly look up pg_namespace entries and > check their USAGE privilege bits. That strikes me as a fairly expensive > operation to have to do for each table, function, and operator name in > every query, when in practice the results aren't going to be changing > very often. I'd like to only do the lookups when search_path changes > or we receive a notification of an update in pg_namespace. > > This mechanism would also allow us to solve the plpgsql-cached-plans > problem that keeps coming up. If plpgsql registers a callback routine > for relcache events, then it'll get a notification every time something > happens to a relation schema. It could search its cached plans to see > if there is any reference to that relation OID. If so, blow away the IMHO is clean call a callback if a change is relavant for cachedplan -- it means if Oid is used in plan. > cached plan. (Or at least prevent it from being used again. There'd > be some possibility of this happening for a plan that's currently in > use, I believe, so you'd probably need to avoid discarding the plan > until the active call is done.) > > We'll have the same problem with the PREPAREd-plan feature that Neil is > working on, so it seems like time to do this. Comments? Wanted! It's very good idea. I have a question, how I will know how changes are relevant for my query plan? IMHO is needful some hight-level API, like list = ExtractQueryPlanOids( plan ); reg = RegisterOidsCallback( list, mycallback, mycallbackdata ); and now I cando: mycallback(reg, mycallbackdata){ remove_plan_from_my_cache( (MyKey *) mycallbackdata ); UnregisterOidsCallback(reg);} Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
Karel Zak <zakkr@zf.jcu.cz> writes: > I have a question, how I will know how changes are relevant for my > query plan? IMHO is needful some hight-level API, like > list = ExtractQueryPlanOids( plan ); > reg = RegisterOidsCallback( list, mycallback, mycallbackdata ); Yes, some kind of routine to extract all the referenced relation OIDs in a plan tree would be a good idea. I can provide that. The inval callback just tells you the OID of the relation that got flushed; it's up to you to get from there to the plans you need to rebuild. Perhaps a hash table would work well. BTW, the inval callback had better just mark the plans invalid, not try to rebuild them right away. I don't think it's safe to try to do more database accesses while we're in the relcache invalidation path. "Rebuild plan on next attempted use" seems like a better idea. regards, tom lane
On Tue, Apr 30, 2002 at 09:43:29AM -0400, Tom Lane wrote: > Karel Zak <zakkr@zf.jcu.cz> writes: > > I have a question, how I will know how changes are relevant for my > > query plan? IMHO is needful some hight-level API, like > > > list = ExtractQueryPlanOids( plan ); > > reg = RegisterOidsCallback( list, mycallback, mycallbackdata ); > > Yes, some kind of routine to extract all the referenced relation OIDs The relations or others things like operators, functions, etc. Right? > in a plan tree would be a good idea. I can provide that. The inval > callback just tells you the OID of the relation that got flushed; it's > up to you to get from there to the plans you need to rebuild. Perhaps > a hash table would work well. There must be possible define some callback specific data too, thecallback maybe will search query in some own specific cacheand itrequire some key. > BTW, the inval callback had better just mark the plans invalid, not > try to rebuild them right away. I don't think it's safe to try to Hmm, it can depend on action, I can imagine: DROP TABLE tab; ERROR: mycallback(): can't rebuild the query used in PL/SQL function 'xyz'. Please drop this function first. ...tabledrop failed. This is maybe not possible implement now, but it's ideal conception:-) > do more database accesses while we're in the relcache invalidation > path. "Rebuild plan on next attempted use" seems like a better idea. Agree. It means store to cache query string too (I not sure if it'sused in current qcache, but it's simple). There can be one query cache and one cached planns checking only, for RI, SPI, PL/SQL, PREPARE/EXECUTE. Or not? I think implement4x same things is terrible. Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
Karel Zak <zakkr@zf.jcu.cz> writes: > There must be possible define some callback specific data too, the > callback maybe will search query in some own specific cache and it > require some key. Yeah, I have it set up similarly to on_proc_exit callbacks: when you register the callback function you can also provide a Datum (typically a pointer). But I'm not sure that there's any real value in this. AFAICS, such callbacks are always going to be interested in looking through a cache data structure of some kind, and so there will always be global variables pointing to what they need to get to. (I'm envisioning *one* callback function handling a whole cache, not a callback for each entry!) >> BTW, the inval callback had better just mark the plans invalid, not >> try to rebuild them right away. I don't think it's safe to try to > Hmm, it can depend on action, I can imagine: > DROP TABLE tab; > ERROR: mycallback(): can't rebuild the query used in PL/SQL function > 'xyz'. Please drop this function first. > ...table drop failed. Nope, that is NOT okay. (1) The drop might have been done by some other backend. (2) Even if it was your own backend, the drop is already committed by the time you hear about it. You invalidate the plan, and then when and if that function is called again, you'll flag an error as a natural result of trying to recompute the plan. No shortcuts. (If we do want to prevent a drop in cases like this, it has to be done via the kind of dependency mechanism that Rod is working on.) > There can be one query cache and one cached planns checking only, for RI, > SPI, PL/SQL, PREPARE/EXECUTE. Or not? Hmm. Seems like it might be a good idea, but I'm not certain that all of these have exactly the same requirements. If they can all share one cache that'd definitely be a win. regards, tom lane
On Tue, Apr 30, 2002 at 10:39:38AM -0400, Tom Lane wrote: > Karel Zak <zakkr@zf.jcu.cz> writes: > > There must be possible define some callback specific data too, the > > callback maybe will search query in some own specific cache and it > > require some key. > > Yeah, I have it set up similarly to on_proc_exit callbacks: when you > register the callback function you can also provide a Datum (typically > a pointer). But I'm not sure that there's any real value in this. > AFAICS, such callbacks are always going to be interested in looking > through a cache data structure of some kind, and so there will always > be global variables pointing to what they need to get to. (I'm > envisioning *one* callback function handling a whole cache, not a > callback for each entry!) I understand. > >> BTW, the inval callback had better just mark the plans invalid, not > >> try to rebuild them right away. I don't think it's safe to try to > > > Hmm, it can depend on action, I can imagine: > > > DROP TABLE tab; > > > ERROR: mycallback(): can't rebuild the query used in PL/SQL function > > 'xyz'. Please drop this function first. > > > ...table drop failed. > > Nope, that is NOT okay. It was dream. > (If we do want to prevent a drop in cases like this, it has to be done > via the kind of dependency mechanism that Rod is working on.) It's good that someone works on dreams:-) > > There can be one query cache and one cached planns checking only, for RI, > > SPI, PL/SQL, PREPARE/EXECUTE. Or not? > > Hmm. Seems like it might be a good idea, but I'm not certain that all > of these have exactly the same requirements. If they can all share one > cache that'd definitely be a win. All they needs save query planns -- IMHO it's enough. It's alwayscouple "plan-identificator" + "plan-tree". There is not problem create branchs of the cache, separate for SPI,PL/SQL ..etc. But the _routines_ and API will _same_. branch = qCache_CreateBranch("PL/SQL");qCache_AddPlan(branch, plan, hashkey); Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz