Thread: DISCARD ALL ; stored procedures
Greetings, Looking at discard all, I was a bit suprised that 'DISCARD PLANS;' doesn't clear out cached stored procedures. To be honest,that's one of the main reasons I'd see to use it. I thought there had been some discussion in the archives relatedto invalidating stored procedure plans due to catalog or other changes, I would have thought it'd be possible to hookinto that to do the same on a DISCARD PLANS;. Thoughts? Is there an issue doing that? It certainly seems like it'd be a lot better than what he current documentationrequires: When necessary, the cache can be flushed by starting a fresh database session. Maybe we could use 'DISCARD PLPLANS;' or something, if people feel there needs to be a seperate way to clear those. Thanks, Stephen
On Thu, Jan 6, 2011 at 3:20 PM, Stephen Frost <sfrost@snowman.net> wrote: > Greetings, > > Looking at discard all, I was a bit suprised that 'DISCARD PLANS;' > doesn't clear out cached stored procedures. To be honest, that's one > of the main reasons I'd see to use it. I thought there had been some > discussion in the archives related to invalidating stored procedure > plans due to catalog or other changes, I would have thought it'd be > possible to hook into that to do the same on a DISCARD PLANS;. > > Thoughts? Is there an issue doing that? It certainly seems like it'd > be a lot better than what he current documentation requires: > > When necessary, the cache can be flushed by starting a fresh database > session. > > Maybe we could use 'DISCARD PLPLANS;' or something, if people feel > there needs to be a seperate way to clear those. this is a problem. under what circumstances would you want to discard them and why? the main problem I see with cached plpgsql plans is interactions with search_path -- but DISCARD might not be the best way to attack that problem. There might be other reasons though. merlin
* Merlin Moncure (mmoncure@gmail.com) wrote: > this is a problem. under what circumstances would you want to discard > them and why? the main problem I see with cached plpgsql plans is > interactions with search_path -- but DISCARD might not be the best way > to attack that problem. There might be other reasons though. interaction w/ search_path (or, rather, lack of respect for it..) is exactly the issue here for me. Thanks, Stephen
On Thu, Jan 6, 2011 at 4:30 PM, Stephen Frost <sfrost@snowman.net> wrote: > * Merlin Moncure (mmoncure@gmail.com) wrote: >> this is a problem. under what circumstances would you want to discard >> them and why? the main problem I see with cached plpgsql plans is >> interactions with search_path -- but DISCARD might not be the best way >> to attack that problem. There might be other reasons though. > > interaction w/ search_path (or, rather, lack of respect for it..) is > exactly the issue here for me. this has been discussed a couple of times -- a plausible alternative might be to adjust the plan caching mechanism to organize the plan cache around search_path. that way you get a separate plan per search_path instance. discard has zero backwards compatibility issues but has one big problem -- if you are using combination of connection pooling, lots of plpgsql and search_path manipulation, you take a big performance hit. in other words, even if you can discard everything., do you really want to? merlin
* Merlin Moncure (mmoncure@gmail.com) wrote: > this has been discussed a couple of times -- a plausible alternative > might be to adjust the plan caching mechanism to organize the plan > cache around search_path. that way you get a separate plan per > search_path instance. That would certainly be fine for me. To be honest, I feel like I've even suggested that in the past, somewhere. > discard has zero backwards compatibility issues but has one big > problem -- if you are using combination of connection pooling, lots of > plpgsql and search_path manipulation, you take a big performance hit. > in other words, even if you can discard everything., do you really > want to? I don't see how this can be an unnecessary performance hit. You might argue that I should redesign things to not work this way, but that's a whole different discussion. At the moment the options are: - switch to using execute - force a full database reconnect To get the 'correct' behavior. If it's "performance" vs. "correctness", you can guess what I'm going to vote for, however, in this case, I can't see how either of the other options would perform better than a discard-like approach. If people are already using 'discard all;' then they're already throwing away their plans for prepared queries, it strikes me as unlikely that they'd have an issue with also getting rid of stored procedure plans. If they do, they could certainly use the individual 'discard' statements instead (presuming we implement this with a new discard argument). Thanks Stephen
On Thu, Jan 6, 2011 at 5:22 PM, Stephen Frost <sfrost@snowman.net> wrote: > If it's "performance" vs. "correctness", you can guess what I'm going to > vote for, however, in this case, I can't see how either of the other > options would perform better than a discard-like approach. If people > are already using 'discard all;' then they're already throwing away > their plans for prepared queries, it strikes me as unlikely that they'd > have an issue with also getting rid of stored procedure plans. If they > do, they could certainly use the individual 'discard' statements > instead (presuming we implement this with a new discard argument). If DISCARD ALL doesn't flush this stuff, I'd consider that an outright bug. Does it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > If DISCARD ALL doesn't flush this stuff, I'd consider that an outright > bug. Does it? No, it does not, based on my testing against 8.4.5: Simple function: ---------------------------- CREATE OR REPLACE FUNCTION test_func() RETURNS boolean AS $_$ DECLARE rec RECORD; BEGIN SELECT INTO rec statefp FROM edges LIMIT 1; raise notice 'rec: %', rec; RETURN TRUE; END; $_$ LANGUAGE plpgsql; ---------------------------- Couple of tables, note the CHECK constraints on statefp,which is what the function pulls out: ---------------------------- gis*=> \d tiger_02.addrfn Table "tiger_01.addrfn" Column | Type | Modifiers ----------+-----------------------+------------------------------------------------------gid | integer | not null default nextval('addrfn_gid_seq'::regclass)arid | character varying(22) | linearid | character varying(22)| statefp | character varying(2) | not null default '01'::character varying Indexes: "addrfn_pkey" PRIMARY KEY, btree (gid) Check constraints: "addrfn_statefp_check" CHECK (statefp::text = '01'::text) Inherits: tiger_us.addrfn ---------------------------- ---------------------------- gis*=> \d tiger_02.addrfn Table "tiger_02.addrfn" Column | Type | Modifiers ----------+-----------------------+------------------------------------------------------gid | integer | not null default nextval('addrfn_gid_seq'::regclass)arid | character varying(22) | linearid | character varying(22)| statefp | character varying(2) | not null default '02'::character varying Indexes: "addrfn_pkey" PRIMARY KEY, btree (gid) Check constraints: "addrfn_statefp_check" CHECK (statefp::text = '02'::text) Inherits: tiger_us.addrfn ---------------------------- See the results: gis=> \i ./qs.sql CREATE FUNCTION gis*=> set search_path to tiger_01; SET gis*=> set search_path to tiger_01,sfrost; SET gis*=> select test_func(); NOTICE: rec: (01)test_func -----------t (1 row) gis*=> commit; COMMIT gis=> discard all; DISCARD ALL gis=> set search_path to tiger_02,sfrost; SET gis*=> select test_func(); NOTICE: rec: (01)test_func -----------t (1 row) The addrfn table in the tiger_02 schema certainly can not have a statefp of 01 due to the CHECK constraint (and it believe me, it's right). To be honest, I agree it's a bug, and I would *love* to have it back-patched, but I could see an argument for it to be something explicit from DISCARD PLANS; and would hence require a grammar change which isn't something we'd typically back-patch. Making it part of DISCARD PLANS; and back-patching it to 8.3 where DISCARD was introduced would be awesome for me. :) Thanks, Stephen
On Fri, Jan 7, 2011 at 8:43 AM, Stephen Frost <sfrost@snowman.net> wrote: > To be honest, I agree it's a bug, and I would *love* to have it > back-patched, but I could see an argument for it to be something > explicit from DISCARD PLANS; and would hence require a grammar > change which isn't something we'd typically back-patch. That argument doesn't carry much water with me, because it's hard for me to imagine a situation where you need to discard one of these things but discarding the other one also causes a problem. I'm also pretty unexcited about adding more stuff to the grammar to cover such a marginal borderline case. Adding unreserved keywords is pretty cheap, but it's not totally free. If we were going to do it I'd suggest DISCARD LANGUAGE PLANS or something like that, but I don't really see a reason to go there at all. Really it seems to me that changing the search path ought to discard anything that might have been done differently had the search path been different, but I don't think that's back-patch material. > Making it part of DISCARD PLANS; and back-patching it to 8.3 where > DISCARD was introduced would be awesome for me. :) I'd need to look at this more closely before committing anything, but at first blush I think that's reasonable. Have a patch? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > Really it seems to me that changing the search path ought to discard > anything that might have been done differently had the search path > been different, but I don't think that's back-patch material. I like that idea, but I agree it wouldn't be back-patchable, and I could see arguments against it also (convoluting the GUC mechanics, etc). > > Making it part of DISCARD PLANS; and back-patching it to 8.3 where > > DISCARD was introduced would be awesome for me. :) > > I'd need to look at this more closely before committing anything, but > at first blush I think that's reasonable. Have a patch? Sadly, no.. To be honest, I was fully expecting a response of "that's hard to do." I'm not sure we have any mechanics in place for throwing away stored procedure plans, but I'll go look and see if I can come up with something. Would *love* to get this fixed. Thanks! Stephen
* Stephen Frost (sfrost@snowman.net) wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: > > > Making it part of DISCARD PLANS; and back-patching it to 8.3 where > > > DISCARD was introduced would be awesome for me. :) > > > > I'd need to look at this more closely before committing anything, but > > at first blush I think that's reasonable. Have a patch? > > To be honest, I was fully expecting a response of "that's hard to do." Soo, yeah, I found the "this is hard" part. Basically, the plan cacheing, etc, is all handled by the stored procedures themselves, and we havn't got any way to tell a PL "destroy all your plans." We might be able to hack up fmgr to throw away all references to functions, but that wouldn't release the memory they use up, making 'discard plans;' leak like a sieve. What's worse is that most PLs actually *also* allocate a bunch of stuff in TopMemoryContext, meaning even if we figured out what contexts are used by the PLs, we still couldn't nuke them. Personally, I'm not a fan of the PLs doing that (perhaps there's some performance reason? dunno). A few approaches come to mind for dealing with this- #1. Add a new 'Top-but-removed-on-DISCARD' context and modify the PLs to use that instead of TopMemoryContext and requireany other contextsthey create to be children of it. #2. Add another entry point to the PLs in pg_pltemplate.h which is a function to be called on DISCARD. #3. Add a way for PLs to request a context similar to #1, but would be on a per-PL basis. This would involve adding anew function tommgr/, or adding a new parameter for creating a context. I like the idea of having a context-per-PL, but I don't know that I can justify the complications necessary to make it happen. Given that they're all already dumping things in TopMemoryContext, at least moving them out of *that* would improve the situation, so my inclination is to do #1. Of course, user-added PLs wouldn't get this benefit and would still leak memory after a DISCARD until they're updated. After the above is figured out, we should be able to go through and make fmgr get cleaned up after a DISCARD. Thoughts? Stephen
On Fri, Jan 7, 2011 at 1:29 PM, Stephen Frost <sfrost@snowman.net> wrote: > #1. Add a new 'Top-but-removed-on-DISCARD' context and modify the PLs to > use that instead of TopMemoryContext and require any other contexts > they create to be children of it. I'm guessing that just resetting the memory context is going to result in things breaking all over the place - the PL might have dangling pointers into the context. And it might have other resources that we don't know about. Thus I think we need: > #2. Add another entry point to the PLs in pg_pltemplate.h which is a > function to be called on DISCARD. ...except I think that the principal thing you need to modify is pl_language, rather than pl_pltemplate. If we go this route, then (1) it can't be back-patched, obviously, and (2) we need to think a little bit harder about what we're asking to have discarded, because I think it's going to be a lot more than just cached plans. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Fri, Jan 7, 2011 at 1:29 PM, Stephen Frost <sfrost@snowman.net> wrote: > > #1. Add a new 'Top-but-removed-on-DISCARD' context and modify the PLs to > > use that instead of TopMemoryContext and require any other contexts > > they create to be children of it. > > I'm guessing that just resetting the memory context is going to result > in things breaking all over the place - the PL might have dangling > pointers into the context. After looking through the code more, we actually already say "use this context for stuff you allocate in fn_extra", but it doesn't look like the PLs are respecting or using that. We do have a function which resets fn_extra already (fmgr_finfo_copy) but I'm not sure under what conditions it's used and I'm not sure why it doesn't leak memory by doing that. If we can figure out the list of functions that have been called, get at all of their fn_extra pointers to set them to NULL, and nuke the context that they're created in, that should work. The PLs in core appear to be good about using fn_extra and resetting it should be sufficient to force a recompile of the stored procedures. It also looks like they shouldn't have any issue surviving that reset. > And it might have other resources that we > don't know about. Thus I think we need: This is certainly a concern and would be a reason to offer a seperate function for the PLs to use, but I'm not sure we need to jump there right away. I'd like to see if the core/contrib PLs can all handle the above approach and then see if third-party PLs complain. > > #2. Add another entry point to the PLs in pg_pltemplate.h which is a > > function to be called on DISCARD. > > ...except I think that the principal thing you need to modify is > pl_language, rather than pl_pltemplate. Right, sorry. > If we go this route, then (1) it can't be back-patched, obviously, and > (2) we need to think a little bit harder about what we're asking to > have discarded, because I think it's going to be a lot more than just > cached plans. I'm not ready to give up quite yet, but I agree that we might end up there. Thanks, Stephen
All, Alright, so, the whole fn_extra stuff seems to be unrelated.. I'm not sure when it's used (perhaps multiple calls to the same function in a given query?), but the PLs have their own hash tables that they use for storing functions that have been called. I had assumed that was done through fmgr, but apparently not (or at least, I can't find where..). I'm starting to wonder if we're trying to do too much with this though. If all the PLs have to go through SPI to *get* plans (at least ones we care about), perhaps we could just use SPI to implement the plan invalidation? Consider if we saved the DISCARD's transaction ID and store the last-discard txn (or whenever the function was first prepared) in the result of the SPI prepare and then detect if we need to switch to replanning the query in SPI_execute_plan instead of just executing it. Of course, we'd have to have enough info *to* replan it, but we should be able to manage that. Thoughts? Stephen
On Sat, Jan 8, 2011 at 4:28 PM, Stephen Frost <sfrost@snowman.net> wrote: > Thoughts? Unfortunately, we've officially exceeded my level of knowledge to the point where I can't comment intelligently. Sorry.... :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Stephen Frost's message of vie ene 07 15:29:52 -0300 2011: > * Stephen Frost (sfrost@snowman.net) wrote: > > * Robert Haas (robertmhaas@gmail.com) wrote: > > > > Making it part of DISCARD PLANS; and back-patching it to 8.3 where > > > > DISCARD was introduced would be awesome for me. :) > > > > > > I'd need to look at this more closely before committing anything, but > > > at first blush I think that's reasonable. Have a patch? > > > > To be honest, I was fully expecting a response of "that's hard to do." > > Soo, yeah, I found the "this is hard" part. Basically, the plan > cacheing, etc, is all handled by the stored procedures themselves, and > we havn't got any way to tell a PL "destroy all your plans." We might > be able to hack up fmgr to throw away all references to functions, but > that wouldn't release the memory they use up, making 'discard plans;' > leak like a sieve. What this discussion suggests to me is that cached plans need to be tracked by a resource owner that's linked to the function. The problem I see with this idea is figure out what module would keep track of resowners that need to be reset ... Other than that I think it should be straightforward :-) -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support