Thread: DISCARD ALL ; stored procedures

DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
Greetings,
 Looking at discard all, I was a bit suprised that 'DISCARD PLANS;' doesn't clear out cached stored procedures.  To be
honest,that's one of the main reasons I'd see to use it.  I thought there had been some discussion in the archives
relatedto invalidating stored procedure plans due to catalog or other changes, I would have thought it'd be possible to
hookinto that to do the same on a DISCARD PLANS;.
 
 Thoughts?  Is there an issue doing that?  It certainly seems like it'd be a lot better than what he current
documentationrequires:
 
 When necessary, the cache can be flushed by starting a fresh database session.
 Maybe we could use 'DISCARD PLPLANS;' or something, if people feel there needs to be a seperate way to clear those.
     Thanks,
    Stephen

Re: DISCARD ALL ; stored procedures

From
Merlin Moncure
Date:
On Thu, Jan 6, 2011 at 3:20 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Greetings,
>
>  Looking at discard all, I was a bit suprised that 'DISCARD PLANS;'
>  doesn't clear out cached stored procedures.  To be honest, that's one
>  of the main reasons I'd see to use it.  I thought there had been some
>  discussion in the archives related to invalidating stored procedure
>  plans due to catalog or other changes, I would have thought it'd be
>  possible to hook into that to do the same on a DISCARD PLANS;.
>
>  Thoughts?  Is there an issue doing that?  It certainly seems like it'd
>  be a lot better than what he current documentation requires:
>
>  When necessary, the cache can be flushed by starting a fresh database
>  session.
>
>  Maybe we could use 'DISCARD PLPLANS;' or something, if people feel
>  there needs to be a seperate way to clear those.

this is a problem. under what circumstances would you want to discard
them and why?  the main problem I see with cached plpgsql plans is
interactions with search_path -- but DISCARD might not be the best way
to attack that problem.  There might be other reasons though.

merlin


Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
* Merlin Moncure (mmoncure@gmail.com) wrote:
> this is a problem. under what circumstances would you want to discard
> them and why?  the main problem I see with cached plpgsql plans is
> interactions with search_path -- but DISCARD might not be the best way
> to attack that problem.  There might be other reasons though.

interaction w/ search_path (or, rather, lack of respect for it..) is
exactly the issue here for me.
Thanks,
    Stephen

Re: DISCARD ALL ; stored procedures

From
Merlin Moncure
Date:
On Thu, Jan 6, 2011 at 4:30 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Merlin Moncure (mmoncure@gmail.com) wrote:
>> this is a problem. under what circumstances would you want to discard
>> them and why?  the main problem I see with cached plpgsql plans is
>> interactions with search_path -- but DISCARD might not be the best way
>> to attack that problem.  There might be other reasons though.
>
> interaction w/ search_path (or, rather, lack of respect for it..) is
> exactly the issue here for me.

this has been discussed a couple of times -- a plausible alternative
might be to adjust the plan caching mechanism to organize the plan
cache around search_path.  that way you get a separate plan per
search_path instance.

discard has zero backwards compatibility issues but has one big
problem -- if you are using combination of connection pooling, lots of
plpgsql and search_path manipulation, you take a big performance hit.
in other words, even if you can discard everything., do you really
want to?

merlin


Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
* Merlin Moncure (mmoncure@gmail.com) wrote:
> this has been discussed a couple of times -- a plausible alternative
> might be to adjust the plan caching mechanism to organize the plan
> cache around search_path.  that way you get a separate plan per
> search_path instance.

That would certainly be fine for me.  To be honest, I feel like I've
even suggested that in the past, somewhere.

> discard has zero backwards compatibility issues but has one big
> problem -- if you are using combination of connection pooling, lots of
> plpgsql and search_path manipulation, you take a big performance hit.
> in other words, even if you can discard everything., do you really
> want to?

I don't see how this can be an unnecessary performance hit.  You might
argue that I should redesign things to not work this way, but that's a
whole different discussion.  At the moment the options are:

- switch to using execute
- force a full database reconnect

To get the 'correct' behavior.

If it's "performance" vs. "correctness", you can guess what I'm going to
vote for, however, in this case, I can't see how either of the other
options would perform better than a discard-like approach.  If people
are already using 'discard all;' then they're already throwing away
their plans for prepared queries, it strikes me as unlikely that they'd
have an issue with also getting rid of stored procedure plans.  If they
do, they could certainly use the individual 'discard' statements
instead (presuming we implement this with a new discard argument).
Thanks
    Stephen

Re: DISCARD ALL ; stored procedures

From
Robert Haas
Date:
On Thu, Jan 6, 2011 at 5:22 PM, Stephen Frost <sfrost@snowman.net> wrote:
> If it's "performance" vs. "correctness", you can guess what I'm going to
> vote for, however, in this case, I can't see how either of the other
> options would perform better than a discard-like approach.  If people
> are already using 'discard all;' then they're already throwing away
> their plans for prepared queries, it strikes me as unlikely that they'd
> have an issue with also getting rid of stored procedure plans.  If they
> do, they could certainly use the individual 'discard' statements
> instead (presuming we implement this with a new discard argument).

If DISCARD ALL doesn't flush this stuff, I'd consider that an outright
bug.  Does it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> If DISCARD ALL doesn't flush this stuff, I'd consider that an outright
> bug.  Does it?

No, it does not, based on my testing against 8.4.5:

Simple function:

----------------------------
CREATE OR REPLACE FUNCTION test_func() RETURNS boolean
AS $_$
DECLARE rec RECORD;
BEGIN
   SELECT INTO rec statefp FROM edges LIMIT 1;
   raise notice 'rec: %', rec;
   RETURN TRUE;
END;
$_$ LANGUAGE plpgsql;
----------------------------

Couple of tables, note the CHECK constraints on statefp,which is what the function pulls out:
----------------------------
gis*=> \d tiger_02.addrfn                                Table "tiger_01.addrfn" Column  |         Type          |
               Modifiers                        
----------+-----------------------+------------------------------------------------------gid      | integer
 | not null default nextval('addrfn_gid_seq'::regclass)arid     | character varying(22) | linearid | character
varying(22)| statefp  | character varying(2)  | not null default '01'::character varying 
Indexes:   "addrfn_pkey" PRIMARY KEY, btree (gid)
Check constraints:   "addrfn_statefp_check" CHECK (statefp::text = '01'::text)
Inherits: tiger_us.addrfn
----------------------------

----------------------------
gis*=> \d tiger_02.addrfn                                Table "tiger_02.addrfn" Column  |         Type          |
               Modifiers                        
----------+-----------------------+------------------------------------------------------gid      | integer
 | not null default nextval('addrfn_gid_seq'::regclass)arid     | character varying(22) | linearid | character
varying(22)| statefp  | character varying(2)  | not null default '02'::character varying 
Indexes:   "addrfn_pkey" PRIMARY KEY, btree (gid)
Check constraints:   "addrfn_statefp_check" CHECK (statefp::text = '02'::text)
Inherits: tiger_us.addrfn
----------------------------

See the results:
gis=> \i ./qs.sql
CREATE FUNCTION
gis*=> set search_path to tiger_01;
SET
gis*=> set search_path to tiger_01,sfrost;
SET
gis*=> select test_func();
NOTICE:  rec: (01)test_func
-----------t
(1 row)

gis*=> commit;
COMMIT
gis=> discard all;
DISCARD ALL
gis=> set search_path to tiger_02,sfrost;
SET
gis*=> select test_func();
NOTICE:  rec: (01)test_func
-----------t
(1 row)

The addrfn table in the tiger_02 schema certainly can not have a
statefp of 01 due to the CHECK constraint (and it believe me, it's
right).

To be honest, I agree it's a bug, and I would *love* to have it
back-patched, but I could see an argument for it to be something
explicit from DISCARD PLANS; and would hence require a grammar
change which isn't something we'd typically back-patch.

Making it part of DISCARD PLANS; and back-patching it to 8.3 where
DISCARD was introduced would be awesome for me. :)
Thanks,
    Stephen

Re: DISCARD ALL ; stored procedures

From
Robert Haas
Date:
On Fri, Jan 7, 2011 at 8:43 AM, Stephen Frost <sfrost@snowman.net> wrote:
> To be honest, I agree it's a bug, and I would *love* to have it
> back-patched, but I could see an argument for it to be something
> explicit from DISCARD PLANS; and would hence require a grammar
> change which isn't something we'd typically back-patch.

That argument doesn't carry much water with me, because it's hard for
me to imagine a situation where you need to discard one of these
things but discarding the other one also causes a problem.  I'm also
pretty unexcited about adding more stuff to the grammar to cover such
a marginal borderline case.  Adding unreserved keywords is pretty
cheap, but it's not totally free.  If we were going to do it I'd
suggest DISCARD LANGUAGE PLANS or something like that, but I don't
really see a reason to go there at all.

Really it seems to me that changing the search path ought to discard
anything that might have been done differently had the search path
been different, but I don't think that's back-patch material.

> Making it part of DISCARD PLANS; and back-patching it to 8.3 where
> DISCARD was introduced would be awesome for me. :)

I'd need to look at this more closely before committing anything, but
at first blush I think that's reasonable.  Have a patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> Really it seems to me that changing the search path ought to discard
> anything that might have been done differently had the search path
> been different, but I don't think that's back-patch material.

I like that idea, but I agree it wouldn't be back-patchable, and I could
see arguments against it also (convoluting the GUC mechanics, etc).

> > Making it part of DISCARD PLANS; and back-patching it to 8.3 where
> > DISCARD was introduced would be awesome for me. :)
>
> I'd need to look at this more closely before committing anything, but
> at first blush I think that's reasonable.  Have a patch?

Sadly, no..  To be honest, I was fully expecting a response of "that's
hard to do."  I'm not sure we have any mechanics in place for throwing
away stored procedure plans, but I'll go look and see if I can come up
with something.  Would *love* to get this fixed.
Thanks!
    Stephen

Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
* Stephen Frost (sfrost@snowman.net) wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > > Making it part of DISCARD PLANS; and back-patching it to 8.3 where
> > > DISCARD was introduced would be awesome for me. :)
> >
> > I'd need to look at this more closely before committing anything, but
> > at first blush I think that's reasonable.  Have a patch?
>
> To be honest, I was fully expecting a response of "that's hard to do."

Soo, yeah, I found the "this is hard" part.  Basically, the plan
cacheing, etc, is all handled by the stored procedures themselves, and
we havn't got any way to tell a PL "destroy all your plans."  We might
be able to hack up fmgr to throw away all references to functions, but
that wouldn't release the memory they use up, making 'discard plans;'
leak like a sieve.

What's worse is that most PLs actually *also* allocate a bunch of stuff
in TopMemoryContext, meaning even if we figured out what contexts are
used by the PLs, we still couldn't nuke them.  Personally, I'm not a fan
of the PLs doing that (perhaps there's some performance reason?  dunno).

A few approaches come to mind for dealing with this-

#1. Add a new 'Top-but-removed-on-DISCARD' context and modify the PLs to   use that instead of TopMemoryContext and
requireany other contextsthey create to be children of it. 

#2. Add another entry point to the PLs in pg_pltemplate.h which is a   function to be called on DISCARD.

#3. Add a way for PLs to request a context similar to #1, but would be   on a per-PL basis.  This would involve adding
anew function tommgr/, or adding a new parameter for creating a context. 

I like the idea of having a context-per-PL, but I don't know that I can
justify the complications necessary to make it happen.  Given that
they're all already dumping things in TopMemoryContext, at least moving
them out of *that* would improve the situation, so my inclination is to
do #1.  Of course, user-added PLs wouldn't get this benefit and would
still leak memory after a DISCARD until they're updated.

After the above is figured out, we should be able to go through and
make fmgr get cleaned up after a DISCARD.

Thoughts?
Stephen

Re: DISCARD ALL ; stored procedures

From
Robert Haas
Date:
On Fri, Jan 7, 2011 at 1:29 PM, Stephen Frost <sfrost@snowman.net> wrote:
> #1. Add a new 'Top-but-removed-on-DISCARD' context and modify the PLs to
>    use that instead of TopMemoryContext and require any other contexts
>        they create to be children of it.

I'm guessing that just resetting the memory context is going to result
in things breaking all over the place - the PL might have dangling
pointers into the context.  And it might have other resources that we
don't know about.  Thus I think we need:

> #2. Add another entry point to the PLs in pg_pltemplate.h which is a
>    function to be called on DISCARD.

...except I think that the principal thing you need to modify is
pl_language, rather than pl_pltemplate.

If we go this route, then (1) it can't be back-patched, obviously, and
(2) we need to think a little bit harder about what we're asking to
have discarded, because I think it's going to be a lot more than just
cached plans.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Jan 7, 2011 at 1:29 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > #1. Add a new 'Top-but-removed-on-DISCARD' context and modify the PLs to
> >    use that instead of TopMemoryContext and require any other contexts
> >        they create to be children of it.
>
> I'm guessing that just resetting the memory context is going to result
> in things breaking all over the place - the PL might have dangling
> pointers into the context.

After looking through the code more, we actually already say "use this
context for stuff you allocate in fn_extra", but it doesn't look like
the PLs are respecting or using that.  We do have a function which
resets fn_extra already (fmgr_finfo_copy) but I'm not sure under what
conditions it's used and I'm not sure why it doesn't leak memory by
doing that.

If we can figure out the list of functions that have been called, get at
all of their fn_extra pointers to set them to NULL, and nuke the context
that they're created in, that should work.  The PLs in core appear to be
good about using fn_extra and resetting it should be sufficient to force
a recompile of the stored procedures.  It also looks like they shouldn't
have any issue surviving that reset.

> And it might have other resources that we
> don't know about.  Thus I think we need:

This is certainly a concern and would be a reason to offer a seperate
function for the PLs to use, but I'm not sure we need to jump there
right away.  I'd like to see if the core/contrib PLs can all handle the
above approach and then see if third-party PLs complain.

> > #2. Add another entry point to the PLs in pg_pltemplate.h which is a
> >    function to be called on DISCARD.
>
> ...except I think that the principal thing you need to modify is
> pl_language, rather than pl_pltemplate.

Right, sorry.

> If we go this route, then (1) it can't be back-patched, obviously, and
> (2) we need to think a little bit harder about what we're asking to
> have discarded, because I think it's going to be a lot more than just
> cached plans.

I'm not ready to give up quite yet, but I agree that we might end up
there.
Thanks,
    Stephen

Re: DISCARD ALL ; stored procedures

From
Stephen Frost
Date:
All,

Alright, so, the whole fn_extra stuff seems to be unrelated..  I'm not
sure when it's used (perhaps multiple calls to the same function in a
given query?), but the PLs have their own hash tables that they use for
storing functions that have been called.  I had assumed that was done
through fmgr, but apparently not (or at least, I can't find where..).
I'm starting to wonder if we're trying to do too much with this
though.  If all the PLs have to go through SPI to *get* plans (at least
ones we care about), perhaps we could just use SPI to implement the
plan invalidation?

Consider if we saved the DISCARD's transaction ID and store the
last-discard txn (or whenever the function was first prepared) in the
result of the SPI prepare and then detect if we need to switch to
replanning the query in SPI_execute_plan instead of just executing it.
Of course, we'd have to have enough info *to* replan it, but we should
be able to manage that.

Thoughts?
Stephen

Re: DISCARD ALL ; stored procedures

From
Robert Haas
Date:
On Sat, Jan 8, 2011 at 4:28 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Thoughts?

Unfortunately, we've officially exceeded my level of knowledge to the
point where I can't comment intelligently.  Sorry.... :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: DISCARD ALL ; stored procedures

From
Alvaro Herrera
Date:
Excerpts from Stephen Frost's message of vie ene 07 15:29:52 -0300 2011:
> * Stephen Frost (sfrost@snowman.net) wrote:
> > * Robert Haas (robertmhaas@gmail.com) wrote:
> > > > Making it part of DISCARD PLANS; and back-patching it to 8.3 where
> > > > DISCARD was introduced would be awesome for me. :)
> > > 
> > > I'd need to look at this more closely before committing anything, but
> > > at first blush I think that's reasonable.  Have a patch?
> > 
> > To be honest, I was fully expecting a response of "that's hard to do."
> 
> Soo, yeah, I found the "this is hard" part.  Basically, the plan
> cacheing, etc, is all handled by the stored procedures themselves, and
> we havn't got any way to tell a PL "destroy all your plans."  We might
> be able to hack up fmgr to throw away all references to functions, but
> that wouldn't release the memory they use up, making 'discard plans;'
> leak like a sieve.

What this discussion suggests to me is that cached plans need to be
tracked by a resource owner that's linked to the function.  The problem
I see with this idea is figure out what module would keep track of
resowners that need to be reset ...  Other than that I think it should
be straightforward :-)

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support