Thread: pl/python do not delete function arguments
(continuing the flurry of patches) Here's a patch that stops PL/Python from removing the function's arguments from its globals dict after calling it. It's an incremental patch on top of the plpython-refactor patch sent in http://archives.postgresql.org/message-id/4D135170.3080705@wulczer.org. Git branch for this patch: https://github.com/wulczer/postgres/tree/dont-remove-arguments Apart from being useless, as the whole dict is unreffed and thus freed in PLy_procedure_delete, removing args actively breaks things for recursive invocation of the same function. The recursive callee after returning will remove the args from globals, and subsequent access to the arguments in the caller will cause a NameError (see new regression test in patch). Cheers, Jan
Attachment
2010/12/31 Jan Urbański <wulczer@wulczer.org>: > (continuing the flurry of patches) > > Here's a patch that stops PL/Python from removing the function's > arguments from its globals dict after calling it. It's > an incremental patch on top of the plpython-refactor patch sent in > http://archives.postgresql.org/message-id/4D135170.3080705@wulczer.org. > > Git branch for this patch: > https://github.com/wulczer/postgres/tree/dont-remove-arguments > > Apart from being useless, as the whole dict is unreffed and thus freed > in PLy_procedure_delete, removing args actively breaks things for > recursive invocation of the same function. The recursive callee after > returning will remove the args from globals, and subsequent access to > the arguments in the caller will cause a NameError (see new regression > test in patch). I've reviewed this. The patch is old enough to be rejected by patch command, but I manged to apply it by hand. It compiles clean. Added tests pass. I created fibonacci function similar to recursion_test in the patch and confirmed the recursion raises error on 9.0 but not on 9.1. Doc is not with the patch since this change is to remove unnecessary optimization internally. "Ready for Committer" Regards, -- Hitoshi Harada
On 09/02/11 04:52, Hitoshi Harada wrote: > 2010/12/31 Jan Urbański <wulczer@wulczer.org>: >> (continuing the flurry of patches) >> >> Here's a patch that stops PL/Python from removing the function's >> arguments from its globals dict after calling it. It's >> an incremental patch on top of the plpython-refactor patch sent in >> http://archives.postgresql.org/message-id/4D135170.3080705@wulczer.org. >> >> Git branch for this patch: >> https://github.com/wulczer/postgres/tree/dont-remove-arguments >> >> Apart from being useless, as the whole dict is unreffed and thus freed >> in PLy_procedure_delete, removing args actively breaks things for >> recursive invocation of the same function. The recursive callee after >> returning will remove the args from globals, and subsequent access to >> the arguments in the caller will cause a NameError (see new regression >> test in patch). > > I've reviewed this. The patch is old enough to be rejected by patch > command, but I manged to apply it by hand. > It compiles clean. Added tests pass. > I created fibonacci function similar to recursion_test in the patch > and confirmed the recursion raises error on 9.0 but not on 9.1. > Doc is not with the patch since this change is to remove unnecessary > optimization internally. > > "Ready for Committer" Thanks, patch merged with HEAD attached. Jan
Attachment
On ons, 2011-02-09 at 10:02 +0100, Jan Urbański wrote: > On 09/02/11 04:52, Hitoshi Harada wrote: > > 2010/12/31 Jan Urbański <wulczer@wulczer.org>: > >> (continuing the flurry of patches) > >> > >> Here's a patch that stops PL/Python from removing the function's > >> arguments from its globals dict after calling it. It's > >> an incremental patch on top of the plpython-refactor patch sent in > >> http://archives.postgresql.org/message-id/4D135170.3080705@wulczer.org. > >> > >> Git branch for this patch: > >> https://github.com/wulczer/postgres/tree/dont-remove-arguments > >> > >> Apart from being useless, as the whole dict is unreffed and thus freed > >> in PLy_procedure_delete, removing args actively breaks things for > >> recursive invocation of the same function. The recursive callee after > >> returning will remove the args from globals, and subsequent access to > >> the arguments in the caller will cause a NameError (see new regression > >> test in patch). > > > > I've reviewed this. The patch is old enough to be rejected by patch > > command, but I manged to apply it by hand. > > It compiles clean. Added tests pass. > > I created fibonacci function similar to recursion_test in the patch > > and confirmed the recursion raises error on 9.0 but not on 9.1. > > Doc is not with the patch since this change is to remove unnecessary > > optimization internally. > > > > "Ready for Committer" > > Thanks, > > patch merged with HEAD attached. Curiously, without the patch the recursion_test(4) call fails but recursion_test(5) passes. Anyone know why? Btw., I get a KeyError, not a NameError as you say above.
On 14/02/11 21:06, Peter Eisentraut wrote: > On ons, 2011-02-09 at 10:02 +0100, Jan Urbański wrote: >> On 09/02/11 04:52, Hitoshi Harada wrote: >>> 2010/12/31 Jan Urbański <wulczer@wulczer.org>: >>>> (continuing the flurry of patches) >>>> >>>> Here's a patch that stops PL/Python from removing the function's >>>> arguments from its globals dict after calling it. It's >>>> an incremental patch on top of the plpython-refactor patch sent in >>>> http://archives.postgresql.org/message-id/4D135170.3080705@wulczer.org. >>>> >>>> Git branch for this patch: >>>> https://github.com/wulczer/postgres/tree/dont-remove-arguments >>>> >>>> Apart from being useless, as the whole dict is unreffed and thus freed >>>> in PLy_procedure_delete, removing args actively breaks things for >>>> recursive invocation of the same function. The recursive callee after >>>> returning will remove the args from globals, and subsequent access to >>>> the arguments in the caller will cause a NameError (see new regression >>>> test in patch). >>> >>> I've reviewed this. The patch is old enough to be rejected by patch >>> command, but I manged to apply it by hand. >>> It compiles clean. Added tests pass. >>> I created fibonacci function similar to recursion_test in the patch >>> and confirmed the recursion raises error on 9.0 but not on 9.1. >>> Doc is not with the patch since this change is to remove unnecessary >>> optimization internally. >>> >>> "Ready for Committer" >> >> Thanks, >> >> patch merged with HEAD attached. > > Curiously, without the patch the recursion_test(4) call fails but > recursion_test(5) passes. Anyone know why? Damn, I remember that bug and thought I fixed it. I think calls with even numbers passed and calls with odd numbers failed... I'll try to see if a merge error did not introduce that bug back. > Btw., I get a KeyError, not a NameError as you say above. Will look into that too. Cheers, Jan
On 14/02/11 22:13, Jan Urbański wrote: > On 14/02/11 21:06, Peter Eisentraut wrote: >> On ons, 2011-02-09 at 10:02 +0100, Jan Urbański wrote: >>> On 09/02/11 04:52, Hitoshi Harada wrote: >>>> 2010/12/31 Jan Urbański <wulczer@wulczer.org>: >>>>> (continuing the flurry of patches) >>>>> >>>>> Here's a patch that stops PL/Python from removing the function's >>>>> arguments from its globals dict after calling it. It's >>>>> an incremental patch on top of the plpython-refactor patch sent in >>>>> http://archives.postgresql.org/message-id/4D135170.3080705@wulczer.org. >>>>> >>>>> Git branch for this patch: >>>>> https://github.com/wulczer/postgres/tree/dont-remove-arguments >>>>> >>>>> Apart from being useless, as the whole dict is unreffed and thus freed >>>>> in PLy_procedure_delete, removing args actively breaks things for >>>>> recursive invocation of the same function. The recursive callee after >>>>> returning will remove the args from globals, and subsequent access to >>>>> the arguments in the caller will cause a NameError (see new regression >>>>> test in patch). >>>> >>>> I've reviewed this. The patch is old enough to be rejected by patch >>>> command, but I manged to apply it by hand. >>>> It compiles clean. Added tests pass. >>>> I created fibonacci function similar to recursion_test in the patch >>>> and confirmed the recursion raises error on 9.0 but not on 9.1. >>>> Doc is not with the patch since this change is to remove unnecessary >>>> optimization internally. >>>> >>>> "Ready for Committer" >>> >>> Thanks, >>> >>> patch merged with HEAD attached. >> >> Curiously, without the patch the recursion_test(4) call fails but >> recursion_test(5) passes. Anyone know why? Baah, damn, did not read the "without patch" part. The problem is that every *second* call to the function fails, regardless of the number. The first execution succeeds, but then PLy_delete_args deletes the argument from the globals, and when the next execution tries to fetch "n" from it, it raises a KeyError. Jan
On mån, 2011-02-14 at 22:22 +0100, Jan Urbański wrote: > The problem is that every *second* call to the function fails, > regardless of the number. The first execution succeeds, but then > PLy_delete_args deletes the argument from the globals, and when the > next execution tries to fetch "n" from it, it raises a KeyError. This isn't quite right either, because it obviously depends on the recursion somehow. So in SELECT recursion_test(5); SELECT recursion_test(4); it is the first recursive invocation of the (4) call that fails. If you just do SELECT recursion_test(1); SELECT recursion_test(1); SELECT recursion_test(1); nothing fails. (We'd have noticed that sooner, obviously. ;-) ) But in SELECT recursion_test(1); SELECT recursion_test(4); SELECT recursion_test(1); it's the last (1) call, which is not recursive, that fails. Your patch obviously fixes all that, but I'm wondering if we have another problem with the procedure caching somehow. And I'm also wondering what to put into the commit message. :-/
----- Original message ----- > On mån, 2011-02-14 at 22:22 +0100, Jan Urbański wrote: > > The problem is that every *second* call to the function fails, > > regardless of the number. The first execution succeeds, but then > > PLy_delete_args deletes the argument from the globals, and when the > > next execution tries to fetch "n" from it, it raises a KeyError. > > This isn't quite right either, because it obviously depends on the > recursion somehow. So in > > SELECT recursion_test(5); > SELECT recursion_test(4); > > it is the first recursive invocation of the (4) call that fails. If you > just do > > SELECT recursion_test(1); > SELECT recursion_test(1); > SELECT recursion_test(1); > > nothing fails. (We'd have noticed that sooner, obviously. ;-) ) Isn't that because with 1 there is no recursion, i.e. plpy.execute never gets called from Python? > But in > > SELECT recursion_test(1); > SELECT recursion_test(4); > SELECT recursion_test(1); > > it's the last (1) call, which is not recursive, that fails. Because the invocation that actually recurses sets up the scene for failure. Jan
On tis, 2011-02-15 at 09:58 +0100, Jan Urbański wrote: > Because the invocation that actually recurses sets up the scene for > failure. That's what we're observing, but I can't figure out why it is. If you can, could you explain it? It actually makes sense to me that the arguments should be deleted at the end of the call. The data belongs to that call only, and PLy_procedure_delete() that would otherwise clean it up is only called rarely. Apparently, the recursive call ends up deleting the wrong arguments, but it's not clear to me why that would affect the next top-level call, because that would set up its own arguments again anyway. In any case, perhaps the right fix is to fix PLy_function_delete_args() to delete the args correctly.
On 15/02/11 20:39, Peter Eisentraut wrote: > On tis, 2011-02-15 at 09:58 +0100, Jan Urbański wrote: >> Because the invocation that actually recurses sets up the scene for >> failure. > > That's what we're observing, but I can't figure out why it is. If you > can, could you explain it? > > It actually makes sense to me that the arguments should be deleted at > the end of the call. The data belongs to that call only, and > PLy_procedure_delete() that would otherwise clean it up is only called > rarely. > > Apparently, the recursive call ends up deleting the wrong arguments, but > it's not clear to me why that would affect the next top-level call, > because that would set up its own arguments again anyway. In any case, > perhaps the right fix is to fix PLy_function_delete_args() to delete the > args correctly. Aaah, ok, I got it (again). Let me write this in full before I forget and spend another hour chasing that bug (and boy, bugs that disappear because you're doing things in the debugger are so annoying). And actually, my patch doesn't fix it fully :| Let me demonstrate: CREATE FUNCTION rec(n integer) RETURNS integer AS $$ if n == 0: return plpy.notice("before n is %d" % n) plpy.execute("select rec(0)") plpy.notice("after n is %d" % n) $$ LANGUAGE plpythonu; Without the patch the second plpy.notice raises a NameError. With the patch the output is: NOTICE: before n is 4 CONTEXT: PL/Python function "rec" NOTICE: after n is 0 CONTEXT: PL/Python function "rec" What happens? In PLy_function_handler, PLy_function_build_args is called, and proc->globals is set. After that PLy_procedure_call is called, which starts executing Python code. The Python code does a call into C with plpy.execute, and PLy_function_handler gets called (a reentrant call). Then PLy_function_build_args is called again. It overwrites the "n" entry in proc->globals and then PLy_procedure_call gets called, which drops us back into Python (on the stack there's now C, Python, C, Python). This second invocation exits quickly because n == 0, and we're back in C. Now without my patch, the next thing to happen was deleting the arguments, which removed "n" from the proc->globals dict. The rest of C code runs and finally plpy.execute returns and we;re back in Python (the stack is C, Python). The second plpy.notice is run, which fetches "n" from the globals, and not finding it, raises a NameError. With the patch it simply fetches the overwritten value, namely 0. The KeyError was a red herring - that's how Python reacted when evaluating "n in (0, 1)", and if you look in the server log you'll see a RuntimeWarning complaining about something internal, that doesn't matter. The bottom line is that PLy_procedure_call is not reentrant because of proc->globals, and it has to be. Now when fixing this bug I tries copying the globals dict and restoring it, but ran into issues (I think the problem was that the function didn't like running with different globals then the one it has been compiled with). Not sure what to do with this :( Document it as a caveat (with or without my patch) and carry on? That sucks quite badly... Jan
On Tue, Feb 15, 2011 at 6:04 PM, Jan Urbański <wulczer@wulczer.org> wrote: > On 15/02/11 20:39, Peter Eisentraut wrote: >> On tis, 2011-02-15 at 09:58 +0100, Jan Urbański wrote: >>> Because the invocation that actually recurses sets up the scene for >>> failure. >> >> That's what we're observing, but I can't figure out why it is. If you >> can, could you explain it? >> >> It actually makes sense to me that the arguments should be deleted at >> the end of the call. The data belongs to that call only, and >> PLy_procedure_delete() that would otherwise clean it up is only called >> rarely. >> >> Apparently, the recursive call ends up deleting the wrong arguments, but >> it's not clear to me why that would affect the next top-level call, >> because that would set up its own arguments again anyway. In any case, >> perhaps the right fix is to fix PLy_function_delete_args() to delete the >> args correctly. > > Aaah, ok, I got it (again). Let me write this in full before I forget > and spend another hour chasing that bug (and boy, bugs that disappear > because you're doing things in the debugger are so annoying). And > actually, my patch doesn't fix it fully :| Let me demonstrate: > > CREATE FUNCTION rec(n integer) RETURNS integer AS $$ > if n == 0: > return > plpy.notice("before n is %d" % n) > plpy.execute("select rec(0)") > plpy.notice("after n is %d" % n) > $$ LANGUAGE plpythonu; > > Without the patch the second plpy.notice raises a NameError. With the > patch the output is: > > NOTICE: before n is 4 > CONTEXT: PL/Python function "rec" > NOTICE: after n is 0 > CONTEXT: PL/Python function "rec" > > What happens? In PLy_function_handler, PLy_function_build_args is > called, and proc->globals is set. After that PLy_procedure_call is > called, which starts executing Python code. The Python code does a call > into C with plpy.execute, and PLy_function_handler gets called (a > reentrant call). > > Then PLy_function_build_args is called again. It overwrites the "n" > entry in proc->globals and then PLy_procedure_call gets called, which > drops us back into Python (on the stack there's now C, Python, C, > Python). This second invocation exits quickly because n == 0, and we're > back in C. > > Now without my patch, the next thing to happen was deleting the > arguments, which removed "n" from the proc->globals dict. The rest of C > code runs and finally plpy.execute returns and we;re back in Python (the > stack is C, Python). > > The second plpy.notice is run, which fetches "n" from the globals, and > not finding it, raises a NameError. With the patch it simply fetches the > overwritten value, namely 0. > > The KeyError was a red herring - that's how Python reacted when > evaluating "n in (0, 1)", and if you look in the server log you'll see a > RuntimeWarning complaining about something internal, that doesn't > matter. The bottom line is that PLy_procedure_call is not reentrant > because of proc->globals, and it has to be. > > Now when fixing this bug I tries copying the globals dict and restoring > it, but ran into issues (I think the problem was that the function > didn't like running with different globals then the one it has been > compiled with). Not sure what to do with this :( Document it as a caveat > (with or without my patch) and carry on? That sucks quite badly... From this discussion I gather that we have a problem here that we don't exactly know how to fix, so I'm inclined to suggest that we mark this Returned with Feedback in the CommitFest and instead add it to the TODO. Since this is a pre-existing bug and not a new regression, it should not be something we hold up beta for. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
----- Original message ----- > On Tue, Feb 15, 2011 at 6:04 PM, Jan Urbański <wulczer@wulczer.org> > wrote: > > On 15/02/11 20:39, Peter Eisentraut wrote: > > > On tis, 2011-02-15 at 09:58 +0100, Jan Urbański wrote: > > > > [a bug that we don't know how to fix] > From this discussion I gather that we have a problem here that we > don't exactly know how to fix, so I'm inclined to suggest that we mark > this Returned with Feedback in the CommitFest and instead add it to > the TODO. Since this is a pre-existing bug and not a new regression, > it should not be something we hold up beta for. I'm officially at a loss on how to fix that bug without some serious gutting of how PL/Python arguments work. If someonecomes up with a brilliant way to solve this problem, we can commit it after beta, or even during the 9.2 cycle (shouldthe brilliant solution be backpatcheable). Jan
2011/2/26 Jan Urbański <wulczer@wulczer.org>: > ----- Original message ----- >> On Tue, Feb 15, 2011 at 6:04 PM, Jan Urbański <wulczer@wulczer.org> >> wrote: >> > On 15/02/11 20:39, Peter Eisentraut wrote: >> > > On tis, 2011-02-15 at 09:58 +0100, Jan Urbański wrote: >> > > > [a bug that we don't know how to fix] > >> From this discussion I gather that we have a problem here that we >> don't exactly know how to fix, so I'm inclined to suggest that we mark >> this Returned with Feedback in the CommitFest and instead add it to >> the TODO. Since this is a pre-existing bug and not a new regression, >> it should not be something we hold up beta for. > > I'm officially at a loss on how to fix that bug without some serious gutting of how PL/Python arguments work. If someonecomes up with a brilliant way to solve this problem, we can commit it after beta, or even during the 9.2 cycle (shouldthe brilliant solution be backpatcheable). Is this discussion related to the following todo item: Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> > > On Tue, Feb 15, 2011 at 6:04 PM, Jan Urbański <wulczer@wulczer.org> > > > wrote: > > > > On 15/02/11 20:39, Peter Eisentraut wrote: > > > > > On tis, 2011-02-15 at 09:58 +0100, Jan Urbański wrote: > > > > > > [a bug that we don't know how to fix] > > > > > From this discussion I gather that we have a problem here that we > > > don't exactly know how to fix, so I'm inclined to suggest that we > > > mark this Returned with Feedback in the CommitFest and instead add > > > it to the TODO. Since this is a pre-existing bug and not a new > > > regression, it should not be something we hold up beta for. > > > > I'm officially at a loss on how to fix that bug without some serious > > gutting of how PL/Python arguments work. If someone comes up with a > > brilliant way to solve this problem, we can commit it after beta, or > > even during the 9.2 cycle (should the brilliant solution be > > backpatcheable). > > Is this discussion related to the following todo item: > > Create a new restricted execution class that will allow passing > function arguments in as locals. Passing them as globals means > functions cannot be called recursively. Yep, the bug it's more or less an emanation of this problem.
On lör, 2011-02-26 at 09:43 +0100, Jan Urbański wrote: > I'm officially at a loss on how to fix that bug without some serious > gutting of how PL/Python arguments work. If someone comes up with a > brilliant way to solve this problem, we can commit it after beta, or > even during the 9.2 cycle (should the brilliant solution be > backpatcheable). We'd essentially be trading off freeing something too soon with freeing it not at all. I'm not sure how good that tradeoff is.