Re: Fix memleaks and error handling in jsonb_plpython - Mailing list pgsql-hackers
From | Nikita Glukhov |
---|---|
Subject | Re: Fix memleaks and error handling in jsonb_plpython |
Date | |
Msg-id | fa61eb4a-43be-0573-764d-ccd244688217@postgrespro.ru Whole thread Raw |
In response to | Re: Fix memleaks and error handling in jsonb_plpython (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Fix memleaks and error handling in jsonb_plpython
|
List | pgsql-hackers |
On 05.03.2019 6:45, Michael Paquier wrote:
On Fri, Mar 01, 2019 at 05:24:39AM +0300, Nikita Glukhov wrote:Unfortunately, contrib/jsonb_plpython still contain a lot of problems in error handling that can lead to memory leaks:- not all Python function calls are checked for the success- not in all places PG exceptions are caught to release Python references But it seems that this errors can happen only in OOM case. Attached patch with the fix. Back-patch for PG11 is needed.That looks right to me. Here are some comments. One thing to be really careful of when using PG_TRY/PG_CATCH blocks is that variables modified in the try block and then referenced in the catch block need to be marked as volatile. If you don't do that, the value when reaching the catch part is indeterminate. With your patch the result variable used in two places of PLyObject_FromJsonbContainer() is not marked as volatile. Similarly, it seems that "items" in PLyMapping_ToJsonbValue() and "seq" in "PLySequence_ToJsonbValue" should be volatile because they get changed in the try loop, and referenced afterwards.
I known about this volatility issues, but maybe I incorrectly understand what should be marked as volatile for pointer variables: the pointer itself and/or the memory referenced by it. I thought that only pointer needs to be marked, and also there is message [1] clearly describing what needs to be marked. Previously in PLyMapping_ToJsonbValue() the whole contents of PyObject was marked as volatile, not the pointer itself which is not modified in PG_TRY:
- /* We need it volatile, since we use it after longjmp */ - volatile PyObject *items_v = NULL;
So, I removed volatile qualifier here. Variable 'result' is also not modified in PG_TRY, it is also non-volatile. I marked only 'key' variable in PLyObject_FromJsonbContainer() as volatile, because it is really modified in the loop inside PG_TRY(), and PLyObject_FromJsonbValue(&v2) call after its assignment can throw PG exception: + PyObject *volatile key = NULL;
Also I have idea to introduce a global list of Python objects that need to be dereferenced in PG_CATCH inside PLy_exec_function() in the case of exception. Then typical code will be look like that: PyObject *list = PLy_RegisterObject(PyList_New()); if (!list) return NULL; ... code that can throw PG exception, PG_TRY/PG_CATCH is not needed ... return PLy_UnregisterObject(list); /* returns list */
Another issue: in ltree_plpython we don't check the return state of PyList_SetItem(), which we should complain about I think.
Yes, PyList_SetItem() and PyString_FromStringAndSize() should be checked, but CPython's PyList_SetItem() really should not fail because list storage is preallocated:
int PyList_SetItem(PyObject *op, Py_ssize_t i, PyObject *newitem) { PyObject **p; if (!PyList_Check(op)) { Py_XDECREF(newitem); PyErr_BadInternalCall(); return -1; } if (!valid_index(i, Py_SIZE(op))) { Py_XDECREF(newitem); PyErr_SetString(PyExc_IndexError, "list assignment index out of range"); return -1; } p = ((PyListObject *)op) -> ob_item + i; Py_XSETREF(*p, newitem); return 0; }
[1] https://www.postgresql.org/message-id/31436.1483415248%40sss.pgh.pa.us
pgsql-hackers by date: