Thread: plpython intermittent ImportErrors
Hi, I have a plpython stored procedure which sometimes fails when I run my applications automated test suite. The procedure is called hundreds of times during the tests but only fails a few times, often with the following ImportError: Traceback (most recent call last): File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module> __boot() File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot import sys, imp, os, os.path File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module> import posixpath as path File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module> import stat ImportError: No module named stat Changing the order in which the tests are run, or running tests individually makes the error move/change or disappear. The behaviour is the same with PostgreSQL versions 9.2.2 and 9.1.7. I have tried (but failed) to reproduce this error in a simple .sql script. Outside of the tests, it always seems to work. Having run into a brick wall debugging this, I'm hoping there's someone here who can help? -- Brian Sutherland
On 01/14/2013 08:30 AM, Brian Sutherland wrote: > Hi, > > I have a plpython stored procedure which sometimes fails when I run my > applications automated test suite. The procedure is called hundreds of > times during the tests but only fails a few times, often with the > following ImportError: > > Traceback (most recent call last): > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module> > __boot() > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot > import sys, imp, os, os.path > File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module> > import posixpath as path > File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module> > import stat > ImportError: No module named stat > > Changing the order in which the tests are run, or running tests > individually makes the error move/change or disappear. The behaviour is > the same with PostgreSQL versions 9.2.2 and 9.1.7. > > I have tried (but failed) to reproduce this error in a simple .sql > script. Outside of the tests, it always seems to work. > > Having run into a brick wall debugging this, I'm hoping there's someone > here who can help? Since order seems to be important what test is run prior to the function failing versus the test run when it succeeds? > -- Adrian Klaver adrian.klaver@gmail.com
On Mon, Jan 14, 2013 at 09:05:09AM -0800, Adrian Klaver wrote: > On 01/14/2013 08:30 AM, Brian Sutherland wrote: > >Hi, > > > >I have a plpython stored procedure which sometimes fails when I run my > >applications automated test suite. The procedure is called hundreds of > >times during the tests but only fails a few times, often with the > >following ImportError: > > > > Traceback (most recent call last): > > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module> > > __boot() > > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot > > import sys, imp, os, os.path > > File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module> > > import posixpath as path > > File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module> > > import stat > > ImportError: No module named stat > > > >Changing the order in which the tests are run, or running tests > >individually makes the error move/change or disappear. The behaviour is > >the same with PostgreSQL versions 9.2.2 and 9.1.7. > > > >I have tried (but failed) to reproduce this error in a simple .sql > >script. Outside of the tests, it always seems to work. > > > >Having run into a brick wall debugging this, I'm hoping there's someone > >here who can help? > > Since order seems to be important what test is run prior to the > function failing versus the test run when it succeeds? Experimenting, I can get it down to about 3 tests. At that point it succeeds about 80% of the time and the errors start being more random (i.e. different modules are unimportable). It also starts erroring inside the stored procedure itself rather than at "import site" time. The database backend stops closing the connection immediately. The 2 preceding tests, in this case, do not call the stored procedure (or any plpython code) at all. I'm guessing that it's some kind of race condition, but I wouldn't know where to start looking. -- Brian Sutherland
On 01/14/2013 09:55 AM, Brian Sutherland wrote: >>> >>> Changing the order in which the tests are run, or running tests >>> individually makes the error move/change or disappear. The behaviour is >>> the same with PostgreSQL versions 9.2.2 and 9.1.7. >>> >>> I have tried (but failed) to reproduce this error in a simple .sql >>> script. Outside of the tests, it always seems to work. >>> >>> Having run into a brick wall debugging this, I'm hoping there's someone >>> here who can help? >> >> Since order seems to be important what test is run prior to the >> function failing versus the test run when it succeeds? > > Experimenting, I can get it down to about 3 tests. At that point it > succeeds about 80% of the time and the errors start being more random > (i.e. different modules are unimportable). It also starts erroring > inside the stored procedure itself rather than at "import site" time. > The database backend stops closing the connection immediately. What are the stored procedure errors? > > The 2 preceding tests, in this case, do not call the stored procedure > (or any plpython code) at all. > > I'm guessing that it's some kind of race condition, but I wouldn't know > where to start looking. Just a guess, something is screwing around with sys.path. > -- Adrian Klaver adrian.klaver@gmail.com
On Tue, Jan 15, 2013 at 4:55 AM, Brian Sutherland <brian@vanguardistas.net> wrote: > I'm guessing that it's some kind of race condition, but I wouldn't know > where to start looking. Look for a recursive import (A imports B, B imports A) or multiple threads trying to import simultaneously - Python sometimes has issues with that. Quite a few of those issues were sorted out in recent 3.x versions, but you're using 2.7. ChrisA
On Wed, Jan 16, 2013 at 08:10:26AM +1100, Chris Angelico wrote: > On Tue, Jan 15, 2013 at 4:55 AM, Brian Sutherland > <brian@vanguardistas.net> wrote: > > I'm guessing that it's some kind of race condition, but I wouldn't know > > where to start looking. > > Look for a recursive import (A imports B, B imports A) I've always seen circular imports as deterministic. But I don't think it's this, because at least some of the tracebacks occur when importing standard library code during "import site" at interpreter startup. It's very unlikely there's a circular import there. > or multiple > threads trying to import simultaneously - Python sometimes has issues > with that. Quite a few of those issues were sorted out in recent 3.x > versions, but you're using 2.7. I thought Python, even in 2.7, had an import lock to prevent multiple threads importing simultaneously: http://docs.python.org/2/library/imp.html#imp.lock_held But yes, that could be a lead onto the issue, if the import lock were broken that could result in the behaviour I see. Hmm, but checking in the various modules shows that the import lock is being correctly acquired. > > ChrisA > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Brian Sutherland
On Mon, Jan 14, 2013 at 09:05:09AM -0800, Adrian Klaver wrote: > On 01/14/2013 08:30 AM, Brian Sutherland wrote: > >Hi, > > > >I have a plpython stored procedure which sometimes fails when I run my > >applications automated test suite. The procedure is called hundreds of > >times during the tests but only fails a few times, often with the > >following ImportError: > > > > Traceback (most recent call last): > > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module> > > __boot() > > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot > > import sys, imp, os, os.path > > File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module> > > import posixpath as path > > File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module> > > import stat > > ImportError: No module named stat > > > >Changing the order in which the tests are run, or running tests > >individually makes the error move/change or disappear. The behaviour is > >the same with PostgreSQL versions 9.2.2 and 9.1.7. > > > >I have tried (but failed) to reproduce this error in a simple .sql > >script. Outside of the tests, it always seems to work. > > > >Having run into a brick wall debugging this, I'm hoping there's someone > >here who can help? > > Since order seems to be important what test is run prior to the > function failing versus the test run when it succeeds? I finally got out the big hammer. I applied the attached patch to Python/import.c and started postgres with PYTHONVERBOSE set. I discovered that the import was failing because the fopen() call on the module fails with: # trying /Users/jinty/src/mp/lib/python2.7/linecache.py Error opening file: Too many open files So there's at least one bug here, Python should probably raise an intelligent error message if an import fails because of too many open files. Reported that here: http://bugs.python.org/issue16981 I had a look at the files open by the process, there were not that many, so no leaks or anything. Just an utterly insane OSX default maximum open file descriptors. Running: ulimit -n 4096 before starting PostgreSQL resolved my issue completely. Many thanks to all who helped out! -- Brian Sutherland
Attachment
On 01/16/2013 08:20 AM, Brian Sutherland wrote: > On Mon, Jan 14, 2013 at 09:05:09AM -0800, Adrian Klaver wrote: >> On 01/14/2013 08:30 AM, Brian Sutherland wrote: > > I had a look at the files open by the process, there were not that many, > so no leaks or anything. Just an utterly insane OSX default maximum open > file descriptors. > > Running: > > ulimit -n 4096 > > before starting PostgreSQL resolved my issue completely. > > Many thanks to all who helped out! > Well that came from left field. That was some detective work. Thanks for the follow up. -- Adrian Klaver adrian.klaver@gmail.com
On Mon, Jan 14, 2013 at 11:30 PM, Brian Sutherland <brian@vanguardistas.net> wrote: > Hi, > > I have a plpython stored procedure which sometimes fails when I run my > applications automated test suite. The procedure is called hundreds of > times during the tests but only fails a few times, often with the > following ImportError: > > Traceback (most recent call last): > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module> > __boot() > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot > import sys, imp, os, os.path > File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module> > import posixpath as path > File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module> > import stat > ImportError: No module named stat The first thing that pops out here is that the paths contain references to buildout. Can you confirm that the exception is actually being received from PostgreSQL, or is the error coming from your test harness? Is your plpython stored procedure supposed to be using the environment constructed by buildout, or the system default environment? (we use buildout for our Python code, but our plpythonu stored procedures use the stock standard Python environment, as provided by the Ubuntu packages). If this is the correct environment, it sounds like you are triggering some sort of race condition in the buildout generated .py files. You might be able to confirm and/or work around the issue by getting your own stanza added to the top of the generated site.py, explicitly importing the problematic modules right at the top before any buildout magic happens. -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
On Thu, Jan 17, 2013 at 03:18:09PM +0700, Stuart Bishop wrote: > On Mon, Jan 14, 2013 at 11:30 PM, Brian Sutherland > <brian@vanguardistas.net> wrote: > > Hi, > > > > I have a plpython stored procedure which sometimes fails when I run my > > applications automated test suite. The procedure is called hundreds of > > times during the tests but only fails a few times, often with the > > following ImportError: > > > > Traceback (most recent call last): > > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module> > > __boot() > > File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot > > import sys, imp, os, os.path > > File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module> > > import posixpath as path > > File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module> > > import stat > > ImportError: No module named stat > > The first thing that pops out here is that the paths contain > references to buildout. Yeah, my test environment is undeniably just too complex :( > Can you confirm that the exception is actually being received from > PostgreSQL, or is the error coming from your test harness? It was definitely from PostgreSQL. > Is your plpython stored procedure supposed to be using the environment > constructed by buildout, or the system default environment? I set PYTHONPATH and PYTHONHOME so that postgres uses my buildout environment (which is inside a virtualenv :(:(). The hack I use is undeniably ugly, but does work... > (we use buildout for our Python code, but our plpythonu stored > procedures use the stock standard Python environment, as provided by > the Ubuntu packages). Sadly, I need to get this running on OSX as that's what our developers use. On Ubuntu/Debian, I would have definitely used the system python environment. > If this is the correct environment, it sounds like you are triggering > some sort of race condition in the buildout generated .py files. You > might be able to confirm and/or work around the issue by getting your > own stanza added to the top of the generated site.py, explicitly > importing the problematic modules right at the top before any buildout > magic happens. I mentioned in a different post, but I did manage to resolve the issue. OSX has insanely low limits for max open files. Python hit that limit during the import and hid the real error. Increasing the open file limit with ulimit before starting postgres resolved the issue. I reported a bug about it here: http://bugs.python.org/issue16981 Many thanks for having a look anyway. -- Brian Sutherland
On 17 January 2013 12:30, Brian Sutherland <brian@vanguardistas.net> wrote:
> (we use buildout for our Python code, but our plpythonu stored> procedures use the stock standard Python environment, as provided bySadly, I need to get this running on OSX as that's what our developers
> the Ubuntu packages).
use. On Ubuntu/Debian, I would have definitely used the system python
environment.
Why wouldn't you do the same on OSX? It has Python 2.7 in the base system, doesn't it?
On Thu, Jan 17, 2013 at 01:25:54PM +0100, Alban Hertroys wrote: > On 17 January 2013 12:30, Brian Sutherland <brian@vanguardistas.net> wrote: > > > > (we use buildout for our Python code, but our plpythonu stored > > > procedures use the stock standard Python environment, as provided by > > > the Ubuntu packages). > > > > Sadly, I need to get this running on OSX as that's what our developers > > use. On Ubuntu/Debian, I would have definitely used the system python > > environment. > > > Why wouldn't you do the same on OSX? It has Python 2.7 in the base system, > doesn't it? But the base system doesn't have many Python libraries, so you need a packaging system. Ubuntu/Debian has APT which I am personally very confortable with. -- Brian Sutherland