Thread: plpython intermittent ImportErrors

plpython intermittent ImportErrors

From
Brian Sutherland
Date:
Hi,

I have a plpython stored procedure which sometimes fails when I run my
applications automated test suite. The procedure is called hundreds of
times during the tests but only fails a few times, often with the
following ImportError:

    Traceback (most recent call last):
      File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module>
        __boot()
      File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot
        import sys, imp, os, os.path
      File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module>
        import posixpath as path
      File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module>
        import stat
    ImportError: No module named stat

Changing the order in which the tests are run, or running tests
individually makes the error move/change or disappear. The behaviour is
the same with PostgreSQL versions 9.2.2 and 9.1.7.

I have tried (but failed) to reproduce this error in a simple .sql
script. Outside of the tests, it always seems to work.

Having run into a brick wall debugging this, I'm hoping there's someone
here who can help?

--
Brian Sutherland


Re: plpython intermittent ImportErrors

From
Adrian Klaver
Date:
On 01/14/2013 08:30 AM, Brian Sutherland wrote:
> Hi,
>
> I have a plpython stored procedure which sometimes fails when I run my
> applications automated test suite. The procedure is called hundreds of
> times during the tests but only fails a few times, often with the
> following ImportError:
>
>      Traceback (most recent call last):
>        File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module>
>          __boot()
>        File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot
>          import sys, imp, os, os.path
>        File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module>
>          import posixpath as path
>        File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module>
>          import stat
>      ImportError: No module named stat
>
> Changing the order in which the tests are run, or running tests
> individually makes the error move/change or disappear. The behaviour is
> the same with PostgreSQL versions 9.2.2 and 9.1.7.
>
> I have tried (but failed) to reproduce this error in a simple .sql
> script. Outside of the tests, it always seems to work.
>
> Having run into a brick wall debugging this, I'm hoping there's someone
> here who can help?

Since order seems to be important what test is run prior to the function
failing versus the test run when it succeeds?

>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: plpython intermittent ImportErrors

From
Brian Sutherland
Date:
On Mon, Jan 14, 2013 at 09:05:09AM -0800, Adrian Klaver wrote:
> On 01/14/2013 08:30 AM, Brian Sutherland wrote:
> >Hi,
> >
> >I have a plpython stored procedure which sometimes fails when I run my
> >applications automated test suite. The procedure is called hundreds of
> >times during the tests but only fails a few times, often with the
> >following ImportError:
> >
> >     Traceback (most recent call last):
> >       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module>
> >         __boot()
> >       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot
> >         import sys, imp, os, os.path
> >       File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module>
> >         import posixpath as path
> >       File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module>
> >         import stat
> >     ImportError: No module named stat
> >
> >Changing the order in which the tests are run, or running tests
> >individually makes the error move/change or disappear. The behaviour is
> >the same with PostgreSQL versions 9.2.2 and 9.1.7.
> >
> >I have tried (but failed) to reproduce this error in a simple .sql
> >script. Outside of the tests, it always seems to work.
> >
> >Having run into a brick wall debugging this, I'm hoping there's someone
> >here who can help?
>
> Since order seems to be important what test is run prior to the
> function failing versus the test run when it succeeds?

Experimenting, I can get it down to about 3 tests. At that point it
succeeds about 80% of the time and the errors start being more random
(i.e. different modules are unimportable).  It also starts erroring
inside the stored procedure itself rather than at "import site" time.
The database backend stops closing the connection immediately.

The 2 preceding tests, in this case, do not call the stored procedure
(or any plpython code) at all.

I'm guessing that it's some kind of race condition, but I wouldn't know
where to start looking.

--
Brian Sutherland


Re: plpython intermittent ImportErrors

From
Adrian Klaver
Date:
On 01/14/2013 09:55 AM, Brian Sutherland wrote:

>>>
>>> Changing the order in which the tests are run, or running tests
>>> individually makes the error move/change or disappear. The behaviour is
>>> the same with PostgreSQL versions 9.2.2 and 9.1.7.
>>>
>>> I have tried (but failed) to reproduce this error in a simple .sql
>>> script. Outside of the tests, it always seems to work.
>>>
>>> Having run into a brick wall debugging this, I'm hoping there's someone
>>> here who can help?
>>
>> Since order seems to be important what test is run prior to the
>> function failing versus the test run when it succeeds?
>
> Experimenting, I can get it down to about 3 tests. At that point it
> succeeds about 80% of the time and the errors start being more random
> (i.e. different modules are unimportable).  It also starts erroring
> inside the stored procedure itself rather than at "import site" time.
> The database backend stops closing the connection immediately.

What are the stored procedure errors?

>
> The 2 preceding tests, in this case, do not call the stored procedure
> (or any plpython code) at all.
>
> I'm guessing that it's some kind of race condition, but I wouldn't know
> where to start looking.

Just a guess, something is screwing around with sys.path.

>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: plpython intermittent ImportErrors

From
Chris Angelico
Date:
On Tue, Jan 15, 2013 at 4:55 AM, Brian Sutherland
<brian@vanguardistas.net> wrote:
> I'm guessing that it's some kind of race condition, but I wouldn't know
> where to start looking.

Look for a recursive import (A imports B, B imports A) or multiple
threads trying to import simultaneously - Python sometimes has issues
with that. Quite a few of those issues were sorted out in recent 3.x
versions, but you're using 2.7.

ChrisA


Re: plpython intermittent ImportErrors

From
Brian Sutherland
Date:
On Wed, Jan 16, 2013 at 08:10:26AM +1100, Chris Angelico wrote:
> On Tue, Jan 15, 2013 at 4:55 AM, Brian Sutherland
> <brian@vanguardistas.net> wrote:
> > I'm guessing that it's some kind of race condition, but I wouldn't know
> > where to start looking.
>
> Look for a recursive import (A imports B, B imports A)

I've always seen circular imports as deterministic.

But I don't think it's this, because at least some of the tracebacks
occur when importing standard library code during "import site" at
interpreter startup. It's very unlikely there's a circular import there.

> or multiple
> threads trying to import simultaneously - Python sometimes has issues
> with that. Quite a few of those issues were sorted out in recent 3.x
> versions, but you're using 2.7.

I thought Python, even in 2.7, had an import lock to prevent multiple
threads importing simultaneously:

    http://docs.python.org/2/library/imp.html#imp.lock_held

But yes, that could be a lead onto the issue, if the import lock were
broken that could result in the behaviour I see.

Hmm, but checking in the various modules shows that the import lock is
being correctly acquired.

>
> ChrisA
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

--
Brian Sutherland


Re: plpython intermittent ImportErrors RESOLVED

From
Brian Sutherland
Date:
On Mon, Jan 14, 2013 at 09:05:09AM -0800, Adrian Klaver wrote:
> On 01/14/2013 08:30 AM, Brian Sutherland wrote:
> >Hi,
> >
> >I have a plpython stored procedure which sometimes fails when I run my
> >applications automated test suite. The procedure is called hundreds of
> >times during the tests but only fails a few times, often with the
> >following ImportError:
> >
> >     Traceback (most recent call last):
> >       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module>
> >         __boot()
> >       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot
> >         import sys, imp, os, os.path
> >       File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module>
> >         import posixpath as path
> >       File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module>
> >         import stat
> >     ImportError: No module named stat
> >
> >Changing the order in which the tests are run, or running tests
> >individually makes the error move/change or disappear. The behaviour is
> >the same with PostgreSQL versions 9.2.2 and 9.1.7.
> >
> >I have tried (but failed) to reproduce this error in a simple .sql
> >script. Outside of the tests, it always seems to work.
> >
> >Having run into a brick wall debugging this, I'm hoping there's someone
> >here who can help?
>
> Since order seems to be important what test is run prior to the
> function failing versus the test run when it succeeds?

I finally got out the big hammer. I applied the attached patch to
Python/import.c and started postgres with PYTHONVERBOSE set. I
discovered that the import was failing because the fopen() call on the
module fails with:

    # trying /Users/jinty/src/mp/lib/python2.7/linecache.py
    Error opening file: Too many open files

So there's at least one bug here, Python should probably raise an
intelligent error message if an import fails because of too many open
files. Reported that here: http://bugs.python.org/issue16981

I had a look at the files open by the process, there were not that many,
so no leaks or anything. Just an utterly insane OSX default maximum open
file descriptors.

Running:

    ulimit -n 4096

before starting PostgreSQL resolved my issue completely.

Many thanks to all who helped out!

--
Brian Sutherland

Attachment

Re: plpython intermittent ImportErrors RESOLVED

From
Adrian Klaver
Date:
On 01/16/2013 08:20 AM, Brian Sutherland wrote:
> On Mon, Jan 14, 2013 at 09:05:09AM -0800, Adrian Klaver wrote:
>> On 01/14/2013 08:30 AM, Brian Sutherland wrote:

>
> I had a look at the files open by the process, there were not that many,
> so no leaks or anything. Just an utterly insane OSX default maximum open
> file descriptors.
>
> Running:
>
>      ulimit -n 4096
>
> before starting PostgreSQL resolved my issue completely.
>
> Many thanks to all who helped out!
>

Well that came from left field. That was some detective work. Thanks for
the follow up.

--
Adrian Klaver
adrian.klaver@gmail.com


Re: plpython intermittent ImportErrors

From
Stuart Bishop
Date:
On Mon, Jan 14, 2013 at 11:30 PM, Brian Sutherland
<brian@vanguardistas.net> wrote:
> Hi,
>
> I have a plpython stored procedure which sometimes fails when I run my
> applications automated test suite. The procedure is called hundreds of
> times during the tests but only fails a few times, often with the
> following ImportError:
>
>     Traceback (most recent call last):
>       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module>
>         __boot()
>       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot
>         import sys, imp, os, os.path
>       File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module>
>         import posixpath as path
>       File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module>
>         import stat
>     ImportError: No module named stat

The first thing that pops out here is that the paths contain
references to buildout.

Can you confirm that the exception is actually being received from
PostgreSQL, or is the error coming from your test harness?

Is your plpython stored procedure supposed to be using the environment
constructed by buildout, or the system default environment?

(we use buildout for our Python code, but our plpythonu stored
procedures use the stock standard Python environment, as provided by
the Ubuntu packages).

If this is the correct environment, it sounds like you are triggering
some sort of race condition in the buildout generated .py files. You
might be able to confirm and/or work around the issue by getting your
own stanza added to the top of the generated site.py, explicitly
importing the problematic modules right at the top before any buildout
magic happens.


--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/


Re: plpython intermittent ImportErrors

From
Brian Sutherland
Date:
On Thu, Jan 17, 2013 at 03:18:09PM +0700, Stuart Bishop wrote:
> On Mon, Jan 14, 2013 at 11:30 PM, Brian Sutherland
> <brian@vanguardistas.net> wrote:
> > Hi,
> >
> > I have a plpython stored procedure which sometimes fails when I run my
> > applications automated test suite. The procedure is called hundreds of
> > times during the tests but only fails a few times, often with the
> > following ImportError:
> >
> >     Traceback (most recent call last):
> >       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 73, in <module>
> >         __boot()
> >       File "/Users/jinty/.buildout/eggs/setuptools-0.6c11-py2.7.egg/site.py", line 2, in __boot
> >         import sys, imp, os, os.path
> >       File "/Users/jinty/src/mp/lib/python2.7/os.py", line 49, in <module>
> >         import posixpath as path
> >       File "/Users/jinty/src/mp/lib/python2.7/posixpath.py", line 15, in <module>
> >         import stat
> >     ImportError: No module named stat
>
> The first thing that pops out here is that the paths contain
> references to buildout.

Yeah, my test environment is undeniably just too complex :(

> Can you confirm that the exception is actually being received from
> PostgreSQL, or is the error coming from your test harness?

It was definitely from PostgreSQL.

> Is your plpython stored procedure supposed to be using the environment
> constructed by buildout, or the system default environment?

I set PYTHONPATH and PYTHONHOME so that postgres uses my buildout
environment (which is inside a virtualenv :(:(). The hack I use is
undeniably ugly, but does work...

> (we use buildout for our Python code, but our plpythonu stored
> procedures use the stock standard Python environment, as provided by
> the Ubuntu packages).

Sadly, I need to get this running on OSX as that's what our developers
use. On Ubuntu/Debian, I would have definitely used the system python
environment.

> If this is the correct environment, it sounds like you are triggering
> some sort of race condition in the buildout generated .py files. You
> might be able to confirm and/or work around the issue by getting your
> own stanza added to the top of the generated site.py, explicitly
> importing the problematic modules right at the top before any buildout
> magic happens.

I mentioned in a different post, but I did manage to resolve the issue.
OSX has insanely low limits for max open files. Python hit that limit
during the import and hid the real error. Increasing the open file limit
with ulimit before starting postgres resolved the issue. I reported a
bug about it here:

    http://bugs.python.org/issue16981

Many thanks for having a look anyway.

--
Brian Sutherland


Re: plpython intermittent ImportErrors

From
Alban Hertroys
Date:
On 17 January 2013 12:30, Brian Sutherland <brian@vanguardistas.net> wrote:
> (we use buildout for our Python code, but our plpythonu stored
> procedures use the stock standard Python environment, as provided by
> the Ubuntu packages).

Sadly, I need to get this running on OSX as that's what our developers
use. On Ubuntu/Debian, I would have definitely used the system python
environment.
 
Why wouldn't you do the same on OSX? It has Python 2.7 in the base system, doesn't it?

Re: plpython intermittent ImportErrors

From
Brian Sutherland
Date:
On Thu, Jan 17, 2013 at 01:25:54PM +0100, Alban Hertroys wrote:
> On 17 January 2013 12:30, Brian Sutherland <brian@vanguardistas.net> wrote:
>
> > > (we use buildout for our Python code, but our plpythonu stored
> > > procedures use the stock standard Python environment, as provided by
> > > the Ubuntu packages).
> >
> > Sadly, I need to get this running on OSX as that's what our developers
> > use. On Ubuntu/Debian, I would have definitely used the system python
> > environment.
>
>
> Why wouldn't you do the same on OSX? It has Python 2.7 in the base system,
> doesn't it?

But the base system doesn't have many Python libraries, so you need a
packaging system.  Ubuntu/Debian has APT which I am personally very
confortable with.

--
Brian Sutherland