Thread: plpython issue with Win64 (PG 9.2)
Hi,
On Windows 7 64bit, plpython is causing server crash with the following test case i.e.
CREATE PROCEDURAL LANGUAGE 'plpython3u';
CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if a > b:
return a
return b
$$ LANGUAGE plpython3u;
SELECT pymax(1, 2);
Server exit with the following exception i.e.
Unhandled exception at 0x777a3483 in postgres.exe: 0xC00000FD: Stack overflow.
plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int * tb_depth) Line 174 C
plpython3.dll!PLy_elog(int elevel, const char * fmt, ...) Line 67 C
plpython3.dll!PLyUnicode_AsString(_object * unicode) Line 96 C
plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int * tb_depth) Line 176 + 0x8 bytes C
plpython3.dll!PLy_elog(int elevel, const char * fmt, ...) Line 67 C
plpython3.dll!PLyUnicode_AsString(_object * unicode) Line 96 C
...
...
plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int * tb_depth) Line 176 + 0x8 bytes C
plpython3.dll!PLy_elog(int elevel, const char * fmt, ...) Line 67 C
plpython3.dll!PLyUnicode_AsString(_object * unicode) Line 96 C
plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int * tb_depth) Line 176 + 0x8 bytes C
Dbserver get stuck in the following call loop i.e.
... PLy_elog() -> PLy_traceback() -> PLyUnicode_AsString() -> PLyUnicode_Bytes() -> PLy_elog() ...
I think primary reason that trigger this issue is when Function PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server encoding*/, ) " it fails with null. I built latest pg 9.2 source code with python 3.2.2.3 by using Visual Studio 2010. Thanks.
Best Regards,
Muhammad Asif Naeem
On 27/06/12 11:51, Asif Naeem wrote: > Hi, > > On Windows 7 64bit, plpython is causing server crash with the following > test case i.e. > > CREATE PROCEDURAL LANGUAGE 'plpython3u'; >> CREATE OR REPLACE FUNCTION pymax (a integer, b integer) >> RETURNS integer >> AS $$ >> if a> b: >> return a >> return b >> $$ LANGUAGE plpython3u; >> SELECT pymax(1, 2); > > I think primary reason that trigger this issue is when Function > PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server > encoding*/, ) " it fails with null. I built latest pg 9.2 source code with > python 3.2.2.3 by using Visual Studio 2010. Thanks. I'll try to reproduce this on Linux, which should be possible given the results of your investigation. Cheers, Jan
On 27/06/12 13:57, Jan Urbański wrote: > On 27/06/12 11:51, Asif Naeem wrote: >> Hi, >> >> On Windows 7 64bit, plpython is causing server crash with the following >> test case i.e. >> >> CREATE PROCEDURAL LANGUAGE 'plpython3u'; >>> CREATE OR REPLACE FUNCTION pymax (a integer, b integer) >>> RETURNS integer >>> AS $$ >>> if a> b: >>> return a >>> return b >>> $$ LANGUAGE plpython3u; >>> SELECT pymax(1, 2); > >> >> I think primary reason that trigger this issue is when Function >> PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server >> encoding*/, ) " it fails with null. I built latest pg 9.2 source code >> with >> python 3.2.2.3 by using Visual Studio 2010. Thanks. > > I'll try to reproduce this on Linux, which should be possible given the > results of your investigation. Your analysis is correct, I managed to reproduce this by injecting serverenc = "win1252"; into PLyUnicode_Bytes. The comment in that function says that Python understands all PostgreSQL encoding names except for SQL_ASCII, but that's not really true. In your case GetDatabaseEncodingName() returns "WIN1252" and Python accepts "CP125". I'm wondering how this should be fixed. Just by adding more special cases in PLyUnicode_Bytes? Even if we add a switch statement that would convert PG_WIN1250 into "CP1250", Python can still raise an exception when encoding (for various reasons). How about replacing the PLy_elog there with just an elog? This loses traceback support and the Python exception message, which could be helpful for debugging (something like "invalid character <foo> for encoding cp1250"). OTOH, I'm uneasy about invoking the entire PLy_elog machinery from a function that's as low-level as PLyUnicode_Bytes. Lastly, we map SQL_ASCII to "ascii" which is arguably wrong. The function is supposed to return bytes in the server encoding, and under SQL_ASCII that probably means we can return anything (ie. use any encoding we deem useful). Using "ascii" as the Python codec name will raise an error on anything that has the high bit set. So: I'd add code to translate WINxxx into CPxxx when choosing the Python to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the SQL_ASCII case alone, as there were no complaints and people using SQL_ASCII are asking for it anyway. Cheers, Jan
Thank you. Please do let me know once fix check-in. I will test it and share feedback with you. Thanks.
Asif Naeem
On Fri, Jun 29, 2012 at 3:36 AM, Jan Urbański <wulczer@wulczer.org> wrote:
Your analysis is correct, I managed to reproduce this by injectingOn 27/06/12 13:57, Jan Urbański wrote:On 27/06/12 11:51, Asif Naeem wrote:Hi,
On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.
CREATE PROCEDURAL LANGUAGE 'plpython3u';CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if a> b:
return a
return b
$$ LANGUAGE plpython3u;
SELECT pymax(1, 2);
I think primary reason that trigger this issue is when Function
PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server
encoding*/, ) " it fails with null. I built latest pg 9.2 source code
with
python 3.2.2.3 by using Visual Studio 2010. Thanks.
I'll try to reproduce this on Linux, which should be possible given the
results of your investigation.
serverenc = "win1252";
into PLyUnicode_Bytes. The comment in that function says that Python understands all PostgreSQL encoding names except for SQL_ASCII, but that's not really true. In your case GetDatabaseEncodingName() returns "WIN1252" and Python accepts "CP125".
I'm wondering how this should be fixed. Just by adding more special cases in PLyUnicode_Bytes?
Even if we add a switch statement that would convert PG_WIN1250 into "CP1250", Python can still raise an exception when encoding (for various reasons). How about replacing the PLy_elog there with just an elog? This loses traceback support and the Python exception message, which could be helpful for debugging (something like "invalid character <foo> for encoding cp1250"). OTOH, I'm uneasy about invoking the entire PLy_elog machinery from a function that's as low-level as PLyUnicode_Bytes.
Lastly, we map SQL_ASCII to "ascii" which is arguably wrong. The function is supposed to return bytes in the server encoding, and under SQL_ASCII that probably means we can return anything (ie. use any encoding we deem useful). Using "ascii" as the Python codec name will raise an error on anything that has the high bit set.
So: I'd add code to translate WINxxx into CPxxx when choosing the Python to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the SQL_ASCII case alone, as there were no complaints and people using SQL_ASCII are asking for it anyway.
Cheers,
Jan
On 29/06/12 00:36, Jan Urbański wrote: > On 27/06/12 13:57, Jan Urbański wrote: >> On 27/06/12 11:51, Asif Naeem wrote: >>> Hi, >>> >>> On Windows 7 64bit, plpython is causing server crash with the following >>> test case i.e. > So: I'd add code to translate WINxxx into CPxxx when choosing the Python > to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the > SQL_ASCII case alone, as there were no complaints and people using > SQL_ASCII are asking for it anyway. Since no one commented, I'll produce a patch to that effect. I believe this should go into 9.2 given that otherwise PL/Python will basically crash any database using the CP12xx encoding. Cheers, Jan
On 03/07/12 17:45, Jan Urbański wrote: > On 29/06/12 00:36, Jan Urbański wrote: >> On 27/06/12 13:57, Jan Urbański wrote: >>> On 27/06/12 11:51, Asif Naeem wrote: >>>> Hi, >>>> >>>> On Windows 7 64bit, plpython is causing server crash with the following >>>> test case i.e. > >> So: I'd add code to translate WINxxx into CPxxx when choosing the Python >> to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the >> SQL_ASCII case alone, as there were no complaints and people using >> SQL_ASCII are asking for it anyway. > > Since no one commented, I'll produce a patch to that effect. I believe > this should go into 9.2 given that otherwise PL/Python will basically > crash any database using the CP12xx encoding. Patch attached. Asif, could you try a few things on a CP1252 database? First verify if your original test case now works and then try this: create function enctest() returns text as $$ return b'tr\xc3\xb3spido'.decode('utf-8') $$ language plpython3u; select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); Thanks, Jan
Attachment
Patch attached. Asif, could you try a few things on a CP1252 database?
First verify if your original test case now works and then try this:
I have test the patch on Win64. postgres server is working fine now for WIN1252. Thanks.
create function enctest() returns text as $$
return b'tr\xc3\xb3spido'.decode('utf-8')
$$ language plpython3u;
select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex');
create function enctest() returns text as $$
return b'tr\xc3\xb3spido'.decode('utf-8')
$$ language plpython3u;
select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex');
enctest | encode
----------+--------------------
tróspido | 7472c3b3737069646f
(1 row)
Please do let me know If you have any other query. Thanks.
Best Regards,
Muhammad Asif Naeem
On 04/07/12 13:58, Asif Naeem wrote: >> Patch attached. Asif, could you try a few things on a CP1252 database? > > First verify if your original test case now works and then try this: >> >> > I have test the patch on Win64. postgres server is working fine now for > WIN1252. Thanks. > > >> create function enctest() returns text as $$ >> return b'tr\xc3\xb3spido'.decode('**utf-8') >> $$ language plpython3u; >> >> select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); >> > > create function enctest() returns text as $$ > return b'tr\xc3\xb3spido'.decode('utf-8') > $$ language plpython3u; > select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); > enctest | encode > ----------+-------------------- > tróspido | 7472c3b3737069646f > (1 row) > > Please do let me know If you have any other query. Thanks. Great, this looks correct. Can we apply this to 9.2? Cheers, Jan
On 04.07.2012 15:11, Jan Urbański wrote: > On 04/07/12 13:58, Asif Naeem wrote: >> I have test the patch on Win64. postgres server is working fine now for >> WIN1252. Thanks. >> >>> create function enctest() returns text as $$ >>> return b'tr\xc3\xb3spido'.decode('**utf-8') >>> $$ language plpython3u; >>> >>> select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); >>> >> >> create function enctest() returns text as $$ >> return b'tr\xc3\xb3spido'.decode('utf-8') >> $$ language plpython3u; >> select enctest(), encode(convert_to(enctest(), 'utf-8'), 'hex'); >> enctest | encode >> ----------+-------------------- >> tróspido | 7472c3b3737069646f >> (1 row) >> >> Please do let me know If you have any other query. Thanks. > > Great, this looks correct. > > Can we apply this to 9.2? Committed. This bug was present in versions >= 9.0, so backpatched. I used ereport() rather than elog() in the error message. Correct me if that was wrong, but the point was to avoid PLy_elog(), because that might cause recursion, and ereport() should be ok. I believe the message should be translated, as it's quite possible to get that error, at least if you use SQL_ASCII, so ereport() is more approriate than elog(). Thanks! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 05/07/12 21:37, Heikki Linnakangas wrote: > Committed. This bug was present in versions >= 9.0, so backpatched. Thanks! > I used ereport() rather than elog() in the error message. Correct me if > that was wrong, but the point was to avoid PLy_elog(), because that > might cause recursion, and ereport() should be ok. I believe the message > should be translated, as it's quite possible to get that error, at least > if you use SQL_ASCII, so ereport() is more approriate than elog(). Yes, you're absolutely right. Cheers, Jan