Re: [PATCH] Fix conversion for Decimal arguments in plpython functions - Mailing list pgsql-hackers

From Szymon Guz
Subject Re: [PATCH] Fix conversion for Decimal arguments in plpython functions
Date
Msg-id CAFjNrYsotkr0PZQ=t=cbdp8d2vnnY_UNGxP5NswzxXPcCC9cmw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix conversion for Decimal arguments in plpython functions  (Ronan Dunklau <rdunklau@gmail.com>)
List pgsql-hackers
Well, I really don't like the idea of such a dependency.

However it could be added as configuration option, so you could compile postgres with e.g. --with-cdecimal, and then it would be user dependent.
Maybe it is a good idea for another patch.

On 25 June 2013 14:23, Ronan Dunklau <rdunklau@gmail.com> wrote:
Concerning the efficiency problem, it should be noted that the latest 3.3 release of cpython introduces an "accelerator" for decimal data types, as a C-module.  This module was previously available from the Python package index at: https://pypi.python.org/pypi/cdecimal/2.2

It may be overkill to try to include such a dependency, but the performance overhead from using decimal is really mitigated by this implementation.


2013/6/25 Szymon Guz <mabewlun@gmail.com>
On 25 June 2013 05:16, Steve Singer <steve@ssinger.info> wrote:
On 05/28/2013 04:41 PM, Szymon Guz wrote:
Hi,
I've got a patch.

This is for a plpython enhancement.

There is an item at the TODO list http://wiki.postgresql.org/wiki/Todo#Server-Side_Languages
"Fix loss of information during conversion of numeric type to Python float"

This patch uses a decimal.Decimal type from Python standard library for the plpthon function numeric argument instead of float.

Patch contains changes in code, documentation and tests.

Most probably there is something wrong, as this is my first Postgres patch :)


Thanks for contributing.

This patch applies cleanly against master and compiles with warnings

plpy_main.c: In function ‘PLy_init_interp’:
plpy_main.c:157:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
plpy_main.c:161:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]

You can avoid this by moving the declaration of decimal and decimal_dict to be at the top of the function where mainmod is declared.

Also in this function you've introduced places where it returns with an error (the PLy_elog(ERROR...) calls before decrementing the reference to mainmod. I think you can decrement the mainmod reference after the call to SetItemString  before your changes that import the Decimal module.


The patch works as expected, I am able to write python functions that take numerics as arguments and work with them.  I can adjust the decimal context precision inside of  my function.

One concern I have is that this patch makes pl/python functions involving numerics more than 3 times as slow as before.


create temp table b(a numeric);
insert into b select generate_series(1,10000);

create or replace function x(a numeric,b numeric) returns numeric as $$
if a==None:
  return b
return a+b
$$ language plpythonu;
create aggregate sm(basetype=numeric, sfunc=x,stype=numeric);


test=# select sm(a) from b;
    sm
----------
 50005000
(1 row)

Time: 565.650 ms

versus before the patch this was taking in the range of 80ms.

Would it be faster to call numeric_send instead of numeric_out and then convert the sequence of Int16's to a tuple of digits that can be passed into the Decimal constructor? I think this is worth trying and testing,


Documentation
=================
Your patched version of the docs say

  PostgreSQL <type>real</type>, <type>double</type>, and <type>numeric</type> are converted to
   Python <type>Decimal</type>. This type is imported from<literal>decimal.Decimal</literal>.


I don't think this is correct, as far as I can tell your not changing the behaviour for postgresql real and double types, they continue to use floating point.



<listitem>
<para>
       PostgreSQL <type>real</type> and <type>double</type>are converted to
       Python <type>float</type>.
</para>
</listitem>

<listitem>
<para>
       PostgreSQL <type>numeric</type> is converted to
       Python <type>Decimal</type>. This type is imported from
<literal>decimal.Decimal</literal>.
</para>
</listitem>


Hi,
I've attached a new patch. I've fixed all the problems you've found, except for the efficiency problem, which has been described in previous email.

thanks,
Szymon 


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



pgsql-hackers by date:

Previous
From: Ronan Dunklau
Date:
Subject: Re: [PATCH] Fix conversion for Decimal arguments in plpython functions
Next
From: "Yuri Levinsky"
Date:
Subject: Hash partitioning.