Re: [PATCH] Fix conversion for Decimal arguments in plpython functions - Mailing list pgsql-hackers

From Szymon Guz
Subject Re: [PATCH] Fix conversion for Decimal arguments in plpython functions
Date
Msg-id CAFjNrYuZL74uRAey2_7Dk3CYNfhO7mzS3a_8cPYntds3=V1HrA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix conversion for Decimal arguments in plpython functions  (Steve Singer <steve@ssinger.info>)
List pgsql-hackers
On 25 June 2013 05:16, Steve Singer <steve@ssinger.info> wrote:

One concern I have is that this patch makes pl/python functions involving numerics more than 3 times as slow as before.


create temp table b(a numeric);
insert into b select generate_series(1,10000);

create or replace function x(a numeric,b numeric) returns numeric as $$
if a==None:
  return b
return a+b
$$ language plpythonu;
create aggregate sm(basetype=numeric, sfunc=x,stype=numeric);


test=# select sm(a) from b;
    sm
----------
 50005000
(1 row)

Time: 565.650 ms

versus before the patch this was taking in the range of 80ms.

Would it be faster to call numeric_send instead of numeric_out and then convert the sequence of Int16's to a tuple of digits that can be passed into the Decimal constructor? I think this is worth trying and testing,


Hi,
thanks for all the remarks.

I think I cannot do anything about speeding up the code. What I've found so far is:

I cannot use simple fields from NumericVar in my code, so to not waste time on something not sensible, I've tried to found out if using the tuple constructor for decimal.Decimal will be faster. I've changed the function to something like this:

static PyObject *
PLyDecimal_FromNumeric(PLyDatumToOb *arg, Datum d)
{       
        PyObject *digits = PyTuple_New(4);
        PyTuple_SetItem(digits, 0, PyInt_FromLong(1));
        PyTuple_SetItem(digits, 1, PyInt_FromLong(4));
        PyTuple_SetItem(digits, 2, PyInt_FromLong(1));
        PyTuple_SetItem(digits, 3, PyInt_FromLong(4));

        PyObject *tuple = PyTuple_New(3);
        PyTuple_SetItem(tuple, 0, PyInt_FromLong(1));
        PyTuple_SetItem(tuple, 1, digits);
        PyTuple_SetItem(tuple, 2, PyInt_FromLong(-3));
       
        value = PyObject_CallFunctionObjArgs(PLy_decimal_ctor_global, tuple, NULL);

        return value;
}

Yes, it returns the same value regardless the params. The idea is to call Python code like:

Decimal((0, (1,  4, 1, 4), -3))

which is simply:

Decimal('1.414')

Unfortunately this is not faster. It is as slow as it was with string constructor.

I've checked the speed of decimal.Decimal using pure python. For this I used a simple function, similar to yours:


def x(a, b):
    if a is None:
        return b
    return a + b

I've run the tests using simple ints:


def test():
    a = 0
    for i in xrange(0, 10000):
        a += x(a, i)


for a in xrange(1, 100):
    test()


And later I've run the same function, but with converting the arguments to Decimals:

from decimal import Decimal


def x(a, b):
    if a is None:
        return b
    return a + b


def test():
    a = 0
    for i in xrange(0, 10000):
        a += x(Decimal(a), Decimal(i))


for a in xrange(1, 100):
    test()

It was run 100 times for decreasing the impact of test initialization.

The results for both files are:
int:  0.697s
decimal: 38.859s

What gives average time for one function call of:
int: 69ms
decimal: 380ms


For me the problem is with slow code at Python's side, the Decimal constructors are pretty slow, and there is nothing I can do with that at the Postgres' side.


I will send patch with fixes later.


thanks,
Szymon

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: PostgreSQL 9.3 latest dev snapshot
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: enable new error fields in plpgsql (9.4)