Thread: TODO request: multi-dimensional arrays in PL/pythonU
All, Currently PL/python has 1 dimension hardcoded for returning arrays: create or replace function nparr () returns float[][] language plpythonu as $f$ from numpy import array x = ((1.0,2.0),(3.0,4.0),(5.0,6.0),) return x $f$; josh=# select nparr() ; ERROR: invalid input syntax for type double precision: "(1.0, 2.0)" CONTEXT: while creating return value PL/Python function "nparr" josh=# I'd like to add the following TODO to the TODO list: PL/Python [] Allow functions to return multi-dimensional arrays from lists or numpy arrays. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Tue, 2013-08-13 at 14:30 -0700, Josh Berkus wrote: > Currently PL/python has 1 dimension hardcoded for returning arrays: > > create or replace function nparr () > returns float[][] > language plpythonu > as $f$ > from numpy import array > x = ((1.0,2.0),(3.0,4.0),(5.0,6.0),) > return x > $f$; There is no way to know how many dimensions the function expects to get back. (float[][] doesn't actually mean anything.) So when converting the return value back to SQL, you'd have to guess, is the first element convertible to float (how do you know?), if not, does it support the sequence protocol, if yes, so let's try to construct a multidimensional array. What if the first element is a float but the second is not? It would be useful to have a solution for that, but it would need to be more principled than what I just wrote.
On Wed, Aug 14, 2013 at 9:34 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On Tue, 2013-08-13 at 14:30 -0700, Josh Berkus wrote: >> Currently PL/python has 1 dimension hardcoded for returning arrays: >> >> create or replace function nparr () >> returns float[][] >> language plpythonu >> as $f$ >> from numpy import array >> x = ((1.0,2.0),(3.0,4.0),(5.0,6.0),) >> return x >> $f$; > > There is no way to know how many dimensions the function expects to get > back. (float[][] doesn't actually mean anything.) So when converting > the return value back to SQL, you'd have to guess, is the first element > convertible to float (how do you know?), if not, does it support the > sequence protocol, if yes, so let's try to construct a multidimensional > array. What if the first element is a float but the second is not? > > It would be useful to have a solution for that, but it would need to be > more principled than what I just wrote. ndarray has a shape attribute. Perhaps they could be supported if they follow the ndarray-like protocol? (ie: have a shape attribute)
> There is no way to know how many dimensions the function expects to get > back. (float[][] doesn't actually mean anything.) So when converting > the return value back to SQL, you'd have to guess, is the first element > convertible to float (how do you know?), if not, does it support the > sequence protocol, if yes, so let's try to construct a multidimensional > array. What if the first element is a float but the second is not? > > It would be useful to have a solution for that, but it would need to be > more principled than what I just wrote. Well, PL/R is able to return multi-dim arrays. So we have some code precedent for this. Mind you, there's fewer checks required for PL/R, because like Postgres it requires each dimension of the array to have identical length and all items to be the same type. Given that, it might be easier to support this first for numpy, which also has the same restrictions. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com