Re: PL/Python adding support for multi-dimensional arrays - Mailing list pgsql-hackers

From Dave Cramer
Subject Re: PL/Python adding support for multi-dimensional arrays
Date
Msg-id CADK3HH+8bgY4s=9m7GrKyknr+24CAGqo36Dfca043V_q89rHsw@mail.gmail.com
Whole thread Raw
In response to Re: PL/Python adding support for multi-dimensional arrays  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: PL/Python adding support for multi-dimensional arrays
List pgsql-hackers

On 10 August 2016 at 01:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko@pivotal.io>:
On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <agrishchenko@pivotal.io> wrote:
Hi

Current implementation of PL/Python does not allow the use of multi-dimensional arrays, for both input and output parameters. This forces end users to introduce workarounds like casting arrays to text before passing them to the functions and parsing them after, which is an error-prone approach

This patch adds support for multi-dimensional arrays as both input and output parameters for PL/Python functions. The number of dimensions supported is limited by Postgres MAXDIM macrovariable, by default equal to 6. Both input and output multi-dimensional arrays should have fixed dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays represent MxNxK cube, etc.

This patch does not support multi-dimensional arrays of composite types, as composite types in Python might be represented as iterators and there is no obvious way to find out when the nested array stops and composite type structure starts. For example, if we have a composite type of (int, text), we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and it is hard to find out that the first two lists are lists, and the third one represents structure. Things are getting even more complex when you have arrays as members of composite type. This is why I think this limitation is reasonable.

Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[] AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;

Before patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
ERROR:  cannot convert multidimensional array to Python list
DETAIL:  PL/Python only supports one-dimensional arrays.
CONTEXT:  PL/Python function "test_type_conversion_array_int4"

After patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
INFO:  ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
 test_type_conversion_array_int4 
---------------------------------
 {{1,2,3},{4,5,6}}
(1 row)

--
Best regards,
Alexey Grishchenko

Also this patch incorporates the fix for https://www.postgresql.org/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3xq-2NR9jUfv3qwHA%40mail.gmail.com, as they touch the same piece of code - array manipulation in PL/Python


I am sending review of this patch:

1. The implemented functionality is clearly benefit - passing MD arrays, pretty faster passing bigger arrays
2. I was able to use this patch cleanly without any errors or warnings
3. There is no any error or warning
4. All tests passed - I tested Python 2.7 and Python 3.5
5. The code is well commented and clean
6. For this new functionality the documentation is not necessary

7. I invite more regress tests for both directions (Python <-> Postgres) for more than two dimensions

My only one objection is not enough regress tests - after fixing this patch will be ready for commiters.

Good work, Alexey

Thank you

Regards

Pavel
 
--
Best regards,
Alexey Grishchenko



Pavel,

I will pick this up.

pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Fix for PL/Python slow input arrays traversal issue
Next
From: Pavel Stehule
Date:
Subject: Re: IF (NOT) EXISTS in psql-completion