Thread: Array of tuples as a parameter, with type casts

Array of tuples as a parameter, with type casts

From
Vladimir Ryabtsev
Date:
I have a query like this:

query = '''
insert into t
select * from unnest (%s)
as t1(c1 timestamp, c2 int)
'''

The reason of the approach is obviously reducing the number of server roundtrips when inserting many rows.
Usage:

from datetime import datetime
import psycopg2
db = psycopg2.connect('postgres://postgres:******@host/postgres?sslmode=prefer')
cur = db.cursor()
cur.execute(query, ([(datetime.now(), 1), (datetime.now(), 2)],))
db.commit()
db.close()

Recently they needed to extend the column set by a text and a bigint columns. Neither of them works:

from datetime import datetime
import psycopg2
query = '''insert into t
select * from unnest (%s)
as t1(c1 timestamp, c2 int, c3 text, c4 bigint)
'''
db = psycopg2.connect('postgres://postgres@localhost/postgres')
cur = db.cursor()
cur.execute(query, ([(datetime.now(), 1, 'abc', 100), (datetime.now(), 2, 'xyz', 200)],))
db.commit()
db.close()

It throws:
psycopg2.errors.DatatypeMismatch: function return row and query-specified return row do not match
DETAIL:  Returned type unknown at ordinal position 3, but query expects text.

The problem is that the library sends the following request to the DBMS:

insert into t
select * from unnest (ARRAY[('2019-11-29T19:48:19.683310'::timestamp, 1, 'abc', 100),('2019-11-29T19:48:19.683320'::timestamp, 2, 'xyz', 200)])
as t1(c1 timestamp, c2 int, c3 text, c4 bigint)

For whatever reason it fails with the above error, but OK, it is related to Postgres, not to psycopg2.
I can make it work by specifying type casts for text and bigint columns:

insert into t
select * from unnest (ARRAY[('2019-11-29T19:48:19.683310'::timestamp, 1, 'abc'::text, 100::bigint),('2019-11-29T19:48:19.683320'::timestamp, 2, 'xyz'::text, 200::bigint)])
as t1(c1 timestamp, c2 int, c3 text, c4 bigint);

But I need a way to make the psycopg2 module to do that. Another workaround would be creation a row type for the desired set of columns and casting %s to this type[], but I would not like to create additional objects in the database because it is pretty much out of my control.

Your help is very appreciated.

P.S. I am aware of other solutions such as execute_batch(), execute_values(), etc. Take this question as a theoretical one, I just want to understand if user is able to control this particular aspect of the module.

psycopg2-binary==2.8.4
Postgres ~ any

Re: Array of tuples as a parameter, with type casts

From
Adrian Klaver
Date:
On 11/29/19 8:30 PM, Vladimir Ryabtsev wrote:
> I have a query like this:
> 
> query = '''
> insert into t
> select * from unnest (%s)
> as t1(c1 timestamp, c2 int)
> '''
> 
> The reason of the approach is obviously reducing the number of server 
> roundtrips when inserting many rows.
> Usage:
> 
> from datetime import datetime
> import psycopg2
> db = 
> psycopg2.connect('postgres://postgres:******@host/postgres?sslmode=prefer')
> cur = db.cursor()
> cur.execute(query, ([(datetime.now(), 1), (datetime.now(), 2)],))
> db.commit()
> db.close()
> 
> Recently they needed to extend the column set by a text and a bigint 
> columns. Neither of them works:
> 
> from datetime import datetime
> import psycopg2
> query = '''insert into t
> select * from unnest (%s)
> as t1(c1 timestamp, c2 int, c3 text, c4 bigint)
> '''
> db = psycopg2.connect('postgres://postgres@localhost/postgres')
> cur = db.cursor()
> cur.execute(query, ([(datetime.now(), 1, 'abc', 100), (datetime.now(), 
> 2, 'xyz', 200)],))
> db.commit()
> db.close()
> 
> It throws:
> psycopg2.errors.DatatypeMismatch: function return row and 
> query-specified return row do not match
> DETAIL:  Returned type unknown at ordinal position 3, but query expects 
> text.
> 
> The problem is that the library sends the following request to the DBMS:
> 
> insert into t
> select * from unnest (ARRAY[('2019-11-29T19:48:19.683310'::timestamp, 1, 
> 'abc', 100),('2019-11-29T19:48:19.683320'::timestamp, 2, 'xyz', 200)])
> as t1(c1 timestamp, c2 int, c3 text, c4 bigint)
> 
> For whatever reason it fails with the above error, but OK, it is related 
> to Postgres, not to psycopg2.
> I can make it work by specifying type casts for text and bigint columns:
> 
> insert into t
> select * from unnest (ARRAY[('2019-11-29T19:48:19.683310'::timestamp, 1, 
> 'abc'::text, 100::bigint),('2019-11-29T19:48:19.683320'::timestamp, 2, 
> 'xyz'::text, 200::bigint)])
> as t1(c1 timestamp, c2 int, c3 text, c4 bigint);

On a hunch try changing:

'''insert into t
select * from unnest (%s)
as t1(c1 timestamp, c2 int, c3 text, c4 bigint)'''

to

'''insert into t
select * from unnest (%s)
as t1(c1 timestamp, c2 int, c3 varchar, c4 bigint)'''

> 
> But I need a way to make the psycopg2 module to do that. Another 
> workaround would be creation a row type for the desired set of columns 
> and casting %s to this type[], but I would not like to create additional 
> objects in the database because it is pretty much out of my control.
> 
> Your help is very appreciated.
> 
> P.S. I am aware of other solutions such as execute_batch(), 
> execute_values(), etc. Take this question as a theoretical one, I just 
> want to understand if user is able to control this particular aspect of 
> the module.
> 
> psycopg2-binary==2.8.4
> Postgres ~ any


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Array of tuples as a parameter, with type casts

From
Vladimir Ryabtsev
Date:
It does not change much, the error message just says it now expects 'character varying'.
Anyway bigint problem is the next in the line.

--
Regards,
Vladimir

Re: Array of tuples as a parameter, with type casts

From
Adrian Klaver
Date:
On 12/3/19 4:00 PM, Vladimir Ryabtsev wrote:
> It does not change much, the error message just says it now expects 
> 'character varying'.
> Anyway bigint problem is the next in the line.

Yeah it is related to this:

https://www.postgresql.org/docs/11/functions-array.html
unnest(anyarray)     setof anyelement

Dealing with anyelement can be tricky.

I tried out an alternate solution:

json_query = "select * from json_to_recordset(%s) as t(a timestamp, b 
varchar, c bigint)"

cur.execute(json_query,[Json([{'a': datetime.now().isoformat(), 'b': 
'abc', 'c': 100},{'a': datetime.now().isoformat(), 'b': 'xyz', 'c': 200}])])

Which results in:

test_(postgres)# select * from json_to_recordset(E'[{"a": 
"2019-12-04T10:40:22.823910", "b": "abc", "c": 100}, {"a": 
"2019-12-04T10:40:22.823918", "b": "xyz", "c": 200}]') as t(a timestamp, 
b varchar, c bigint)
;
              a              |  b  |  c
----------------------------+-----+-----
  2019-12-04 10:40:22.82391  | abc | 100
  2019-12-04 10:40:22.823918 | xyz | 200
(2 rows)

> 
> --
> Regards,
> Vladimir


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Array of tuples as a parameter, with type casts

From
Adrian Klaver
Date:
On 12/4/19 10:44 AM, Adrian Klaver wrote:
> On 12/3/19 4:00 PM, Vladimir Ryabtsev wrote:
>> It does not change much, the error message just says it now expects 
>> 'character varying'.
>> Anyway bigint problem is the next in the line.
> 
> Yeah it is related to this:
> 
> https://www.postgresql.org/docs/11/functions-array.html
> unnest(anyarray)     setof anyelement
> 
> Dealing with anyelement can be tricky.
> 
> I tried out an alternate solution:
> 
> json_query = "select * from json_to_recordset(%s) as t(a timestamp, b 
> varchar, c bigint)"
> 
> cur.execute(json_query,[Json([{'a': datetime.now().isoformat(), 'b': 
> 'abc', 'c': 100},{'a': datetime.now().isoformat(), 'b': 'xyz', 'c': 
> 200}])])

Should have added that Json comes from:

from psycopg2.extras import Json

http://initd.org/psycopg/docs/extras.html?highlight=json#json-adaptation


> 
> Which results in:
> 
> test_(postgres)# select * from json_to_recordset(E'[{"a": 
> "2019-12-04T10:40:22.823910", "b": "abc", "c": 100}, {"a": 
> "2019-12-04T10:40:22.823918", "b": "xyz", "c": 200}]') as t(a timestamp, 
> b varchar, c bigint)
> ;
>               a              |  b  |  c
> ----------------------------+-----+-----
>   2019-12-04 10:40:22.82391  | abc | 100
>   2019-12-04 10:40:22.823918 | xyz | 200
> (2 rows)
> 
>>
>> -- 
>> Regards,
>> Vladimir
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Array of tuples as a parameter, with type casts

From
Vladimir Ryabtsev
Date:
Yes, I know it can be done with Json, too. I was particularly interested if it can be controlled in psycopg2 module, because the serialization to ARRAY[...] string is performed be the module and in this particular case explicit specification of types is required, but it seems like the module does not provide any control over it...
Thanks anyway!

--
Regards,
Vladimir

ср, 4 дек. 2019 г. в 10:47, Adrian Klaver <adrian.klaver@aklaver.com>:
On 12/4/19 10:44 AM, Adrian Klaver wrote:
> On 12/3/19 4:00 PM, Vladimir Ryabtsev wrote:
>> It does not change much, the error message just says it now expects
>> 'character varying'.
>> Anyway bigint problem is the next in the line.
>
> Yeah it is related to this:
>
> https://www.postgresql.org/docs/11/functions-array.html
> unnest(anyarray)     setof anyelement
>
> Dealing with anyelement can be tricky.
>
> I tried out an alternate solution:
>
> json_query = "select * from json_to_recordset(%s) as t(a timestamp, b
> varchar, c bigint)"
>
> cur.execute(json_query,[Json([{'a': datetime.now().isoformat(), 'b':
> 'abc', 'c': 100},{'a': datetime.now().isoformat(), 'b': 'xyz', 'c':
> 200}])])

Should have added that Json comes from:

from psycopg2.extras import Json

http://initd.org/psycopg/docs/extras.html?highlight=json#json-adaptation


>
> Which results in:
>
> test_(postgres)# select * from json_to_recordset(E'[{"a":
> "2019-12-04T10:40:22.823910", "b": "abc", "c": 100}, {"a":
> "2019-12-04T10:40:22.823918", "b": "xyz", "c": 200}]') as t(a timestamp,
> b varchar, c bigint)
> ;
>               a              |  b  |  c
> ----------------------------+-----+-----
>   2019-12-04 10:40:22.82391  | abc | 100
>   2019-12-04 10:40:22.823918 | xyz | 200
> (2 rows)
>
>>
>> --
>> Regards,
>> Vladimir
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Array of tuples as a parameter, with type casts

From
Daniele Varrazzo
Date:
On Wed, Dec 4, 2019 at 9:33 PM Vladimir Ryabtsev <greatvovan@gmail.com> wrote:
>
> Yes, I know it can be done with Json, too. I was particularly interested if it can be controlled in psycopg2 module,
becausethe serialization to ARRAY[...] string is performed be the module and in this particular case explicit
specificationof types is required, but it seems like the module does not provide any control over it... 

You can have control adding casts next to the placeholders, such as
'%s::type'. For your case you can create a postgres composite type and
cast the placeholder to '%s::mytype[]'.

The problem in mapping Python lists to Postgres arrays is that in
Python the list is _the_ type, whereas in Postgres arrays are "array
of a type". In Python "type([1]) == type(['a'])"; in Postgres they are
not the same: "pg_typeof('{}'::text[]) != pg_typeof('{}'::int[])".
arrays are of homogeneous types different each other. There is no sane
or efficient algorithm to attach a Postgres cast looking at a generic
Python list. Do you look at the first argument? What if it's None
(which in Python is always the same NoneType, whereas in Postgres NULL
are typed)? What if the array is empty? The whole idea of attaching a
cast to the values, as psycopg does for dates - typing in
'2019-12-31'::date in the query, is actually a bad idea because that's
passing a sql snippet to the query rather than a value, so it couldn't
be used in server-side placeholder etc.

However you can use the same trick, subclassing some of your type to
make it adapt into a typed expression and solve types mismatches or
ambigiuties. If instead of a tuple you use a type of yours, e.g. a
namedtuple, you can make it adapt like super() does and appending a
'::type', and ARRAY[] will work because it will try to sniff the type
from the first argument (bets are off if it's None - no idea if
Postgres will look at the first non-NULL, or will barf, and if it does
what if your list is empty...). Or you can subclass the list instead
and make its spew a '::mytype[]', which would work even if it's empty
or contains None.

Using a '%s::mytype[]' *into the query* is the best option IMO:
usually if you are writing a certain query you know what types are
involved, whereas values might be coming from far away.

-- Daniele



RE: Array of tuples as a parameter, with type casts

From
David Raymond
Date:

Another workaround would be creation a row type for the desired set of columns and casting %s to this type[], but I would not like to create additional objects in the database because it is pretty much out of my control.

 

Isn’t every table its own type? So could you do this?

inert into t

select * from unnest(%s::t[])...

 

 

https://www.postgresql.org/docs/current/rowtypes.html

 

Whenever you create a table, a composite type is also automatically created, with the same name as the table, to represent the table's row type. For example, had we said:

 

CREATE TABLE inventory_item (

    name            text,

    supplier_id     integer REFERENCES suppliers,

    price           numeric CHECK (price > 0)

);

 

then the same inventory_item composite type shown above would come into being as a byproduct, and could be used just as above.

...

 

From: Vladimir Ryabtsev <greatvovan@gmail.com>
Sent: Friday, November 29, 2019 11:30 PM
To: psycopg@lists.postgresql.org
Subject: Array of tuples as a parameter, with type casts

 

I have a query like this:

query = '''
insert into t
select * from unnest (%s)
as t1(c1 timestamp, c2 int)
'''

The reason of the approach is obviously reducing the number of server roundtrips when inserting many rows.
Usage:

from datetime import datetime
import psycopg2
db = psycopg2.connect('postgres://postgres:******@host/postgres?sslmode=prefer')
cur = db.cursor()
cur.execute(query, ([(datetime.now(), 1), (datetime.now(), 2)],))
db.commit()
db.close()

Recently they needed to extend the column set by a text and a bigint columns. Neither of them works:

from datetime import datetime
import psycopg2
query = '''insert into t
select * from unnest (%s)
as t1(c1 timestamp, c2 int, c3 text, c4 bigint)
'''
db = psycopg2.connect('postgres://postgres@localhost/postgres')
cur = db.cursor()
cur.execute(query, ([(datetime.now(), 1, 'abc', 100), (datetime.now(), 2, 'xyz', 200)],))
db.commit()
db.close()

It throws:
psycopg2.errors.DatatypeMismatch: function return row and query-specified return row do not match
DETAIL:  Returned type unknown at ordinal position 3, but query expects text.

The problem is that the library sends the following request to the DBMS:

insert into t
select * from unnest (ARRAY[('2019-11-29T19:48:19.683310'::timestamp, 1, 'abc', 100),('2019-11-29T19:48:19.683320'::timestamp, 2, 'xyz', 200)])
as t1(c1 timestamp, c2 int, c3 text, c4 bigint)

For whatever reason it fails with the above error, but OK, it is related to Postgres, not to psycopg2.
I can make it work by specifying type casts for text and bigint columns:

insert into t
select * from unnest (ARRAY[('2019-11-29T19:48:19.683310'::timestamp, 1, 'abc'::text, 100::bigint),('2019-11-29T19:48:19.683320'::timestamp, 2, 'xyz'::text, 200::bigint)])
as t1(c1 timestamp, c2 int, c3 text, c4 bigint);

But I need a way to make the psycopg2 module to do that. Another workaround would be creation a row type for the desired set of columns and casting %s to this type[], but I would not like to create additional objects in the database because it is pretty much out of my control.

Your help is very appreciated.

P.S. I am aware of other solutions such as execute_batch(), execute_values(), etc. Take this question as a theoretical one, I just want to understand if user is able to control this particular aspect of the module.

psycopg2-binary==2.8.4
Postgres ~ any

Re: Array of tuples as a parameter, with type casts

From
Vladimir Ryabtsev
Date:
Isn’t every table its own type? So could you do this?

Wow, you are right! I completely forgot about it!
I even don't need "as t(...)" column list.
Thanks!

--
Regards
Vladimir