Thread: libpq test suite

libpq test suite

From
Manlio Perillo
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi.

Sorry for the question, but where can I find the libpq test suite?
I can not find it in the PostgreSQL sources; it seems that there are
only some examples, in src/test/examples.


I'm planning to add some new features to libpq:
   * make PQsendPrepare send a "Describe Portal" protocol message   * add support for setting per column binary result
format  * add direct support for portals
 

(I think I need these for a project I'm working on [1]),

How can I check if the new code does not break existing usage?


[1] A new Python PostgreSQL driver, implemented following   http://wiki.postgresql.org/wiki/Driver_development   and
withmany optimization (compared to psycopg2) enabled by the   use of the extended query protocol
 


Thanks  Manlio Perillo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEc0iAACgkQscQJ24LbaURiVQCfaANOEaIJHdq/ZoQYx1Hu8wZr
On8An202Fqc928K9NuDRen/0B06sN/RS
=Qngt
-----END PGP SIGNATURE-----



Re: libpq test suite

From
Albe Laurenz
Date:
Manlio Perillo wrote:
> Sorry for the question, but where can I find the libpq test suite?
> I can not find it in the PostgreSQL sources; it seems that there are
> only some examples, in src/test/examples.

The regression tests are in src/interfaces/libpq/test
and currently contain only URL parsing tests.

> I'm planning to add some new features to libpq:
> 
>     * make PQsendPrepare send a "Describe Portal" protocol message
>     * add support for setting per column binary result format

I suggested exactly that here:
http://www.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C208A4EDD4@exadv11.host.magwien.gv.at
and met resistance:
- one can use libpqtypes
- I couldn't find a convincing use case
- it clutters up the API

>     * add direct support for portals
> 
> (I think I need these for a project I'm working on [1]),
> 
> How can I check if the new code does not break existing usage?
> 
> 
> [1] A new Python PostgreSQL driver, implemented following
>     http://wiki.postgresql.org/wiki/Driver_development
>     and with many optimization (compared to psycopg2) enabled by the
>     use of the extended query protocol

I think that you'll need to explain in more detail why
your proposed additions would be necessary for your project.
Especially since many good drivers have been written against
libpq as it is.

Yours,
Laurenz Albe

[RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Manlio Perillo
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 14/02/2013 14:06, Albe Laurenz ha scritto:
> Manlio Perillo wrote:
>> Sorry for the question, but where can I find the libpq test suite?
>> I can not find it in the PostgreSQL sources; it seems that there are
>> only some examples, in src/test/examples.
> 
> The regression tests are in src/interfaces/libpq/test
> and currently contain only URL parsing tests.
> 

Ok, thanks.

Since I'm not sure if I should add a new test here, I'll use the test
suite of my project, since it contains an (almost) 1:1 wrapper around libpq.

>> I'm planning to add some new features to libpq:
>>
>>     * make PQsendPrepare send a "Describe Portal" protocol message
>>     * add support for setting per column binary result format
> 
> I suggested exactly that here:
> http://www.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C208A4EDD4@exadv11.host.magwien.gv.at
> and met resistance:
> - one can use libpqtypes
> - I couldn't find a convincing use case
> - it clutters up the API
> 

For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:

1) always use PQsendQueryParams functions.
  This will avoid having to escape parameters, as it is done in  psycopg2  (IMHO it still use simple query protocol for
compatibilitypurpose)
 

2) when the driver detects a Python string is being sent to the  database, use binary format.
  As a special case, this will avoid having to use PQescapeByteaConn  when sending binary string (e.g. byte strings in
Python3.x)
 

3) enable use of prepared statements, but only if the user requested it,  using setinputsizes function (used to set the
Oidsof the parameters)
 

4) when using a prepared statement, check the Oids of the result tuple.
  In order to make this efficient, I proposed a patch to send a  Describe Portal message in PQsendPrepare function.
  When the driver detects that one of the result column is a string  type, set the result format for that column to
binary.
  As a special case, this will avoid having to use PQunescapeBytea  when receiving a bytea data.
  This is currently impossible, using libpq API.

5) when returning the result set of a query, after a call to  cursor.fetchall(), do not convert all the data to Python
objects.
  This will be done only "on request".
  This should optimize memory usage, as reported in:  http://wiki.postgresql.org/wiki/Python_PostgreSQL_Driver_TODO

6) make available the use of PQsetSingleRowMode, to optimize large  result set (as an option to the connection.cursor
method)

7) as a generalization of PQsetSingleRowMode, expose in libpq API some  of protocol internal portal API.
  One possible idea is to add a PQsetRowSize function, that will set  the size of the result set, to be used in the
Executeprotocol  message (currently libpq always set it to 0, to get the entire  result set, and it does not support
thePortal Suspended message)
 
  This will avoid having to use named cursor, as it is done in psycopg.
  I'll try to make a patch to check if this is feasible, can be  done efficiently, and the new API has a minimal impact
onexisting  API
 

Note that I will have to code these features, in order to check they
will work as I expect.


> [...]
>>
>> [1] A new Python PostgreSQL driver, implemented following
>>     http://wiki.postgresql.org/wiki/Driver_development
>>     and with many optimization (compared to psycopg2) enabled by the
>>     use of the extended query protocol
> 
> I think that you'll need to explain in more detail why
> your proposed additions would be necessary for your project.
> Especially since many good drivers have been written against
> libpq as it is.
> 
> Yours,
> Laurenz Albe
> 

Thanks   Manlio Perillo

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEc81MACgkQscQJ24LbaURO9ACfctOREoaAtMDm06Sg+qv5jesj
iW0An1CVAOaHzYaSn+P1AIJvXpI7nVT0
=rK4j
-----END PGP SIGNATURE-----



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Jonathan Rogers
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A number of the described features sound quite useful. Is it not
practical to extend an existing library such as psycopg2? What method
will you use to call libpq functions? As you are no doubt aware,
psycopg2 uses the traditional CPython API but there is a fork which uses
ctypes instead, enabling it to work with CPython and Pypy.

Manlio Perillo wrote:

> For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:
> 
> 2) when the driver detects a Python string is being sent to the
>    database, use binary format.

What exactly do you mean by "Python string"?


- -- 
Jonathan Ross Rogers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlEdHFAACgkQVmXTv6uMqqOlzgCgxcGtjwFqK1AMKAED9rK5YSOa
3AMAoJhw4197HPBrjpxG/iSLK43B1B3j
=RnaP
-----END PGP SIGNATURE-----



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Manlio Perillo
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 14/02/2013 18:18, Jonathan Rogers ha scritto:
> A number of the described features sound quite useful. Is it not
> practical to extend an existing library such as psycopg2?

I suspect there are compatibility issues.

> What method
> will you use to call libpq functions? As you are no doubt aware,
> psycopg2 uses the traditional CPython API but there is a fork which uses
> ctypes instead, enabling it to work with CPython and Pypy.
> 

I'm implementing a prototype version, in pure Python with ctypes.
When the prototype is ready, I will implement a CPython extension module
(since the purpose of writing yet another driver is to make it more
efficient than the current best driver).

I will also discuss about porting some of the features to psycopg2 (but
first I need to have a working driver).

> Manlio Perillo wrote:
> 
>> For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:
> 
>> 2) when the driver detects a Python string is being sent to the
>>    database, use binary format.
> 
> What exactly do you mean by "Python string"?
> 

A Python string object.

The libpq interface will implement the functions pg_str_encode(string object) -> bytes pg_str_decode(bytes, result
object)-> string object
 
as described in
http://wiki.postgresql.org/wiki/Driver_development

but with some differences.

The pg_str_encode function will return a tuple with the raw bytes and
the "suggested" parameter format.

As an example, pg_str_encode(<Python 3.x byte string>) will return the
byte string and 1 (since it is a binary string, and I want to avoid to
use PQescapeBytea function).

For the DBAPI interface, I plan to implement a generic API to map
PostgreSQL types to Python types (different from the one implemented in
psycopg2); something like:

pgtype = connection.create_type_map(   <python type object>, <postgresql type name>,   <input text function>, <output
textfunction>,   <input binary function - optional>,   <input binary function - optional>)
 

but I have yet to decide how to detect the param format to use.

Maybe there will be only one output function, that will decide the best
format to be used.



Regards  Manlio
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEdIYQACgkQscQJ24LbaUSfBACfWR7eD6pdyipC3/fljUiuelx6
GV4An3agVt4tx0E/JDUvO0iM8/BiZT1o
=xckI
-----END PGP SIGNATURE-----



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Peter Eisentraut
Date:
On 2/14/13 9:23 AM, Manlio Perillo wrote:
> 1) always use PQsendQueryParams functions.
> 
>    This will avoid having to escape parameters, as it is done in
>    psycopg2
>    (IMHO it still use simple query protocol for compatibility purpose)

I think the reason this doesn't work is that in order to prepare a query
you need to know the parameter types, but you don't know that in Python,
or at least with the way the DB-API works.  For example, if you write

cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2))

what types will you pass to PQsendQueryParams?

You can make some obvious heuristics, but this sort of thing can get
complicated pretty quickly.



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Manlio Perillo
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 14/02/2013 20:01, Peter Eisentraut ha scritto:
> On 2/14/13 9:23 AM, Manlio Perillo wrote:
>> 1) always use PQsendQueryParams functions.
>>
>>    This will avoid having to escape parameters, as it is done in
>>    psycopg2
>>    (IMHO it still use simple query protocol for compatibility purpose)
> 
> I think the reason this doesn't work is that in order to prepare a query
> you need to know the parameter types, but you don't know that in Python,
> or at least with the way the DB-API works.

Hint: .setinputsizes.

In my implementation, prepared queries are **only** used if user calls
setinputsizes; if setinputsizes is not called, preparing a query can
cause performance loss, so it is better to not prepare one.

>  For example, if you write
> 
> cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2))
> 
> what types will you pass to PQsendQueryParams?
> 

1) if setinputsizes is used, use the type specified here.

2) if setinputsizes is not used, query the driver's type catalog, to  get the postgresql type oid associated with a
Pythonobject; e.g.:
 
      pg_type = connection.get_type_by_object(val1)      buf, format = pg_type.output_function(val1)      param_oid =
pg_type.oid

> You can make some obvious heuristics, but this sort of thing can get
> complicated pretty quickly.

A non trivial case if when val is a list, that should be mapped to a
PostgreSQL array.

However, you can always set the oid to 0, and let PostgreSQL deduce the
type, as it is done in psycopg2.  If user called setinputsizes, we are
happy.


Regards   Manlio
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEdO54ACgkQscQJ24LbaURFWACgkG3dkDYUA3tHooiXpGDf8hm0
Fz0AnjhFDwJU/XSqtgPyHwuHw9+GiRlv
=m68+
-----END PGP SIGNATURE-----



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Marko Tiikkaja
Date:
On 14/02/2013 20:01, Peter Eisentraut wrote:
> On 2/14/13 9:23 AM, Manlio Perillo wrote:
>> 1) always use PQsendQueryParams functions.
>>
>>     This will avoid having to escape parameters, as it is done in
>>     psycopg2
>>     (IMHO it still use simple query protocol for compatibility purpose)
>
> I think the reason this doesn't work is that in order to prepare a query
> you need to know the parameter types, but you don't know that in Python,
> or at least with the way the DB-API works.  For example, if you write
>
> cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2))
>
> what types will you pass to PQsendQueryParams?

Pardon me if this is obvious, but why would you need to pass any types 
at all?  Assuming we're still talking about PQsendQueryParams and not an 
explicit prepare/execute cycle..


Regards,
Marko Tiikkaja



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Andrew McNamara
Date:
>For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:

I suggest you have a look at my Python ocpgdb driver:
   http://code.google.com/p/ocpgdb/

It uses the v3 binary protocol exclusively (to avoid the usual escaping
security issues). A number of gotchyas were discovered along the way -
in particular, you must be a lot more careful about types (as you note
in a later reply). There were also some issues with the v3 protocol,
most of which have been fixed now.

ocpgdb does not implement everything, just the bits I needed. That said,
other people/projects are using it in production, and it's proven to be
fast and stable.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Manlio Perillo
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 15/02/2013 02:45, Andrew McNamara ha scritto:
>> For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:
> 
> I suggest you have a look at my Python ocpgdb driver:
> 
>     http://code.google.com/p/ocpgdb/
> 

Thanks, I did not know it.

> It uses the v3 binary protocol exclusively (to avoid the usual escaping
> security issues). A number of gotchyas were discovered along the way -
> in particular, you must be a lot more careful about types (as you note
> in a later reply).

Note that this query: curs.execute('SELECT * FROM foo WHERE bah > %s', '2006-1-1')

is IMHO incorrect, as per DBAPI 2.0:
http://www.python.org/dev/peps/pep-0249/#type-objects-and-constructors


> There were also some issues with the v3 protocol,
> most of which have been fixed now.
> 

I hope the issues are discussed in the commit log messages; I'm creating
a git local mirror of the svn repository.

> ocpgdb does not implement everything, just the bits I needed. That said,
> other people/projects are using it in production, and it's proven to be
> fast and stable.
> 


Regards  Manlio
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEemTYACgkQscQJ24LbaUTujgCfZhrNTsqy/PvRJ4qwLVqy8QVT
lNwAnjsJooEv/vss32RNMKEISOdZ16F1
=nKO5
-----END PGP SIGNATURE-----



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Peter Eisentraut
Date:
On 2/14/13 2:42 PM, Marko Tiikkaja wrote:
>> I think the reason this doesn't work is that in order to prepare a query
>> you need to know the parameter types, but you don't know that in Python,
>> or at least with the way the DB-API works.  For example, if you write
>>
>> cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2))
>>
>> what types will you pass to PQsendQueryParams?
> 
> Pardon me if this is obvious, but why would you need to pass any types
> at all?  Assuming we're still talking about PQsendQueryParams and not an
> explicit prepare/execute cycle..

Well, PQsendQueryParams() requires types to be passed, doesn't it?




Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
Daniele Varrazzo
Date:
On Fri, Feb 15, 2013 at 9:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On 2/14/13 2:42 PM, Marko Tiikkaja wrote:
>>> I think the reason this doesn't work is that in order to prepare a query
>>> you need to know the parameter types, but you don't know that in Python,
>>> or at least with the way the DB-API works.  For example, if you write
>>>
>>> cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2))
>>>
>>> what types will you pass to PQsendQueryParams?
>>
>> Pardon me if this is obvious, but why would you need to pass any types
>> at all?  Assuming we're still talking about PQsendQueryParams and not an
>> explicit prepare/execute cycle..
>
> Well, PQsendQueryParams() requires types to be passed, doesn't it?

No, not necessarily: they are inferred by the context if they are not specified.

I've had in mind for a long time to use the *Params() functions in
psycopg (although it would be largely not backwards compatible, hence
to be done on user request and not by default). Psycopg has all the
degrees of freedom in keeping the two implementations alive (the
non-*params for backward compatibility, the *params for future usage).
I'd drafted a plan on the psycopg ML some times ago. But I don't have
a timeline for that: it's a major work and without pressing
motivations to do it.

-- Daniele



Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From
"P. Christeas"
Date:
On Thursday 14 February 2013, Manlio Perillo wrote:
> Il 14/02/2013 14:06, Albe Laurenz ha scritto:
> > Manlio Perillo wrote:
> >> Sorry for the question, but where can I find the libpq test suite?
> >> I can not find it in the PostgreSQL sources; it seems that there are
> >> only some examples, in src/test/examples.
> > 
> For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:
> 
> 1) always use PQsendQueryParams functions.
> 
>    This will avoid having to escape parameters, as it is done in
>    psycopg2
>    (IMHO it still use simple query protocol for compatibility purpose)
> 
> 2) when the driver detects a Python string is being sent to the
>    database, use binary format.
> 
>    As a special case, this will avoid having to use PQescapeByteaConn
>    when sending binary string (e.g. byte strings in Python 3.x)
> ....

Perhaps you could also see some attempt I'd made to support binary protocol 
inside psycopg2, some time ago:

https://github.com/xrg/psycopg/tree/execparams2



-- 
Say NO to spam and viruses. Stop using Microsoft Windows!