Thread: libpq test suite
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi. Sorry for the question, but where can I find the libpq test suite? I can not find it in the PostgreSQL sources; it seems that there are only some examples, in src/test/examples. I'm planning to add some new features to libpq: * make PQsendPrepare send a "Describe Portal" protocol message * add support for setting per column binary result format * add direct support for portals (I think I need these for a project I'm working on [1]), How can I check if the new code does not break existing usage? [1] A new Python PostgreSQL driver, implemented following http://wiki.postgresql.org/wiki/Driver_development and withmany optimization (compared to psycopg2) enabled by the use of the extended query protocol Thanks Manlio Perillo -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEc0iAACgkQscQJ24LbaURiVQCfaANOEaIJHdq/ZoQYx1Hu8wZr On8An202Fqc928K9NuDRen/0B06sN/RS =Qngt -----END PGP SIGNATURE-----
Manlio Perillo wrote: > Sorry for the question, but where can I find the libpq test suite? > I can not find it in the PostgreSQL sources; it seems that there are > only some examples, in src/test/examples. The regression tests are in src/interfaces/libpq/test and currently contain only URL parsing tests. > I'm planning to add some new features to libpq: > > * make PQsendPrepare send a "Describe Portal" protocol message > * add support for setting per column binary result format I suggested exactly that here: http://www.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C208A4EDD4@exadv11.host.magwien.gv.at and met resistance: - one can use libpqtypes - I couldn't find a convincing use case - it clutters up the API > * add direct support for portals > > (I think I need these for a project I'm working on [1]), > > How can I check if the new code does not break existing usage? > > > [1] A new Python PostgreSQL driver, implemented following > http://wiki.postgresql.org/wiki/Driver_development > and with many optimization (compared to psycopg2) enabled by the > use of the extended query protocol I think that you'll need to explain in more detail why your proposed additions would be necessary for your project. Especially since many good drivers have been written against libpq as it is. Yours, Laurenz Albe
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Il 14/02/2013 14:06, Albe Laurenz ha scritto: > Manlio Perillo wrote: >> Sorry for the question, but where can I find the libpq test suite? >> I can not find it in the PostgreSQL sources; it seems that there are >> only some examples, in src/test/examples. > > The regression tests are in src/interfaces/libpq/test > and currently contain only URL parsing tests. > Ok, thanks. Since I'm not sure if I should add a new test here, I'll use the test suite of my project, since it contains an (almost) 1:1 wrapper around libpq. >> I'm planning to add some new features to libpq: >> >> * make PQsendPrepare send a "Describe Portal" protocol message >> * add support for setting per column binary result format > > I suggested exactly that here: > http://www.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C208A4EDD4@exadv11.host.magwien.gv.at > and met resistance: > - one can use libpqtypes > - I couldn't find a convincing use case > - it clutters up the API > For my Python DBAPI2 PostgreSQL driver I plan the following optimizations: 1) always use PQsendQueryParams functions. This will avoid having to escape parameters, as it is done in psycopg2 (IMHO it still use simple query protocol for compatibilitypurpose) 2) when the driver detects a Python string is being sent to the database, use binary format. As a special case, this will avoid having to use PQescapeByteaConn when sending binary string (e.g. byte strings in Python3.x) 3) enable use of prepared statements, but only if the user requested it, using setinputsizes function (used to set the Oidsof the parameters) 4) when using a prepared statement, check the Oids of the result tuple. In order to make this efficient, I proposed a patch to send a Describe Portal message in PQsendPrepare function. When the driver detects that one of the result column is a string type, set the result format for that column to binary. As a special case, this will avoid having to use PQunescapeBytea when receiving a bytea data. This is currently impossible, using libpq API. 5) when returning the result set of a query, after a call to cursor.fetchall(), do not convert all the data to Python objects. This will be done only "on request". This should optimize memory usage, as reported in: http://wiki.postgresql.org/wiki/Python_PostgreSQL_Driver_TODO 6) make available the use of PQsetSingleRowMode, to optimize large result set (as an option to the connection.cursor method) 7) as a generalization of PQsetSingleRowMode, expose in libpq API some of protocol internal portal API. One possible idea is to add a PQsetRowSize function, that will set the size of the result set, to be used in the Executeprotocol message (currently libpq always set it to 0, to get the entire result set, and it does not support thePortal Suspended message) This will avoid having to use named cursor, as it is done in psycopg. I'll try to make a patch to check if this is feasible, can be done efficiently, and the new API has a minimal impact onexisting API Note that I will have to code these features, in order to check they will work as I expect. > [...] >> >> [1] A new Python PostgreSQL driver, implemented following >> http://wiki.postgresql.org/wiki/Driver_development >> and with many optimization (compared to psycopg2) enabled by the >> use of the extended query protocol > > I think that you'll need to explain in more detail why > your proposed additions would be necessary for your project. > Especially since many good drivers have been written against > libpq as it is. > > Yours, > Laurenz Albe > Thanks Manlio Perillo -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEc81MACgkQscQJ24LbaURO9ACfctOREoaAtMDm06Sg+qv5jesj iW0An1CVAOaHzYaSn+P1AIJvXpI7nVT0 =rK4j -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 A number of the described features sound quite useful. Is it not practical to extend an existing library such as psycopg2? What method will you use to call libpq functions? As you are no doubt aware, psycopg2 uses the traditional CPython API but there is a fork which uses ctypes instead, enabling it to work with CPython and Pypy. Manlio Perillo wrote: > For my Python DBAPI2 PostgreSQL driver I plan the following optimizations: > > 2) when the driver detects a Python string is being sent to the > database, use binary format. What exactly do you mean by "Python string"? - -- Jonathan Ross Rogers -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlEdHFAACgkQVmXTv6uMqqOlzgCgxcGtjwFqK1AMKAED9rK5YSOa 3AMAoJhw4197HPBrjpxG/iSLK43B1B3j =RnaP -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Il 14/02/2013 18:18, Jonathan Rogers ha scritto: > A number of the described features sound quite useful. Is it not > practical to extend an existing library such as psycopg2? I suspect there are compatibility issues. > What method > will you use to call libpq functions? As you are no doubt aware, > psycopg2 uses the traditional CPython API but there is a fork which uses > ctypes instead, enabling it to work with CPython and Pypy. > I'm implementing a prototype version, in pure Python with ctypes. When the prototype is ready, I will implement a CPython extension module (since the purpose of writing yet another driver is to make it more efficient than the current best driver). I will also discuss about porting some of the features to psycopg2 (but first I need to have a working driver). > Manlio Perillo wrote: > >> For my Python DBAPI2 PostgreSQL driver I plan the following optimizations: > >> 2) when the driver detects a Python string is being sent to the >> database, use binary format. > > What exactly do you mean by "Python string"? > A Python string object. The libpq interface will implement the functions pg_str_encode(string object) -> bytes pg_str_decode(bytes, result object)-> string object as described in http://wiki.postgresql.org/wiki/Driver_development but with some differences. The pg_str_encode function will return a tuple with the raw bytes and the "suggested" parameter format. As an example, pg_str_encode(<Python 3.x byte string>) will return the byte string and 1 (since it is a binary string, and I want to avoid to use PQescapeBytea function). For the DBAPI interface, I plan to implement a generic API to map PostgreSQL types to Python types (different from the one implemented in psycopg2); something like: pgtype = connection.create_type_map( <python type object>, <postgresql type name>, <input text function>, <output textfunction>, <input binary function - optional>, <input binary function - optional>) but I have yet to decide how to detect the param format to use. Maybe there will be only one output function, that will decide the best format to be used. Regards Manlio -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEdIYQACgkQscQJ24LbaUSfBACfWR7eD6pdyipC3/fljUiuelx6 GV4An3agVt4tx0E/JDUvO0iM8/BiZT1o =xckI -----END PGP SIGNATURE-----
Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)
From
Peter Eisentraut
Date:
On 2/14/13 9:23 AM, Manlio Perillo wrote: > 1) always use PQsendQueryParams functions. > > This will avoid having to escape parameters, as it is done in > psycopg2 > (IMHO it still use simple query protocol for compatibility purpose) I think the reason this doesn't work is that in order to prepare a query you need to know the parameter types, but you don't know that in Python, or at least with the way the DB-API works. For example, if you write cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2)) what types will you pass to PQsendQueryParams? You can make some obvious heuristics, but this sort of thing can get complicated pretty quickly.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Il 14/02/2013 20:01, Peter Eisentraut ha scritto: > On 2/14/13 9:23 AM, Manlio Perillo wrote: >> 1) always use PQsendQueryParams functions. >> >> This will avoid having to escape parameters, as it is done in >> psycopg2 >> (IMHO it still use simple query protocol for compatibility purpose) > > I think the reason this doesn't work is that in order to prepare a query > you need to know the parameter types, but you don't know that in Python, > or at least with the way the DB-API works. Hint: .setinputsizes. In my implementation, prepared queries are **only** used if user calls setinputsizes; if setinputsizes is not called, preparing a query can cause performance loss, so it is better to not prepare one. > For example, if you write > > cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2)) > > what types will you pass to PQsendQueryParams? > 1) if setinputsizes is used, use the type specified here. 2) if setinputsizes is not used, query the driver's type catalog, to get the postgresql type oid associated with a Pythonobject; e.g.: pg_type = connection.get_type_by_object(val1) buf, format = pg_type.output_function(val1) param_oid = pg_type.oid > You can make some obvious heuristics, but this sort of thing can get > complicated pretty quickly. A non trivial case if when val is a list, that should be mapped to a PostgreSQL array. However, you can always set the oid to 0, and let PostgreSQL deduce the type, as it is done in psycopg2. If user called setinputsizes, we are happy. Regards Manlio -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEdO54ACgkQscQJ24LbaURFWACgkG3dkDYUA3tHooiXpGDf8hm0 Fz0AnjhFDwJU/XSqtgPyHwuHw9+GiRlv =m68+ -----END PGP SIGNATURE-----
On 14/02/2013 20:01, Peter Eisentraut wrote: > On 2/14/13 9:23 AM, Manlio Perillo wrote: >> 1) always use PQsendQueryParams functions. >> >> This will avoid having to escape parameters, as it is done in >> psycopg2 >> (IMHO it still use simple query protocol for compatibility purpose) > > I think the reason this doesn't work is that in order to prepare a query > you need to know the parameter types, but you don't know that in Python, > or at least with the way the DB-API works. For example, if you write > > cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2)) > > what types will you pass to PQsendQueryParams? Pardon me if this is obvious, but why would you need to pass any types at all? Assuming we're still talking about PQsendQueryParams and not an explicit prepare/execute cycle.. Regards, Marko Tiikkaja
>For my Python DBAPI2 PostgreSQL driver I plan the following optimizations: I suggest you have a look at my Python ocpgdb driver: http://code.google.com/p/ocpgdb/ It uses the v3 binary protocol exclusively (to avoid the usual escaping security issues). A number of gotchyas were discovered along the way - in particular, you must be a lot more careful about types (as you note in a later reply). There were also some issues with the v3 protocol, most of which have been fixed now. ocpgdb does not implement everything, just the bits I needed. That said, other people/projects are using it in production, and it's proven to be fast and stable. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Il 15/02/2013 02:45, Andrew McNamara ha scritto: >> For my Python DBAPI2 PostgreSQL driver I plan the following optimizations: > > I suggest you have a look at my Python ocpgdb driver: > > http://code.google.com/p/ocpgdb/ > Thanks, I did not know it. > It uses the v3 binary protocol exclusively (to avoid the usual escaping > security issues). A number of gotchyas were discovered along the way - > in particular, you must be a lot more careful about types (as you note > in a later reply). Note that this query: curs.execute('SELECT * FROM foo WHERE bah > %s', '2006-1-1') is IMHO incorrect, as per DBAPI 2.0: http://www.python.org/dev/peps/pep-0249/#type-objects-and-constructors > There were also some issues with the v3 protocol, > most of which have been fixed now. > I hope the issues are discussed in the commit log messages; I'm creating a git local mirror of the svn repository. > ocpgdb does not implement everything, just the bits I needed. That said, > other people/projects are using it in production, and it's proven to be > fast and stable. > Regards Manlio -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEemTYACgkQscQJ24LbaUTujgCfZhrNTsqy/PvRJ4qwLVqy8QVT lNwAnjsJooEv/vss32RNMKEISOdZ16F1 =nKO5 -----END PGP SIGNATURE-----
Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)
From
Peter Eisentraut
Date:
On 2/14/13 2:42 PM, Marko Tiikkaja wrote: >> I think the reason this doesn't work is that in order to prepare a query >> you need to know the parameter types, but you don't know that in Python, >> or at least with the way the DB-API works. For example, if you write >> >> cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2)) >> >> what types will you pass to PQsendQueryParams? > > Pardon me if this is obvious, but why would you need to pass any types > at all? Assuming we're still talking about PQsendQueryParams and not an > explicit prepare/execute cycle.. Well, PQsendQueryParams() requires types to be passed, doesn't it?
Re: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)
From
Daniele Varrazzo
Date:
On Fri, Feb 15, 2013 at 9:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On 2/14/13 2:42 PM, Marko Tiikkaja wrote: >>> I think the reason this doesn't work is that in order to prepare a query >>> you need to know the parameter types, but you don't know that in Python, >>> or at least with the way the DB-API works. For example, if you write >>> >>> cur.execute("SELECT * FROM tbl WHERE a = %s AND b = %s", (val1, val2)) >>> >>> what types will you pass to PQsendQueryParams? >> >> Pardon me if this is obvious, but why would you need to pass any types >> at all? Assuming we're still talking about PQsendQueryParams and not an >> explicit prepare/execute cycle.. > > Well, PQsendQueryParams() requires types to be passed, doesn't it? No, not necessarily: they are inferred by the context if they are not specified. I've had in mind for a long time to use the *Params() functions in psycopg (although it would be largely not backwards compatible, hence to be done on user request and not by default). Psycopg has all the degrees of freedom in keeping the two implementations alive (the non-*params for backward compatibility, the *params for future usage). I'd drafted a plan on the psycopg ML some times ago. But I don't have a timeline for that: it's a major work and without pressing motivations to do it. -- Daniele
On Thursday 14 February 2013, Manlio Perillo wrote: > Il 14/02/2013 14:06, Albe Laurenz ha scritto: > > Manlio Perillo wrote: > >> Sorry for the question, but where can I find the libpq test suite? > >> I can not find it in the PostgreSQL sources; it seems that there are > >> only some examples, in src/test/examples. > > > For my Python DBAPI2 PostgreSQL driver I plan the following optimizations: > > 1) always use PQsendQueryParams functions. > > This will avoid having to escape parameters, as it is done in > psycopg2 > (IMHO it still use simple query protocol for compatibility purpose) > > 2) when the driver detects a Python string is being sent to the > database, use binary format. > > As a special case, this will avoid having to use PQescapeByteaConn > when sending binary string (e.g. byte strings in Python 3.x) > .... Perhaps you could also see some attempt I'd made to support binary protocol inside psycopg2, some time ago: https://github.com/xrg/psycopg/tree/execparams2 -- Say NO to spam and viruses. Stop using Microsoft Windows!