Re: Confusion over Python drivers - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Confusion over Python drivers |
Date | |
Msg-id | 1265661520.29919.2261.camel@jdavis Whole thread Raw |
In response to | Re: Confusion over Python drivers (Florian Weimer <fw@deneb.enyo.de>) |
Responses |
Re: Confusion over Python drivers
|
List | pgsql-hackers |
On Mon, 2010-02-08 at 20:29 +0100, Florian Weimer wrote: > I'm contemplating to create a new language binding for libpq (or, to > be more precise, turn an existing language binding into something that > can be published). I've been agonizing a bit over how to create a > bridge between the host language type system and the PostgreSQL type > system. If I understand you correctly, you suggest to leave > everything as strings. This solution has the appeal of being > implemented easily. It also sidesteps a lot of issues revolving > around different representation choices for numbers. Agreed. Ultimately, the conversion has to be done somewhere, but I don't believe the driver is the place for it. Type conversions are always going to be imperfect, and this has some important consequences:* The type conversion system will be endlessly tweaked toimprove it* Developers will always run into problems with it in any complex application, so we need to allow them to circumvent the system and do it themselves when necessary. Both of these things point to another layer on top of the driver itself. It could be some extra convenience functions that come with the driver, or an entirely separate layer (like ActiveRecord). But if we always let the developer have access to the full power of libpq, it limits the damage that can be done by a slightly-too-creative API on top of it. > Do you really suggest to preserve the PQexecParams API verbatim, that > is, passing in three arrays containing type, value, and format? That > seems to be a bit problematic. I suspect the common case will be to > use unknown types, text format, and the default conversion from values > to strings. I tried to address this specifically in the document: "For example: it should be easy to pass parameters so that PQexecParams (and others) can be used, avoiding SQL injection risks. The important thing is to maintain close to a one-to-one mapping between libpq and the driver's API, and to provide all of the functionality of libpq." In ruby-pg, you can just do: conn.exec("INSERT INTO foo VALUES($1)", ["Jeff"]) And I think that's appropriate. What I'm saying is that there should still exist some way to pass explicit types or formats (although that should still be easier than it is in C ;). Here's the long form: conn.exec("INSERT INTO foo VALUES($1)", [{:value => "Jeff", :format => 0, :type => 0}]) The nice thing about that format is that you can do the "easy" thing for most of the parameters in a query, but then choose binary format for that one BYTEA parameter. That's because, in ruby, you can mix strings and hashes in the same array. So I'm not saying we should make everyone code ruby that looks like C. I'm saying that the job of the driver is to provide full access to libpq, and anything beyond that should be an optional convenience routine, and should be free of magic and cleverness (that's the job of a higher layer). > Conversely, for result sets, I'm tempted to transparently decode > escaped BYTEA columns. Consider the following ruby-pg program, where you have two empty tables foo and bar, each with a single BYTEA column "b": conn = PGconn.connect(...) conn.exec("INSERT INTO foo VALUES($1)", ["\\\\000"]) # copy the single value in foo into bar val = conn.exec("SELECT b FROM foo LIMIT 1")[0]["b"] conn.exec("INSERT INTO barVALUES($1)", [val]) That copies value so that foo and bar have the same contents: a 4 byte value "\000". What would happen though, if val was transparently decoded? It would decode it once in ruby, and again inside of postgres (in byteain), leaving you with a one byte value in bar, even though foo has a four-byte value. I really think that only higher layers should implement that kind of magic, no matter how "obvious" it may seem that the user wants something extra. > > Note that the ruby-pg driver doesn't 100% adhere to those standards > > (encoding is the primary problem, and that will be fixed). > > Lack of Unicode support means that I can punt that to application > authors, I guess. Ruby 1.9+ and Python 3.0+ both have string encoding models that can't just be ignored. We could punt it by always returning byte sequences rather than strings, but I think that's a particularly extreme version of my philosophy of not trying to convert between types. > By the way, the downside of using strings everywhere is that your > binding API will most likely not work with SQLite (or any other > SQL-like database which lacks column type information). I am trying to develop standards suitable for PostgreSQL drivers based on libpq. These are not meant to be standards for a database-agnostic API, standards for a high-level database adapter, or even standards for a driver written against something other than libpq (like the JDBC driver). Thank you for your comments. I will try to integrate these thoughts into the document. Regards,Jeff Davis
pgsql-hackers by date: