Thread: Practical impediment to supporting multiple SSL libraries
Just quickly going through what might be needed to support multiple SSL libraries revealed one big problem in libpq-fe.h. #ifdef USE_SSL /* Get the SSL structure associated with a connection */ extern SSL *PQgetssl(PGconn *conn); #else extern void *PQgetssl(PGconn *conn); #endif The return type of the function changes depending on whether SSL is compiled in or not. :( So, libpq exposes to its users the underlying SSL library, which seems wrong. Now, options include: 1. Changing it to always return (void*), irrespective of SSL 2. Creating a PGsslcontext type that varies depending on what library you use (or not). 3. Removing the function entirely because the only user appears to be psql (in tree anyway). 4. Only declare the function if the user has #included openssl themselves. Or alternatively we could do nothing because: 5. It's not a problem 6. It's a backward incompatable change Personally, I'm in favour of 1, because then we can get rid of the #include for openssl, so users don't have to have openssl headers installed to compile postgresql programs. Options 2, 3 and 4 have varying levels of evilness attached. However, I can see how 5 or 6 might be attractive. Thoughts? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > Martijn van Oosterhout > Sent: 12 April 2006 16:48 > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > Just quickly going through what might be needed to support > multiple SSL libraries revealed one big problem in libpq-fe.h. > > #ifdef USE_SSL > /* Get the SSL structure associated with a connection */ > extern SSL *PQgetssl(PGconn *conn); #else extern void > *PQgetssl(PGconn *conn); #endif > > The return type of the function changes depending on whether > SSL is compiled in or not. :( So, libpq exposes to its users > the underlying SSL library, which seems wrong. Now, options include: > > 1. Changing it to always return (void*), irrespective of SSL > 2. Creating a PGsslcontext type that varies depending on what > library you use (or not). > 3. Removing the function entirely because the only user > appears to be psql (in tree anyway). > 4. Only declare the function if the user has #included > openssl themselves. > > Or alternatively we could do nothing because: > > 5. It's not a problem > 6. It's a backward incompatable change The next version of psqlODBC (that has just gone into CVS tip after months of work and debate) uses it, and would break almost completely should it be removed, therefore any backwards incompatible change should be avoided imho. And 2 or 4 could cause chaos for Windows users if different DLL builds get mixed up. Regards, Dave.
On Wed, Apr 12, 2006 at 05:03:32PM +0100, Dave Page wrote: <about the declaration of PQgetssl> > The next version of psqlODBC (that has just gone into CVS tip after > months of work and debate) uses it, and would break almost completely > should it be removed, therefore any backwards incompatible change should > be avoided imho. And 2 or 4 could cause chaos for Windows users if > different DLL builds get mixed up. Hmm, may I ask what it uses it for? Just to get information, or something more substantial? Thanks in advance, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 12 April 2006 17:15 > To: Dave Page > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > On Wed, Apr 12, 2006 at 05:03:32PM +0100, Dave Page wrote: > > <about the declaration of PQgetssl> > > The next version of psqlODBC (that has just gone into CVS tip after > > months of work and debate) uses it, and would break almost > completely > > should it be removed, therefore any backwards incompatible change > > should be avoided imho. And 2 or 4 could cause chaos for > Windows users > > if different DLL builds get mixed up. > > Hmm, may I ask what it uses it for? Just to get information, > or something more substantial? The driver implements all versions of the wire protocol itself, but if libpq is available at runtime (it will dynamically load it on platforms that support it) it can use it for connection setup so features like SSL can be provided easily. I'm still not overly familiar with how it works yet, but I'm sure Hiroshi (CC'd) can provide further details if you need them. Regards, Dave.
Martijn van Oosterhout <kleptog@svana.org> writes: > 1. Changing it to always return (void*), irrespective of SSL > ... > Personally, I'm in favour of 1, because then we can get rid of the > #include for openssl, so users don't have to have openssl headers > installed to compile postgresql programs. I like that too. I've never been very happy about having libpq-fe.h depending on USE_SSL. There is a more serious issue here though: if we allow more than one SSL library, what exactly can an application safely do with the returned pointer? It strikes me as very dangerous for the app to assume it knows which SSL library is underneath libpq. It's not at all hard to imagine an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS or vice versa. To the extent that there are apps out there that depend on doing something with this function, I think that even contemplating supporting multiple SSL libraries is a threat. regards, tom lane
Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > >>1. Changing it to always return (void*), irrespective of SSL >>... >>Personally, I'm in favour of 1, because then we can get rid of the >>#include for openssl, so users don't have to have openssl headers >>installed to compile postgresql programs. > > > I like that too. I've never been very happy about having libpq-fe.h > depending on USE_SSL. > > There is a more serious issue here though: if we allow more than one SSL > library, what exactly can an application safely do with the returned > pointer? It strikes me as very dangerous for the app to assume it knows > which SSL library is underneath libpq. It's not at all hard to imagine > an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS > or vice versa. To the extent that there are apps out there that depend > on doing something with this function, I think that even contemplating > supporting multiple SSL libraries is a threat. I wonder if there are apps that actually use the ssl pointer, beyond detection of encrypted connections. So interpreting the result as bool would be sufficient. Regards, Andreas
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > 1. Changing it to always return (void*), irrespective of SSL > > ... > > Personally, I'm in favour of 1, because then we can get rid of the > > #include for openssl, so users don't have to have openssl headers > > installed to compile postgresql programs. > > I like that too. I've never been very happy about having libpq-fe.h > depending on USE_SSL. I'm all in favor of dropping the dependency on OpenSSL headers from libpq, just to throw my 2 cents in there. > There is a more serious issue here though: if we allow more than one SSL > library, what exactly can an application safely do with the returned > pointer? It strikes me as very dangerous for the app to assume it knows > which SSL library is underneath libpq. It's not at all hard to imagine > an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS > or vice versa. To the extent that there are apps out there that depend > on doing something with this function, I think that even contemplating > supporting multiple SSL libraries is a threat. I'm afraid the way to do this would probably be to have it return a Postgres-defined structure (without depending on if it's compiled with SSL or not) which then indicates if the connection is SSL-enabled or not and then probably other 'common' information (remote DN, remote CA, ASN.1 formatted certificate perhaps, etc...). Thanks, Stephen
* Andreas Pflug (pgadmin@pse-consulting.de) wrote: > I wonder if there are apps that actually use the ssl pointer, beyond > detection of encrypted connections. So interpreting the result as bool > would be sufficient. I'm not sure if there are apps out there which use it for anything but a bool but there's certainly a potential for apps to want to do things like get the DN of the remote server... Thanks, Stephen
On Wed, Apr 12, 2006 at 01:42:51PM -0400, Stephen Frost wrote: > * Andreas Pflug (pgadmin@pse-consulting.de) wrote: > > I wonder if there are apps that actually use the ssl pointer, beyond > > detection of encrypted connections. So interpreting the result as bool > > would be sufficient. > > I'm not sure if there are apps out there which use it for anything but a > bool but there's certainly a potential for apps to want to do things > like get the DN of the remote server... Strangly enough, the SSL code in libpq has stored the peer DN and CN except it doesn't appear to be available to the client... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Wed, Apr 12, 2006 at 12:32:01PM -0400, Tom Lane wrote: > There is a more serious issue here though: if we allow more than one SSL > library, what exactly can an application safely do with the returned > pointer? It strikes me as very dangerous for the app to assume it knows > which SSL library is underneath libpq. It's not at all hard to imagine > an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS > or vice versa. To the extent that there are apps out there that depend > on doing something with this function, I think that even contemplating > supporting multiple SSL libraries is a threat. The only real way to a solution is to work out why people want the pointer. So far I've found two reasons: - People want to hijack the connection after libpq has set it up to do their own processing. - People want to examine the certificates more closely. The first would be easily handled by providing a formal interface for libpq to hijack the connection with, providing read/write and maybe a few others. The latter is tricker. You're invariably going to run into the problem where the app uses one lib and libpq the other. Other than DN and CN, what else would people want? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> > There is a more serious issue here though: if we allow more > than one > > SSL library, what exactly can an application safely do with the > > returned pointer? It strikes me as very dangerous for the app to > > assume it knows which SSL library is underneath libpq. It's not at > > all hard to imagine an app getting an OpenSSL struct pointer and > > trying to pass it to GnuTLS or vice versa. To the extent > that there > > are apps out there that depend on doing something with this > function, > > I think that even contemplating supporting multiple SSL > libraries is a threat. > > The only real way to a solution is to work out why people > want the pointer. So far I've found two reasons: > > - People want to hijack the connection after libpq has set it > up to do their own processing. > > - People want to examine the certificates more closely. > > The first would be easily handled by providing a formal > interface for libpq to hijack the connection with, providing > read/write and maybe a few others. The latter is tricker. > You're invariably going to run into the problem where the app > uses one lib and libpq the other. > > Other than DN and CN, what else would people want? Issuer (name and certificate), validity dates, basic constraints, key usage, posslby fingerprint. //Magnus
On Wed, Apr 12, 2006 at 08:14:58PM +0200, Magnus Hagander wrote: > > Other than DN and CN, what else would people want? > > Issuer (name and certificate), validity dates, basic constraints, key > usage, posslby fingerprint. GnuTLS handles this with just one function: gnutls_x509_crt_get_dn_by_oid( cert, oid, index, raw, &data, &length ) And a whole pile of #defines #define GNUTLS_OID_X520_COUNTRY_NAME "2.5.4.6" #define GNUTLS_OID_X520_ORGANIZATION_NAME "2.5.4.10" #define GNUTLS_OID_X520_ORGANIZATIONAL_UNIT_NAME "2.5.4.11" etc... Which is nice because then end users can code in the attributes they want and we don't have to deal with the endless variations. I don't however know enough to know if this (with a function to get OIDs by index) is sufficient to extract all the information from the certificate. Presumably OpenSSL can do this too... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
"Magnus Hagander" <mha@sollentuna.net> writes: >> Other than DN and CN, what else would people want? > Issuer (name and certificate), validity dates, basic constraints, key > usage, posslby fingerprint. I think that way madness lies --- do we really want to commit to re-inventing an SSL API that will cover anything someone might want to do with either underlying library? Moreover, this does not fix the problem: an existing app that thinks it can pass the returned pointer to an OpenSSL routine will still crash the moment a GnuTLS version of libpq is put under it. Case in point: psql, as currently coded. An idea that just occurred to me is to define PQgetssl as "return SSL* if we are using OpenSSL for this connection; else return NULL". Then add a parallel routine (maybe PQgetgnussl?) defined as returning the equivalent GnuTLS handle, only if we are using GnuTLS for this connection. (Presumably, in any one build of libpq, one of the pair of routines would be an always-returns-null stub.) The advantage of this is that an app knows what it'll get, and an app that's only familiar with one of the two SSL libraries will not be given a pointer it can't use. I'd still want to adopt Martijn's idea of declaring both of 'em as returning void *, to avoid depending on other packages' include files. regards, tom lane
Martijn van Oosterhout wrote: >On Wed, Apr 12, 2006 at 05:03:32PM +0100, Dave Page wrote: > ><about the declaration of PQgetssl> > > >>The next version of psqlODBC (that has just gone into CVS tip after >>months of work and debate) uses it, and would break almost completely >>should it be removed, therefore any backwards incompatible change should >>be avoided imho. And 2 or 4 could cause chaos for Windows users if >>different DLL builds get mixed up. >> >> > >Hmm, may I ask what it uses it for? Just to get information, or >something more substantial? > In case of SSL mode, the driver gets the communication path using PQsocket() or PQgetssl() after calling PQconnectdb(). The driver comunicates with the server by itself using the path. In case of non-SSL mode, the driver never calls libpq API at all. regards, Hiroshi Inoue
On Wed, Apr 12, 2006 at 05:00:17PM -0400, Tom Lane wrote: > > Issuer (name and certificate), validity dates, basic constraints, key > > usage, posslby fingerprint. > > I think that way madness lies --- do we really want to commit to > re-inventing an SSL API that will cover anything someone might want > to do with either underlying library? Indeed. There's also the issue that the underlying system may not be using what you think it is. e.g. GnuTLS can authenticate on PGP keys rather than x509 certificates. There's still the mystery regarding libpq extracting peer DN and CN but passing it to the user. > An idea that just occurred to me is to define PQgetssl as "return SSL* > if we are using OpenSSL for this connection; else return NULL". Then > add a parallel routine (maybe PQgetgnussl?) defined as returning the > equivalent GnuTLS handle, only if we are using GnuTLS for this > connection. (Presumably, in any one build of libpq, one of the pair of > routines would be an always-returns-null stub.) Alternatively, create a new function PQgetsslinfo() that returns both the library name and a (void) pointer. In any case the old interface can never return anything other than a pointer for OpenSSL. > I'd still want to adopt Martijn's idea of declaring both of 'em as > returning void *, to avoid depending on other packages' include files. Ack, at least we can get that out of the way. It doesn't change anything from the user's point of view, other than they know for sure what the signiture is. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Wed, Apr 12, 2006 at 05:25:47PM +0100, Dave Page wrote: > The driver implements all versions of the wire protocol itself, but if > libpq is available at runtime (it will dynamically load it on platforms > that support it) it can use it for connection setup so features like SSL > can be provided easily. I'm still not overly familiar with how it works > yet, but I'm sure Hiroshi (CC'd) can provide further details if you need > them. Right, so what you're basically doing is setting up the connection via libpq then grabbing the SSL pointer and using that to continue communicating. If it's not SSL you use PQsocket get the socket and continue from there. Unorthodox usage, but it should work. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 13 April 2006 07:58 > To: Dave Page > Cc: pgsql-hackers@postgresql.org; Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > On Wed, Apr 12, 2006 at 05:25:47PM +0100, Dave Page wrote: > > The driver implements all versions of the wire protocol > itself, but if > > libpq is available at runtime (it will dynamically load it on > > platforms that support it) it can use it for connection setup so > > features like SSL can be provided easily. I'm still not overly > > familiar with how it works yet, but I'm sure Hiroshi (CC'd) can > > provide further details if you need them. > > Right, so what you're basically doing is setting up the > connection via libpq then grabbing the SSL pointer and using > that to continue communicating. If it's not SSL you use > PQsocket get the socket and continue from there. Yup. > Unorthodox usage, but it should work. Well, we had a pure custom implementation of the protocol, had a pure libpq based version and after much discussion decided that the best version of all was the hybrid as it allowed us to hijack features like SSL, Kerberos, pgpass et al, yet not be constrained by the limitations of libpq, or copy query results about so much. Regards, Dave
On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote: > Well, we had a pure custom implementation of the protocol, had a pure > libpq based version and after much discussion decided that the best > version of all was the hybrid as it allowed us to hijack features like > SSL, Kerberos, pgpass et al, yet not be constrained by the limitations > of libpq, or copy query results about so much. Right. Would you see value in a more formal libpq "hijack-me" interface that would support making the initial connection and then handing off the rest to something else? I'm wondering because obviously with the current setup, if libpq is compiled with SSL support, psqlODBC must also be. Are there any points where you have to fight libpq over control of the socket? I'm thinking that such an interface would need to provide the following: read (sync/async) write (sync/async) getfd (for select/poll) ispending (is there stuff to do) release (for when you're finished) Is there anything else you might need? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 13 April 2006 09:15 > To: Dave Page > Cc: pgsql-hackers@postgresql.org; Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote: > > Well, we had a pure custom implementation of the protocol, > had a pure > > libpq based version and after much discussion decided that the best > > version of all was the hybrid as it allowed us to hijack > features like > > SSL, Kerberos, pgpass et al, yet not be constrained by the > limitations > > of libpq, or copy query results about so much. > > Right. Would you see value in a more formal libpq "hijack-me" > interface that would support making the initial connection > and then handing off the rest to something else? > > I'm wondering because obviously with the current setup, if > libpq is compiled with SSL support, psqlODBC must also be. > Are there any points where you have to fight libpq over > control of the socket? > > I'm thinking that such an interface would need to provide the > following: > > read (sync/async) > write (sync/async) > getfd (for select/poll) > ispending (is there stuff to do) > release (for when you're finished) > > Is there anything else you might need? I'll have to let Hiroshi comment on that as he wrote the code. I've only skimmed over it a few times so far. Regards, Dave.
* Martijn van Oosterhout (kleptog@svana.org) wrote: > On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote: > > Well, we had a pure custom implementation of the protocol, had a pure > > libpq based version and after much discussion decided that the best > > version of all was the hybrid as it allowed us to hijack features like > > SSL, Kerberos, pgpass et al, yet not be constrained by the limitations > > of libpq, or copy query results about so much. > > Right. Would you see value in a more formal libpq "hijack-me" interface > that would support making the initial connection and then handing off > the rest to something else? > > I'm wondering because obviously with the current setup, if libpq is > compiled with SSL support, psqlODBC must also be. Are there any points > where you have to fight libpq over control of the socket? [...] > Is there anything else you might need? Instead of having it hijack the libpq connection and implement the wireline protocol itself, why don't we work on fixing the problems (such as the double-copying that libpq requires) in libpq to allow the driver (and others!) to use it in the 'orthodox' way? I would have spoken up on the ODBC list if I understood that 'hybrid' really meant 'just using libpq for connection/authentication'. I really think it's a bad idea to have the ODBC driver reimplement the wireline protocol because that protocol does change from time to time and someone using libpq will hopefully have fewer changes (and thus makes the code easier to maintain) than someone implementing the wireline protocol themselves (just causing more busy-work that, at least we saw in the past with the ODBC driver, may end up taking *forever* for someone to be able to commit the extra required time to implement). Thanks, Stephen
On Thu, Apr 13, 2006 at 06:44:12AM -0400, Stephen Frost wrote: > Instead of having it hijack the libpq connection and implement the > wireline protocol itself, why don't we work on fixing the problems (such > as the double-copying that libpq requires) in libpq to allow the driver > (and others!) to use it in the 'orthodox' way? Ok. I'm not sure what this "double copying" you're referring to is, but I'd certaintly like to know why people are reimplementing the protocol (psqlODBC is hardly the only one). Is is that people want to use completely different interaction models? Like work around the wait-for-whole-resultset-before-returing issue? Or maybe better notice handling? What is it that's so deficient? Or maybe it's portability? Like DBI PgPP module? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> -----Original Message----- > From: Stephen Frost [mailto:sfrost@snowman.net] > Sent: 13 April 2006 11:44 > To: Martijn van Oosterhout > Cc: Dave Page; pgsql-hackers@postgresql.org; Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > Instead of having it hijack the libpq connection and > implement the wireline protocol itself, why don't we work on > fixing the problems (such as the double-copying that libpq > requires) in libpq to allow the driver (and others!) to use > it in the 'orthodox' way? > > I would have spoken up on the ODBC list if I understood that 'hybrid' > really meant 'just using libpq for > connection/authentication'. I really think it's a bad idea > to have the ODBC driver reimplement the wireline protocol > because that protocol does change from time to time and > someone using libpq will hopefully have fewer changes (and > thus makes the code easier to maintain) than someone > implementing the wireline protocol themselves (just causing > more busy-work that, at least we saw in the past with the > ODBC driver, may end up taking *forever* for someone to be > able to commit the extra required time to implement). This has been the subject of discussion for many months and the concencus was that the most effective approach was the hybrid one which has now been moved into CVS tip. Those involved are fully aware of the maintenance issues of implementing the wire protocol in the driver, as well as the difficulties using libpq entirely caused (that is how the 08.01.xxxx driver works). Changing direction again simply isn't going to happen. Regards, Dave
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 13 April 2006 11:54 > To: Dave Page; pgsql-hackers@postgresql.org; Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > On Thu, Apr 13, 2006 at 06:44:12AM -0400, Stephen Frost wrote: > > Instead of having it hijack the libpq connection and implement the > > wireline protocol itself, why don't we work on fixing the problems > > (such as the double-copying that libpq requires) in libpq > to allow the > > driver (and others!) to use it in the 'orthodox' way? > > Ok. I'm not sure what this "double copying" you're referring > to is, The libpq driver copies results out of the PGresult struct into the internal QueryResult classes. With libpq out of the loop, data can go straight from the wire into the QR. > but I'd certaintly like to know why people are > reimplementing the protocol (psqlODBC is hardly the only one). There are elements of the wire protocol that libpq doesn't actually implement from what I recall. IIRC, they were added specifically for JDBC but also intended to be used by psqlODBC as well. I forget the details though as I wasn't so involved with the ODBC development back then. In addition of course, implementing the protocol natively does allow for maximum flexibility. Regards, Dave.
On Thu, Apr 13, 2006 at 12:12:25PM +0100, Dave Page wrote: > > Ok. I'm not sure what this "double copying" you're referring > > to is, > > The libpq driver copies results out of the PGresult struct into the > internal QueryResult classes. With libpq out of the loop, data can go > straight from the wire into the QR. Hmm, the simplest improvement I can think of is one where you register a callback that libpq calls whenever it has received a new tuple. However, w.r.t. the copying, the pointers in get PGresult are in memory belonging to that result. As long as that PGresult hangs around, you should be able to just copy the pointers rather than the data? Or is this unacceptable? The only alternative I can think of is let users provide a callback that is given the number of bytes and it returns memory to store the data into. But that just seems unnecessarily complex, considering you could just copy the pointers. > There are elements of the wire protocol that libpq doesn't actually > implement from what I recall. IIRC, they were added specifically for > JDBC but also intended to be used by psqlODBC as well. I forget the > details though as I wasn't so involved with the ODBC development back > then. Ugh, that's terrible. How do these features get tested if nothing within the main tree implements them. > In addition of course, implementing the protocol natively does allow for > maximum flexibility. Maybe, but it should be possible to have a lot of flexibility without having many projects jump through all sorts of hoops everytime a new protocol version is created. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 13 April 2006 12:34 > To: Dave Page > Cc: pgsql-hackers@postgresql.org; Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > However, w.r.t. the copying, the pointers in get PGresult are > in memory belonging to that result. As long as that PGresult > hangs around, you should be able to just copy the pointers > rather than the data? Or is this unacceptable? It copies the data. I can't think offhand why it was implemented that way, but then I didn't write the code (Anoop & Siva @ Pervasive did). Anyhoo, as I've said, that approach has now been abandoned anyway in favour of Hiroshi's, so it's him you'd need to convince to change. The rest of us have only just started re-learning the code. Regards, Dave
* Dave Page (dpage@vale-housing.co.uk) wrote: > This has been the subject of discussion for many months and the > concencus was that the most effective approach was the hybrid one which > has now been moved into CVS tip. Those involved are fully aware of the > maintenance issues of implementing the wire protocol in the driver, as > well as the difficulties using libpq entirely caused (that is how the > 08.01.xxxx driver works). Changing direction again simply isn't going to > happen. There was barely any discussion at all about this... I do follow the lists involved even though I didn't respond to the question regarding this (either time it was asked) because I didn't understand that 'hybrid' meant 'only using libpq for the connection'. I'm curious how many others of those being asked understood this... I think the fact that you had to ask twice to get any response at all is a good indication. Does the latest verion in CVS support V3 of the wireline protocol? If I recall correctly, the version it was based on still only supported V2... What does the wireline protocol implementation in the ODBC driver do that it can't get through libpq? I can certainly understand the double-copying issue (I complained about that myself when first starting to use libpq) but I think that could be fixed without that much difficulty. Were there other things? Thanks, Stephen
> -----Original Message----- > From: Stephen Frost [mailto:sfrost@snowman.net] > Sent: 13 April 2006 12:56 > To: Dave Page > Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org; > Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > There was barely any discussion at all about this... I do > follow the lists involved even though I didn't respond to the > question regarding this (either time it was asked) because I > didn't understand that 'hybrid' meant 'only using libpq for > the connection'. I'm curious how many others of those being > asked understood this... I think the fact that you had to > ask twice to get any response at all is a good indication. There was extensive off-list discussion between all the active developers before we explained the situation on list, created the test builds, announced the fact that the code was in CVS and asked for feedback from users. Most of the initial discussion occurred off-list because there were issues of commercial support to consider that at the time should not have been done in public (in a nutshell, we didn't want to piss Pervasive off). > Does the latest verion in CVS support V3 of the wireline > protocol? If I recall correctly, the version it was based on > still only supported V2... Yes, it supports v3. > What does the wireline protocol implementation in the ODBC > driver do that it can't get through libpq? I can certainly > understand the double-copying issue (I complained about that > myself when first starting to use libpq) but I think that > could be fixed without that much difficulty. Were there other things? I don't know if we are currently using any features that libpq cannot offer. I do know that although the older driver basically worked with libpq, major features (such as updateable cursors) were broken beyond feasible repair. They would have had to have been almost entirely redesigned, and given that we have enough trouble finding developers with enough time and the ability to fix even relatively simple bugs in the driver it seemed more sensible to go with the solution that worked properly, yet still offered the features (v3, SSL, Kerberos) that we wanted from libpq. The only downside is that we might have to update for any future protocols again, but even that is not essential given that the server will fall back to v2 and presumably v3 when v4 is written. Regards, Dave
* Martijn van Oosterhout (kleptog@svana.org) wrote: > On Thu, Apr 13, 2006 at 12:12:25PM +0100, Dave Page wrote: > > > Ok. I'm not sure what this "double copying" you're referring > > > to is, > > > > The libpq driver copies results out of the PGresult struct into the > > internal QueryResult classes. With libpq out of the loop, data can go > > straight from the wire into the QR. > > Hmm, the simplest improvement I can think of is one where you > register a callback that libpq calls whenever it has received a new > tuple. You wouldn't want it on every tuple as that'd get expensive through function calls. > However, w.r.t. the copying, the pointers in get PGresult are in memory > belonging to that result. As long as that PGresult hangs around, you > should be able to just copy the pointers rather than the data? Or is > this unacceptable? It's actually pretty common (or seems to be anyway) to want to store the data from the query result into your own data structure. Yes, you could just use pointers all over the place but that means you're going to have to use things which understand PQresult everywhere as opposed to having a generic 'storage manager' with other generic things (index creator, aggregator, etc) which can be used with more than just PQresults. > The only alternative I can think of is let users provide a callback > that is given the number of bytes and it returns memory to store the > data into. But that just seems unnecessarily complex, considering you > could just copy the pointers. You don't provide a callback, you have the user provide a memory region to libpq which libpq can then fill in. It's really not that difficult, the API would really look quite a bit like PQexecParams, ie: int PQgetTuples(PGresult *res, // Returns number of tuples populated const int max_ntuples, // Basically buffer sizechar *result_set, // Destination buffer const int *columnOffsets, // integer array of offsets const int *columnLengths, // integer array of lengths, for checks const int record_len, // Length of each structure int *columnNulls, // 0/1 for is not null / is null int resultFormat); // Or maybe just binary? If we want to do conversion of the data in some way then we may need to expand this to include that ability (but I don't think PQgetvalue does, so...). > > There are elements of the wire protocol that libpq doesn't actually > > implement from what I recall. IIRC, they were added specifically for > > JDBC but also intended to be used by psqlODBC as well. I forget the > > details though as I wasn't so involved with the ODBC development back > > then. > > Ugh, that's terrible. How do these features get tested if nothing > within the main tree implements them. I fully agree with this sentiment... > > In addition of course, implementing the protocol natively does allow for > > maximum flexibility. > > Maybe, but it should be possible to have a lot of flexibility without > having many projects jump through all sorts of hoops everytime a new > protocol version is created. Indeed. Thanks, Stephen
On Thu, Apr 13, 2006 at 12:48:06PM +0100, Dave Page wrote: > Anyhoo, as I've said, that approach has now been abandoned anyway in > favour of Hiroshi's, so it's him you'd need to convince to change. The > rest of us have only just started re-learning the code. Well, I quickly scanned the code in CVS to see what I could find out. There are a few features the psqlodbc tuplereader has that libpq doesn't. 1. It reads tuples as you go through the data. The resultset has a cursor, it only processes the data as you request it. 2. It reads directly from the socket directly into a per-tuple malloc()ed field. 3. It extracts per-row tids directly into a seperate array. 4. The resulting resultset can be updated and modified as well as appended to. This requires freeing and adding rows. And committing the result. This is probably your updatable cursors. So in fact what you really want is libpq as a protocol decoder but want to manage your resultset yourself. And you want to be able to let users handle incoming data as it comes rather than waiting for the whole set. I don't think the zero-copy is relevent, the code is not written in a way that suggests speed was an issue. Rather I think the way you want to use the resultset is the issue. You can't use the memory in the PGresult because then'd you need to track which tuples were allocated by you and which we allocated by libpq. The resulting copying is needless, along with the fact that you double your memory usage. In fact, can think that a number of other projects would like an alternative. For example, a Perl module would want to load the strings directly into blessed perl strings rather than keep a copy of the resultset around. I think this would be a worthwhile addition to the libpq interface. I'll see if I can come up with a proposal (whether it'll get implemented is another issue entirely). Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
* Dave Page (dpage@vale-housing.co.uk) wrote: > > What does the wireline protocol implementation in the ODBC > > driver do that it can't get through libpq? I can certainly > > understand the double-copying issue (I complained about that > > myself when first starting to use libpq) but I think that > > could be fixed without that much difficulty. Were there other things? > > I don't know if we are currently using any features that libpq cannot > offer. > > I do know that although the older driver basically worked with libpq, > major features (such as updateable cursors) were broken beyond feasible > repair. They would have had to have been almost entirely redesigned, and > given that we have enough trouble finding developers with enough time > and the ability to fix even relatively simple bugs in the driver it > seemed more sensible to go with the solution that worked properly, yet > still offered the features (v3, SSL, Kerberos) that we wanted from Updatable cursors isn't something supported in the core system yet but is clearly useful and is part of the spec (it seems anyway). It'd be really nice to implement this as part of core instead of having it be reimplemented by multiple different people. I think it'd be time well spent to work on implementing it in core than redesigning it in the ODBC driver to work with the current libpq. As I understand it, this would probably also be useful to the JDBC people. > libpq. The only downside is that we might have to update for any future > protocols again, but even that is not essential given that the server > will fall back to v2 and presumably v3 when v4 is written. Perhaps not *essential* but certainly a good thing to do as it can provide performance and functionality improvements... Thanks, Stephen
* Martijn van Oosterhout (kleptog@svana.org) wrote: > Well, I quickly scanned the code in CVS to see what I could find out. Wow, that was quick. :) > So in fact what you really want is libpq as a protocol decoder but want > to manage your resultset yourself. And you want to be able to let users > handle incoming data as it comes rather than waiting for the whole set. The data-as-it-comes bit could be done w/ a Postgres cursor, couldn't it? But then you have to read through all the data using PQgetResult, which isn't much fun. > I don't think the zero-copy is relevent, the code is not written in a > way that suggests speed was an issue. Rather I think the way you want > to use the resultset is the issue. You can't use the memory in the > PGresult because then'd you need to track which tuples were allocated > by you and which we allocated by libpq. The resulting copying is > needless, along with the fact that you double your memory usage. The double memory usage definitely sucks but I really think speed would also be greatly improved by removing the double copying and all the function calls dealing with PQgetResult, etc... > In fact, can think that a number of other projects would like an > alternative. For example, a Perl module would want to load the strings > directly into blessed perl strings rather than keep a copy of the > resultset around. I think this would be a worthwhile addition to the > libpq interface. Me too. :) > I'll see if I can come up with a proposal (whether it'll get > implemented is another issue entirely). I'd be interested in trying to help with this too.. Thanks, Stephen
> -----Original Message----- > From: Stephen Frost [mailto:sfrost@snowman.net] > Sent: 13 April 2006 14:03 > To: Martijn van Oosterhout > Cc: Dave Page; pgsql-hackers@postgresql.org; Hiroshi Inoue > Subject: Re: [HACKERS] Practical impediment to supporting > multiple SSL libraries > > * Martijn van Oosterhout (kleptog@svana.org) wrote: > > Well, I quickly scanned the code in CVS to see what I could > find out. > > Wow, that was quick. :) Yes :-) > > I don't think the zero-copy is relevent, the code is not > written in a > > way that suggests speed was an issue. Rather I think the > way you want > > to use the resultset is the issue. You can't use the memory in the > > PGresult because then'd you need to track which tuples were > allocated > > by you and which we allocated by libpq. The resulting copying is > > needless, along with the fact that you double your memory usage. > > The double memory usage definitely sucks but I really think > speed would also be greatly improved by removing the double > copying and all the function calls dealing with PQgetResult, etc... Don't forget that the code now in CVS-tip is not the code that had the copy issue. As of last Saturday, the hybrid version was moved to tip. Regards, Dave.
On Thu, Apr 13, 2006 at 08:32:34AM -0400, Stephen Frost wrote: > * Martijn van Oosterhout (kleptog@svana.org) wrote: > > Hmm, the simplest improvement I can think of is one where you > > register a callback that libpq calls whenever it has received a new > > tuple. > > You wouldn't want it on every tuple as that'd get expensive through > function calls. Why not? Internally we call pqAddTuple for every tuple, calling a user function instead is hardly going to be more expensive. Also, I was thinking of the situation where the user function could set a flag so the eventual caller of (perhaps) PQconsumeInput knows that it's got enough for now. > It's actually pretty common (or seems to be anyway) to want to store the > data from the query result into your own data structure. Yes, you could > just use pointers all over the place but that means you're going to have > to use things which understand PQresult everywhere as opposed to having a > generic 'storage manager' with other generic things (index creator, > aggregator, etc) which can be used with more than just PQresults. <snip> > You don't provide a callback, you have the user provide a memory region > to libpq which libpq can then fill in. It's really not that difficult, > the API would really look quite a bit like PQexecParams, ie: Except in the case of psqlODBC, it wants to be able to malloc/free() each field, which your method doesn't solve. Also, it doesn't solve the duplicate memory use, nor the retreiving of rows before the resultset is complete. > If we want to do conversion of the data in some way then we may need to > expand this to include that ability (but I don't think PQgetvalue does, > so...). I think a callback is much easier. As a bonus the user could specify that libpq doesn't need to remember the rows. Memory savings. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
* Martijn van Oosterhout (kleptog@svana.org) wrote: > On Thu, Apr 13, 2006 at 08:32:34AM -0400, Stephen Frost wrote: > > You wouldn't want it on every tuple as that'd get expensive through > > function calls. > > Why not? Internally we call pqAddTuple for every tuple, calling a user > function instead is hardly going to be more expensive. Also, I was > thinking of the situation where the user function could set a flag > so the eventual caller of (perhaps) PQconsumeInput knows that it's got > enough for now. Hrmpf, the fact that we have a different call we make for every tuple anyway isn't exactly encouraging to me. > > You don't provide a callback, you have the user provide a memory region > > to libpq which libpq can then fill in. It's really not that difficult, > > the API would really look quite a bit like PQexecParams, ie: > > Except in the case of psqlODBC, it wants to be able to malloc/free() > each field, which your method doesn't solve. Also, it doesn't solve the > duplicate memory use, nor the retreiving of rows before the resultset > is complete. I don't entirely follow why you think it wouldn't solve the duplicate memory use (except perhaps in the psqlODBC case if they decide to just grab a bunch of tuples into one area and then go through and malloc/free each one after that, not exactly what I'd suggest...). The basic idea was actually modeled off of 'read'- you get back what's currently available, which might not be the full set you asked for so far. I think perhaps you're assuming that my suggestion would just be an overlay on top of the existing libpq PQgetReult which would just turn around and call PQgetResult to fill in the memory region provided by the user- entirely *not* the case... Perhaps I should have used 'PQconn' instead of 'PQresult' as the first argument and that would have been clearer. Additionally, honestly, this is very similar to how Oracle's multi-row retrival works... It uses two functions (one for setup into its own structure and then one for actually getting rows) but the basic idea is the same. > > If we want to do conversion of the data in some way then we may need to > > expand this to include that ability (but I don't think PQgetvalue does, > > so...). > > I think a callback is much easier. As a bonus the user could specify > that libpq doesn't need to remember the rows. Memory savings. My solution didn't have libpq remembering the rows... Thanks, Stephen
Martijn van Oosterhout <kleptog@svana.org> writes: > Right. Would you see value in a more formal libpq "hijack-me" interface > that would support making the initial connection and then handing off > the rest to something else? I think this would just be busywork... the way ODBC is doing it seems fine to me. In any case, do we really want to encourage random apps to bypass the library? For one thing, with an API such as you suggest, it would really be libpq's problem to figure out what to do with regular vs passthrough calls. As it stands, it's very obviously not libpq's problem anymore once you hijack the socket. regards, tom lane
Martijn van Oosterhout <kleptog@svana.org> writes: > On Thu, Apr 13, 2006 at 12:12:25PM +0100, Dave Page wrote: > > > Ok. I'm not sure what this "double copying" you're referring > > > to is, > > > > The libpq driver copies results out of the PGresult struct into the > > internal QueryResult classes. With libpq out of the loop, data can go > > straight from the wire into the QR. > > Hmm, the simplest improvement I can think of is one where you > register a callback that libpq calls whenever it has received a new > tuple. That could be useful for applications but I think a driver really wants to retain control of the flow of control. To make use of a callback it would have to have an awkward dance of calling whatever function gives libpq license to call the callback, having the callback stuff the data in a temporary space, then checking for new data in the temporary space, and returning it to the user. -- greg
* Martijn van Oosterhout (kleptog@svana.org) wrote: > Why not? Internally we call pqAddTuple for every tuple, calling a user > function instead is hardly going to be more expensive. Also, I was > thinking of the situation where the user function could set a flag > so the eventual caller of (perhaps) PQconsumeInput knows that it's got > enough for now. I went ahead and looked through the libpq source a bit. What I was suggesting looks like it would change primairly getAnotherTuple to, instead of allocating the result memory itself, just store the result into the appropriate place in the user-provided memory space. Thus, getAnotherTuple wouldn't do any allocation and wouldn't call pqAddTuple at all. It would need to keep track of where it is in the user-provided memory area and if it runs out of space return back through the 'outofmemory' mechanism. The new function would basically set up the appropriate structures in the PGconn and then call 'parseInput()' which would then handle any recently-arrived data, call getAnotherTuple, which would then detect that it's dumping data into a user-provided area and would do so until it's finished being called by parseInput() or it runs out of user memory space. This would be used with the async command processing. A drawback, of course, is that this degenerates to busy-waiting if the application has nothing better to do. Any clue as to if the PQsocket could safely be used in a select()-based system? I'm guessing it could, just never tried that myself. :) Also not sure how to know if there's data which needs to be sent and hasn't been yet for some reason. Thanks! Stephen
On Thu, Apr 13, 2006 at 09:34:10AM -0400, Stephen Frost wrote: > * Martijn van Oosterhout (kleptog@svana.org) wrote: > > Except in the case of psqlODBC, it wants to be able to malloc/free() > > each field, which your method doesn't solve. Also, it doesn't solve the > > duplicate memory use, nor the retreiving of rows before the resultset > > is complete. > > I don't entirely follow why you think it wouldn't solve the duplicate > memory use (except perhaps in the psqlODBC case if they decide to just > grab a bunch of tuples into one area and then go through and malloc/free > each one after that, not exactly what I'd suggest...). Right, I didn't understand that you meant to be doing this synchronously, as the data came in. I thought it was just another way of retreiving the data already received. But given that a stated reason that psqlODBC didn't use the libpq interface was due to the copying of all the data, it would be nice if we had something for that. From looking at your declaration: int PQgetTuples(PGresult *res, // Returns number of tuples populated const int max_ntuples, // Basically buffer sizechar *result_set, // Destination buffer const int *columnOffsets, // integer array of offsets const int*columnLengths, // integer array of lengths, for checks const int record_len, // Length of each structureint *columnNulls, // 0/1 for is not null / is null int resultFormat); // Or maybe just binary? you seem to be suggesting that all the data be stored in one big memory block at resultset. What do you do if the data is longer than the given length? What does record_len mean (what structures)? Also, you can't specify binary/non-binary here, that's done in the query request. libpq doesn't handle the data differently depending on binaryness. Also, how can you find out the actual length of each value after the call? Frankly I'm not seeing much improvement over normal processing. It just seems like yet another data-model that won't fit most users. The definition of PQgetvalue is merely: return res->tuples[tup_num][field_num].value; So we could acheive the same effect by letting people look into PQresult before the query is finished. The function you suggest would be especially difficult for something like psqlODBC which has no idea beforehand how long a value could be. I'm still of the opinion that letting people supply an alternative to pqAddTuple would be cleaner. The interface would look like: typedef struct pgresAttValue { int len; /* length in bytes of the value */ char *value; /* actual value, plus terminating zero byte */ } PGresAttValue; typedef int (*PQtuplecallback)( PQresult *res, PGresAttValue *fields ); int PQsettuplecallback( PQresult *res, PQtuplecallback cb ); fields is simply a pointer to an array of nfields such structures. Users can do whatever they want with the info, store it in their own structure, parse it, throw it away, send it over a network, etc. With this callback I could probably implement your function above fairly straightforwardly. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
* Greg Stark (gsstark@mit.edu) wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > Hmm, the simplest improvement I can think of is one where you > > register a callback that libpq calls whenever it has received a new > > tuple. > > That could be useful for applications but I think a driver really wants to > retain control of the flow of control. To make use of a callback it would have > to have an awkward dance of calling whatever function gives libpq license to > call the callback, having the callback stuff the data in a temporary space, > then checking for new data in the temporary space, and returning it to the > user. I doubt the callback would be called at some inopportune time... Probably the callback would be passed into a libpq call which then directly calls the callback and is done with it when it returns. The libpq function would certainly need a parameter which is just passed to the callback to allow the system to maintain state (such as how many tuples the callback has processed so far) to avoid ugly global variables but otherwise I don't really see that this is changing the flow of control all that much... I can see how having a callback would be useful though I think for a good number of cases it's just going to be populating a memory region with it and we could cover that common case by providing an API for exactly that. The other issue with a callback is that libpq would have to either call the callback for each value (not my preference) or have some way to pass a whole variable-length tuple to the callback, which would require libpq to allocate memory for the tuple (hopefully only once and not per-tuple) and then build up whatever structure it's going to give to the callback in memory (copy once) and then call the callback which would be required to copy the tuple somewhere else (copy again). Of course, all of this is after an initial copy from read() into the read buffer, but I doubt that could be helped (and read()'ing small enough amounts to make it happen wouldn't really improve things). Thanks, Stephen
On Thu, Apr 13, 2006 at 11:14:57AM -0400, Greg Stark wrote: > That could be useful for applications but I think a driver really wants to > retain control of the flow of control. To make use of a callback it would have > to have an awkward dance of calling whatever function gives libpq license to > call the callback, having the callback stuff the data in a temporary space, > then checking for new data in the temporary space, and returning it to the > user. We have an asyncronous interface. I was thinking like: PQsendQuery( conn, query ); res = PQgetResult( conn ); gotenough = FALSE; PQsetcallback( res, mycallback ); while( !gotenough )PQconsumeinput(conn); /* When we reach here we have at least five rows in our data structure */ sub mycallback(res,data) {/* stuff data in memory structure */if( row_count > 5 ) gotenough = TRUE; } If you set non-blocking you can even go off and do other things while waiting. No need for temporary space... Does this seem too complex? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > ...you seem to be suggesting that all the data be stored in one big memory > block at resultset. I didn't like that either; it assumes far too much about what the application needs to do. I think what's wanted is a callback hook that lets the app decide where and how to store the data. Not sure what the hook's API should be exactly, though. regards, tom lane
* Martijn van Oosterhout (kleptog@svana.org) wrote: > Right, I didn't understand that you meant to be doing this > synchronously, as the data came in. I thought it was just another way > of retreiving the data already received. But given that a stated reason > that psqlODBC didn't use the libpq interface was due to the copying of > all the data, it would be nice if we had something for that. From > looking at your declaration: > > int PQgetTuples(PGresult *res, // Returns number of tuples populated > const int max_ntuples, // Basically buffer size > char *result_set, // Destination buffer > const int *columnOffsets, // integer array of offsets > const int *columnLengths, // integer array of lengths, for checks > const int record_len, // Length of each structure > int *columnNulls, // 0/1 for is not null / is null > int resultFormat); // Or maybe just binary? > > you seem to be suggesting that all the data be stored in one big memory > block at resultset. The current block would be stored in one big memory block, yes. Basically a malloc(BUF_SIZE*sizeof(my_structure)); > What do you do if the data is longer than the given length? What does > record_len mean (what structures)? Also, you can't specify > binary/non-binary here, that's done in the query request. libpq doesn't > handle the data differently depending on binaryness. Also, how can you > find out the actual length of each value after the call? hmm, ok, binary/non-binary can be dropped then. If the data is longer than the length then you return and let the caller figure out what it wants to do (realloc, malloc another area, etc). Record_len is just the size of each record, so the amount to skip from the start to get to record #2. libpq just needs it in getAnotherTuple to calculate the place to put the next value. Finding the actual length is a good point, I should have included (as execParams has) an integer array for this, which could actually replace columnNulls and have a special indication when it's a Null value in the column. > Frankly I'm not seeing much improvement over normal processing. It just > seems like yet another data-model that won't fit most users. The > definition of PQgetvalue is merely: > > return res->tuples[tup_num][field_num].value; > > So we could acheive the same effect by letting people look into > PQresult before the query is finished. The function you suggest would > be especially difficult for something like psqlODBC which has no idea > beforehand how long a value could be. I don't think it's quite the same effect... :P I don't think it's exactly uncommon for people to know their data structure and to have defined a struct for it, allocate a set of memory and then want to just loop through the memory in a for() loop based on the structure size. This has been pretty common in standalone application I've seen and it's really nice to be able to have a database just dump the results of a query into such a structure. > I'm still of the opinion that letting people supply an alternative to > pqAddTuple would be cleaner. The interface would look like: > > typedef struct pgresAttValue > { > int len; /* length in bytes of the value */ > char *value; /* actual value, plus terminating zero byte */ > } PGresAttValue; > > typedef int (*PQtuplecallback)( PQresult *res, PGresAttValue *fields ); > int PQsettuplecallback( PQresult *res, PQtuplecallback cb ); > > fields is simply a pointer to an array of nfields such structures. > Users can do whatever they want with the info, store it in their own > structure, parse it, throw it away, send it over a network, etc. With > this callback I could probably implement your function above fairly > straightforwardly. Sure you could but you're forced to do more copying around of the data (copy into the PGresAttValue, copy out of it into your structure array). If you want something more complex then a callback makes more sense but I'm of the opinion that we're talking about a 90/10 or 80/20 split here in terms of dump-into-memory array vs. do-something-more-complicated. And that opinion isn't *solely* based on Oracle providing a similar mechanism such that probably quite a few Oracle apps are written exactly as I suggest (I don't think OCI8 has a callback like you're proposing at all...). Just to point out, we could do what you're proposing by letting people look at PQresult during an async too.. ;) Except, of course, libpq would need to allocate/deallocate all the PQresults and have some way of knowing which have been used by the caller and which havn't. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > I can see how having a callback would be useful though I think for a > good number of cases it's just going to be populating a memory region > with it and we could cover that common case by providing an API for > exactly that. We already have that: it's called the existing libpq API. The only reason I can see for offering any new feature in this area is to cater to apps that want to transform the data representation on-the-fly, not merely dump it into an area that will be the functional equivalent of a PGresult. So it really has to be a callback. > The other issue with a callback is that libpq would have > to either call the callback for each value (not my preference) Why not? That would eliminate a number of problems. regards, tom lane
Martijn van Oosterhout <kleptog@svana.org> writes: > On Thu, Apr 13, 2006 at 11:14:57AM -0400, Greg Stark wrote: > > That could be useful for applications but I think a driver really wants to > > retain control of the flow of control. To make use of a callback it would have > > to have an awkward dance of calling whatever function gives libpq license to > > call the callback, having the callback stuff the data in a temporary space, > > then checking for new data in the temporary space, and returning it to the > > user. > > We have an asyncronous interface. I was thinking like: > > sub mycallback(res,data) > { > /* stuff data in memory structure */ > if( row_count > 5 ) > gotenough = TRUE; > } > > If you set non-blocking you can even go off and do other things while > waiting. No need for temporary space... > > Does this seem too complex? There's nothing wrong with a callback interface for applications. They can generally have the callback function update the display or output to a file or whatever they're planning to do with the data. However drivers don't generally work that way. Drivers have functions like: $q = prepare("select ..."); $q->execute(); while ($row = $q->fetch()) { print $row->{column}; } To handle that using a callback interface would require that $q->fetch invoke some kind of pqCheckForData() which would upcall to the callback with the available data. The callback would have to stuff the data somewhere. Then fetch() would check to see if there was data there and return it to the user. It's doable but dealing with this impedance mismatch between the interfaces necessitates extra steps. That means extra copying and extra function calls. -- greg
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I can see how having a callback would be useful though I think for a > > good number of cases it's just going to be populating a memory region > > with it and we could cover that common case by providing an API for > > exactly that. > > We already have that: it's called the existing libpq API. Right, and it sucks for doing large amounts of transfer through it. > The only reason I can see for offering any new feature in this area is > to cater to apps that want to transform the data representation > on-the-fly, not merely dump it into an area that will be the functional > equivalent of a PGresult. So it really has to be a callback. It's only the functional equivalent when you think all the world is a Postgres app, which is just not the case. > > The other issue with a callback is that libpq would have > > to either call the callback for each value (not my preference) > > Why not? That would eliminate a number of problems. For one thing, it's certainly possible the callback (to do a data transform like you're suggesting) would want access to the other information in a given tuple. Having to store a partial tuple in a temporary area which has to be built up to the full tuple before you can actually process it wouldn't be all that great. This is much less true for the contents of an entire table (that you would need it all before being able to perform the transforms). It would also be an awful lot of calls. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> The only reason I can see for offering any new feature in this area is >> to cater to apps that want to transform the data representation >> on-the-fly, not merely dump it into an area that will be the functional >> equivalent of a PGresult. So it really has to be a callback. > It's only the functional equivalent when you think all the world is a > Postgres app, which is just not the case. If we are dumping data into a simple memory block in a format dictated by libpq, then we haven't done a thing to make the app's use of that data independent of libpq. Furthermore, because that format has to be generalized (variable-length fields, etc), it will not be noticeably easier to use than the existing PQresult API. What I would envision as a typical use of a callback is to convert the data and store it in a C struct designed specifically for a particular query's known result structure (say, a few ints, a string of a known maximum length, etc). libpq can't do that, but a callback could do it easily. The fixed-memory-block approach also falls over when considering results of uncertain maximum size. Lastly, it doesn't seem to me to respond at all to the ODBC needs that started this thread: IIUC, they want each row separately malloc'd so that they can free selected rows from the completed resultset. >>> The other issue with a callback is that libpq would have >>> to either call the callback for each value (not my preference) >> >> Why not? That would eliminate a number of problems. > For one thing, it's certainly possible the callback (to do a data > transform like you're suggesting) would want access to the other > information in a given tuple. Having to store a partial tuple in a > temporary area which has to be built up to the full tuple before you can > actually process it wouldn't be all that great. So instead, you'd prefer to *always* store partial tuples in a temporary area, thereby making sure the independent-field-conversions case has performance just as bad as the dependent-conversions case. I can't follow that reasoning. regards, tom lane
On Thu, Apr 13, 2006 at 11:54:33AM -0400, Stephen Frost wrote: <snip> > Sure you could but you're forced to do more copying around of the data > (copy into the PGresAttValue, copy out of it into your structure array). > If you want something more complex then a callback makes more sense but > I'm of the opinion that we're talking about a 90/10 or 80/20 split here > in terms of dump-into-memory array vs. do-something-more-complicated. I think we're talking cross-purposes here. You seem to be interested in making another way to get the data. What I'm trying to do is create an interface flexible enough that no-one would ever want write their own wire-protocol parser because they can get libpq to do it. This probably falls into the 10% portion. The use of PGresAttValue was deliberate. libpq already uses this so it costs nothing. Also, the memory pointed to is allocated very cheaply within libpq. The intention is that users can either choose to use that (great for read-only, i.e. 90% of the time) or copy it *only* if they want to (what psqlODBC wants to do). Basically, your solution doesn't handle the use case of psqlODBC which is specifically what I'm aiming at here... > Just to point out, we could do what you're proposing by letting people > look at PQresult during an async too.. ;) Except, of course, libpq > would need to allocate/deallocate all the PQresults and have some way of > knowing which have been used by the caller and which havn't. Eh? You only need one PQresult... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Thu, Apr 13, 2006 at 12:02:56PM -0400, Greg Stark wrote: > There's nothing wrong with a callback interface for applications. They can > generally have the callback function update the display or output to a file or > whatever they're planning to do with the data. > > However drivers don't generally work that way. Drivers have functions like: As I pointed out in another email, this change is not aimed at applications doing fetch_next, but specifically at drivers like psqlODBC which have a very special way of handling resultsets, in this case, updateable resultsets. The aim is to work out why people are writing their own wire-protocol parsers. To find out the deficiency in libpq that prevents them using it. I agree, for what you're talking about I don't think a callback is at all relevent. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Tom Lane <tgl@sss.pgh.pa.us> writes: > So instead, you'd prefer to *always* store partial tuples in a temporary > area, thereby making sure the independent-field-conversions case has > performance just as bad as the dependent-conversions case. > I can't follow that reasoning. I think there's some confusion about what problem this is aiming to solve. I thought the primary problem ODBC and other drivers have is just that they want to be able to fetch whatever records are available instead of waiting for the entire query results to be ready. All it sounded like to me was a need for a function that would wait until n records were available (or perhaps n bytes worth of records) then return. You seem to be talking about a much broader set of problems to solve. -- greg
Greg Stark <gsstark@mit.edu> writes: > I think there's some confusion about what problem this is aiming to solve. I > thought the primary problem ODBC and other drivers have is just that they want > to be able to fetch whatever records are available instead of waiting for the > entire query results to be ready. No, that's not what I'm thinking about at all, and I don't think Martijn is either. The point here is that ODBC wants to store the resultset in a considerably different format from what libpq natively provides, and we'd like to avoid the conversion overhead. Now, a callback function could be (ab)used for the purpose of not waiting, very easily: either do real processing on each row for itself, or signal the main app via some outside-the-API mechanism whenever it has stored N rows. The question the app author would have to ask himself is whether he needs to undo that processing if the query fails further on, and if so how to do that. But that need not be our problem. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > It's only the functional equivalent when you think all the world is a > > Postgres app, which is just not the case. > > If we are dumping data into a simple memory block in a format dictated > by libpq, then we haven't done a thing to make the app's use of that > data independent of libpq. Furthermore, because that format has to be > generalized (variable-length fields, etc), it will not be noticeably > easier to use than the existing PQresult API. The format of the structure *isn't* really dictated by libpq. The offsets, value length and record size is intended to support most any C array structure. Variable length fields have a max size which, if it goes over, an error is returned or indicated throgh the indicator array. It also gets it into the structure quite a few applications would like to have it in (which is certainly not PQresult). > What I would envision as a typical use of a callback is to convert the > data and store it in a C struct designed specifically for a particular > query's known result structure (say, a few ints, a string of a known > maximum length, etc). libpq can't do that, but a callback could do it > easily. Heh, this is exactly what I'm proposed we make libpq capable of doing, which is a relatively simple thing to do. I agree that it's often a goal of application devlopers to get it into this structure. The one downside is that at the moment I think the binary results from libpq come back in network-byte-order instead of host-byte-order. Oracle provided a way to indicate the types of the fields in the structure and performed some conversions (such as these) for you. The constants they used started with "SQL_" but I'm not entirely sure if they were actually defined in the standard or not. > The fixed-memory-block approach also falls over when considering results > of uncertain maximum size. Lastly, it doesn't seem to me to respond at > all to the ODBC needs that started this thread: IIUC, they want each row > separately malloc'd so that they can free selected rows from the > completed resultset. Results of uncertain maximum size aren't a problem at all... The caller can do the exact same thing libpq does (realloc), or it could allocate another array. *Each* call to the libpq function would return the number of elements actually populated into the memory-block; the caller would then be expected to pass in a *fresh* memory block for the next call (which could just be a simply calculated offset into the block they allocated, or could be a realloc'd block + offset, or a brand new block, etc...). I'm really not why there seem to be this "this won't work!" reaction. This isn't something I came up with out of whole cloth, it's an API that isn't unlike PQexecParams, is similar to something Oracle does (which I've used quite a bit for doing *exactly* what's mentioned above- I've got an array of pre-defined C structs that I know match the query and I want that array filled in) and is really not that complicated. > > For one thing, it's certainly possible the callback (to do a data > > transform like you're suggesting) would want access to the other > > information in a given tuple. Having to store a partial tuple in a > > temporary area which has to be built up to the full tuple before you can > > actually process it wouldn't be all that great. > > So instead, you'd prefer to *always* store partial tuples in a temporary > area, thereby making sure the independent-field-conversions case has > performance just as bad as the dependent-conversions case. > I can't follow that reasoning. I havn't been ruling out providing a callback mechanism as well but I think it's the 10% case and the 90% case is being shoe-horned into the 10% case with a performance degredation to boot. They're also not partial tuples, it's not a temporary area, and there's demonstratably less copying around of the data. It seems ODBC may be in the 10% piece here but I havn't looked at the ODBC source code yet. Thanks, Stephen
* Greg Stark (gsstark@mit.edu) wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > So instead, you'd prefer to *always* store partial tuples in a temporary > > area, thereby making sure the independent-field-conversions case has > > performance just as bad as the dependent-conversions case. > > I can't follow that reasoning. > > I think there's some confusion about what problem this is aiming to solve. I > thought the primary problem ODBC and other drivers have is just that they want > to be able to fetch whatever records are available instead of waiting for the > entire query results to be ready. Honestly, I think that may be part of it but it seems they're more interested in storing the tuples in their own structure right away instead of keeping a PQresult around and using it everywhere. > All it sounded like to me was a need for a function that would wait until n > records were available (or perhaps n bytes worth of records) then return. I'm not sure that you'd actually want to block until there was a certain amount returned, but that would be doable I suppose. > You seem to be talking about a much broader set of problems to solve. I'd like to improve the API in general to cover a set of use-cases that I've run into quite a few times (and apparently some others have too as other DBs offer a similar API). I'd also like the ODBC driver to be able to use libpq instead of having its own implementation of the wireline protocol. I was hoping these would overlap but it's possible they won't in which case it might be sensible to add two new metheds to the API (though I'm sure to get flak about that idea). Thanks, Stephen
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > I think there's some confusion about what problem this is aiming to solve. I > > thought the primary problem ODBC and other drivers have is just that they want > > to be able to fetch whatever records are available instead of waiting for the > > entire query results to be ready. > > No, that's not what I'm thinking about at all, and I don't think Martijn > is either. The point here is that ODBC wants to store the resultset in > a considerably different format from what libpq natively provides, and > we'd like to avoid the conversion overhead. So how would you provide the data to the callback? And how does having a callback instead of a regular downcall give you any more flexibility in how you present the data? -- greg
* Greg Stark (gsstark@mit.edu) wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > Greg Stark <gsstark@mit.edu> writes: > > > I think there's some confusion about what problem this is aiming to solve. I > > > thought the primary problem ODBC and other drivers have is just that they want > > > to be able to fetch whatever records are available instead of waiting for the > > > entire query results to be ready. > > > > No, that's not what I'm thinking about at all, and I don't think Martijn > > is either. The point here is that ODBC wants to store the resultset in > > a considerably different format from what libpq natively provides, and > > we'd like to avoid the conversion overhead. > > So how would you provide the data to the callback? And how does having a > callback instead of a regular downcall give you any more flexibility in how > you present the data? The callback can be called for each record without having to store any more than 1 tuple's worth of information in libpq. I suppose you could change things such that a call using the new interface only processes 1 tuple worth from the input stream instead and just not read any more data from the socket until there have been enough calls to process tuples. That's really more the double-memory issue though. There's also the double-copying that's happening and the have to to wait for all the data to come in before being able to read it, of course that last could be handled by cursors... Thanks, Stephen
On Thu, Apr 13, 2006 at 03:42:44PM -0400, Stephen Frost wrote: > > You seem to be talking about a much broader set of problems to solve. > > I'd like to improve the API in general to cover a set of use-cases that > I've run into quite a few times (and apparently some others have too as > other DBs offer a similar API). I'd also like the ODBC driver to be > able to use libpq instead of having its own implementation of the > wireline protocol. I was hoping these would overlap but it's possible > they won't in which case it might be sensible to add two new metheds to > the API (though I'm sure to get flak about that idea). Well, the psqlODBC driver apparently ran into a number of problems with libpq that resulted in them not using it for their purpose. Given libpq primary purpose is to connect to PostgreSQL, it failing at that is something that should be fixed. The problem you're trying to solve is also important, it would be nice to find a good solution to that. I'm just not sure if it was relevent to the decision to bypass libpq. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout wrote: -- Start of PGP signed section. > On Thu, Apr 13, 2006 at 03:42:44PM -0400, Stephen Frost wrote: > > > You seem to be talking about a much broader set of problems to solve. > > > > I'd like to improve the API in general to cover a set of use-cases that > > I've run into quite a few times (and apparently some others have too as > > other DBs offer a similar API). I'd also like the ODBC driver to be > > able to use libpq instead of having its own implementation of the > > wireline protocol. I was hoping these would overlap but it's possible > > they won't in which case it might be sensible to add two new metheds to > > the API (though I'm sure to get flak about that idea). > > Well, the psqlODBC driver apparently ran into a number of problems with > libpq that resulted in them not using it for their purpose. Given libpq > primary purpose is to connect to PostgreSQL, it failing at that is > something that should be fixed. > > The problem you're trying to solve is also important, it would be nice > to find a good solution to that. I'm just not sure if it was relevent > to the decision to bypass libpq. I know there was a lot of confusion over parallel development of psqlODBC and my guess is that current CVS is the best solution at this time. Of course, that doesn't invalidate the idea that this can be revisited as things settle down and improvements made. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> No, that's not what I'm thinking about at all, and I don't think Martijn >> is either. The point here is that ODBC wants to store the resultset in >> a considerably different format from what libpq natively provides, and >> we'd like to avoid the conversion overhead. > So how would you provide the data to the callback? And how does having a > callback instead of a regular downcall give you any more flexibility in how > you present the data? You'd hand the callback the raw data coming off the wire (pointer and byte count, probably), and then it could do whatever's appropriate. For instance, if the callback knows this field is to be converted to int, it could do atoi() and then store the integer. (Or if it knows the data is transmitted in binary, ntohl() would be the thing instead.) The basic point here is that the callback should replace all the parts of getAnotherTuple() that are responsible for storing data into the PGresult structure, including all of pqAddTuple. If you aren't satisfied with the PGresult representation, that's the level of flexibility you need, IMHO. I don't see the point of half-measures. Probably there would need to be at least three callbacks involved: one for setup, called just after the tuple descriptor info has been received; one for per-field data receipt, and one for per-tuple operations (called after all the fields of the current tuple have been passed to the per-field callback). Maybe you'd want a shutdown callback too, although that's probably not strictly necessary since whatever you might need it to do could be done equally well in the app after PQgetResult returns. (You still want to return a PGresult to carry command success/failure info, and probably the tuple descriptor info, even though use of the callbacks would leave it containing none of the data.) A useful finger exercise for validating the design would be to code up the default callbacks, ie, code to build the current PGresult structure using this API. regards, tom lane
Stephen Frost wrote: >* Martijn van Oosterhout (kleptog@svana.org) wrote: > > >>On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote: >> >> >>>Well, we had a pure custom implementation of the protocol, had a pure >>>libpq based version and after much discussion decided that the best >>>version of all was the hybrid as it allowed us to hijack features like >>>SSL, Kerberos, pgpass et al, yet not be constrained by the limitations >>>of libpq, or copy query results about so much. >>> >>> >>Right. Would you see value in a more formal libpq "hijack-me" interface >>that would support making the initial connection and then handing off >>the rest to something else? >> >>I'm wondering because obviously with the current setup, if libpq is >>compiled with SSL support, psqlODBC must also be. Are there any points >>where you have to fight libpq over control of the socket? >> >> >[...] > > >>Is there anything else you might need? >> >> > >Instead of having it hijack the libpq connection and implement the >wireline protocol itself, why don't we work on fixing the problems (such >as the double-copying that libpq requires) in libpq to allow the driver >(and others!) to use it in the 'orthodox' way? > >I would have spoken up on the ODBC list if I understood that 'hybrid' >really meant 'just using libpq for connection/authentication'. I really >think it's a bad idea to have the ODBC driver reimplement the wireline >protocol because that protocol does change from time to time and someone >using libpq will hopefully have fewer changes (and thus makes the code >easier to maintain) than someone implementing the wireline protocol >themselves (just causing more busy-work that, at least we saw in the >past with the ODBC driver, may end up taking *forever* for someone to >be able to commit the extra required time to implement). > > Libpq and the psqlodbc driver have walked on another road for a very long time. In 6.3 or before, there wasn't a libpq library under Windows. In 6.4 we had the libpq library under Windows but it wasn't able to talk to 6.3 or before unfortunately.... At last in 7.4 the libpq was able to speak both protocol v3 and protocol v2 but it is a pretty hard work at least for me to tranfer all the accummulated works to libpq based version. I'm not sure what kind of functionalities required for libpq to make the tranfer easy. Of cource double-copying issue is big one of them. regards, Hiroshi Inue
Re: Practical impediment to supporting multiple SSL libraries
From
"Zeugswetter Andreas DCP SD"
Date:
> Well, the psqlODBC driver apparently ran into a number of problems with > libpq that resulted in them not using it for their purpose. > Given libpq primary purpose is to connect to PostgreSQL, it failing at that is > something that should be fixed. I think you are forgetting, that e.g. a JDBC driver will not want to depend on an external C dll at all. It will want a native Java implementation (Group 4). Thus imho it is necessary to have a defined wire protocol, which we have. So if a driver needs to use the wire protocol it is imho not a problem. If applications started using it, because they don't find a suitable driver, now that would be a problem. Andreas
"Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at> writes: > > Well, the psqlODBC driver apparently ran into a number of problems with > > libpq that resulted in them not using it for their purpose. Given libpq > > primary purpose is to connect to PostgreSQL, it failing at that is > > something that should be fixed. > > I think you are forgetting, that e.g. a JDBC driver will not want to depend > on an external C dll at all. It will want a native Java implementation > (Group 4). Thus imho it is necessary to have a defined wire protocol, which > we have. I think you are forgetting that this is a complete nonsequitor. Nobody suggested eliminating the defined wire protocol. Nor was anybody even discussing JDBC. Java folks' fetish for reimplementing everything in Java is entirely irrelevant. -- greg
Greg Stark <gsstark@MIT.EDU> writes: > "Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at> writes: > > > > Well, the psqlODBC driver apparently ran into a number of problems with > > > libpq that resulted in them not using it for their purpose. Given libpq > > > primary purpose is to connect to PostgreSQL, it failing at that is > > > something that should be fixed. > > > > I think you are forgetting, that e.g. a JDBC driver will not want to depend > > on an external C dll at all. It will want a native Java implementation > > (Group 4). Thus imho it is necessary to have a defined wire protocol, which > > we have. > > I think you are forgetting that this is a complete nonsequitor. Hm, now that I've had some sleep I think I see where you're going with this. As long as there's a defined wire protocol (and there will always be one) then there's nothing wrong with what the psqlODBC driver is doing and having a libpq mode that hands off small bits of the unparsed stream isn't really any different than just having the driver read the unparsed data from the socket. I'm not sure whether that's true or not but it's certainly a reasonable point. Sorry for my quick response last night. -- greg
On Thu, Apr 13, 2006 at 09:00:10PM -0400, Tom Lane wrote: > Probably there would need to be at least three callbacks involved: > one for setup, called just after the tuple descriptor info has been > received; one for per-field data receipt, and one for per-tuple > operations (called after all the fields of the current tuple have > been passed to the per-field callback). Maybe you'd want a shutdown > callback too, although that's probably not strictly necessary since > whatever you might need it to do could be done equally well in the > app after PQgetResult returns. (You still want to return a PGresult > to carry command success/failure info, and probably the tuple descriptor > info, even though use of the callbacks would leave it containing none of > the data.) Sounds really good. The only thing now is that the main author of the wire-protocol code in psqlODBC has not yet made any comment on any of this. So we dont want to set anything in stone until we know it would solve their problem... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Fri, Apr 14, 2006 at 10:42:33AM -0400, Greg Stark wrote: > Hm, now that I've had some sleep I think I see where you're going with this. > > As long as there's a defined wire protocol (and there will always be one) then > there's nothing wrong with what the psqlODBC driver is doing and having a > libpq mode that hands off small bits of the unparsed stream isn't really any > different than just having the driver read the unparsed data from the socket. Well, the main motivation for this is that when a new version of the protocol appears, libpq will support it but psqlODBC won't. If libpq provides a way to get these small bits of the unparsed stream in a protocol independant way, then that problem goes away. There are a number of other (primarily driver) projects that would benefit from being able to bypass the PGresult structure for storing data. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > On Fri, Apr 14, 2006 at 10:42:33AM -0400, Greg Stark wrote: >> As long as there's a defined wire protocol (and there will always be >> one) then there's nothing wrong with what the psqlODBC driver is doing > Well, the main motivation for this is that when a new version of the > protocol appears, libpq will support it but psqlODBC won't. If libpq > provides a way to get these small bits of the unparsed stream in a > protocol independant way, then that problem goes away. Greg's observation is correct, so maybe we are overthinking this problem. A fair question to ask is whether psqlODBC would consider going back to a non-hybrid implementation if these features did exist in libpq. > There are a number of other (primarily driver) projects that would > benefit from being able to bypass the PGresult structure for storing > data. Please mention some specific examples. We need some examples as a reality check. regards, tom lane
On Fri, Apr 14, 2006 at 04:53:53PM +0200, Martijn van Oosterhout wrote: > Sounds really good. <snip> There's a message on the pgsql-odbc mailing list[1] with some reasons for not using libpq: 1. The driver sets some session default parameters(DateStyle, client_encoding etc) using start-up message. As far as I can see it only does this when the environment variables are set. Which IMHO is the correct behaviour. If psqlodbc doesn't honour them that does violate the principle of least surprise. OTOH, the users of ODBC possibly shouldn't be affected by the environment variables of the user, given the user of ODBC likely doesn't know (or care) that PostgreSQL is involved. 2. You can try V2 protocol implementation when the V3 implementation has some bugs or performance issues. Well, there is a point here, you can't force the version. It always defaults to 3 if available. 3. Quote: I don't know what libraries the libpq would need in the future but it's quite unpleasant for me if the psqlodbc driver can't be loaded with the lack of needeless librairies. It's a reason, just not a good one IMHO. If the user has installed libpq with a number of libraries, then that's what the user wants. I'm not sure why psqlODBC is worried about that. So while this thread has produced several good ideas (which possibly should be implemented regardless), perhaps we should focus on these issues also. Have a nice day, [1] http://archives.postgresql.org/pgsql-odbc/2006-04/msg00052.php -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
-----Original Message----- From: "Tom Lane"<tgl@sss.pgh.pa.us> Sent: 14/04/06 16:22:45 To: "Martijn van Oosterhout"<kleptog@svana.org> Cc: "Greg Stark"<gsstark@mit.edu>, "Zeugswetter Andreas DCP SD"<ZeugswetterA@spardat.at>, "Dave Page"<dpage@vale-housing.co.uk>,"pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "Hiroshi Inoue"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to supporting multiple SSL libraries > A fair question to ask is whether psqlODBC would consider > going back to a non-hybrid implementation if these features did exist > in libpq. It's not something I want to spend any more time on, and Hiroshi made it quite clear on -odbc yesterday that he doesn't wantlibpq to become a requirement of psqlODBC (it's dynamically loaded atm, thus is optional). Regards, Dave -----Unmodified Original Message----- Martijn van Oosterhout <kleptog@svana.org> writes: > On Fri, Apr 14, 2006 at 10:42:33AM -0400, Greg Stark wrote: >> As long as there's a defined wire protocol (and there will always be >> one) then there's nothing wrong with what the psqlODBC driver is doing > Well, the main motivation for this is that when a new version of the > protocol appears, libpq will support it but psqlODBC won't. If libpq > provides a way to get these small bits of the unparsed stream in a > protocol independant way, then that problem goes away. Greg's observation is correct, so maybe we are overthinking this problem. A fair question to ask is whether psqlODBC would consider going back to a non-hybrid implementation if these features did exist in libpq. > There are a number of other (primarily driver) projects that would > benefit from being able to bypass the PGresult structure for storing > data. Please mention some specific examples. We need some examples as a reality check. regards, tom lane
Dave Page wrote: > > -----Original Message----- From: "Tom Lane"<tgl@sss.pgh.pa.us> Sent: > 14/04/06 16:22:45 To: "Martijn van Oosterhout"<kleptog@svana.org> Cc: > "Greg Stark"<gsstark@mit.edu>, "Zeugswetter Andreas DCP > SD"<ZeugswetterA@spardat.at>, "Dave Page"<dpage@vale-housing.co.uk>, > "pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "Hiroshi > Inoue"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to > supporting multiple SSL libraries > > > A fair question to ask is whether psqlODBC would consider > > going back to a non-hybrid implementation if these features did exist > > in libpq. > > It's not something I want to spend any more time on, and Hiroshi made > it quite clear on -odbc yesterday that he doesn't want libpq to become > a requirement of psqlODBC (it's dynamically loaded atm, thus is > optional). Hiroshi does not speak for the psqlODBC project. It is a community project. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
-----Original Message----- From: "Bruce Momjian"<pgman@candle.pha.pa.us> Sent: 14/04/06 16:42:08 To: "Dave Page"<dpage@vale-housing.co.uk> Cc: "tgl@sss.pgh.pa.us"<tgl@sss.pgh.pa.us>, "kleptog@svana.org"<kleptog@svana.org>, "gsstark@mit.edu"<gsstark@mit.edu>, "ZeugswetterA@spardat.at"<ZeugswetterA@spardat.at>,"pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "inoue@tpf.co.jp"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to supporting multiple SSL libraries > Hiroshi does not speak for the psqlODBC project. It is a community > project. I am well aware of that, but as he is by far the most experienced and productive ODBC developer we have it would not be particularlysensible to not give his opinion the weight it deserves - especially as there is unlikely to be anyone else toundertake such a project (again). Regards, Dave -----Unmodified Original Message----- Dave Page wrote: > > -----Original Message----- From: "Tom Lane"<tgl@sss.pgh.pa.us> Sent: > 14/04/06 16:22:45 To: "Martijn van Oosterhout"<kleptog@svana.org> Cc: > "Greg Stark"<gsstark@mit.edu>, "Zeugswetter Andreas DCP > SD"<ZeugswetterA@spardat.at>, "Dave Page"<dpage@vale-housing.co.uk>, > "pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "Hiroshi > Inoue"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to > supporting multiple SSL libraries > > > A fair question to ask is whether psqlODBC would consider > > going back to a non-hybrid implementation if these features did exist > > in libpq. > > It's not something I want to spend any more time on, and Hiroshi made > it quite clear on -odbc yesterday that he doesn't want libpq to become > a requirement of psqlODBC (it's dynamically loaded atm, thus is > optional). Hiroshi does not speak for the psqlODBC project. It is a community project. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Dave Page wrote: > > Hiroshi does not speak for the psqlODBC project. It is a community > > project. > > I am well aware of that, but as he is by far the most experienced and > productive ODBC developer we have it would not be particularly sensible > to not give his opinion the weight it deserves - especially as there > is unlikely to be anyone else to undertake such a project (again). Right, sure he has weight. It is the concept that "If Hiroshi doesn't want it, it isn't going to happen", that I objected to. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
> > It's not something I want to spend any more time on, and Hiroshi made > > it quite clear on -odbc yesterday that he doesn't want libpq to become > > a requirement of psqlODBC (it's dynamically loaded atm, thus is > > optional). > > Hiroshi does not speak for the psqlODBC project. It is a community > project. Well yes it is a community project, but whoever is doing the development is going to make the decision on what direction to go. Sincerely, Joshua D. Drake > > -- > Bruce Momjian http://candle.pha.pa.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Please mention some specific examples. We need some examples as a > reality check. Just took a look through a couple of Debian packages which depend on libpq4: libpam-pgsql: pam_pgsql.c, line 473 it uses PQgetvalue() as one would expect, but doesn't actually save the pointer anywhere, just uses it to do comparisons against (all it stores, apparently, is a password in the DB). libnss-pgsql: src/backend.c, line 228: sptr = PQgetvalue(res, row, colnum); slen = strlen(sptr); if(*buflen < slen+1) { return NSS_STATUS_TRYAGAIN; } strncpy(*buffer,sptr, slen); (*buffer)[slen] = '\0'; *valptr = *buffer; *buffer += slen + 1; *buflen -= slen + 1; return NSS_STATUS_SUCCESS; That really seems to be the classic example to me. Get the data from PQresult, store it in something else, work on it. mapserver: mappostgis.c, starting from line 1340: shape->values = (char **) malloc(sizeof(char *) * layer->numitems); for(t = 0; t < layer->numitems; t++) { temp1= (char *) PQgetvalue(query_result, 0, t); size = PQgetlength(query_result,0, t); temp2 = (char *) malloc(size + 1); memcpy(temp2, temp1, size); temp2[size] = 0;/* null terminate it */ shape->values[t] = temp2; } This same code repeats in another place (1139). They also appear to forget to PQclear() in some cases. :( They don't appear to ever save the pointer returned by PQresult() for anything. postfix: src/global/dict_pgsql.c, starting from line 349: numcols = PQnfields(query_res); for (expansion = i = 0; i < numrows && dict_errno == 0; i++) { for (j = 0; j < numcols; j++) { r = PQgetvalue(query_res,i, j); if (db_common_expand(dict_pgsql->ctx, dict_pgsql->result_format, r, name, result, 0) && dict_pgsql->expansion_limit > 0 && ++expansion > dict_pgsql->expansion_limit){ msg_warn("%s: %s: Expansion limit exceeded for key: '%s'", myname,dict_pgsql->parser->name, name); dict_errno = DICT_ERR_RETRY; break; } } } PQclear(query_res); r = vstring_str(result); return ((dict_errno == 0 && *r) ? r : 0); exim does something similar to postfix too. It really seems unlikely that anyone keeps PQresult's around for very long and they all seem to want to stick it into their own memory structure. I don't know how many people would move to a new API should one be provided though. Callbacks can be kind of a pain in the butt to code too which makes the amount of effort required to move to using them a bit higher too. This all means double memory usage though and that really makes me want some kind of API that can be used to process data as it comes in. Another thought along these lines: Perhaps a 'PQgettuple' which can be used to process one tuple at a time. This would be used in an ASYNC fashion and libpq just wouldn't read/accept more than a tuple's worth each time, which it could do into a fixed area (in general, for a variable-length field it could default to an initial size and then only grow it when necessary, and grow it larger than the current request by some amount to hopefully avoid more malloc/reallocs later). Thanks, Stephen
-----Original Message----- From: "Bruce Momjian"<pgman@candle.pha.pa.us> Sent: 14/04/06 16:57:58 To: "Dave Page"<dpage@vale-housing.co.uk> Cc: "tgl@sss.pgh.pa.us"<tgl@sss.pgh.pa.us>, "kleptog@svana.org"<kleptog@svana.org>, "gsstark@mit.edu"<gsstark@mit.edu>, "ZeugswetterA@spardat.at"<ZeugswetterA@spardat.at>,"pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "inoue@tpf.co.jp"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to supporting multiple SSL libraries > Right, sure he has weight. It is the concept that "If Hiroshi doesn't > want it, it isn't going to happen", that I objected to. I don't believe I said that - and you ought to know me well enough by now to know I wouldn't have said it! :-) Regards, Dave -----Unmodified Original Message----- Dave Page wrote: > > Hiroshi does not speak for the psqlODBC project. It is a community > > project. > > I am well aware of that, but as he is by far the most experienced and > productive ODBC developer we have it would not be particularly sensible > to not give his opinion the weight it deserves - especially as there > is unlikely to be anyone else to undertake such a project (again). Right, sure he has weight. It is the concept that "If Hiroshi doesn't want it, it isn't going to happen", that I objected to. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Fri, Apr 14, 2006 at 11:22:23AM -0400, Tom Lane wrote: > Greg's observation is correct, so maybe we are overthinking this > problem. A fair question to ask is whether psqlODBC would consider > going back to a non-hybrid implementation if these features did exist > in libpq. Well, it is an issue. It's not a specific problem per se that psqlODBC implements the protocol itself. If you remember right back at the beginning of the thread (see subject) there was the issue of users using libpq to connect and then continuing themselves. The issue being that the pointer from PQgetssl() wouldn't work if we had different SSL libraries available. Perhaps a far easier approach would be to indeed just have a hijack interface that provides read/write over whatever protocol libpq negotiated. Then people could write their own protocol parsers to suit their needs while still using libpq for the connection. Have the cake and eat it too? Note, we would have to allow users of libpq to force the version, otherwise libpq would connect using a version the user doesn't understand. > Please mention some specific examples. We need some examples as a > reality check. Well, psqlODBC is the obvious case. Besides that it becomes tricky. I would think that DBI::Pg could benefit, I just don't understand the code well enough to know if it's directly useful. I would expect drivers in particular to benefit and some complex applications, but if you're asking for specific examples, I don't have any... That doesn't change the fact that it's a nice idea, just definite benificiaries are harder to find. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > Perhaps a far easier approach would be to indeed just have a hijack > interface that provides read/write over whatever protocol libpq > negotiated. Well, there's a precedent to look at: the original implementation of COPY mode was pretty nearly exactly that. And it sucked, and eventually we changed it. So I'd be pretty leery of repeating the experience... regards, tom lane
On Fri, Apr 14, 2006 at 01:05:11PM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > Perhaps a far easier approach would be to indeed just have a hijack > > interface that provides read/write over whatever protocol libpq > > negotiated. > > Well, there's a precedent to look at: the original implementation of > COPY mode was pretty nearly exactly that. And it sucked, and eventually > we changed it. So I'd be pretty leery of repeating the experience... As I remember, the main issue was with the loss of control over the error state and recovering if stuff went wrong. In this case, once someone hijacks a connection they can't hand it back. It only option is to close. It was just thinking of providing pointers to pqsecure_read/write and maybe a few other things, but that's it. Or was there something else? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Stephen Frost <sfrost@snowman.net> writes: > Another thought along these lines: Perhaps a 'PQgettuple' which can be > used to process one tuple at a time. This would be used in an ASYNC > fashion and libpq just wouldn't read/accept more than a tuple's worth > each time, which it could do into a fixed area (in general, for a > variable-length field it could default to an initial size and then only > grow it when necessary, and grow it larger than the current request by > some amount to hopefully avoid more malloc/reallocs later). I know DBD::Oracle uses an interface somewhat like this but more sophisticated. It provides a buffer and Oracle fills it with as many records as it can. It's blocking though (by default) and DBD::Oracle tries to adjust the size of the buffer to keep the network pipeline full, but if the application is slow at reading the data then the network buffers fill and it pushes back to the database which blocks writing. This is normally a good thing though. One of the main problems with the current libpq interface is that if you have a very large result set it flows in as fast as it can and the library buffers it *all*. If you're trying to avoid forcing the user to eat millions of records at once you don't want to be buffering them anywhere all at once. You want a constant pipeline of records streaming out as fast as they can be processed and no faster. -- greg
* Greg Stark (gsstark@mit.edu) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Another thought along these lines: Perhaps a 'PQgettuple' which can be > > used to process one tuple at a time. This would be used in an ASYNC > > fashion and libpq just wouldn't read/accept more than a tuple's worth > > each time, which it could do into a fixed area (in general, for a > > variable-length field it could default to an initial size and then only > > grow it when necessary, and grow it larger than the current request by > > some amount to hopefully avoid more malloc/reallocs later). > > I know DBD::Oracle uses an interface somewhat like this but more > sophisticated. It provides a buffer and Oracle fills it with as many records > as it can. The API I suggested originally did this, actually. I'm not sure if it would be used in these cases though which is why I was backing away from it a bit. I think it's great if you're grabbing alot of data but these seem to be cases when you're not. Then again, that's probably because of the kind of things I was looking at (you don't generally see large data-analysis tools in a distribution like Debian simply because those tools are usually specialized to a given data set, as is actually the case with some tools we use here at my work which make use of the Oracle buffer system and I'd love to move to something similar for Postgres...). > It's blocking though (by default) and DBD::Oracle tries to adjust the size of > the buffer to keep the network pipeline full, but if the application is slow > at reading the data then the network buffers fill and it pushes back to the > database which blocks writing. It could be done as blocking or non-blocking and could be an option in the API, really. I do prefer the idea that if the application is slow at reading the data then it pushes back to the database to block writing. I also *really* prefer to minimize the amount of memory used by libraries... I've never felt it's appropriate for libpq to allocate huge amount of memory in response to a large query. :/ I know this can be worked around using cursors but I still feel it's a terrible thing for a library to do. > This is normally a good thing though. One of the main problems with the > current libpq interface is that if you have a very large result set it flows > in as fast as it can and the library buffers it *all*. If you're trying to > avoid forcing the user to eat millions of records at once you don't want to be > buffering them anywhere all at once. You want a constant pipeline of records > streaming out as fast as they can be processed and no faster. Right... As I mentioned, the application can use cursors to *work-around* this foolishness in libpq but that doesn't really make it any less silly. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > Right... As I mentioned, the application can use cursors to > *work-around* this foolishness in libpq but that doesn't really make it > any less silly. Before you define libpq's behavior as "foolishness", you really ought to have a watertight semantics for what will happen in your proposal when a SELECT fails partway through (ie, after delivering some but not all of the tuples). In my mind the main reason for all-or-nothing PGresult behavior is exactly to save applications from having to deal with that case. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Right... As I mentioned, the application can use cursors to > > *work-around* this foolishness in libpq but that doesn't really make it > > any less silly. > > Before you define libpq's behavior as "foolishness", you really ought to > have a watertight semantics for what will happen in your proposal when a > SELECT fails partway through (ie, after delivering some but not all of > the tuples). In my mind the main reason for all-or-nothing PGresult > behavior is exactly to save applications from having to deal with that > case. The library would report an error when trying to finish reading the data. Honestly, that just isn't libpq's problem, it's the application's problem to deal with it and I know that *I* certainly wouldn't have any expectation (or faith...) in libpq doing the right thing for my particular application in any given failure case. The library should report error conditions, not assume that I'd only want all-or-nothing anyway. I'm not all about breaking backwards compatibility though and so I'm not suggesting we change the existing behavior in this regard. This should not be an impediment to an addition to the API to allow for reading the data as it comes in. This certainly isn't unheardof or unexpected in the database world either as (at least) Oracle's library doesn't do this collect-everything and make-sure-it's-all-happy before returning data to the user. Not to mention the potential for something bad to happen *while* reading the data out of libpq. For example, having the machine crash because you've run it out of memory because you've got at least 2 and probably 3 copies of the data in memory (ie: ODBC under Windows with libpq). libpq might have been correct to provide data to the client since it was sure it had it all but it doesn't help a bit when because of libpq the box runs out of memory. Thanks, Stephen
Martijn van Oosterhout wrote: >On Fri, Apr 14, 2006 at 04:53:53PM +0200, Martijn van Oosterhout wrote: > > >>Sounds really good. >> >> ><snip> > >There's a message on the pgsql-odbc mailing list[1] with some reasons >for not using libpq: > >1. The driver sets some session default parameters(DateStyle, > client_encoding etc) using start-up message. > >As far as I can see it only does this when the environment variables >are set. Which IMHO is the correct behaviour. > IMHO if libpq is to be a generic library it should first provide exactly what it can do using the protocol. *Environment varibales* are not appropriate for per application/datasource settings at all. >3. Quote: I don't know what libraries the libpq would need in the >future but it's quite unpleasant for me if the psqlodbc driver can't be >loaded with the lack of needeless librairies. > >It's a reason, just not a good one IMHO. If the user has installed >libpq with a number of libraries, then that's what the user wants. I'm >not sure why psqlODBC is worried about that. > > It's very important to clarify for what the libraries are needed and my basic policy is to provide appropriate bindings(linkage) between the libraries for the current dependency relation. As for SSL mode it is only a mere extra for the current enhanced driver. My main purpose was to finish up my unfinished work before 7.4 using V3 protocol, holdable cursors etc. The current driver under Windows is available without the existence of libpq. regards, Hiroshi Inoue
Martijn van Oosterhout wrote: >On Thu, Apr 13, 2006 at 09:00:10PM -0400, Tom Lane wrote: > > >>Probably there would need to be at least three callbacks involved: >>one for setup, called just after the tuple descriptor info has been >>received; one for per-field data receipt, and one for per-tuple >>operations (called after all the fields of the current tuple have >>been passed to the per-field callback). Maybe you'd want a shutdown >>callback too, although that's probably not strictly necessary since >>whatever you might need it to do could be done equally well in the >>app after PQgetResult returns. (You still want to return a PGresult >>to carry command success/failure info, and probably the tuple descriptor >>info, even though use of the callbacks would leave it containing none of >>the data.) >> >> > >Sounds really good. The only thing now is that the main author of the >wire-protocol code in psqlODBC has not yet made any comment on any of >this. So we dont want to set anything in stone until we know it would >solve their problem... > > Unfortunately I don't have so much time to examine it. Though the double_copying issue may be the biggest one, I'm pretty sure it's not the unqiue one. We are happy if we would be able to replace the current code by libpq API one by one but it's impossible because the driver can't go back to libpq mode once after it went into hi-jacking mode. As for hi-jacking mode used in the driver it's better to be able to use encapsulated recv/send than getting the pointer to SSL or socket. regards, Hiroshi Inoue