Thread: Practical impediment to supporting multiple SSL libraries

Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

12 April 2006, 12:48:30

Just quickly going through what might be needed to support multiple SSL
libraries revealed one big problem in libpq-fe.h.

#ifdef USE_SSL
/* Get the SSL structure associated with a connection */
extern SSL *PQgetssl(PGconn *conn);
#else
extern void *PQgetssl(PGconn *conn);
#endif

The return type of the function changes depending on whether SSL is
compiled in or not. :( So, libpq exposes to its users the underlying
SSL library, which seems wrong. Now, options include:

1. Changing it to always return (void*), irrespective of SSL
2. Creating a PGsslcontext type that varies depending on what library
you use (or not).
3. Removing the function entirely because the only user appears to be
psql (in tree anyway).
4. Only declare the function if the user has #included openssl
themselves.

Or alternatively we could do nothing because:

5. It's not a problem
6. It's a backward incompatable change

Personally, I'm in favour of 1, because then we can get rid of the
#include for openssl, so users don't have to have openssl headers
installed to compile postgresql programs. Options 2, 3 and 4 have
varying levels of evilness attached. However, I can see how 5 or 6
might be attractive.

Thoughts?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

12 April 2006, 13:04:01


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of
> Martijn van Oosterhout
> Sent: 12 April 2006 16:48
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> Just quickly going through what might be needed to support
> multiple SSL libraries revealed one big problem in libpq-fe.h.
>
> #ifdef USE_SSL
> /* Get the SSL structure associated with a connection */
> extern SSL *PQgetssl(PGconn *conn); #else extern void
> *PQgetssl(PGconn *conn); #endif
>
> The return type of the function changes depending on whether
> SSL is compiled in or not. :( So, libpq exposes to its users
> the underlying SSL library, which seems wrong. Now, options include:
>
> 1. Changing it to always return (void*), irrespective of SSL
> 2. Creating a PGsslcontext type that varies depending on what
> library you use (or not).
> 3. Removing the function entirely because the only user
> appears to be psql (in tree anyway).
> 4. Only declare the function if the user has #included
> openssl themselves.
>
> Or alternatively we could do nothing because:
>
> 5. It's not a problem
> 6. It's a backward incompatable change

The next version of psqlODBC (that has just gone into CVS tip after
months of work and debate) uses it, and would break almost completely
should it be removed, therefore any backwards incompatible change should
be avoided imho. And 2 or 4 could cause chaos for Windows users if
different DLL builds get mixed up.

Regards, Dave.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

12 April 2006, 13:15:22

On Wed, Apr 12, 2006 at 05:03:32PM +0100, Dave Page wrote:

<about the declaration of PQgetssl>
> The next version of psqlODBC (that has just gone into CVS tip after
> months of work and debate) uses it, and would break almost completely
> should it be removed, therefore any backwards incompatible change should
> be avoided imho. And 2 or 4 could cause chaos for Windows users if
> different DLL builds get mixed up.

Hmm, may I ask what it uses it for? Just to get information, or
something more substantial?

Thanks in advance,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

12 April 2006, 13:25:58

> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: 12 April 2006 17:15
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> On Wed, Apr 12, 2006 at 05:03:32PM +0100, Dave Page wrote:
>
> <about the declaration of PQgetssl>
> > The next version of psqlODBC (that has just gone into CVS tip after
> > months of work and debate) uses it, and would break almost
> completely
> > should it be removed, therefore any backwards incompatible change
> > should be avoided imho. And 2 or 4 could cause chaos for
> Windows users
> > if different DLL builds get mixed up.
>
> Hmm, may I ask what it uses it for? Just to get information,
> or something more substantial?

The driver implements all versions of the wire protocol itself, but if
libpq is available at runtime (it will dynamically load it on platforms
that support it) it can use it for connection setup so features like SSL
can be provided easily. I'm still not overly familiar with how it works
yet, but I'm sure Hiroshi (CC'd) can provide further details if you need
them.

Regards, Dave.

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

12 April 2006, 13:32:11

Martijn van Oosterhout <kleptog@svana.org> writes:
> 1. Changing it to always return (void*), irrespective of SSL
> ...
> Personally, I'm in favour of 1, because then we can get rid of the
> #include for openssl, so users don't have to have openssl headers
> installed to compile postgresql programs.

I like that too.  I've never been very happy about having libpq-fe.h
depending on USE_SSL.

There is a more serious issue here though: if we allow more than one SSL
library, what exactly can an application safely do with the returned
pointer?  It strikes me as very dangerous for the app to assume it knows
which SSL library is underneath libpq.  It's not at all hard to imagine
an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS
or vice versa.  To the extent that there are apps out there that depend
on doing something with this function, I think that even contemplating
supporting multiple SSL libraries is a threat.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Andreas Pflug

Date:

12 April 2006, 14:15:18

Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> 
>>1. Changing it to always return (void*), irrespective of SSL
>>...
>>Personally, I'm in favour of 1, because then we can get rid of the
>>#include for openssl, so users don't have to have openssl headers
>>installed to compile postgresql programs.
> 
> 
> I like that too.  I've never been very happy about having libpq-fe.h
> depending on USE_SSL.
> 
> There is a more serious issue here though: if we allow more than one SSL
> library, what exactly can an application safely do with the returned
> pointer?  It strikes me as very dangerous for the app to assume it knows
> which SSL library is underneath libpq.  It's not at all hard to imagine
> an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS
> or vice versa.  To the extent that there are apps out there that depend
> on doing something with this function, I think that even contemplating
> supporting multiple SSL libraries is a threat.

I wonder if there are apps that actually use the ssl pointer, beyond 
detection of encrypted connections. So interpreting the result as bool 
would be sufficient.

Regards,
Andreas

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

12 April 2006, 14:38:03

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > 1. Changing it to always return (void*), irrespective of SSL
> > ...
> > Personally, I'm in favour of 1, because then we can get rid of the
> > #include for openssl, so users don't have to have openssl headers
> > installed to compile postgresql programs.
>
> I like that too.  I've never been very happy about having libpq-fe.h
> depending on USE_SSL.

I'm all in favor of dropping the dependency on OpenSSL headers from
libpq, just to throw my 2 cents in there.

> There is a more serious issue here though: if we allow more than one SSL
> library, what exactly can an application safely do with the returned
> pointer?  It strikes me as very dangerous for the app to assume it knows
> which SSL library is underneath libpq.  It's not at all hard to imagine
> an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS
> or vice versa.  To the extent that there are apps out there that depend
> on doing something with this function, I think that even contemplating
> supporting multiple SSL libraries is a threat.

I'm afraid the way to do this would probably be to have it return a
Postgres-defined structure (without depending on if it's compiled with
SSL or not) which then indicates if the connection is SSL-enabled or not
and then probably other 'common' information (remote DN, remote CA,
ASN.1 formatted certificate perhaps, etc...).
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

12 April 2006, 14:41:56

* Andreas Pflug (pgadmin@pse-consulting.de) wrote:
> I wonder if there are apps that actually use the ssl pointer, beyond
> detection of encrypted connections. So interpreting the result as bool
> would be sufficient.

I'm not sure if there are apps out there which use it for anything but a
bool but there's certainly a potential for apps to want to do things
like get the DN of the remote server...
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

12 April 2006, 14:56:08

On Wed, Apr 12, 2006 at 01:42:51PM -0400, Stephen Frost wrote:
> * Andreas Pflug (pgadmin@pse-consulting.de) wrote:
> > I wonder if there are apps that actually use the ssl pointer, beyond
> > detection of encrypted connections. So interpreting the result as bool
> > would be sufficient.
>
> I'm not sure if there are apps out there which use it for anything but a
> bool but there's certainly a potential for apps to want to do things
> like get the DN of the remote server...

Strangly enough, the SSL code in libpq has stored the peer DN and CN
except it doesn't appear to be available to the client...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

12 April 2006, 14:59:39

On Wed, Apr 12, 2006 at 12:32:01PM -0400, Tom Lane wrote:
> There is a more serious issue here though: if we allow more than one SSL
> library, what exactly can an application safely do with the returned
> pointer?  It strikes me as very dangerous for the app to assume it knows
> which SSL library is underneath libpq.  It's not at all hard to imagine
> an app getting an OpenSSL struct pointer and trying to pass it to GnuTLS
> or vice versa.  To the extent that there are apps out there that depend
> on doing something with this function, I think that even contemplating
> supporting multiple SSL libraries is a threat.

The only real way to a solution is to work out why people want the
pointer. So far I've found two reasons:

- People want to hijack the connection after libpq has set it up to do
their own processing.

- People want to examine the certificates more closely.

The first would be easily handled by providing a formal interface for
libpq to hijack the connection with, providing read/write and maybe a
few others. The latter is tricker. You're invariably going to run into
the problem where the app uses one lib and libpq the other.

Other than DN and CN, what else would people want?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Magnus Hagander"

Date:

12 April 2006, 15:15:05

> > There is a more serious issue here though: if we allow more
> than one
> > SSL library, what exactly can an application safely do with the
> > returned pointer?  It strikes me as very dangerous for the app to
> > assume it knows which SSL library is underneath libpq.  It's not at
> > all hard to imagine an app getting an OpenSSL struct pointer and
> > trying to pass it to GnuTLS or vice versa.  To the extent
> that there
> > are apps out there that depend on doing something with this
> function,
> > I think that even contemplating supporting multiple SSL
> libraries is a threat.
>
> The only real way to a solution is to work out why people
> want the pointer. So far I've found two reasons:
>
> - People want to hijack the connection after libpq has set it
> up to do their own processing.
>
> - People want to examine the certificates more closely.
>
> The first would be easily handled by providing a formal
> interface for libpq to hijack the connection with, providing
> read/write and maybe a few others. The latter is tricker.
> You're invariably going to run into the problem where the app
> uses one lib and libpq the other.
>
> Other than DN and CN, what else would people want?

Issuer (name and certificate), validity dates, basic constraints, key
usage, posslby fingerprint.

//Magnus

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

12 April 2006, 16:13:13

On Wed, Apr 12, 2006 at 08:14:58PM +0200, Magnus Hagander wrote:
> > Other than DN and CN, what else would people want?
>
> Issuer (name and certificate), validity dates, basic constraints, key
> usage, posslby fingerprint.

GnuTLS handles this with just one function:

gnutls_x509_crt_get_dn_by_oid( cert, oid, index, raw, &data, &length )

And a whole pile of #defines

#define GNUTLS_OID_X520_COUNTRY_NAME            "2.5.4.6"
#define GNUTLS_OID_X520_ORGANIZATION_NAME       "2.5.4.10"
#define GNUTLS_OID_X520_ORGANIZATIONAL_UNIT_NAME "2.5.4.11"

etc...

Which is nice because then end users can code in the attributes they
want and we don't have to deal with the endless variations. I don't
however know enough to know if this (with a function to get OIDs by
index) is sufficient to extract all the information from the
certificate.

Presumably OpenSSL can do this too...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

12 April 2006, 18:41:15

"Magnus Hagander" <mha@sollentuna.net> writes:
>> Other than DN and CN, what else would people want?

> Issuer (name and certificate), validity dates, basic constraints, key
> usage, posslby fingerprint.

I think that way madness lies --- do we really want to commit to
re-inventing an SSL API that will cover anything someone might want
to do with either underlying library?  Moreover, this does not fix
the problem: an existing app that thinks it can pass the returned
pointer to an OpenSSL routine will still crash the moment a GnuTLS
version of libpq is put under it.  Case in point: psql, as currently
coded.

An idea that just occurred to me is to define PQgetssl as "return SSL*
if we are using OpenSSL for this connection; else return NULL".  Then
add a parallel routine (maybe PQgetgnussl?) defined as returning the
equivalent GnuTLS handle, only if we are using GnuTLS for this
connection.  (Presumably, in any one build of libpq, one of the pair of
routines would be an always-returns-null stub.)

The advantage of this is that an app knows what it'll get, and an app
that's only familiar with one of the two SSL libraries will not be
given a pointer it can't use.

I'd still want to adopt Martijn's idea of declaring both of 'em as
returning void *, to avoid depending on other packages' include files.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Hiroshi Inoue

Date:

12 April 2006, 19:40:37

Martijn van Oosterhout wrote:

>On Wed, Apr 12, 2006 at 05:03:32PM +0100, Dave Page wrote:
>
><about the declaration of PQgetssl>
>  
>
>>The next version of psqlODBC (that has just gone into CVS tip after
>>months of work and debate) uses it, and would break almost completely
>>should it be removed, therefore any backwards incompatible change should
>>be avoided imho. And 2 or 4 could cause chaos for Windows users if
>>different DLL builds get mixed up.
>>    
>>
>
>Hmm, may I ask what it uses it for? Just to get information, or
>something more substantial?
>

In case of SSL mode, the driver gets the communication path using
PQsocket() or PQgetssl() after calling PQconnectdb(). The driver
comunicates with the server by itself using the path. In case of
non-SSL mode, the driver never calls libpq API at all.

regards,
Hiroshi Inoue

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 03:52:02

On Wed, Apr 12, 2006 at 05:00:17PM -0400, Tom Lane wrote:
> > Issuer (name and certificate), validity dates, basic constraints, key
> > usage, posslby fingerprint.
>
> I think that way madness lies --- do we really want to commit to
> re-inventing an SSL API that will cover anything someone might want
> to do with either underlying library?

Indeed. There's also the issue that the underlying system may not be
using what you think it is. e.g. GnuTLS can authenticate on PGP keys
rather than x509 certificates. There's still the mystery regarding
libpq extracting peer DN and CN but passing it to the user.

> An idea that just occurred to me is to define PQgetssl as "return SSL*
> if we are using OpenSSL for this connection; else return NULL".  Then
> add a parallel routine (maybe PQgetgnussl?) defined as returning the
> equivalent GnuTLS handle, only if we are using GnuTLS for this
> connection.  (Presumably, in any one build of libpq, one of the pair of
> routines would be an always-returns-null stub.)

Alternatively, create a new function PQgetsslinfo() that returns both
the library name and a (void) pointer. In any case the old interface
can never return anything other than a pointer for OpenSSL.

> I'd still want to adopt Martijn's idea of declaring both of 'em as
> returning void *, to avoid depending on other packages' include files.

Ack, at least we can get that out of the way. It doesn't change
anything from the user's point of view, other than they know for sure
what the signiture is.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 03:57:45

On Wed, Apr 12, 2006 at 05:25:47PM +0100, Dave Page wrote:
> The driver implements all versions of the wire protocol itself, but if
> libpq is available at runtime (it will dynamically load it on platforms
> that support it) it can use it for connection setup so features like SSL
> can be provided easily. I'm still not overly familiar with how it works
> yet, but I'm sure Hiroshi (CC'd) can provide further details if you need
> them.

Right, so what you're basically doing is setting up the connection via
libpq then grabbing the SSL pointer and using that to continue
communicating. If it's not SSL you use PQsocket get the socket and
continue from there.

Unorthodox usage, but it should work.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 04:49:02


> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: 13 April 2006 07:58
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org; Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> On Wed, Apr 12, 2006 at 05:25:47PM +0100, Dave Page wrote:
> > The driver implements all versions of the wire protocol
> itself, but if
> > libpq is available at runtime (it will dynamically load it on
> > platforms that support it) it can use it for connection setup so
> > features like SSL can be provided easily. I'm still not overly
> > familiar with how it works yet, but I'm sure Hiroshi (CC'd) can
> > provide further details if you need them.
>
> Right, so what you're basically doing is setting up the
> connection via libpq then grabbing the SSL pointer and using
> that to continue communicating. If it's not SSL you use
> PQsocket get the socket and continue from there.

Yup.
> Unorthodox usage, but it should work.

Well, we had a pure custom implementation of the protocol, had a pure
libpq based version and after much discussion decided that the best
version of all was the hybrid as it allowed us to hijack features like
SSL, Kerberos, pgpass et al, yet not be constrained by the limitations
of libpq, or copy query results about so much.

Regards, Dave

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 05:15:26

On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote:
> Well, we had a pure custom implementation of the protocol, had a pure
> libpq based version and after much discussion decided that the best
> version of all was the hybrid as it allowed us to hijack features like
> SSL, Kerberos, pgpass et al, yet not be constrained by the limitations
> of libpq, or copy query results about so much.

Right. Would you see value in a more formal libpq "hijack-me" interface
that would support making the initial connection and then handing off
the rest to something else?

I'm wondering because obviously with the current setup, if libpq is
compiled with SSL support, psqlODBC must also be. Are there any points
where you have to fight libpq over control of the socket?

I'm thinking that such an interface would need to provide the
following:

read (sync/async)
write (sync/async)
getfd (for select/poll)
ispending (is there stuff to do)
release (for when you're finished)

Is there anything else you might need?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 05:26:14


> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: 13 April 2006 09:15
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org; Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote:
> > Well, we had a pure custom implementation of the protocol,
> had a pure
> > libpq based version and after much discussion decided that the best
> > version of all was the hybrid as it allowed us to hijack
> features like
> > SSL, Kerberos, pgpass et al, yet not be constrained by the
> limitations
> > of libpq, or copy query results about so much.
>
> Right. Would you see value in a more formal libpq "hijack-me"
> interface that would support making the initial connection
> and then handing off the rest to something else?
>
> I'm wondering because obviously with the current setup, if
> libpq is compiled with SSL support, psqlODBC must also be.
> Are there any points where you have to fight libpq over
> control of the socket?
>
> I'm thinking that such an interface would need to provide the
> following:
>
> read (sync/async)
> write (sync/async)
> getfd (for select/poll)
> ispending (is there stuff to do)
> release (for when you're finished)
>
> Is there anything else you might need?

I'll have to let Hiroshi comment on that as he wrote the code. I've only
skimmed over it a few times so far.

Regards, Dave.

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 07:43:05

* Martijn van Oosterhout (kleptog@svana.org) wrote:
> On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote:
> > Well, we had a pure custom implementation of the protocol, had a pure
> > libpq based version and after much discussion decided that the best
> > version of all was the hybrid as it allowed us to hijack features like
> > SSL, Kerberos, pgpass et al, yet not be constrained by the limitations
> > of libpq, or copy query results about so much.
>
> Right. Would you see value in a more formal libpq "hijack-me" interface
> that would support making the initial connection and then handing off
> the rest to something else?
>
> I'm wondering because obviously with the current setup, if libpq is
> compiled with SSL support, psqlODBC must also be. Are there any points
> where you have to fight libpq over control of the socket?
[...]
> Is there anything else you might need?

Instead of having it hijack the libpq connection and implement the
wireline protocol itself, why don't we work on fixing the problems (such
as the double-copying that libpq requires) in libpq to allow the driver
(and others!) to use it in the 'orthodox' way?

I would have spoken up on the ODBC list if I understood that 'hybrid'
really meant 'just using libpq for connection/authentication'.  I really
think it's a bad idea to have the ODBC driver reimplement the wireline
protocol because that protocol does change from time to time and someone
using libpq will hopefully have fewer changes (and thus makes the code
easier to maintain) than someone implementing the wireline protocol
themselves (just causing more busy-work that, at least we saw in the
past with the ODBC driver, may end up taking *forever* for someone to
be able to commit the extra required time to implement).
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 07:54:13

On Thu, Apr 13, 2006 at 06:44:12AM -0400, Stephen Frost wrote:
> Instead of having it hijack the libpq connection and implement the
> wireline protocol itself, why don't we work on fixing the problems (such
> as the double-copying that libpq requires) in libpq to allow the driver
> (and others!) to use it in the 'orthodox' way?

Ok. I'm not sure what this "double copying" you're referring to is, but
I'd certaintly like to know why people are reimplementing the protocol
(psqlODBC is hardly the only one).

Is is that people want to use completely different interaction models?
Like work around the wait-for-whole-resultset-before-returing issue? Or
maybe better notice handling? What is it that's so deficient?

Or maybe it's portability? Like DBI PgPP module?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 07:56:31


> -----Original Message-----
> From: Stephen Frost [mailto:sfrost@snowman.net]
> Sent: 13 April 2006 11:44
> To: Martijn van Oosterhout
> Cc: Dave Page; pgsql-hackers@postgresql.org; Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> Instead of having it hijack the libpq connection and
> implement the wireline protocol itself, why don't we work on
> fixing the problems (such as the double-copying that libpq
> requires) in libpq to allow the driver (and others!) to use
> it in the 'orthodox' way?
>
> I would have spoken up on the ODBC list if I understood that 'hybrid'
> really meant 'just using libpq for
> connection/authentication'.  I really think it's a bad idea
> to have the ODBC driver reimplement the wireline protocol
> because that protocol does change from time to time and
> someone using libpq will hopefully have fewer changes (and
> thus makes the code easier to maintain) than someone
> implementing the wireline protocol themselves (just causing
> more busy-work that, at least we saw in the past with the
> ODBC driver, may end up taking *forever* for someone to be
> able to commit the extra required time to implement).

This has been the subject of discussion for many months and the
concencus was that the most effective approach was the hybrid one which
has now been moved into CVS tip. Those involved are fully aware of the
maintenance issues of implementing the wire protocol in the driver, as
well as the difficulties using libpq entirely caused (that is how the
08.01.xxxx driver works). Changing direction again simply isn't going to
happen.

Regards, Dave

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 08:12:38

> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: 13 April 2006 11:54
> To: Dave Page; pgsql-hackers@postgresql.org; Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> On Thu, Apr 13, 2006 at 06:44:12AM -0400, Stephen Frost wrote:
> > Instead of having it hijack the libpq connection and implement the
> > wireline protocol itself, why don't we work on fixing the problems
> > (such as the double-copying that libpq requires) in libpq
> to allow the
> > driver (and others!) to use it in the 'orthodox' way?
>
> Ok. I'm not sure what this "double copying" you're referring
> to is,

The libpq driver copies results out of the PGresult struct into the
internal QueryResult classes. With libpq out of the loop, data can go
straight from the wire into the QR.

> but I'd certaintly like to know why people are
> reimplementing the protocol (psqlODBC is hardly the only one).

There are elements of the wire protocol that libpq doesn't actually
implement from what I recall. IIRC, they were added specifically for
JDBC but also intended to be used by psqlODBC as well. I forget the
details though as I wasn't so involved with the ODBC development back
then.

In addition of course, implementing the protocol natively does allow for
maximum flexibility.

Regards, Dave.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 08:34:19

On Thu, Apr 13, 2006 at 12:12:25PM +0100, Dave Page wrote:
> > Ok. I'm not sure what this "double copying" you're referring
> > to is,
>
> The libpq driver copies results out of the PGresult struct into the
> internal QueryResult classes. With libpq out of the loop, data can go
> straight from the wire into the QR.

Hmm, the simplest improvement I can think of is one where you
register a callback that libpq calls whenever it has received a new
tuple.

However, w.r.t. the copying, the pointers in get PGresult are in memory
belonging to that result. As long as that PGresult hangs around, you
should be able to just copy the pointers rather than the data? Or is
this unacceptable?

The only alternative I can think of is let users provide a callback
that is given the number of bytes and it returns memory to store the
data into. But that just seems unnecessarily complex, considering you
could just copy the pointers.

> There are elements of the wire protocol that libpq doesn't actually
> implement from what I recall. IIRC, they were added specifically for
> JDBC but also intended to be used by psqlODBC as well. I forget the
> details though as I wasn't so involved with the ODBC development back
> then.

Ugh, that's terrible. How do these features get tested if nothing
within the main tree implements them.

> In addition of course, implementing the protocol natively does allow for
> maximum flexibility.

Maybe, but it should be possible to have a lot of flexibility without
having many projects jump through all sorts of hoops everytime a new
protocol version is created.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 08:48:18


> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: 13 April 2006 12:34
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org; Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> However, w.r.t. the copying, the pointers in get PGresult are
> in memory belonging to that result. As long as that PGresult
> hangs around, you should be able to just copy the pointers
> rather than the data? Or is this unacceptable?

It copies the data. I can't think offhand why it was implemented that
way, but then I didn't write the code (Anoop & Siva @ Pervasive did).

Anyhoo, as I've said, that approach has now been abandoned anyway in
favour of Hiroshi's, so it's him you'd need to convince to change. The
rest of us have only just started re-learning the code.

Regards, Dave

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 08:54:35

* Dave Page (dpage@vale-housing.co.uk) wrote:
> This has been the subject of discussion for many months and the
> concencus was that the most effective approach was the hybrid one which
> has now been moved into CVS tip. Those involved are fully aware of the
> maintenance issues of implementing the wire protocol in the driver, as
> well as the difficulties using libpq entirely caused (that is how the
> 08.01.xxxx driver works). Changing direction again simply isn't going to
> happen.

There was barely any discussion at all about this...  I do follow the
lists involved even though I didn't respond to the question regarding
this (either time it was asked) because I didn't understand that
'hybrid' meant 'only using libpq for the connection'.  I'm curious how
many others of those being asked understood this...  I think the fact
that you had to ask twice to get any response at all is a good
indication.

Does the latest verion in CVS support V3 of the wireline protocol?  If I
recall correctly, the version it was based on still only supported V2...

What does the wireline protocol implementation in the ODBC driver do
that it can't get through libpq?  I can certainly understand the
double-copying issue (I complained about that myself when first starting
to use libpq) but I think that could be fixed without that much
difficulty.  Were there other things?
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 09:13:34


> -----Original Message-----
> From: Stephen Frost [mailto:sfrost@snowman.net]
> Sent: 13 April 2006 12:56
> To: Dave Page
> Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org;
> Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> There was barely any discussion at all about this...  I do
> follow the lists involved even though I didn't respond to the
> question regarding this (either time it was asked) because I
> didn't understand that 'hybrid' meant 'only using libpq for
> the connection'.  I'm curious how many others of those being
> asked understood this...  I think the fact that you had to
> ask twice to get any response at all is a good indication.

There was extensive off-list discussion between all the active
developers before we explained the situation on list, created the test
builds, announced the fact that the code was in CVS and asked for
feedback from users. Most of the initial discussion occurred off-list
because there were issues of commercial support to consider that at the
time should not have been done in public (in a nutshell, we didn't want
to piss Pervasive off).

> Does the latest verion in CVS support V3 of the wireline
> protocol?  If I recall correctly, the version it was based on
> still only supported V2...

Yes, it supports v3.

> What does the wireline protocol implementation in the ODBC
> driver do that it can't get through libpq?  I can certainly
> understand the double-copying issue (I complained about that
> myself when first starting to use libpq) but I think that
> could be fixed without that much difficulty.  Were there other things?

I don't know if we are currently using any features that libpq cannot
offer.

I do know that although the older driver basically worked with libpq,
major features (such as updateable cursors) were broken beyond feasible
repair. They would have had to have been almost entirely redesigned, and
given that we have enough trouble finding developers with enough time
and the ability to fix even relatively simple bugs in the driver it
seemed more sensible to go with the solution that worked properly, yet
still offered the features (v3, SSL, Kerberos) that we wanted from
libpq. The only downside is that we might have to update for any future
protocols again, but even that is not essential given that the server
will fall back to v2 and presumably v3 when v4 is written.

Regards, Dave

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 09:31:25

* Martijn van Oosterhout (kleptog@svana.org) wrote:
> On Thu, Apr 13, 2006 at 12:12:25PM +0100, Dave Page wrote:
> > > Ok. I'm not sure what this "double copying" you're referring
> > > to is,
> >
> > The libpq driver copies results out of the PGresult struct into the
> > internal QueryResult classes. With libpq out of the loop, data can go
> > straight from the wire into the QR.
>
> Hmm, the simplest improvement I can think of is one where you
> register a callback that libpq calls whenever it has received a new
> tuple.

You wouldn't want it on every tuple as that'd get expensive through
function calls.

> However, w.r.t. the copying, the pointers in get PGresult are in memory
> belonging to that result. As long as that PGresult hangs around, you
> should be able to just copy the pointers rather than the data? Or is
> this unacceptable?

It's actually pretty common (or seems to be anyway) to want to store the
data from the query result into your own data structure.  Yes, you could
just use pointers all over the place but that means you're going to have
to use things which understand PQresult everywhere as opposed to having a
generic 'storage manager' with other generic things (index creator,
aggregator, etc) which can be used with more than just PQresults.

> The only alternative I can think of is let users provide a callback
> that is given the number of bytes and it returns memory to store the
> data into. But that just seems unnecessarily complex, considering you
> could just copy the pointers.

You don't provide a callback, you have the user provide a memory region
to libpq which libpq can then fill in.  It's really not that difficult,
the API would really look quite a bit like PQexecParams, ie:

int PQgetTuples(PGresult *res,    // Returns number of tuples populated const int max_ntuples,    // Basically buffer
sizechar *result_set,        // Destination buffer const int *columnOffsets,    // integer array of offsets const int
*columnLengths,   // integer array of lengths, for checks const int record_len,        // Length of each structure int
*columnNulls,       // 0/1 for is not null / is null int resultFormat);        // Or maybe just binary? 

If we want to do conversion of the data in some way then we may need to
expand this to include that ability (but I don't think PQgetvalue does,
so...).

> > There are elements of the wire protocol that libpq doesn't actually
> > implement from what I recall. IIRC, they were added specifically for
> > JDBC but also intended to be used by psqlODBC as well. I forget the
> > details though as I wasn't so involved with the ODBC development back
> > then.
>
> Ugh, that's terrible. How do these features get tested if nothing
> within the main tree implements them.

I fully agree with this sentiment...

> > In addition of course, implementing the protocol natively does allow for
> > maximum flexibility.
>
> Maybe, but it should be possible to have a lot of flexibility without
> having many projects jump through all sorts of hoops everytime a new
> protocol version is created.

Indeed.
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 09:47:29

On Thu, Apr 13, 2006 at 12:48:06PM +0100, Dave Page wrote:
> Anyhoo, as I've said, that approach has now been abandoned anyway in
> favour of Hiroshi's, so it's him you'd need to convince to change. The
> rest of us have only just started re-learning the code.

Well, I quickly scanned the code in CVS to see what I could find out.
There are a few features the psqlodbc tuplereader has that libpq
doesn't.

1. It reads tuples as you go through the data. The resultset has a
cursor, it only processes the data as you request it.

2. It reads directly from the socket directly into a per-tuple
malloc()ed field.

3. It extracts per-row tids directly into a seperate array.

4. The resulting resultset can be updated and modified as well as
appended to. This requires freeing and adding rows. And committing the
result. This is probably your updatable cursors.

So in fact what you really want is libpq as a protocol decoder but want
to manage your resultset yourself. And you want to be able to let users
handle incoming data as it comes rather than waiting for the whole set.

I don't think the zero-copy is relevent, the code is not written in a
way that suggests speed was an issue. Rather I think the way you want
to use the resultset is the issue. You can't use the memory in the
PGresult because then'd you need to track which tuples were allocated
by you and which we allocated by libpq. The resulting copying is
needless, along with the fact that you double your memory usage.

In fact, can think that a number of other projects would like an
alternative. For example, a Perl module would want to load the strings
directly into blessed perl strings rather than keep a copy of the
resultset around. I think this would be a worthwhile addition to the
libpq interface.

I'll see if I can come up with a proposal (whether it'll get
implemented is another issue entirely).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 09:53:33

* Dave Page (dpage@vale-housing.co.uk) wrote:
> > What does the wireline protocol implementation in the ODBC
> > driver do that it can't get through libpq?  I can certainly
> > understand the double-copying issue (I complained about that
> > myself when first starting to use libpq) but I think that
> > could be fixed without that much difficulty.  Were there other things?
>
> I don't know if we are currently using any features that libpq cannot
> offer.
>
> I do know that although the older driver basically worked with libpq,
> major features (such as updateable cursors) were broken beyond feasible
> repair. They would have had to have been almost entirely redesigned, and
> given that we have enough trouble finding developers with enough time
> and the ability to fix even relatively simple bugs in the driver it
> seemed more sensible to go with the solution that worked properly, yet
> still offered the features (v3, SSL, Kerberos) that we wanted from

Updatable cursors isn't something supported in the core system yet but
is clearly useful and is part of the spec (it seems anyway).  It'd be
really nice to implement this as part of core instead of having it be
reimplemented by multiple different people.  I think it'd be time well
spent to work on implementing it in core than redesigning it in the ODBC
driver to work with the current libpq.  As I understand it, this would
probably also be useful to the JDBC people.

> libpq. The only downside is that we might have to update for any future
> protocols again, but even that is not essential given that the server
> will fall back to v2 and presumably v3 when v4 is written.

Perhaps not *essential* but certainly a good thing to do as it can
provide performance and functionality improvements...
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 10:01:39

* Martijn van Oosterhout (kleptog@svana.org) wrote:
> Well, I quickly scanned the code in CVS to see what I could find out.

Wow, that was quick. :)

> So in fact what you really want is libpq as a protocol decoder but want
> to manage your resultset yourself. And you want to be able to let users
> handle incoming data as it comes rather than waiting for the whole set.

The data-as-it-comes bit could be done w/ a Postgres cursor, couldn't
it?  But then you have to read through all the data using PQgetResult,
which isn't much fun.

> I don't think the zero-copy is relevent, the code is not written in a
> way that suggests speed was an issue. Rather I think the way you want
> to use the resultset is the issue. You can't use the memory in the
> PGresult because then'd you need to track which tuples were allocated
> by you and which we allocated by libpq. The resulting copying is
> needless, along with the fact that you double your memory usage.

The double memory usage definitely sucks but I really think speed would
also be greatly improved by removing the double copying and all the
function calls dealing with PQgetResult, etc...

> In fact, can think that a number of other projects would like an
> alternative. For example, a Perl module would want to load the strings
> directly into blessed perl strings rather than keep a copy of the
> resultset around. I think this would be a worthwhile addition to the
> libpq interface.

Me too. :)

> I'll see if I can come up with a proposal (whether it'll get
> implemented is another issue entirely).

I'd be interested in trying to help with this too..
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

13 April 2006, 10:07:25


> -----Original Message-----
> From: Stephen Frost [mailto:sfrost@snowman.net]
> Sent: 13 April 2006 14:03
> To: Martijn van Oosterhout
> Cc: Dave Page; pgsql-hackers@postgresql.org; Hiroshi Inoue
> Subject: Re: [HACKERS] Practical impediment to supporting
> multiple SSL libraries
>
> * Martijn van Oosterhout (kleptog@svana.org) wrote:
> > Well, I quickly scanned the code in CVS to see what I could
> find out.
>
> Wow, that was quick. :)

Yes :-)

> > I don't think the zero-copy is relevent, the code is not
> written in a
> > way that suggests speed was an issue. Rather I think the
> way you want
> > to use the resultset is the issue. You can't use the memory in the
> > PGresult because then'd you need to track which tuples were
> allocated
> > by you and which we allocated by libpq. The resulting copying is
> > needless, along with the fact that you double your memory usage.
>
> The double memory usage definitely sucks but I really think
> speed would also be greatly improved by removing the double
> copying and all the function calls dealing with PQgetResult, etc...

Don't forget that the code now in CVS-tip is not the code that had the
copy issue. As of last Saturday, the hybrid version was moved to tip.

Regards, Dave.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 10:08:16

On Thu, Apr 13, 2006 at 08:32:34AM -0400, Stephen Frost wrote:
> * Martijn van Oosterhout (kleptog@svana.org) wrote:
> > Hmm, the simplest improvement I can think of is one where you
> > register a callback that libpq calls whenever it has received a new
> > tuple.
>
> You wouldn't want it on every tuple as that'd get expensive through
> function calls.

Why not? Internally we call pqAddTuple for every tuple, calling a user
function instead is hardly going to be more expensive. Also, I was
thinking of the situation where the user function could set a flag
so the eventual caller of (perhaps) PQconsumeInput knows that it's got
enough for now.

> It's actually pretty common (or seems to be anyway) to want to store the
> data from the query result into your own data structure.  Yes, you could
> just use pointers all over the place but that means you're going to have
> to use things which understand PQresult everywhere as opposed to having a
> generic 'storage manager' with other generic things (index creator,
> aggregator, etc) which can be used with more than just PQresults.

<snip>

> You don't provide a callback, you have the user provide a memory region
> to libpq which libpq can then fill in.  It's really not that difficult,
> the API would really look quite a bit like PQexecParams, ie:

Except in the case of psqlODBC, it wants to be able to malloc/free()
each field, which your method doesn't solve. Also, it doesn't solve the
duplicate memory use, nor the retreiving of rows before the resultset
is complete.

> If we want to do conversion of the data in some way then we may need to
> expand this to include that ability (but I don't think PQgetvalue does,
> so...).

I think a callback is much easier. As a bonus the user could specify
that libpq doesn't need to remember the rows. Memory savings.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 10:33:18

* Martijn van Oosterhout (kleptog@svana.org) wrote:
> On Thu, Apr 13, 2006 at 08:32:34AM -0400, Stephen Frost wrote:
> > You wouldn't want it on every tuple as that'd get expensive through
> > function calls.
>
> Why not? Internally we call pqAddTuple for every tuple, calling a user
> function instead is hardly going to be more expensive. Also, I was
> thinking of the situation where the user function could set a flag
> so the eventual caller of (perhaps) PQconsumeInput knows that it's got
> enough for now.

Hrmpf, the fact that we have a different call we make for every tuple
anyway isn't exactly encouraging to me.

> > You don't provide a callback, you have the user provide a memory region
> > to libpq which libpq can then fill in.  It's really not that difficult,
> > the API would really look quite a bit like PQexecParams, ie:
>
> Except in the case of psqlODBC, it wants to be able to malloc/free()
> each field, which your method doesn't solve. Also, it doesn't solve the
> duplicate memory use, nor the retreiving of rows before the resultset
> is complete.

I don't entirely follow why you think it wouldn't solve the duplicate
memory use (except perhaps in the psqlODBC case if they decide to just
grab a bunch of tuples into one area and then go through and malloc/free
each one after that, not exactly what I'd suggest...).  The basic idea
was actually modeled off of 'read'- you get back what's currently
available, which might not be the full set you asked for so far.  I
think perhaps you're assuming that my suggestion would just be an
overlay on top of the existing libpq PQgetReult which would just turn
around and call PQgetResult to fill in the memory region provided by the
user- entirely *not* the case...  Perhaps I should have used 'PQconn'
instead of 'PQresult' as the first argument and that would have been
clearer.

Additionally, honestly, this is very similar to how Oracle's multi-row
retrival works...  It uses two functions (one for setup into its own
structure and then one for actually getting rows) but the basic idea is
the same.

> > If we want to do conversion of the data in some way then we may need to
> > expand this to include that ability (but I don't think PQgetvalue does,
> > so...).
>
> I think a callback is much easier. As a bonus the user could specify
> that libpq doesn't need to remember the rows. Memory savings.

My solution didn't have libpq remembering the rows...
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

13 April 2006, 11:35:07

Martijn van Oosterhout <kleptog@svana.org> writes:
> Right. Would you see value in a more formal libpq "hijack-me" interface
> that would support making the initial connection and then handing off
> the rest to something else?

I think this would just be busywork... the way ODBC is doing it seems
fine to me.  In any case, do we really want to encourage random apps to
bypass the library?  For one thing, with an API such as you suggest,
it would really be libpq's problem to figure out what to do with regular
vs passthrough calls.  As it stands, it's very obviously not libpq's
problem anymore once you hijack the socket.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

13 April 2006, 12:15:15

Martijn van Oosterhout <kleptog@svana.org> writes:

> On Thu, Apr 13, 2006 at 12:12:25PM +0100, Dave Page wrote:
> > > Ok. I'm not sure what this "double copying" you're referring 
> > > to is, 
> > 
> > The libpq driver copies results out of the PGresult struct into the
> > internal QueryResult classes. With libpq out of the loop, data can go
> > straight from the wire into the QR.
> 
> Hmm, the simplest improvement I can think of is one where you
> register a callback that libpq calls whenever it has received a new
> tuple.

That could be useful for applications but I think a driver really wants to
retain control of the flow of control. To make use of a callback it would have
to have an awkward dance of calling whatever function gives libpq license to
call the callback, having the callback stuff the data in a temporary space,
then checking for new data in the temporary space, and returning it to the
user.

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 12:22:11

* Martijn van Oosterhout (kleptog@svana.org) wrote:
> Why not? Internally we call pqAddTuple for every tuple, calling a user
> function instead is hardly going to be more expensive. Also, I was
> thinking of the situation where the user function could set a flag
> so the eventual caller of (perhaps) PQconsumeInput knows that it's got
> enough for now.

I went ahead and looked through the libpq source a bit.  What I was
suggesting looks like it would change primairly getAnotherTuple to,
instead of allocating the result memory itself, just store the result
into the appropriate place in the user-provided memory space.  Thus,
getAnotherTuple wouldn't do any allocation and wouldn't call pqAddTuple
at all.  It would need to keep track of where it is in the user-provided
memory area and if it runs out of space return back through the
'outofmemory' mechanism.

The new function would basically set up the appropriate structures in
the PGconn and then call 'parseInput()' which would then handle any
recently-arrived data, call getAnotherTuple, which would then detect
that it's dumping data into a user-provided area and would do so until
it's finished being called by parseInput() or it runs out of user memory
space.  This would be used with the async command processing.

A drawback, of course, is that this degenerates to busy-waiting if the
application has nothing better to do.  Any clue as to if the PQsocket
could safely be used in a select()-based system?  I'm guessing it could,
just never tried that myself. :)  Also not sure how to know if there's
data which needs to be sent and hasn't been yet for some reason.
Thanks!
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 12:23:21

On Thu, Apr 13, 2006 at 09:34:10AM -0400, Stephen Frost wrote:
> * Martijn van Oosterhout (kleptog@svana.org) wrote:
> > Except in the case of psqlODBC, it wants to be able to malloc/free()
> > each field, which your method doesn't solve. Also, it doesn't solve the
> > duplicate memory use, nor the retreiving of rows before the resultset
> > is complete.
>
> I don't entirely follow why you think it wouldn't solve the duplicate
> memory use (except perhaps in the psqlODBC case if they decide to just
> grab a bunch of tuples into one area and then go through and malloc/free
> each one after that, not exactly what I'd suggest...).

Right, I didn't understand that you meant to be doing this
synchronously, as the data came in. I thought it was just another way
of retreiving the data already received. But given that a stated reason
that psqlODBC didn't use the libpq interface was due to the copying of
all the data, it would be nice if we had something for that. From
looking at your declaration:

int PQgetTuples(PGresult *res,  // Returns number of tuples populated const int max_ntuples,        // Basically buffer
sizechar *result_set,             // Destination buffer const int *columnOffsets,     // integer array of offsets const
int*columnLengths,     // integer array of lengths, for checks const int record_len,         // Length of each
structureint *columnNulls,             // 0/1 for is not null / is null int resultFormat);            // Or maybe just
binary?

you seem to be suggesting that all the data be stored in one big memory
block at resultset.

What do you do if the data is longer than the given length? What does
record_len mean (what structures)? Also, you can't specify
binary/non-binary here, that's done in the query request. libpq doesn't
handle the data differently depending on binaryness. Also, how can you
find out the actual length of each value after the call?

Frankly I'm not seeing much improvement over normal processing. It just
seems like yet another data-model that won't fit most users. The
definition of PQgetvalue is merely:

return res->tuples[tup_num][field_num].value;

So we could acheive the same effect by letting people look into
PQresult before the query is finished. The function you suggest would
be especially difficult for something like psqlODBC which has no idea
beforehand how long a value could be.

I'm still of the opinion that letting people supply an alternative to
pqAddTuple would be cleaner. The interface would look like:

typedef struct pgresAttValue
{       int                     len;                    /* length in bytes of the value */       char       *value;
                /* actual value, plus terminating zero byte */ 
} PGresAttValue;

typedef int (*PQtuplecallback)( PQresult *res, PGresAttValue *fields );
int PQsettuplecallback( PQresult *res, PQtuplecallback cb );

fields is simply a pointer to an array of nfields such structures.
Users can do whatever they want with the info, store it in their own
structure, parse it, throw it away, send it over a network, etc. With
this callback I could probably implement your function above fairly
straightforwardly.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 12:32:29

* Greg Stark (gsstark@mit.edu) wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Hmm, the simplest improvement I can think of is one where you
> > register a callback that libpq calls whenever it has received a new
> > tuple.
>
> That could be useful for applications but I think a driver really wants to
> retain control of the flow of control. To make use of a callback it would have
> to have an awkward dance of calling whatever function gives libpq license to
> call the callback, having the callback stuff the data in a temporary space,
> then checking for new data in the temporary space, and returning it to the
> user.

I doubt the callback would be called at some inopportune time...
Probably the callback would be passed into a libpq call which then
directly calls the callback and is done with it when it returns.  The
libpq function would certainly need a parameter which is just passed to
the callback to allow the system to maintain state (such as how many
tuples the callback has processed so far) to avoid ugly global variables
but otherwise I don't really see that this is changing the flow of
control all that much...

I can see how having a callback would be useful though I think for a
good number of cases it's just going to be populating a memory region
with it and we could cover that common case by providing an API for
exactly that.  The other issue with a callback is that libpq would have
to either call the callback for each value (not my preference) or have
some way to pass a whole variable-length tuple to the callback, which
would require libpq to allocate memory for the tuple (hopefully only
once and not per-tuple) and then build up whatever structure it's going
to give to the callback in memory (copy once) and then call the callback
which would be required to copy the tuple somewhere else (copy again).
Of course, all of this is after an initial copy from read() into the
read buffer, but I doubt that could be helped (and read()'ing small
enough amounts to make it happen wouldn't really improve things).
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 12:32:32

On Thu, Apr 13, 2006 at 11:14:57AM -0400, Greg Stark wrote:
> That could be useful for applications but I think a driver really wants to
> retain control of the flow of control. To make use of a callback it would have
> to have an awkward dance of calling whatever function gives libpq license to
> call the callback, having the callback stuff the data in a temporary space,
> then checking for new data in the temporary space, and returning it to the
> user.

We have an asyncronous interface. I was thinking like:

PQsendQuery( conn, query );
res = PQgetResult( conn );
gotenough = FALSE;
PQsetcallback( res, mycallback );
while( !gotenough )PQconsumeinput(conn);
/* When we reach here we have at least five rows in our data structure */

sub mycallback(res,data)
{/* stuff data in memory structure */if( row_count > 5 )    gotenough = TRUE;
}

If you set non-blocking you can even go off and do other things while
waiting. No need for temporary space...

Does this seem too complex?

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

13 April 2006, 12:52:42

Martijn van Oosterhout <kleptog@svana.org> writes:
> ...you seem to be suggesting that all the data be stored in one big memory
> block at resultset.

I didn't like that either; it assumes far too much about what the
application needs to do.  I think what's wanted is a callback hook
that lets the app decide where and how to store the data.  Not sure
what the hook's API should be exactly, though.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 12:53:25

* Martijn van Oosterhout (kleptog@svana.org) wrote:
> Right, I didn't understand that you meant to be doing this
> synchronously, as the data came in. I thought it was just another way
> of retreiving the data already received. But given that a stated reason
> that psqlODBC didn't use the libpq interface was due to the copying of
> all the data, it would be nice if we had something for that. From
> looking at your declaration:
>
> int PQgetTuples(PGresult *res,  // Returns number of tuples populated
>   const int max_ntuples,        // Basically buffer size
>   char *result_set,             // Destination buffer
>   const int *columnOffsets,     // integer array of offsets
>   const int *columnLengths,     // integer array of lengths, for checks
>   const int record_len,         // Length of each structure
>   int *columnNulls,             // 0/1 for is not null / is null
>   int resultFormat);            // Or maybe just binary?
>
> you seem to be suggesting that all the data be stored in one big memory
> block at resultset.

The current block would be stored in one big memory block, yes.
Basically a malloc(BUF_SIZE*sizeof(my_structure));

> What do you do if the data is longer than the given length? What does
> record_len mean (what structures)? Also, you can't specify
> binary/non-binary here, that's done in the query request. libpq doesn't
> handle the data differently depending on binaryness. Also, how can you
> find out the actual length of each value after the call?

hmm, ok, binary/non-binary can be dropped then.  If the data is longer
than the length then you return and let the caller figure out what it
wants to do (realloc, malloc another area, etc).  Record_len is just the
size of each record, so the amount to skip from the start to get to
record #2.  libpq just needs it in getAnotherTuple to calculate the
place to put the next value.  Finding the actual length is a good point,
I should have included (as execParams has) an integer array for this,
which could actually replace columnNulls and have a special indication
when it's a Null value in the column.

> Frankly I'm not seeing much improvement over normal processing. It just
> seems like yet another data-model that won't fit most users. The
> definition of PQgetvalue is merely:
>
> return res->tuples[tup_num][field_num].value;
>
> So we could acheive the same effect by letting people look into
> PQresult before the query is finished. The function you suggest would
> be especially difficult for something like psqlODBC which has no idea
> beforehand how long a value could be.

I don't think it's quite the same effect... :P  I don't think it's
exactly uncommon for people to know their data structure and to have
defined a struct for it, allocate a set of memory and then want to just
loop through the memory in a for() loop based on the structure size.
This has been pretty common in standalone application I've seen and it's
really nice to be able to have a database just dump the results of a
query into such a structure.

> I'm still of the opinion that letting people supply an alternative to
> pqAddTuple would be cleaner. The interface would look like:
>
> typedef struct pgresAttValue
> {
>         int                     len;                    /* length in bytes of the value */
>         char       *value;                      /* actual value, plus terminating zero byte */
> } PGresAttValue;
>
> typedef int (*PQtuplecallback)( PQresult *res, PGresAttValue *fields );
> int PQsettuplecallback( PQresult *res, PQtuplecallback cb );
>
> fields is simply a pointer to an array of nfields such structures.
> Users can do whatever they want with the info, store it in their own
> structure, parse it, throw it away, send it over a network, etc. With
> this callback I could probably implement your function above fairly
> straightforwardly.

Sure you could but you're forced to do more copying around of the data
(copy into the PGresAttValue, copy out of it into your structure array).
If you want something more complex then a callback makes more sense but
I'm of the opinion that we're talking about a 90/10 or 80/20 split here
in terms of dump-into-memory array vs. do-something-more-complicated.
And that opinion isn't *solely* based on Oracle providing a similar
mechanism such that probably quite a few Oracle apps are written exactly
as I suggest (I don't think OCI8 has a callback like you're proposing at
all...).

Just to point out, we could do what you're proposing by letting people
look at PQresult during an async too.. ;)  Except, of course, libpq
would need to allocate/deallocate all the PQresults and have some way of
knowing which have been used by the caller and which havn't.
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

13 April 2006, 13:00:36

Stephen Frost <sfrost@snowman.net> writes:
> I can see how having a callback would be useful though I think for a
> good number of cases it's just going to be populating a memory region
> with it and we could cover that common case by providing an API for
> exactly that.

We already have that: it's called the existing libpq API.

The only reason I can see for offering any new feature in this area is
to cater to apps that want to transform the data representation
on-the-fly, not merely dump it into an area that will be the functional
equivalent of a PGresult.  So it really has to be a callback.

> The other issue with a callback is that libpq would have
> to either call the callback for each value (not my preference)

Why not?  That would eliminate a number of problems.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

13 April 2006, 13:06:50

Martijn van Oosterhout <kleptog@svana.org> writes:

> On Thu, Apr 13, 2006 at 11:14:57AM -0400, Greg Stark wrote:
> > That could be useful for applications but I think a driver really wants to
> > retain control of the flow of control. To make use of a callback it would have
> > to have an awkward dance of calling whatever function gives libpq license to
> > call the callback, having the callback stuff the data in a temporary space,
> > then checking for new data in the temporary space, and returning it to the
> > user.
> 
> We have an asyncronous interface. I was thinking like:
> 
> sub mycallback(res,data)
> {
>     /* stuff data in memory structure */
>     if( row_count > 5 )
>         gotenough = TRUE;
> }
> 
> If you set non-blocking you can even go off and do other things while
> waiting. No need for temporary space...
> 
> Does this seem too complex?

There's nothing wrong with a callback interface for applications. They can
generally have the callback function update the display or output to a file or
whatever they're planning to do with the data.

However drivers don't generally work that way. Drivers have functions like:

$q = prepare("select ...");
$q->execute();
while ($row = $q->fetch()) { print $row->{column};
}

To handle that using a callback interface would require that $q->fetch invoke
some kind of pqCheckForData() which would upcall to the callback with the
available data. The callback would have to stuff the data somewhere. Then
fetch() would check to see if there was data there and return it to the user.

It's doable but dealing with this impedance mismatch between the interfaces
necessitates extra steps. That means extra copying and extra function calls.

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 13:07:57

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I can see how having a callback would be useful though I think for a
> > good number of cases it's just going to be populating a memory region
> > with it and we could cover that common case by providing an API for
> > exactly that.
>
> We already have that: it's called the existing libpq API.

Right, and it sucks for doing large amounts of transfer through it.

> The only reason I can see for offering any new feature in this area is
> to cater to apps that want to transform the data representation
> on-the-fly, not merely dump it into an area that will be the functional
> equivalent of a PGresult.  So it really has to be a callback.

It's only the functional equivalent when you think all the world is a
Postgres app, which is just not the case.

> > The other issue with a callback is that libpq would have
> > to either call the callback for each value (not my preference)
>
> Why not?  That would eliminate a number of problems.

For one thing, it's certainly possible the callback (to do a data
transform like you're suggesting) would want access to the other
information in a given tuple.  Having to store a partial tuple in a
temporary area which has to be built up to the full tuple before you can
actually process it wouldn't be all that great.  This is much less true
for the contents of an entire table (that you would need it all before
being able to perform the transforms).  It would also be an awful lot of
calls.
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

13 April 2006, 13:36:34

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> The only reason I can see for offering any new feature in this area is
>> to cater to apps that want to transform the data representation
>> on-the-fly, not merely dump it into an area that will be the functional
>> equivalent of a PGresult.  So it really has to be a callback.

> It's only the functional equivalent when you think all the world is a
> Postgres app, which is just not the case.

If we are dumping data into a simple memory block in a format dictated
by libpq, then we haven't done a thing to make the app's use of that
data independent of libpq.  Furthermore, because that format has to be
generalized (variable-length fields, etc), it will not be noticeably
easier to use than the existing PQresult API.

What I would envision as a typical use of a callback is to convert the
data and store it in a C struct designed specifically for a particular
query's known result structure (say, a few ints, a string of a known
maximum length, etc).  libpq can't do that, but a callback could do it
easily.

The fixed-memory-block approach also falls over when considering results
of uncertain maximum size.  Lastly, it doesn't seem to me to respond at
all to the ODBC needs that started this thread: IIUC, they want each row
separately malloc'd so that they can free selected rows from the
completed resultset.

>>> The other issue with a callback is that libpq would have
>>> to either call the callback for each value (not my preference)
>>
>> Why not?  That would eliminate a number of problems.

> For one thing, it's certainly possible the callback (to do a data
> transform like you're suggesting) would want access to the other
> information in a given tuple.  Having to store a partial tuple in a
> temporary area which has to be built up to the full tuple before you can
> actually process it wouldn't be all that great.

So instead, you'd prefer to *always* store partial tuples in a temporary
area, thereby making sure the independent-field-conversions case has
performance just as bad as the dependent-conversions case.
I can't follow that reasoning.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 14:27:13

On Thu, Apr 13, 2006 at 11:54:33AM -0400, Stephen Frost wrote:

<snip>

> Sure you could but you're forced to do more copying around of the data
> (copy into the PGresAttValue, copy out of it into your structure array).
> If you want something more complex then a callback makes more sense but
> I'm of the opinion that we're talking about a 90/10 or 80/20 split here
> in terms of dump-into-memory array vs. do-something-more-complicated.

I think we're talking cross-purposes here. You seem to be interested in
making another way to get the data. What I'm trying to do is create an
interface flexible enough that no-one would ever want write their own
wire-protocol parser because they can get libpq to do it. This probably
falls into the 10% portion.

The use of PGresAttValue was deliberate. libpq already uses this so it
costs nothing. Also, the memory pointed to is allocated very cheaply
within libpq. The intention is that users can either choose to use that
(great for read-only, i.e. 90% of the time) or copy it *only* if they
want to (what psqlODBC wants to do).

Basically, your solution doesn't handle the use case of psqlODBC which
is specifically what I'm aiming at here...

> Just to point out, we could do what you're proposing by letting people
> look at PQresult during an async too.. ;)  Except, of course, libpq
> would need to allocate/deallocate all the PQresults and have some way of
> knowing which have been used by the caller and which havn't.

Eh? You only need one PQresult...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 14:32:27

On Thu, Apr 13, 2006 at 12:02:56PM -0400, Greg Stark wrote:
> There's nothing wrong with a callback interface for applications. They can
> generally have the callback function update the display or output to a file or
> whatever they're planning to do with the data.
>
> However drivers don't generally work that way. Drivers have functions like:

As I pointed out in another email, this change is not aimed at
applications doing fetch_next, but specifically at drivers like
psqlODBC which have a very special way of handling resultsets, in this
case, updateable resultsets. The aim is to work out why people are
writing their own wire-protocol parsers. To find out the deficiency in
libpq that prevents them using it.

I agree, for what you're talking about I don't think a callback is at
all relevent.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

13 April 2006, 14:38:25

Tom Lane <tgl@sss.pgh.pa.us> writes:

> So instead, you'd prefer to *always* store partial tuples in a temporary
> area, thereby making sure the independent-field-conversions case has
> performance just as bad as the dependent-conversions case.
> I can't follow that reasoning.

I think there's some confusion about what problem this is aiming to solve. I
thought the primary problem ODBC and other drivers have is just that they want
to be able to fetch whatever records are available instead of waiting for the
entire query results to be ready.

All it sounded like to me was a need for a function that would wait until n
records were available (or perhaps n bytes worth of records) then return.

You seem to be talking about a much broader set of problems to solve.

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

13 April 2006, 15:17:17

Greg Stark <gsstark@mit.edu> writes:
> I think there's some confusion about what problem this is aiming to solve. I
> thought the primary problem ODBC and other drivers have is just that they want
> to be able to fetch whatever records are available instead of waiting for the
> entire query results to be ready.

No, that's not what I'm thinking about at all, and I don't think Martijn
is either.  The point here is that ODBC wants to store the resultset in
a considerably different format from what libpq natively provides, and
we'd like to avoid the conversion overhead.

Now, a callback function could be (ab)used for the purpose of not
waiting, very easily: either do real processing on each row for itself,
or signal the main app via some outside-the-API mechanism whenever it
has stored N rows.  The question the app author would have to ask
himself is whether he needs to undo that processing if the query fails
further on, and if so how to do that.  But that need not be our problem.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 16:28:38

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > It's only the functional equivalent when you think all the world is a
> > Postgres app, which is just not the case.
>
> If we are dumping data into a simple memory block in a format dictated
> by libpq, then we haven't done a thing to make the app's use of that
> data independent of libpq.  Furthermore, because that format has to be
> generalized (variable-length fields, etc), it will not be noticeably
> easier to use than the existing PQresult API.

The format of the structure *isn't* really dictated by libpq.  The
offsets, value length and record size is intended to support most any
C array structure.  Variable length fields have a max size which, if it
goes over, an error is returned or indicated throgh the indicator
array.  It also gets it into the structure quite a few applications
would like to have it in (which is certainly not PQresult).

> What I would envision as a typical use of a callback is to convert the
> data and store it in a C struct designed specifically for a particular
> query's known result structure (say, a few ints, a string of a known
> maximum length, etc).  libpq can't do that, but a callback could do it
> easily.

Heh, this is exactly what I'm proposed we make libpq capable of doing,
which is a relatively simple thing to do.  I agree that it's often a
goal of application devlopers to get it into this structure.  The one
downside is that at the moment I think the binary results from libpq
come back in network-byte-order instead of host-byte-order.  Oracle
provided a way to indicate the types of the fields in the structure and
performed some conversions (such as these) for you.  The constants they
used started with "SQL_" but I'm not entirely sure if they were actually
defined in the standard or not.

> The fixed-memory-block approach also falls over when considering results
> of uncertain maximum size.  Lastly, it doesn't seem to me to respond at
> all to the ODBC needs that started this thread: IIUC, they want each row
> separately malloc'd so that they can free selected rows from the
> completed resultset.

Results of uncertain maximum size aren't a problem at all...  The caller
can do the exact same thing libpq does (realloc), or it could allocate
another array.  *Each* call to the libpq function would return the
number of elements actually populated into the memory-block; the caller
would then be expected to pass in a *fresh* memory block for the next
call (which could just be a simply calculated offset into the block they
allocated, or could be a realloc'd block + offset, or a brand new block,
etc...).

I'm really not why there seem to be this "this won't work!" reaction.
This isn't something I came up with out of whole cloth, it's an API that
isn't unlike PQexecParams, is similar to something Oracle does (which
I've used quite a bit for doing *exactly* what's mentioned above- I've
got an array of pre-defined C structs that I know match the query and
I want that array filled in) and is really not that complicated.

> > For one thing, it's certainly possible the callback (to do a data
> > transform like you're suggesting) would want access to the other
> > information in a given tuple.  Having to store a partial tuple in a
> > temporary area which has to be built up to the full tuple before you can
> > actually process it wouldn't be all that great.
>
> So instead, you'd prefer to *always* store partial tuples in a temporary
> area, thereby making sure the independent-field-conversions case has
> performance just as bad as the dependent-conversions case.
> I can't follow that reasoning.

I havn't been ruling out providing a callback mechanism as well but I
think it's the 10% case and the 90% case is being shoe-horned into the
10% case with a performance degredation to boot.  They're also not
partial tuples, it's not a temporary area, and there's demonstratably
less copying around of the data.  It seems ODBC may be in the 10% piece
here but I havn't looked at the ODBC source code yet.
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 16:41:40

* Greg Stark (gsstark@mit.edu) wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > So instead, you'd prefer to *always* store partial tuples in a temporary
> > area, thereby making sure the independent-field-conversions case has
> > performance just as bad as the dependent-conversions case.
> > I can't follow that reasoning.
>
> I think there's some confusion about what problem this is aiming to solve. I
> thought the primary problem ODBC and other drivers have is just that they want
> to be able to fetch whatever records are available instead of waiting for the
> entire query results to be ready.

Honestly, I think that may be part of it but it seems they're more
interested in storing the tuples in their own structure right away
instead of keeping a PQresult around and using it everywhere.

> All it sounded like to me was a need for a function that would wait until n
> records were available (or perhaps n bytes worth of records) then return.

I'm not sure that you'd actually want to block until there was a certain
amount returned, but that would be doable I suppose.

> You seem to be talking about a much broader set of problems to solve.

I'd like to improve the API in general to cover a set of use-cases that
I've run into quite a few times (and apparently some others have too as
other DBs offer a similar API).  I'd also like the ODBC driver to be
able to use libpq instead of having its own implementation of the
wireline protocol.  I was hoping these would overlap but it's possible
they won't in which case it might be sensible to add two new metheds to
the API (though I'm sure to get flak about that idea).
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

13 April 2006, 17:26:52

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > I think there's some confusion about what problem this is aiming to solve. I
> > thought the primary problem ODBC and other drivers have is just that they want
> > to be able to fetch whatever records are available instead of waiting for the
> > entire query results to be ready.
> 
> No, that's not what I'm thinking about at all, and I don't think Martijn
> is either.  The point here is that ODBC wants to store the resultset in
> a considerably different format from what libpq natively provides, and
> we'd like to avoid the conversion overhead.

So how would you provide the data to the callback? And how does having a
callback instead of a regular downcall give you any more flexibility in how
you present the data?

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

13 April 2006, 17:40:27

* Greg Stark (gsstark@mit.edu) wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > Greg Stark <gsstark@mit.edu> writes:
> > > I think there's some confusion about what problem this is aiming to solve. I
> > > thought the primary problem ODBC and other drivers have is just that they want
> > > to be able to fetch whatever records are available instead of waiting for the
> > > entire query results to be ready.
> >
> > No, that's not what I'm thinking about at all, and I don't think Martijn
> > is either.  The point here is that ODBC wants to store the resultset in
> > a considerably different format from what libpq natively provides, and
> > we'd like to avoid the conversion overhead.
>
> So how would you provide the data to the callback? And how does having a
> callback instead of a regular downcall give you any more flexibility in how
> you present the data?

The callback can be called for each record without having to store any
more than 1 tuple's worth of information in libpq.  I suppose you could
change things such that a call using the new interface only processes 1
tuple worth from the input stream instead and just not read any more
data from the socket until there have been enough calls to process
tuples.  That's really more the double-memory issue though.  There's
also the double-copying that's happening and the have to to wait for all
the data to come in before being able to read it, of course that last
could be handled by cursors...
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

13 April 2006, 17:52:09

On Thu, Apr 13, 2006 at 03:42:44PM -0400, Stephen Frost wrote:
> > You seem to be talking about a much broader set of problems to solve.
>
> I'd like to improve the API in general to cover a set of use-cases that
> I've run into quite a few times (and apparently some others have too as
> other DBs offer a similar API).  I'd also like the ODBC driver to be
> able to use libpq instead of having its own implementation of the
> wireline protocol.  I was hoping these would overlap but it's possible
> they won't in which case it might be sensible to add two new metheds to
> the API (though I'm sure to get flak about that idea).

Well, the psqlODBC driver apparently ran into a number of problems with
libpq that resulted in them not using it for their purpose. Given libpq
primary purpose is to connect to PostgreSQL, it failing at that is
something that should be fixed.

The problem you're trying to solve is also important, it would be nice
to find a good solution to that. I'm just not sure if it was relevent
to the decision to bypass libpq.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Bruce Momjian

Date:

13 April 2006, 19:16:35

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Thu, Apr 13, 2006 at 03:42:44PM -0400, Stephen Frost wrote:
> > > You seem to be talking about a much broader set of problems to solve.
> > 
> > I'd like to improve the API in general to cover a set of use-cases that
> > I've run into quite a few times (and apparently some others have too as
> > other DBs offer a similar API).  I'd also like the ODBC driver to be
> > able to use libpq instead of having its own implementation of the
> > wireline protocol.  I was hoping these would overlap but it's possible
> > they won't in which case it might be sensible to add two new metheds to
> > the API (though I'm sure to get flak about that idea).
> 
> Well, the psqlODBC driver apparently ran into a number of problems with
> libpq that resulted in them not using it for their purpose. Given libpq
> primary purpose is to connect to PostgreSQL, it failing at that is
> something that should be fixed.
> 
> The problem you're trying to solve is also important, it would be nice
> to find a good solution to that. I'm just not sure if it was relevent
> to the decision to bypass libpq.

I know there was a lot of confusion over parallel development of
psqlODBC and my guess is that current CVS is the best solution at this
time.  Of course, that doesn't invalidate the idea that this can be
revisited as things settle down and improvements made.

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

13 April 2006, 22:00:14

Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> No, that's not what I'm thinking about at all, and I don't think Martijn
>> is either.  The point here is that ODBC wants to store the resultset in
>> a considerably different format from what libpq natively provides, and
>> we'd like to avoid the conversion overhead.

> So how would you provide the data to the callback? And how does having a
> callback instead of a regular downcall give you any more flexibility in how
> you present the data?

You'd hand the callback the raw data coming off the wire (pointer and
byte count, probably), and then it could do whatever's appropriate.  For
instance, if the callback knows this field is to be converted to int, it
could do atoi() and then store the integer.  (Or if it knows the data is
transmitted in binary, ntohl() would be the thing instead.)

The basic point here is that the callback should replace all the parts
of getAnotherTuple() that are responsible for storing data into the
PGresult structure, including all of pqAddTuple.  If you aren't
satisfied with the PGresult representation, that's the level of
flexibility you need, IMHO.  I don't see the point of half-measures.

Probably there would need to be at least three callbacks involved:
one for setup, called just after the tuple descriptor info has been
received; one for per-field data receipt, and one for per-tuple
operations (called after all the fields of the current tuple have
been passed to the per-field callback).  Maybe you'd want a shutdown
callback too, although that's probably not strictly necessary since
whatever you might need it to do could be done equally well in the
app after PQgetResult returns.  (You still want to return a PGresult
to carry command success/failure info, and probably the tuple descriptor
info, even though use of the callbacks would leave it containing none of
the data.)

A useful finger exercise for validating the design would be to code up
the default callbacks, ie, code to build the current PGresult structure
using this API.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Hiroshi Inoue

Date:

13 April 2006, 22:38:01

Stephen Frost wrote:

>* Martijn van Oosterhout (kleptog@svana.org) wrote:
>  
>
>>On Thu, Apr 13, 2006 at 08:48:54AM +0100, Dave Page wrote:
>>    
>>
>>>Well, we had a pure custom implementation of the protocol, had a pure
>>>libpq based version and after much discussion decided that the best
>>>version of all was the hybrid as it allowed us to hijack features like
>>>SSL, Kerberos, pgpass et al, yet not be constrained by the limitations
>>>of libpq, or copy query results about so much.
>>>      
>>>
>>Right. Would you see value in a more formal libpq "hijack-me" interface
>>that would support making the initial connection and then handing off
>>the rest to something else?
>>
>>I'm wondering because obviously with the current setup, if libpq is
>>compiled with SSL support, psqlODBC must also be. Are there any points
>>where you have to fight libpq over control of the socket?
>>    
>>
>[...]
>  
>
>>Is there anything else you might need?
>>    
>>
>
>Instead of having it hijack the libpq connection and implement the
>wireline protocol itself, why don't we work on fixing the problems (such
>as the double-copying that libpq requires) in libpq to allow the driver
>(and others!) to use it in the 'orthodox' way?
>
>I would have spoken up on the ODBC list if I understood that 'hybrid'
>really meant 'just using libpq for connection/authentication'.  I really
>think it's a bad idea to have the ODBC driver reimplement the wireline
>protocol because that protocol does change from time to time and someone
>using libpq will hopefully have fewer changes (and thus makes the code
>easier to maintain) than someone implementing the wireline protocol
>themselves (just causing more busy-work that, at least we saw in the
>past with the ODBC driver, may end up taking *forever* for someone to
>be able to commit the extra required time to implement).
>  
>

Libpq and the psqlodbc driver have walked on another road for a very
long time.
In 6.3 or before, there wasn't a libpq library under Windows. In 6.4 we
had the
libpq library under Windows but it wasn't able to talk to 6.3 or before
unfortunately....
At last in 7.4 the libpq was able to speak both protocol v3 and protocol
v2 but it is
a pretty hard work at least for me to tranfer all the accummulated works
to libpq based
version. I'm not sure what kind of functionalities required for libpq to
make the tranfer
easy. Of cource double-copying issue is big one of them.

regards,
Hiroshi Inue

Re: Practical impediment to supporting multiple SSL libraries

From

"Zeugswetter Andreas DCP SD"

Date:

14 April 2006, 05:05:59

> Well, the psqlODBC driver apparently ran into a number of problems
with
> libpq that resulted in them not using it for their purpose.
> Given libpq primary purpose is to connect to PostgreSQL, it failing at
that is
> something that should be fixed.

I think you are forgetting, that e.g. a JDBC driver will not want to
depend on
an external C dll at all. It will want a native Java implementation
(Group 4).
Thus imho it is necessary to have a defined wire protocol, which we
have.

So if a driver needs to use the wire protocol it is imho not a problem.
If applications started using it, because they don't find a suitable
driver,
now that would be a problem.

Andreas

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

14 April 2006, 06:27:50

"Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at> writes:

> > Well, the psqlODBC driver apparently ran into a number of problems with
> > libpq that resulted in them not using it for their purpose. Given libpq
> > primary purpose is to connect to PostgreSQL, it failing at that is
> > something that should be fixed.
> 
> I think you are forgetting, that e.g. a JDBC driver will not want to depend
> on an external C dll at all. It will want a native Java implementation
> (Group 4). Thus imho it is necessary to have a defined wire protocol, which
> we have.

I think you are forgetting that this is a complete nonsequitor. 

Nobody suggested eliminating the defined wire protocol. Nor was anybody even
discussing JDBC. Java folks' fetish for reimplementing everything in Java is
entirely irrelevant.

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

14 April 2006, 11:43:05

Greg Stark <gsstark@MIT.EDU> writes:

> "Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at> writes:
> 
> > > Well, the psqlODBC driver apparently ran into a number of problems with
> > > libpq that resulted in them not using it for their purpose. Given libpq
> > > primary purpose is to connect to PostgreSQL, it failing at that is
> > > something that should be fixed.
> > 
> > I think you are forgetting, that e.g. a JDBC driver will not want to depend
> > on an external C dll at all. It will want a native Java implementation
> > (Group 4). Thus imho it is necessary to have a defined wire protocol, which
> > we have.
> 
> I think you are forgetting that this is a complete nonsequitor. 

Hm, now that I've had some sleep I think I see where you're going with this.

As long as there's a defined wire protocol (and there will always be one) then
there's nothing wrong with what the psqlODBC driver is doing and having a
libpq mode that hands off small bits of the unparsed stream isn't really any
different than just having the driver read the unparsed data from the socket.

I'm not sure whether that's true or not but it's certainly a reasonable point.
Sorry for my quick response last night.

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

14 April 2006, 11:54:15

On Thu, Apr 13, 2006 at 09:00:10PM -0400, Tom Lane wrote:
> Probably there would need to be at least three callbacks involved:
> one for setup, called just after the tuple descriptor info has been
> received; one for per-field data receipt, and one for per-tuple
> operations (called after all the fields of the current tuple have
> been passed to the per-field callback).  Maybe you'd want a shutdown
> callback too, although that's probably not strictly necessary since
> whatever you might need it to do could be done equally well in the
> app after PQgetResult returns.  (You still want to return a PGresult
> to carry command success/failure info, and probably the tuple descriptor
> info, even though use of the callbacks would leave it containing none of
> the data.)

Sounds really good. The only thing now is that the main author of the
wire-protocol code in psqlODBC has not yet made any comment on any of
this. So we dont want to set anything in stone until we know it would
solve their problem...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

14 April 2006, 12:01:13

On Fri, Apr 14, 2006 at 10:42:33AM -0400, Greg Stark wrote:
> Hm, now that I've had some sleep I think I see where you're going with this.
>
> As long as there's a defined wire protocol (and there will always be one) then
> there's nothing wrong with what the psqlODBC driver is doing and having a
> libpq mode that hands off small bits of the unparsed stream isn't really any
> different than just having the driver read the unparsed data from the socket.

Well, the main motivation for this is that when a new version of the
protocol appears, libpq will support it but psqlODBC won't. If libpq
provides a way to get these small bits of the unparsed stream in a
protocol independant way, then that problem goes away.

There are a number of other (primarily driver) projects that would
benefit from being able to bypass the PGresult structure for storing
data.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

14 April 2006, 12:22:30

Martijn van Oosterhout <kleptog@svana.org> writes:
> On Fri, Apr 14, 2006 at 10:42:33AM -0400, Greg Stark wrote:
>> As long as there's a defined wire protocol (and there will always be
>> one) then there's nothing wrong with what the psqlODBC driver is doing

> Well, the main motivation for this is that when a new version of the
> protocol appears, libpq will support it but psqlODBC won't. If libpq
> provides a way to get these small bits of the unparsed stream in a
> protocol independant way, then that problem goes away.

Greg's observation is correct, so maybe we are overthinking this
problem.  A fair question to ask is whether psqlODBC would consider
going back to a non-hybrid implementation if these features did exist
in libpq.

> There are a number of other (primarily driver) projects that would
> benefit from being able to bypass the PGresult structure for storing
> data.

Please mention some specific examples.  We need some examples as a
reality check.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

14 April 2006, 12:24:01

On Fri, Apr 14, 2006 at 04:53:53PM +0200, Martijn van Oosterhout wrote:
> Sounds really good.
<snip>

There's a message on the pgsql-odbc mailing list[1] with some reasons
for not using libpq:

1. The driver sets some session default parameters(DateStyle,  client_encoding etc) using start-up message.

As far as I can see it only does this when the environment variables
are set. Which IMHO is the correct behaviour. If psqlodbc doesn't
honour them that does violate the principle of least surprise. OTOH,
the users of ODBC possibly shouldn't be affected by the environment
variables of the user, given the user of ODBC likely doesn't know (or
care) that PostgreSQL is involved.

2. You can try V2 protocol implementation when the V3 implementation  has some bugs or performance issues.

Well, there is a point here, you can't force the version. It always
defaults to 3 if available.

3. Quote: I don't know what libraries the libpq would need in the
future but it's quite unpleasant for me if the psqlodbc driver can't be
loaded with the lack of needeless librairies.

It's a reason, just not a good one IMHO. If the user has installed
libpq with a number of libraries, then that's what the user wants. I'm
not sure why psqlODBC is worried about that.

So while this thread has produced several good ideas (which possibly
should be implemented regardless), perhaps we should focus on these
issues also.

Have a nice day,

[1] http://archives.postgresql.org/pgsql-odbc/2006-04/msg00052.php
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

14 April 2006, 12:35:34

-----Original Message-----
From: "Tom Lane"<tgl@sss.pgh.pa.us>
Sent: 14/04/06 16:22:45
To: "Martijn van Oosterhout"<kleptog@svana.org>
Cc: "Greg Stark"<gsstark@mit.edu>, "Zeugswetter Andreas DCP SD"<ZeugswetterA@spardat.at>, "Dave
Page"<dpage@vale-housing.co.uk>,"pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "Hiroshi
Inoue"<inoue@tpf.co.jp>
Subject: Re: [HACKERS] Practical impediment to supporting multiple SSL libraries

> A fair question to ask is whether psqlODBC would consider
> going back to a non-hybrid implementation if these features did exist
> in libpq.

It's not something I want to spend any more time on, and Hiroshi made it quite clear on -odbc yesterday that he doesn't
wantlibpq to become a requirement of psqlODBC (it's dynamically loaded atm, thus is optional). 

Regards, Dave

-----Unmodified Original Message-----
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Fri, Apr 14, 2006 at 10:42:33AM -0400, Greg Stark wrote:
>> As long as there's a defined wire protocol (and there will always be
>> one) then there's nothing wrong with what the psqlODBC driver is doing

> Well, the main motivation for this is that when a new version of the
> protocol appears, libpq will support it but psqlODBC won't. If libpq
> provides a way to get these small bits of the unparsed stream in a
> protocol independant way, then that problem goes away.

Greg's observation is correct, so maybe we are overthinking this
problem.  A fair question to ask is whether psqlODBC would consider
going back to a non-hybrid implementation if these features did exist
in libpq.

> There are a number of other (primarily driver) projects that would
> benefit from being able to bypass the PGresult structure for storing
> data.

Please mention some specific examples.  We need some examples as a
reality check.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Bruce Momjian

Date:

14 April 2006, 12:42:45

Dave Page wrote:
> 
> -----Original Message----- From: "Tom Lane"<tgl@sss.pgh.pa.us> Sent:
> 14/04/06 16:22:45 To: "Martijn van Oosterhout"<kleptog@svana.org> Cc:
> "Greg Stark"<gsstark@mit.edu>, "Zeugswetter Andreas DCP
> SD"<ZeugswetterA@spardat.at>, "Dave Page"<dpage@vale-housing.co.uk>,
> "pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "Hiroshi
> Inoue"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to
> supporting multiple SSL libraries
> 
> > A fair question to ask is whether psqlODBC would consider
> > going back to a non-hybrid implementation if these features did exist
> > in libpq.
> 
> It's not something I want to spend any more time on, and Hiroshi made
> it quite clear on -odbc yesterday that he doesn't want libpq to become
> a requirement of psqlODBC (it's dynamically loaded atm, thus is
> optional).

Hiroshi does not speak for the psqlODBC project.  It is a community
project.

-- Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

14 April 2006, 12:50:04


-----Original Message-----
From: "Bruce Momjian"<pgman@candle.pha.pa.us>
Sent: 14/04/06 16:42:08
To: "Dave Page"<dpage@vale-housing.co.uk>
Cc: "tgl@sss.pgh.pa.us"<tgl@sss.pgh.pa.us>, "kleptog@svana.org"<kleptog@svana.org>, "gsstark@mit.edu"<gsstark@mit.edu>,
"ZeugswetterA@spardat.at"<ZeugswetterA@spardat.at>,"pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>,
"inoue@tpf.co.jp"<inoue@tpf.co.jp>
Subject: Re: [HACKERS] Practical impediment to supporting multiple SSL libraries

> Hiroshi does not speak for the psqlODBC project.  It is a community
> project.

I am well aware of that, but as he is by far the most experienced and productive ODBC developer we have it would not be
particularlysensible to not give his opinion the weight it deserves - especially as there is unlikely to be anyone else
toundertake such a project (again). 

Regards, Dave

-----Unmodified Original Message-----
Dave Page wrote:
>
> -----Original Message----- From: "Tom Lane"<tgl@sss.pgh.pa.us> Sent:
> 14/04/06 16:22:45 To: "Martijn van Oosterhout"<kleptog@svana.org> Cc:
> "Greg Stark"<gsstark@mit.edu>, "Zeugswetter Andreas DCP
> SD"<ZeugswetterA@spardat.at>, "Dave Page"<dpage@vale-housing.co.uk>,
> "pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>, "Hiroshi
> Inoue"<inoue@tpf.co.jp> Subject: Re: [HACKERS] Practical impediment to
> supporting multiple SSL libraries
>
> > A fair question to ask is whether psqlODBC would consider
> > going back to a non-hybrid implementation if these features did exist
> > in libpq.
>
> It's not something I want to spend any more time on, and Hiroshi made
> it quite clear on -odbc yesterday that he doesn't want libpq to become
> a requirement of psqlODBC (it's dynamically loaded atm, thus is
> optional).

Hiroshi does not speak for the psqlODBC project.  It is a community
project.

-- Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Practical impediment to supporting multiple SSL libraries

From

Bruce Momjian

Date:

14 April 2006, 12:58:15

Dave Page wrote:
> > Hiroshi does not speak for the psqlODBC project.  It is a community
> > project.
> 
> I am well aware of that, but as he is by far the most experienced and
> productive ODBC developer we have it would not be particularly sensible
> to not give his opinion the weight it deserves - especially as there
> is unlikely to be anyone else to undertake such a project (again).

Right, sure he has weight.  It is the concept that "If Hiroshi doesn't
want it, it isn't going to happen", that I objected to.

-- Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Practical impediment to supporting multiple SSL

From

"Joshua D. Drake"

Date:

14 April 2006, 13:10:56

> > It's not something I want to spend any more time on, and Hiroshi made
> > it quite clear on -odbc yesterday that he doesn't want libpq to become
> > a requirement of psqlODBC (it's dynamically loaded atm, thus is
> > optional).
> 
> Hiroshi does not speak for the psqlODBC project.  It is a community
> project.

Well yes it is a community project, but whoever is doing the development
is going to make the decision on what direction to go.

Sincerely,

Joshua D. Drake

> 
> --
>   Bruce Momjian   http://candle.pha.pa.us
>   EnterpriseDB    http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 
-- 
           === The PostgreSQL Company: Command Prompt, Inc. ===     Sales/Support: +1.503.667.4564 || 24x7/Emergency:
+1.800.492.2240    Providing the most comprehensive  PostgreSQL solutions since 1997
http://www.commandprompt.com/

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

14 April 2006, 13:40:35

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Please mention some specific examples.  We need some examples as a
> reality check.

Just took a look through a couple of Debian packages which depend on
libpq4:

libpam-pgsql: pam_pgsql.c, line 473 it uses PQgetvalue() as one would
expect, but doesn't actually save the pointer anywhere, just uses it to
do comparisons against (all it stores, apparently, is a password in the
DB).

libnss-pgsql: src/backend.c, line 228:
 sptr = PQgetvalue(res, row, colnum); slen = strlen(sptr); if(*buflen < slen+1) {   return NSS_STATUS_TRYAGAIN; }
strncpy(*buffer,sptr, slen); (*buffer)[slen] = '\0';

 *valptr = *buffer;
 *buffer += slen + 1; *buflen -= slen + 1;
 return NSS_STATUS_SUCCESS;

That really seems to be the classic example to me.  Get the data from
PQresult, store it in something else, work on it.

mapserver: mappostgis.c, starting from line 1340:

shape->values = (char **) malloc(sizeof(char *) * layer->numitems);
for(t = 0; t < layer->numitems; t++) {    temp1= (char *) PQgetvalue(query_result, 0, t);    size =
PQgetlength(query_result,0, t);    temp2 = (char *) malloc(size + 1);    memcpy(temp2, temp1, size);    temp2[size] =
0;/* null terminate it */

    shape->values[t] = temp2;

}

This same code repeats in another place (1139).  They also
appear to forget to PQclear() in some cases. :(  They don't appear to
ever save the pointer returned by PQresult() for anything.

postfix: src/global/dict_pgsql.c, starting from line 349:

numcols = PQnfields(query_res);

for (expansion = i = 0; i < numrows && dict_errno == 0; i++) {   for (j = 0; j < numcols; j++) {       r =
PQgetvalue(query_res,i, j);       if (db_common_expand(dict_pgsql->ctx, dict_pgsql->result_format,
     r, name, result, 0)           && dict_pgsql->expansion_limit > 0           && ++expansion >
dict_pgsql->expansion_limit){           msg_warn("%s: %s: Expansion limit exceeded for key: '%s'",
myname,dict_pgsql->parser->name, name);           dict_errno = DICT_ERR_RETRY;           break;       }   }

}
PQclear(query_res);
r = vstring_str(result);
return ((dict_errno == 0 && *r) ? r : 0);

exim does something similar to postfix too.

It really seems unlikely that anyone keeps PQresult's around for very
long and they all seem to want to stick it into their own memory
structure.  I don't know how many people would move to a new API should
one be provided though.  Callbacks can be kind of a pain in the butt to
code too which makes the amount of effort required to move to using them
a bit higher too.  This all means double memory usage though and that
really makes me want some kind of API that can be used to process data
as it comes in.

Another thought along these lines:  Perhaps a 'PQgettuple' which can be
used to process one tuple at a time.  This would be used in an ASYNC
fashion and libpq just wouldn't read/accept more than a tuple's worth
each time, which it could do into a fixed area (in general, for a
variable-length field it could default to an initial size and then only
grow it when necessary, and grow it larger than the current request by
some amount to hopefully avoid more malloc/reallocs later).
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

"Dave Page"

Date:

14 April 2006, 13:45:17

-----Original Message-----
From: "Bruce Momjian"<pgman@candle.pha.pa.us>
Sent: 14/04/06 16:57:58
To: "Dave Page"<dpage@vale-housing.co.uk>
Cc: "tgl@sss.pgh.pa.us"<tgl@sss.pgh.pa.us>, "kleptog@svana.org"<kleptog@svana.org>, "gsstark@mit.edu"<gsstark@mit.edu>,
"ZeugswetterA@spardat.at"<ZeugswetterA@spardat.at>,"pgsql-hackers@postgresql.org"<pgsql-hackers@postgresql.org>,
"inoue@tpf.co.jp"<inoue@tpf.co.jp>
Subject: Re: [HACKERS] Practical impediment to supporting multiple SSL libraries

> Right, sure he has weight.  It is the concept that "If Hiroshi doesn't
> want it, it isn't going to happen", that I objected to.

I don't believe I said that - and you ought to know me well enough by now to know I wouldn't have said it!

:-)

Regards, Dave

-----Unmodified Original Message-----
Dave Page wrote:
> > Hiroshi does not speak for the psqlODBC project.  It is a community
> > project.
>
> I am well aware of that, but as he is by far the most experienced and
> productive ODBC developer we have it would not be particularly sensible
> to not give his opinion the weight it deserves - especially as there
> is unlikely to be anyone else to undertake such a project (again).

Right, sure he has weight.  It is the concept that "If Hiroshi doesn't
want it, it isn't going to happen", that I objected to.

-- Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

14 April 2006, 13:47:26

On Fri, Apr 14, 2006 at 11:22:23AM -0400, Tom Lane wrote:
> Greg's observation is correct, so maybe we are overthinking this
> problem.  A fair question to ask is whether psqlODBC would consider
> going back to a non-hybrid implementation if these features did exist
> in libpq.

Well, it is an issue. It's not a specific problem per se that psqlODBC
implements the protocol itself. If you remember right back at the
beginning of the thread (see subject) there was the issue of users
using libpq to connect and then continuing themselves. The issue being
that the pointer from PQgetssl() wouldn't work if we had different SSL
libraries available.

Perhaps a far easier approach would be to indeed just have a hijack
interface that provides read/write over whatever protocol libpq
negotiated. Then people could write their own protocol parsers to suit
their needs while still using libpq for the connection. Have the cake
and eat it too?

Note, we would have to allow users of libpq to force the version,
otherwise libpq would connect using a version the user doesn't
understand.

> Please mention some specific examples.  We need some examples as a
> reality check.

Well, psqlODBC is the obvious case. Besides that it becomes tricky. I
would think that DBI::Pg could benefit, I just don't understand the
code well enough to know if it's directly useful.

I would expect drivers in particular to benefit and some complex
applications, but if you're asking for specific examples, I don't have
any...

That doesn't change the fact that it's a nice idea, just definite
benificiaries are harder to find.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

14 April 2006, 14:05:16

Martijn van Oosterhout <kleptog@svana.org> writes:
> Perhaps a far easier approach would be to indeed just have a hijack
> interface that provides read/write over whatever protocol libpq
> negotiated.

Well, there's a precedent to look at: the original implementation of
COPY mode was pretty nearly exactly that.  And it sucked, and eventually
we changed it.  So I'd be pretty leery of repeating the experience...
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Martijn van Oosterhout

Date:

14 April 2006, 14:51:25

On Fri, Apr 14, 2006 at 01:05:11PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Perhaps a far easier approach would be to indeed just have a hijack
> > interface that provides read/write over whatever protocol libpq
> > negotiated.
>
> Well, there's a precedent to look at: the original implementation of
> COPY mode was pretty nearly exactly that.  And it sucked, and eventually
> we changed it.  So I'd be pretty leery of repeating the experience...

As I remember, the main issue was with the loss of control over the
error state and recovering if stuff went wrong. In this case, once
someone hijacks a connection they can't hand it back. It only option is
to close.

It was just thinking of providing pointers to pqsecure_read/write and
maybe a few other things, but that's it.

Or was there something else?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Practical impediment to supporting multiple SSL libraries

From

Greg Stark

Date:

14 April 2006, 15:28:03

Stephen Frost <sfrost@snowman.net> writes:

> Another thought along these lines:  Perhaps a 'PQgettuple' which can be
> used to process one tuple at a time.  This would be used in an ASYNC
> fashion and libpq just wouldn't read/accept more than a tuple's worth
> each time, which it could do into a fixed area (in general, for a
> variable-length field it could default to an initial size and then only
> grow it when necessary, and grow it larger than the current request by
> some amount to hopefully avoid more malloc/reallocs later).

I know DBD::Oracle uses an interface somewhat like this but more
sophisticated. It provides a buffer and Oracle fills it with as many records
as it can. 

It's blocking though (by default) and DBD::Oracle tries to adjust the size of
the buffer to keep the network pipeline full, but if the application is slow
at reading the data then the network buffers fill and it pushes back to the
database which blocks writing.

This is normally a good thing though. One of the main problems with the
current libpq interface is that if you have a very large result set it flows
in as fast as it can and the library buffers it *all*. If you're trying to
avoid forcing the user to eat millions of records at once you don't want to be
buffering them anywhere all at once. You want a constant pipeline of records
streaming out as fast as they can be processed and no faster.

-- 
greg

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

14 April 2006, 15:41:09

* Greg Stark (gsstark@mit.edu) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Another thought along these lines:  Perhaps a 'PQgettuple' which can be
> > used to process one tuple at a time.  This would be used in an ASYNC
> > fashion and libpq just wouldn't read/accept more than a tuple's worth
> > each time, which it could do into a fixed area (in general, for a
> > variable-length field it could default to an initial size and then only
> > grow it when necessary, and grow it larger than the current request by
> > some amount to hopefully avoid more malloc/reallocs later).
>
> I know DBD::Oracle uses an interface somewhat like this but more
> sophisticated. It provides a buffer and Oracle fills it with as many records
> as it can.

The API I suggested originally did this, actually.  I'm not sure if it
would be used in these cases though which is why I was backing away from
it a bit.  I think it's great if you're grabbing alot of data but these
seem to be cases when you're not.  Then again, that's probably because
of the kind of things I was looking at (you don't generally see large
data-analysis tools in a distribution like Debian simply because those
tools are usually specialized to a given data set, as is actually the
case with some tools we use here at my work which make use of the Oracle
buffer system and I'd love to move to something similar for
Postgres...).

> It's blocking though (by default) and DBD::Oracle tries to adjust the size of
> the buffer to keep the network pipeline full, but if the application is slow
> at reading the data then the network buffers fill and it pushes back to the
> database which blocks writing.

It could be done as blocking or non-blocking and could be an option in
the API, really.  I do prefer the idea that if the application is slow
at reading the data then it pushes back to the database to block
writing.  I also *really* prefer to minimize the amount of memory used
by libraries...  I've never felt it's appropriate for libpq to allocate
huge amount of memory in response to a large query. :/  I know this can
be worked around using cursors but I still feel it's a terrible thing
for a library to do.

> This is normally a good thing though. One of the main problems with the
> current libpq interface is that if you have a very large result set it flows
> in as fast as it can and the library buffers it *all*. If you're trying to
> avoid forcing the user to eat millions of records at once you don't want to be
> buffering them anywhere all at once. You want a constant pipeline of records
> streaming out as fast as they can be processed and no faster.

Right...  As I mentioned, the application can use cursors to
*work-around* this foolishness in libpq but that doesn't really make it
any less silly.
Thanks!
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Tom Lane

Date:

14 April 2006, 18:05:14

Stephen Frost <sfrost@snowman.net> writes:
> Right...  As I mentioned, the application can use cursors to
> *work-around* this foolishness in libpq but that doesn't really make it
> any less silly.

Before you define libpq's behavior as "foolishness", you really ought to
have a watertight semantics for what will happen in your proposal when a
SELECT fails partway through (ie, after delivering some but not all of
the tuples).  In my mind the main reason for all-or-nothing PGresult
behavior is exactly to save applications from having to deal with that
case.
        regards, tom lane

Re: Practical impediment to supporting multiple SSL libraries

From

Stephen Frost

Date:

14 April 2006, 19:00:05

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Right...  As I mentioned, the application can use cursors to
> > *work-around* this foolishness in libpq but that doesn't really make it
> > any less silly.
>
> Before you define libpq's behavior as "foolishness", you really ought to
> have a watertight semantics for what will happen in your proposal when a
> SELECT fails partway through (ie, after delivering some but not all of
> the tuples).  In my mind the main reason for all-or-nothing PGresult
> behavior is exactly to save applications from having to deal with that
> case.

The library would report an error when trying to finish reading the
data.  Honestly, that just isn't libpq's problem, it's the application's
problem to deal with it and I know that *I* certainly wouldn't have any
expectation (or faith...) in libpq doing the right thing for my
particular application in any given failure case.

The library should report error conditions, not assume that I'd only
want all-or-nothing anyway.  I'm not all about breaking backwards
compatibility though and so I'm not suggesting we change the existing
behavior in this regard.  This should not be an impediment to an
addition to the API to allow for reading the data as it comes in.

This certainly isn't unheardof or unexpected in the database world
either as (at least) Oracle's library doesn't do this collect-everything
and make-sure-it's-all-happy before returning data to the user.  Not to
mention the potential for something bad to happen *while* reading the
data out of libpq.  For example, having the machine crash because you've
run it out of memory because you've got at least 2 and probably 3 copies
of the data in memory (ie: ODBC under Windows with libpq).  libpq might
have been correct to provide data to the client since it was sure it had
it all but it doesn't help a bit when because of libpq the box runs out
of memory.
Thanks,
    Stephen

Re: Practical impediment to supporting multiple SSL libraries

From

Hiroshi Inoue

Date:

16 April 2006, 00:54:40

Martijn van Oosterhout wrote:

>On Fri, Apr 14, 2006 at 04:53:53PM +0200, Martijn van Oosterhout wrote:
>  
>
>>Sounds really good.
>>    
>>
><snip>
>
>There's a message on the pgsql-odbc mailing list[1] with some reasons
>for not using libpq:
>
>1. The driver sets some session default parameters(DateStyle,
>   client_encoding etc) using start-up message.
>
>As far as I can see it only does this when the environment variables
>are set. Which IMHO is the correct behaviour. 
>

IMHO if libpq is to be a generic library it should first provide exactly
what it can do using the protocol. *Environment varibales* are not
appropriate for per application/datasource settings at all.

>3. Quote: I don't know what libraries the libpq would need in the
>future but it's quite unpleasant for me if the psqlodbc driver can't be
>loaded with the lack of needeless librairies.
>
>It's a reason, just not a good one IMHO. If the user has installed
>libpq with a number of libraries, then that's what the user wants. I'm
>not sure why psqlODBC is worried about that.
>  
>

It's very important to clarify for what the libraries are needed and my
basic policy
is to provide appropriate bindings(linkage) between the libraries for
the current
dependency relation. As for SSL mode it is only a mere extra for the current
enhanced driver. My main purpose was to finish up my unfinished work
before 7.4
using V3 protocol, holdable cursors etc. The current driver under Windows is
available without the existence of libpq.

regards,
Hiroshi Inoue

Re: Practical impediment to supporting multiple SSL libraries

From

Hiroshi Inoue

Date:

16 April 2006, 18:30:14

Martijn van Oosterhout wrote:

>On Thu, Apr 13, 2006 at 09:00:10PM -0400, Tom Lane wrote:
>  
>
>>Probably there would need to be at least three callbacks involved:
>>one for setup, called just after the tuple descriptor info has been
>>received; one for per-field data receipt, and one for per-tuple
>>operations (called after all the fields of the current tuple have
>>been passed to the per-field callback).  Maybe you'd want a shutdown
>>callback too, although that's probably not strictly necessary since
>>whatever you might need it to do could be done equally well in the
>>app after PQgetResult returns.  (You still want to return a PGresult
>>to carry command success/failure info, and probably the tuple descriptor
>>info, even though use of the callbacks would leave it containing none of
>>the data.)
>>    
>>
>
>Sounds really good. The only thing now is that the main author of the
>wire-protocol code in psqlODBC has not yet made any comment on any of
>this. So we dont want to set anything in stone until we know it would
>solve their problem...
>  
>

Unfortunately I don't have so much time to examine it.
Though the double_copying issue may be the biggest one, I'm pretty sure
it's not the unqiue one. We are happy if we would be able to replace the
current
code by libpq API one by one but it's impossible because the driver
can't go back
to libpq mode once after it went into hi-jacking mode.
As for hi-jacking mode used in the driver it's better to be able to use
encapsulated
recv/send than getting the pointer to SSL or socket.

regards,
Hiroshi Inoue