Thread: libpq type system 0.9a

libpq type system 0.9a

From
"Merlin Moncure"
Date:
Yesterday, we notified -hackers of the latest version of the libpq
type system.  Just to be sure the right people are getting notified,
we are posting the latest patch here as well.  Would love to get some
feedback on this.

The latest version of libpq type system is available here:
http://www.esilo.com/projects/postgresql/libpq/typesys-0.9a.tar.gz

The following modifications where made:
*) documentation fixes
*) parameter resets are no longer automatic
*) updated to patch vs. REL8_3_STABLE

Merlin Moncure & Andrew Chernow
eSilo

Re: libpq type system 0.9a

From
"Florian G. Pflug"
Date:
Merlin Moncure wrote:
> Yesterday, we notified -hackers of the latest version of the libpq
> type system.  Just to be sure the right people are getting notified,
> we are posting the latest patch here as well.  Would love to get some
> feedback on this.
Sorry if this has been discussed before, but why is it necessary
to specify the type when calling PQgetf on a result? It seems that this
formatting string *always* has to match the type list of your select
statement, no?

regards, Florian Pflug


Re: libpq type system 0.9a

From
"Merlin Moncure"
Date:
On Wed, Mar 5, 2008 at 5:47 PM, Florian G. Pflug <fgp@phlo.org> wrote:
> Merlin Moncure wrote:
>  > Yesterday, we notified -hackers of the latest version of the libpq
>  > type system.  Just to be sure the right people are getting notified,
>  > we are posting the latest patch here as well.  Would love to get some
>  > feedback on this.
>  Sorry if this has been discussed before, but why is it necessary
>  to specify the type when calling PQgetf on a result? It seems that this
>  formatting string *always* has to match the type list of your select
>  statement, no?

yes...it always has to match.  the format string requirements could in
theory be relaxed (for 'get') but this would break symmetry with 'put'
and you would lose a sanity check...getf like scanf writes directly
into application memory so the double-specifying (directly in the
format string and indirectly in the query) isn't necessarily a bad
thing.  imagine if your application was 'select * from table' and one
of the field types changed...disaster.

merlin

Re: libpq type system 0.9a

From
Andrew Chernow
Date:
Merlin Moncure wrote:
> On Wed, Mar 5, 2008 at 5:47 PM, Florian G. Pflug <fgp@phlo.org> wrote:
>> Merlin Moncure wrote:
>>  > Yesterday, we notified -hackers of the latest version of the libpq
>>  > type system.  Just to be sure the right people are getting notified,
>>  > we are posting the latest patch here as well.  Would love to get some
>>  > feedback on this.
>>  Sorry if this has been discussed before, but why is it necessary
>>  to specify the type when calling PQgetf on a result? It seems that this
>>  formatting string *always* has to match the type list of your select
>>  statement, no?
>
> yes...it always has to match.  the format string requirements could in
> theory be relaxed (for 'get') but this would break symmetry with 'put'
> and you would lose a sanity check...getf like scanf writes directly
> into application memory so the double-specifying (directly in the
> format string and indirectly in the query) isn't necessarily a bad
> thing.  imagine if your application was 'select * from table' and one
> of the field types changed...disaster.
>
> merlin
>
>

A few other reasons....

 >>why is it necessary to specify the type when calling PQgetf on a result

Unlike PQgetvalue, all values returned by PQgetf are either native C types or
structures ... not C strings.  When you call getf you must tell it what types to
read out of the result object.  Like scanf, they must be the correctly sized
data types.

PGdate date;
int i4;
PQgetf(result, tup_num, "%date %int4", 0, &date, 1, &i4);

Specifying anything other than a %date or %int4 in the above example is a
programming error.  You would be asking to fetch a value of the wrong type.
Without the formatting string, libpq would have to va_arg(ASSUME_T) your value.

// no specifier
int i;
PQgetf(result, tup, field, &i);

In the above, libpq would have to use PQftype to determine what the native C
type is of your variable argument.  If PQftype returned INT8OID, you begin to
clobber your application's memory space ... va_arg(ap, long long) on a 32-bit
value.  This problem is solved by telling libpq what data type you want from a
field.

Also, the libpq type system enforces strict type checking when performing getf
calls.  This protects from mis-matches "programming errors" on types:

For example:

-- create table t (a int8);
PQresult *result = PQexec(conn, "SELECT a FROM t");
char *val = PQgetvalue(result, ...);
int a = atoi(val); // assumed its an int4

In the above example, the libpq user thinks the 'a' column of the 't' table is
an int4 when in fact its an int8.  The above may work most of the time but will
eventually truncate the value and nip you in the butt.  With PQgetf, you would
get an error saying the server returned an int8 and you are asking for an int4.
  Thus, the programming bug would be squashed immediately.

Also, user-defined types are not known to libpq so PQftype would not really
work.  They could if the libpq type system referenced data types by OID, but
this is not portable to other servers.  It is more portable to use the type
name.  For example, a company with 15 postgresql servers that use the same
collection of company-specific user-defined data types.  The type names would be
the same across the 15 servers but there is no guarentee the OIDs would be.

Composites and arrays caused a few issues as well.

We also tried to provide as much protection as possible ... in the spirit of the
backend.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: libpq type system 0.9a

From
Alvaro Herrera
Date:
Merlin Moncure escribió:

> The latest version of libpq type system is available here:
> http://www.esilo.com/projects/postgresql/libpq/typesys-0.9a.tar.gz

This patch is not in diff -c format ... please provide a diff -c patch,
and add the URL to the wiki patch queue:
http://wiki.postgresql.org/wiki/CommitFest:March

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: libpq type system 0.9a

From
"Merlin Moncure"
Date:
On Tue, Mar 25, 2008 at 4:21 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Merlin Moncure escribió:
>  > The latest version of libpq type system is available here:
>  > http://www.esilo.com/projects/postgresql/libpq/typesys-0.9a.tar.gz
>
>  This patch is not in diff -c format ... please provide a diff -c patch,
>  and add the URL to the wiki patch queue:
>  http://wiki.postgresql.org/wiki/CommitFest:March

converted to context diff (0.9b):
http://www.esilo.com/projects/postgresql/libpq/typesys-0.9b.tar.gz

will have wiki set up soon...need to get an account over there.

merlin

Re: libpq type system 0.9a

From
Alvaro Herrera
Date:
Merlin Moncure escribió:
> Yesterday, we notified -hackers of the latest version of the libpq
> type system.  Just to be sure the right people are getting notified,
> we are posting the latest patch here as well.  Would love to get some
> feedback on this.

I had a look at this patch some days ago, and the first question in my
mind was: why is it explicitely on libpq?  Why not have it as a separate
library (say libpqtypes)?  That way, applications not using it would not
need to link to it.  Applications interested in using it would just need
to add another -l switch to their link line.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: libpq type system 0.9a

From
Joe Conway
Date:
Alvaro Herrera wrote:
> Merlin Moncure escribió:
>> Yesterday, we notified -hackers of the latest version of the libpq
>> type system.  Just to be sure the right people are getting notified,
>> we are posting the latest patch here as well.  Would love to get some
>> feedback on this.
>
> I had a look at this patch some days ago, and the first question in my
> mind was: why is it explicitely on libpq?  Why not have it as a separate
> library (say libpqtypes)?  That way, applications not using it would not
> need to link to it.  Applications interested in using it would just need
> to add another -l switch to their link line.
>

+1

Joe


Re: libpq type system 0.9a

From
Andrew Chernow
Date:
Joe Conway wrote:
> Alvaro Herrera wrote:
>> Merlin Moncure escribió:
>>> Yesterday, we notified -hackers of the latest version of the libpq
>>> type system.  Just to be sure the right people are getting notified,
>>> we are posting the latest patch here as well.  Would love to get some
>>> feedback on this.
>>
>> I had a look at this patch some days ago, and the first question in my
>> mind was: why is it explicitely on libpq?  Why not have it as a separate
>> library (say libpqtypes)?  That way, applications not using it would not
>> need to link to it.  Applications interested in using it would just need
>> to add another -l switch to their link line.
>>
>
> +1
>
> Joe
>
>

What is gained by having a separate library?  Our changes don't bloat the
library size so I'm curious what the benefits are to not linking with it?  If
someone doesn't want to use, they don't have to.  Similar to the backend, there
is stuff in there I personally don't use (like geo types), but I'm not sure that
justifies a link option -lgeotypes.

The changes we made are closely tied to libpq's functionality.  Adding PQputf to
simplify the parameterized API, adding PQgetf to compliment PQgetvalue and added
the ability to register user-defined type handlers (used by putf and getf).
PQgetf makes extensive use of PGresult's internal API, especially for arrays and
composites.  Breaking this into a separate library would require an external
library to access the private internals of libpq.

Personally, I am not really in favor of this idea because it breaks apart code
that is very related.  Although, it is doable.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: libpq type system 0.9a

From
Andrew Chernow
Date:
Andrew Chernow wrote:
> Joe Conway wrote:
>> Alvaro Herrera wrote:
>>> Merlin Moncure escribió:
>>>> Yesterday, we notified -hackers of the latest version of the libpq
>>>> type system.  Just to be sure the right people are getting notified,
>>>> we are posting the latest patch here as well.  Would love to get some
>>>> feedback on this.
>>>
>>> I had a look at this patch some days ago, and the first question in my
>>> mind was: why is it explicitely on libpq?  Why not have it as a separate
>>> library (say libpqtypes)?  That way, applications not using it would not
>>> need to link to it.  Applications interested in using it would just need
>>> to add another -l switch to their link line.
>>>
>>
>> +1
>>
>> Joe
>>
>>
>
> What is gained by having a separate library?  Our changes don't bloat
> the library size so I'm curious what the benefits are to not linking
> with it?  If someone doesn't want to use, they don't have to.  Similar
> to the backend, there is stuff in there I personally don't use (like geo
> types), but I'm not sure that justifies a link option -lgeotypes.
>
> The changes we made are closely tied to libpq's functionality.  Adding
> PQputf to simplify the parameterized API, adding PQgetf to compliment
> PQgetvalue and added the ability to register user-defined type handlers
> (used by putf and getf). PQgetf makes extensive use of PGresult's
> internal API, especially for arrays and composites.  Breaking this into
> a separate library would require an external library to access the
> private internals of libpq.
>
> Personally, I am not really in favor of this idea because it breaks
> apart code that is very related.  Although, it is doable.
>

I poked around to see how this would work.  There are some problems.

1. members were added to PGconn so connection-based type handler information can
be copied to PGparam and PGresult objects.

2. members were added to PGresult, referenced in #1.  To properly getf values,
the connection-based type handler information must be available to PGresult.
Otherwise, PQgetf would require an additional argument, a PGconn, which may have
been closed already.

3. PQfinish calls pqClearTypeHandler to free type info assigned to the PGconn.

4. PQclear also calls pqClearTypeHandlers

It would also remove some of the simplicity.  Creating a connection would no
longer initialized type info, which gets copied to PGparam and PGresult.  Type
info includes a list of built-in handlers and backend config, like
integer_datetimes, server-version, etc...  That means an additional function
must be called after PQconnectdb.  But where would the type info be stored?  It
wouldn't exist in PGconn anymore?  Also, this would require double frees.  You
have to free the result as well as the type info since they are no longer one
object.  Same holds true for a pgconn.

There is something elegant about not requiring additional API calls to perform a
putf or getf.  It'll just work if you want to use it.  You can use PQgetf on a
result returned by PQexec and you can use PQputf, PQparamExec followed by
PQgetvalue.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: libpq type system 0.9a

From
"Merlin Moncure"
Date:
On Fri, Apr 4, 2008 at 6:56 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Merlin Moncure escribió:
>
> > Yesterday, we notified -hackers of the latest version of the libpq
>  > type system.  Just to be sure the right people are getting notified,
>  > we are posting the latest patch here as well.  Would love to get some
>  > feedback on this.
>
>  I had a look at this patch some days ago, and the first question in my
>  mind was: why is it explicitely on libpq?  Why not have it as a separate
>  library (say libpqtypes)?  That way, applications not using it would not
>  need to link to it.  Applications interested in using it would just need
>  to add another -l switch to their link line.

I think that is oversimplifying things a little bit.  As Andrew stated
there are some aspects of the type system that would not so easily
abstracted out into a separate library.  The type handlers them selves
could be moved out...but since they are basically hardcoded in the
server (for the built in types), why not do it in the client as well?
The libpq type system was deliberately designed so that user types
could be 'plugged in' .

I think a reasonable objective would be to organize the types a little
bit better in both the client and the server so there would be more
code reuse.  We would support this change, but it would require some
changes to the server as well.

OTOH, we are proposing to extend the libpq interface.  IMO, breaking
the libpq interface to separate libraries would only cause confusion.

merlin