Re: Libpq support for precision and scale - Mailing list pgsql-hackers

From Fernando Nasser
Subject Re: Libpq support for precision and scale
Date
Msg-id 3C88153A.6077EBE4@redhat.com
Whole thread Raw
In response to Re: Libpq support for precision and scale  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Libpq support for precision and scale
Re: Libpq support for precision and scale
Re: Libpq support for precision and scale
List pgsql-hackers
Bruce Momjian wrote:
> 
> I have reviewed this patch and clearly has features I would like to get
> into 7.3.  We have been pushing too much type knowledge into apps and
> this will give people a libpq solution that we can manage.  Here are my
> comments.
> 

We definitively want this to go into 7.3.  I am planning on update
this patch next week.


> > > These seem okay, but I don't like the API detail that "0 is returned if
> > > information is not available".  0 is a valid result, at least for
> > > PQfscale.  I would recommend returning -1.  If you really want to
> > > distinguish bad parameters from non-numeric datatype, then return -1
> > > and -2 for those two cases.
> > >
> >
> > This seems to be the libpq convention.  On calls such as PQfsize and
> > PQfmod, for instance, zero is a valid result and is also returned if
> > the information is not available.
> >
> > Please note that we did not make this convention -- our original version
> > did return -1.  But we decided that following a different rule for these
> > two routines was even more confusing.  And change the return convention
> > for the whole set of functions at this point seems out of the question.
> >
> > P.S.: Maybe whoever originally designed the libpq interface was trying
> > to accomplish some sort of "soft fail" by returning zero.  Just a guess
> > of course.
> 
> I think the problem stems from the fact that some of our functions
> legitimately can return -1, so zero was chosen as a failure code, while
> others use -1 for failure.  In fact, Tom mentioned that there are now
> some types that have a valid atttypmod of 0 (timestamp?) meaning we may
> have a problem there anyway.  Any ideas on how to fix it?
> 

We have agreed to change the error return code to -2.  It will be in the
REPOST of the patch next week.

> In hindsight, we should have defined a macro equal to -2 and report that
> as the failure return for all functions that need it.
> 

Note that -2 is a valid result for some other functions :-(

There is no way of picking a value that works for all.  Maybe these
functions should just be returning a value and setting some global
'libpqerr' variable that had to be set to assure the result was valid.
Anyway, it is too late for that now as backwards compatibility makes
it difficult to change the API that much.


> > > > Most programs won't need this information and may not be willing
> > > > to pay the overhead for metadata retrieval. Thus, we added
> > > > an alternative function to be used instead of PQexec if one
> > > > wishes extra metadata to be retrieved along with the query
> > > > results:
> > >
> > > > PGresult *PQexecIncludeMetadata(PGconn *conn, const char *query);
> > >
> > > This strikes me as very ugly, and unnecessary, and inefficient since
> > > it retrieves metadata for all columns even though the client might
> > > only need to know about some of them.
> >
> > This part I would not worry about.  The new routines are for result sets
> > (not arbitrary columns) so the fields present in it have already been
> > pre-selected.  Also, this kind of information is useful for tools as
> > they don't know beforehand what the fields will be.  In all cases
> > we can think of, the tool will always want metadata about all the
> > fields.
> 
> I hesitate to add another PQexec function.  That could complicate the
> API.
> 

We have agreed to add another call to set a flag for including the
metadata on the PQexec call (which would make it work like the
PQexecIncludeMetadata described above).  It will be in the REPOST patch.

Question: should it affect only the next PQexec(), or should we require
the user to reset it?

How do we call it?  I am thinking of  PQsetIncludeMetadata()
Name suggestions for this call are welcome.

> > > An even worse problem is that
> > > it'll fail entirely with a multi-query query string.
> > >
> >
> > This is a bummer.  But I see no solution for this besides documenting
> > the restriction in the manual.  If I am not mistaken we already have
> > the limitation of returning just the last result anyway (we just
> > collect the error messages).
> >
> >
> > > What I think would be cleaner would be to do the metadata queries
> > > on-the-fly as needed.  With the caching that you already have in there,
> > > on-the-fly queries wouldn't be any less efficient.
> > >
> > > But to do a metadata query we must have access to the connection.
> > > We could handle it two ways:
> > >
> > > 1. Add a PGconn parameter to the querying functions.
> > >
> >
> > The problem is that results may be kept longer than connections
> > (see below).  The current solution did not require the connection
> > as the metadata is for the result set, not tables.
> >
> > The PGconn parameter would be reasonable for retrieving metadata
> > about table columns, for instance.
> 
> I think this is the way to go.  We just require the connection be valid.
> If it isn't, we throw an error.  I don't see this as a major restriction.
> In fact, it would be interesting to just call this function
> automatically when someone requests type info.
> 

Sorry but we disagree on this one.  The metadata is related (part of)
a result, which is a different object, with a different life spam.
There is no way to be certain that the connection is still around
and no reliable way of testing for it.   If the conn field is a
dangling pointer any check based on it depends on that heap memory
not being realocated already.  Well, we could keep a list of results
associated with a connection and go cleaning the conn pointers in it 
_if_ the user uses PQfinish() properly.  A little bit dangerous IMO.

I would stick with Tom Lane's decision of deprecating pconn and leave
the metadata independent of it.


> > > 2. Make use of the PGconn link that's stored in PGresults, and
> > > specify that these functions can only be used on PGresults that
> > > came from a still-open connection.
> > >
> >
> > That field has been deprecated (see comments in the source code)
> > because a result may be kept even after the connection is closed.
> >
> >
> > > I think I prefer the first, since it makes it more visible to the
> > > programmer that queries may get executed.  But it's a judgment call
> > > probably; I could see an argument for the second as well.  Any comments,
> > > anyone?
> > >
> >
> > It would have to be the former (to avoid the stale pointer problem).
> >
> > But requiring a connection adds a restriction to the use of this info
> > and makes it have a different life span than the object it refers to
> > (a PGresult), which is very weird.
> 
> Yes, but how often is this going to happen?  If we can throw a reliable
> error message when it happens, it seems quite safe.  "If you are going to
> get type info, keep the connection open so we can get it."
> 

There is no reliable way of detecting this error (see above).


> > > > The PQftypename function returns the internal PostgreSQL type name.
> > > > As some programs may prefer something more user friendly than the
> > > > internal type names, we've thrown in a conversion routine as well:
> > > > char *PQtypeint2ext(const char *intname);
> > > > This routine converts from the internal type name to a more user
> > > > friendly type name convention.
> > >
> > > This seems poorly designed.  Pass it the type OID and typmod, both of
> > > which are readily available from a PQresult without extra computation.
> > > That will let you call the backend's format_type ... of course you'll
> > > need a PGconn too for that.
> > >
> >
> > Requiring the PGconn is bad. But we still could have a PQFtypeExt()
> > returning the "external" type if people prefer it that way.
> > We thought that this should be kept as an explicit conversion
> > operation to make clear the distinction of what the backend knows
> > about and this outside world view of things.
> 
> If they want more info about the result set, keeping the connection open
> so we can get that information seems perfectly logical.  If we put it in
> the manual in its own section as MetaData functions, and mention they
> need a valid connection to work, I think it will be clear to everyone.
> 

This may be possible for this specific conversion routine.  The
advantage
is that we don't need to keep this translation in the clients (so we
don't
have to track changes etc).  I will take a look into this possibility.
We would have conn as a parameter for this call though (will not use the
dangling pointer inside the result). 

-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] system catalog relation of a table and a serial
Next
From: Tatsuo Ishii
Date:
Subject: Re: point in time recovery and moving datafiles online