Re: quoting psql varible as identifier - Mailing list pgsql-hackers

From Robert Haas
Subject Re: quoting psql varible as identifier
Date
Msg-id 603c8f071001191124r4cb2d76fs823a42b07f8ba04b@mail.gmail.com
Whole thread Raw
In response to Re: quoting psql varible as identifier  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: quoting psql varible as identifier
List pgsql-hackers
On Tue, Jan 19, 2010 at 2:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ... I think as
>> long as we're adding a new function, we should make it behave sanely.
>> We could even take the opportunity to go back and add a saner version
>> of PQescapeStringConn.
>
> Well, it's a bit late in the devel cycle to be inventing from scratch,
> but if we did want to do something "saner" what would it look like?
> I can see the following possibilities:
>
> * include boundary quotes (and E too in the literal case).  This would
> imply telling people they should leave whitespace around the value in
> the constructed query ... or should we insert leading/trailing spaces
> to prevent that mistake too?

Does the existing PQescapeStringConn() require E'' rather than just
''?  It isn't documented so.

I think inserting the whitespace would be overkill, but that's another
thing that's not mentioned in the present documentation and should be,
because it would not be difficult to screw it up.

> * malloc our own result value instead of expecting caller to provide
> space.  This adds another failure case (out of memory) but it's not
> really much different from OOM on the caller side.

I think that would be an anti-improvement - caller might have their
own malloc substitute, might want to stack-allocate, etc.  Plus we'd
either have to malloc the worst-case buffer size or else read the
string twice.  But see below...

> * anything else you want to change?

Instead of relying on the output buffer to be exactly 2n+1 or 2n+3 or
whatever, I suggest that we add an explicit  argument for the size of
the output buffer.  If the buffer the caller provides is too short,
ideally we should let the caller know about the error and also
indicate how much space would have been necessary (like sprintf).

> I suggest we could use PQescapeLiteral + PQescapeIdentifier as names
> if we provide a new pair of functions defined with a different API.

I like those names.  Maybe something like:

size_t PQescapeLiteral(PGconn *conn, char *to, size_t tolen, char
*from, size_t fromlen, int *error);
size_t PQescapeIdentifier(PGconn *conn, char *to, size_t tolen, char
*from, size_t fromlen, int *error);

On success, *error is 0 and size_t is the number of bytes actually
written.  On failure, *error is non-zero, conn->errorMessage is an
appropriate error, and the return value is the shortest value of tolen
adequate to hold the result, or 0 if we bombed out for some reason
other than lack of output space.  That way, if people are willing to
incur the overhead of an extra call, they can malloc() (or whatever)
exactly the right amount of space.  Or they can just size the buffer
according to the documentation and then they're safe.

Another option would be to swap the return value and *error:

int PQescapeLiteral(PGconn *conn, char *to, int tolen, char *from, int
fromlen, size_t *needed);
int PQescapeIdentifier(PGconn *conn, char *to, int tolen, char *from,
int fromlen, size_t *needed);

...Robert


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Helping the CommitFest re PL/Perl changes
Next
From: Jeff Davis
Date:
Subject: Re: MySQL-ism help patch for psql