Thread: pg_dump vs PQescapeBytea

pg_dump vs PQescapeBytea

From

Tom Lane

Date:

04 August 2009, 17:36:03

I mentioned in
http://archives.postgresql.org/message-id/21837.1248215656@sss.pgh.pa.us
that pg_dump has a bug in its handling of large objects, which is
directly related to the fact that it's still using the deprecated
function PQescapeBytea.  The reason PQescapeBytea is deprecated is
that it has to guess at the setting of standard_conforming_strings,
and it is not guessing right in this one case.

I speculated about fixing that by inventing a "PQescapeByteaHex"
function, but that was pure brain fade :-(.  The hex format for
bytea still requires one backslash in the value, so you still have
to know standard_conforming_strings to get it right.

AFAICS the only way to make this work correctly and still have libpq
do the work is to create a libpq function that takes an explicit
parameter for the value of standard_conforming_strings to assume.
(We can't pass a PQconn because we don't have one.)  That seems a tad
ugly, so I'm thinking we might just as well have pg_dump solve this
problem for itself.

Another issue related to this code is that ideally we'd like it to
use hex format for speed reasons.  The problem there is that when
writing text output in pg_dump or pg_restore, we don't know for
sure what server version the output will be loaded into.  If we
use hex format then it will be impossible to load the output into
a pre-8.5 server.  This wouldn't be the first time we've broken
backwards compatibility of dump output, of course, but it's something
that would be entirely impractical to fix by manual editing of the
dump output.  So it seems a bit worse than our average breakage.

If we aren't worried about backwards compatibility then the solution
is easy: have pg_dump emit LO data in the formE'\\x ... hex data here ...'
which is cheap to generate and will work regardless of GUC settings.
If we want it to generate backwards-compatible output then we'll
essentially end up duplicating libpq's PQescapeBytea code in pg_dump.

Thoughts, better ideas?
        regards, tom lane

Re: pg_dump vs PQescapeBytea

From

Jeff Davis

Date:

04 August 2009, 20:21:04

On Tue, 2009-08-04 at 13:35 -0400, Tom Lane wrote:
> I speculated about fixing that by inventing a "PQescapeByteaHex"
> function, but that was pure brain fade :-(.  The hex format for
> bytea still requires one backslash in the value, so you still have
> to know standard_conforming_strings to get it right.

On a related topic: right now it's impossible to safely escape (using
libpq alone) bytea data for use as a text-format argument to a
parameterized query. This is because PQescapeBytea returns a string that
is both escaped for bytea and escaped for inclusion in a SQL string (am
I wrong here?).

It's a pretty minor issue, but a little inconsistent. Is it worth
addressing?

Regards,Jeff Davis

Re: pg_dump vs PQescapeBytea

From

Tom Lane

Date:

04 August 2009, 20:28:08

Jeff Davis <pgsql@j-davis.com> writes:
> On a related topic: right now it's impossible to safely escape (using
> libpq alone) bytea data for use as a text-format argument to a
> parameterized query. This is because PQescapeBytea returns a string that
> is both escaped for bytea and escaped for inclusion in a SQL string (am
> I wrong here?).

True, but I would think that most people would prefer to use binary
format and not do any conversion/escaping at all in that situation.
I doubt it's worth producing still another variant of PQescapeBytea
for the people who insist on a text conversion for that case.  (The
hex mode already made their lives a lot easier, anyway.)
        regards, tom lane