Thread: Gripe: bytea_output default => data corruption
Defaulting bytea output from the backend to use hex mode encoding, which is incompatible with pre-9.0 interfaces, wasn't a friendly thing to do. The default should have been escape mode. Or else you needed a big warning in HISTORY that we must either change bytea_output, or upgrade all clients before servers. Because using a 9.0 server with a 8.x libpq-based client results in undetected data corruption when selecting BYTEA objects. By default, the 9.0 server encodes a bytea using hex mode, but an 8.x libpq-based client will decode that using escape mode, with no error detected on either end. For example, start with "A", encode to "\x40" decode to "x40". There are good reasons to break backward compatibility, like security or standards compliance, but not performance. Please think twice next time you consider breaking stuff just because you think the new way should be faster.
On 13/10/2010 01:37, ljb wrote: > Defaulting bytea output from the backend to use hex mode encoding, which is > incompatible with pre-9.0 interfaces, wasn't a friendly thing to do. The > default should have been escape mode. Or else you needed a big warning in > HISTORY that we must either change bytea_output, or upgrade all clients > before servers. Because using a 9.0 server with a 8.x libpq-based client > results in undetected data corruption when selecting BYTEA objects. > > By default, the 9.0 server encodes a bytea using hex mode, but an 8.x > libpq-based client will decode that using escape mode, with no error detected > on either end. For example, start with "A", encode to "\x40" decode to "x40". > > There are good reasons to break backward compatibility, like security or > standards compliance, but not performance. Please think twice next time you > consider breaking stuff just because you think the new way should be faster. In fairness, it *is* flagged in the release note - it's the first item under "data types" in the list of incompatibilities. Ray. -- Raymond O'Donnell :: Galway :: Ireland rod@iol.ie
Hey all,
And it is 9.0, i.e another major version. IMO the higher major versions not
necessarily must be 100% backward compatible. So, think twice next time
you update the major version.
--
// Dmitriy.
And it is 9.0, i.e another major version. IMO the higher major versions not
necessarily must be 100% backward compatible. So, think twice next time
you update the major version.
2010/10/13 Raymond O'Donnell <rod@iol.ie>
On 13/10/2010 01:37, ljb wrote:In fairness, it *is* flagged in the release note - it's the first item under "data types" in the list of incompatibilities.Defaulting bytea output from the backend to use hex mode encoding, which is
incompatible with pre-9.0 interfaces, wasn't a friendly thing to do. The
default should have been escape mode. Or else you needed a big warning in
HISTORY that we must either change bytea_output, or upgrade all clients
before servers. Because using a 9.0 server with a 8.x libpq-based client
results in undetected data corruption when selecting BYTEA objects.
By default, the 9.0 server encodes a bytea using hex mode, but an 8.x
libpq-based client will decode that using escape mode, with no error detected
on either end. For example, start with "A", encode to "\x40" decode to "x40".
There are good reasons to break backward compatibility, like security or
standards compliance, but not performance. Please think twice next time you
consider breaking stuff just because you think the new way should be faster.
Ray.
--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
// Dmitriy.
rod@iol.ie wrote: >... > In fairness, it *is* flagged in the release note - it's the first item > under "data types" in the list of incompatibilities. Quote: "bytea output now appears in hex format by default (Peter Eisentraut) The server parameter bytea_output can be used to select the traditional output format if needed for compatibility." This is inadequate, because it fails to warn that pre-9.0 clients will decode the data incorrectly without reporting an error.
On 13/10/2010 22:03, ljb wrote: > rod@iol.ie wrote: >> ... >> In fairness, it *is* flagged in the release note - it's the first item >> under "data types" in the list of incompatibilities. > > Quote: > "bytea output now appears in hex format by default (Peter Eisentraut) > The server parameter bytea_output can be used to select the > traditional output format if needed for compatibility." > > This is inadequate, because it fails to warn that pre-9.0 clients will > decode the data incorrectly without reporting an error. I thought myself that the word "incompatibility" was fairly self-explanatory... :-) Ray. -- Raymond O'Donnell :: Galway :: Ireland rod@iol.ie
Dear Griper!, fortunately someone showed a easy 'fix' ALTER DATABASE foo SET bytea_output='escape' ; Regds Rajesh Kumar Mallah. On Wed, Oct 13, 2010 at 5:03 PM, ljb <ljb9832@pobox.com> wrote: > rod@iol.ie wrote: >>... >> In fairness, it *is* flagged in the release note - it's the first item >> under "data types" in the list of incompatibilities. > > Quote: > "bytea output now appears in hex format by default (Peter Eisentraut) > The server parameter bytea_output can be used to select the > traditional output format if needed for compatibility." > > This is inadequate, because it fails to warn that pre-9.0 clients will > decode the data incorrectly without reporting an error. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Rajesh Kumar Mallah wrote: > Dear Griper!, > > fortunately someone showed a easy 'fix' > > ALTER DATABASE foo SET bytea_output='escape' ; Yes, we mentioned that setting in the release notes too: E.2.3.5. Data Types * Allow bytea values to be written in hex notation (Peter Eisentraut) The server parameter bytea_output controls whether hex or traditional format is used for bytea output. Libpq's PQescapeByteaConn() function automatically uses the hex format when connected to PostgreSQL 9.0 or newer servers. The new hex format will be directly compatible with more applications that use binary data, allowing them to store and retrieve it without extra conversion. It is also significantly faster to read and write than the traditional format. While the "Incompatibilities" section mentions only the first paragraph, this remention lower down has even more details. Not sure what else you wanted us to do. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
bruce@momjian.us wrote: >... > Yes, we mentioned that setting in the release notes too: >... > While the "Incompatibilities" section mentions only the first paragraph, > this remention lower down has even more details. Not sure what else you > wanted us to do. Here's how I would have written that first paragraph. My addition starts at "Libpq's PQunescapeBytea() function..." | * Allow bytea values to be written in hex notation (Peter Eisentraut) | | The server parameter bytea_output controls whether hex or | traditional format is used for bytea output. Libpq's PQescapeByteaConn() | function automatically uses the hex format when connected to PostgreSQL | 9.0 or newer servers. Libpq's PQunescapeBytea() function from version 9.0 | and newer will properly decode both hex and traditional format. However, | in versions of Libpq older than 9.0, the PQunescapeByte() function can only | decode traditional format, and will corrupt bytea data received in hex | format without reporting an error. To avoid loss of data, you must either | upgrade all clients to 9.0.x, or set the server's bytea_output parameter | to 'escape'. Again: My complaint is that pre-9.0 libpq-based clients mis-decode the new default hex format bytea data without reporting an error, and this danger is insufficiently documented in the release notes. Speaking of documentation, go read the 9.0.x reference manual sections for Libpq's PQescapeByteaConn() and PQunescapeBytea(). These descriptions of escaping and unescaping are incorrect for 9.0, which can add to any confusion.
On Thu, Oct 21, 2010 at 7:12 PM, ljb <ljb9832@pobox.com> wrote:
Again: My complaint is that pre-9.0 libpq-based clients mis-decode the new
default hex format bytea data without reporting an error, and this danger is
insufficiently documented in the release notes.
I had some hoops thru which I had to jump to make our app compatible with both 8.x and 9.x so we could safely migrate our servers without having to coordinate a code push. It wasn't that bad, but part of the problem is that the DBD::Pg driver does not understand the new format... but it does unescape the "\x" marker for me to a simple "x", since that follows the traditional un-escaping rules for values returned from Pg. I don't like overriding defaults in the DB settings unless I *really* have to.
So my code now looks something like this:
if (substr($value,2,100) =~ m/^[0-9a-f]+$/) {
# hex coding of bytea from Postgres 9.0+
$self->log_debug('bytea hex decode');
# remove the leading \x. DBD::Pg descapes \x to x, so just remove x
$value =~ s/^[^0-9a-f]+//;
$value = pack('H*',$value); # convert hex to bytes
} else {
$self->log_debug('bytea escape decoded');
# Postgres < 9.0 encode auto handled by DBD::Pg
}
Unfortunately, Greg is saying that he has no time right now to release an updated DBD::Pg despite the fact that the code is written and merged into the development source tree... so it may be a while before perl people are happy. Luckily my above code will work even when DBD::Pg learns to do auto-escaping of bytea new format.
At least the current $dbh->quote() method still seems to create acceptable escaping for postgres 9.0 when you tell it you have a bytea field type.
vivek@khera.org wrote: >... > I had some hoops thru which I had to jump to make our app compatible with > both 8.x and 9.x so we could safely migrate our servers without having to > coordinate a code push. It wasn't that bad, but part of the problem is that > the DBD::Pg driver does not understand the new format... but it does > unescape the "\x" marker for me to a simple "x", since that follows the > traditional un-escaping rules for values returned from Pg. I don't like > overriding defaults in the DB settings unless I *really* have to. Although DBD::Pg uses libpq, linking it with the PostgreSQL-9.0 libpq doesn't fix the escape/unescape problem. (I'm sure you already know this.) That's because DBD::Pg implements its own string and bytea escape and unescape functions, for some reason, and does not use PQescapeStringConn etc. (I wonder if that makes it vulnerable to the security problems fixed 8.1.4?) I probably would have just gone ahead and deployed 9.0 servers with "bytea_output='escape'" in their postgresql.conf files. Permanent work-around.