Re: pg_upgrade diffs on WIndows - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: pg_upgrade diffs on WIndows |
Date | |
Msg-id | 5047B47A.1020706@dunslane.net Whole thread Raw |
In response to | Re: pg_upgrade diffs on WIndows (Andrew Dunstan <andrew@dunslane.net>) |
Responses |
Re: pg_upgrade diffs on WIndows
|
List | pgsql-hackers |
On 09/05/2012 03:50 PM, Andrew Dunstan wrote: > > On 09/05/2012 03:40 PM, Bruce Momjian wrote: >> On Wed, Sep 5, 2012 at 03:17:40PM -0400, Andrew Dunstan wrote: >>>> The PG_BINARY_W change has only been verified on a non-buildfarm >>>> setup on my laptop (Mingw) >>>> >>>> Note that while it does look like there's a bug either in >>>> pg_upgrade or pg_dumpall, it's probably mostly harmless (adding >>>> some spurious CRs to function code bodies on Windows). I'd feel >>>> happier if it didn't, and happier still if I knew for sure the >>>> ultimate origin. Your pg_dumpall discovery above is interesting. I >>>> might have time later on today to delve into all this. I'm out of >>>> contact for the next few hours. >>> >>> OK, I now have a complete handle on what's going on here, and >>> withdraw my earlier statement that I am confused on this issue :-) >>> >>> First, one lot of CRs is produced because the pg_upgrade test script >>> calls pg_dumpall without -f and redirects that to a file, which >>> Windows kindly opens on text mode. The solution to that is to change >>> the test script to use pg_dumpall -f instead. >>> >>> The second lot of CRs (seen in the second dump file in the diff i >>> previously sent) is produced by pg_upgrade writing its output in >>> text mode, which turns LF into CRLF. The solution to that is the >>> patch to dump.c I posted, which, as Bruce observed, does the same >>> thing that pg_dumpall does. Arguably, it should also open the input >>> file in binary, so that if there really is a CRLF in the dump it >>> won't be eaten. >> So, right now we are only add \r for function bodies, which is mostly >> harmless, but what if a function body has strings with an embedded >> newlines? What about creating a table with newlines in its identifiers: >> >> CREATE TABLE "a >> b" ("c >> d" int); >> >> If \r is added in there, it would be a data corruption problem. Can you >> test that? > > These are among the reasons why I am suggesting opening the file in > binary mode. You're right, that would be data corruption. > > I can set up a check, but it will take a bit of time. As expected, we get a difference in field names. Here's the extract from the dumps diff (* again represents CR): *************** *** 5220,5228 **** -- CREATE TABLE hasnewline ( ! "x y" integer, ! "a b" text ); --- 5220,5228 ---- -- CREATE TABLE hasnewline ( ! "x* y" integer, ! "a* b" text ); If we open the input and output files in binary mode in pg_upgrade's dump.c this disappears. Given this, I think we have no choice but to apply the patch, all the way back to 9.0 in fact. cheers andrew
pgsql-hackers by date: