Thread: pg_upgrade diffs on WIndows
I realized this morning that I might have been a bit cavalier in using dos2unix to smooth away differences in the dumpfiles produced by pg_upgrade. Attached is a dump of the diff if this isn't done, with Carriage Returns printed as '*' to make them visible. As can be seen, in function bodies dump2 has the Carriage Returns doubled. I have not had time to delve into how this comes about, and I need to attend to some income-producing activity for a bit, but I'd like to get it cleaned up ASAP. We are under the hammer for 9.2, so any help other people can give on this would be appreciated. cheers andrew
Attachment
On 09/04/2012 03:09 PM, Andrew Dunstan wrote: > I realized this morning that I might have been a bit cavalier in using > dos2unix to smooth away differences in the dumpfiles produced by > pg_upgrade. Attached is a dump of the diff if this isn't done, with > Carriage Returns printed as '*' to make them visible. As can be seen, > in function bodies dump2 has the Carriage Returns doubled. I have not > had time to delve into how this comes about, and I need to attend to > some income-producing activity for a bit, but I'd like to get it > cleaned up ASAP. We are under the hammer for 9.2, so any help other > people can give on this would be appreciated. > Actually, I have the answer - it's quite simple. We just need to open the output files in binary mode when we split the dumpall file. The attached patch fixes it. I think we should backpatch the first part to 9.0. cheers andrew
Attachment
On 09/04/2012 03:44 PM, Andrew Dunstan wrote: > > On 09/04/2012 03:09 PM, Andrew Dunstan wrote: >> I realized this morning that I might have been a bit cavalier in >> using dos2unix to smooth away differences in the dumpfiles produced >> by pg_upgrade. Attached is a dump of the diff if this isn't done, >> with Carriage Returns printed as '*' to make them visible. As can be >> seen, in function bodies dump2 has the Carriage Returns doubled. I >> have not had time to delve into how this comes about, and I need to >> attend to some income-producing activity for a bit, but I'd like to >> get it cleaned up ASAP. We are under the hammer for 9.2, so any help >> other people can give on this would be appreciated. >> > > > Actually, I have the answer - it's quite simple. We just need to open > the output files in binary mode when we split the dumpall file. The > attached patch fixes it. I think we should backpatch the first part to > 9.0. > OK, nobody else has reacted. I've spoken to Bruce and he seems happy with it, although, TBH, whe I talked to him I thought I understood it and now I'm not so sure. So we have 3 possibilities: leave it as is with an error-hiding hack in the test script, apply this patch which removes the hack and applies a fix that apparently works but which confuses us a bit, or go back to generating errors. The last choice would mean I would need to turn off pg_ugrade testing on Windows pending a fix. And we have to decide pretty much now so we can get 9.2 out the door. Thoughts? cheers andrew
On Tue, Sep 4, 2012 at 08:46:53PM -0400, Andrew Dunstan wrote: > > On 09/04/2012 03:44 PM, Andrew Dunstan wrote: > > > >On 09/04/2012 03:09 PM, Andrew Dunstan wrote: > >>I realized this morning that I might have been a bit cavalier in > >>using dos2unix to smooth away differences in the dumpfiles > >>produced by pg_upgrade. Attached is a dump of the diff if this > >>isn't done, with Carriage Returns printed as '*' to make them > >>visible. As can be seen, in function bodies dump2 has the > >>Carriage Returns doubled. I have not had time to delve into how > >>this comes about, and I need to attend to some income-producing > >>activity for a bit, but I'd like to get it cleaned up ASAP. We > >>are under the hammer for 9.2, so any help other people can give > >>on this would be appreciated. > >> > > > > > >Actually, I have the answer - it's quite simple. We just need to > >open the output files in binary mode when we split the dumpall > >file. The attached patch fixes it. I think we should backpatch the > >first part to 9.0. > > > > > OK, nobody else has reacted. I've spoken to Bruce and he seems happy > with it, although, TBH, whe I talked to him I thought I understood > it and now I'm not so sure. So we have 3 possibilities: leave it as > is with an error-hiding hack in the test script, apply this patch > which removes the hack and applies a fix that apparently works but > which confuses us a bit, or go back to generating errors. The last > choice would mean I would need to turn off pg_ugrade testing on > Windows pending a fix. And we have to decide pretty much now so we > can get 9.2 out the door. I am very concerned about putting something into pg_upgrade that we don't fully understand. Adding stuff to pg_upgrade that we think we understand is risky enough, as we have seen in the pg_upgrade churn of the past week. Let's work on chat to find the complete details --- same goes for the log file change we are not sure about either. pg_upgrade is so complicated that I have learned that if we don't fully understand something, it can affect things we don't anticipate. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, 2012-09-04 at 20:46 -0400, Andrew Dunstan wrote: > OK, nobody else has reacted. I've spoken to Bruce and he seems happy > with it, although, TBH, whe I talked to him I thought I understood it > and now I'm not so sure. So we have 3 possibilities: leave it as is with > an error-hiding hack in the test script, apply this patch which removes > the hack and applies a fix that apparently works but which confuses us a > bit, or go back to generating errors. The last choice would mean I would > need to turn off pg_ugrade testing on Windows pending a fix. And we have > to decide pretty much now so we can get 9.2 out the door. I think now is not the time to cram in poorly understood changes into a release candidate. There is no requirement to have the tests running now or in time for the release, seeing also that no one has been particularly bothered about it for the past 11 months.
Peter Eisentraut <peter_e@gmx.net> writes: > On Tue, 2012-09-04 at 20:46 -0400, Andrew Dunstan wrote: >> OK, nobody else has reacted. I've spoken to Bruce and he seems happy >> with it, although, TBH, whe I talked to him I thought I understood it >> and now I'm not so sure. So we have 3 possibilities: leave it as is with >> an error-hiding hack in the test script, apply this patch which removes >> the hack and applies a fix that apparently works but which confuses us a >> bit, or go back to generating errors. The last choice would mean I would >> need to turn off pg_ugrade testing on Windows pending a fix. And we have >> to decide pretty much now so we can get 9.2 out the door. > I think now is not the time to cram in poorly understood changes into a > release candidate. There is no requirement to have the tests running > now or in time for the release, seeing also that no one has been > particularly bothered about it for the past 11 months. Also, the tests *are* passing right now. I agree, let's not risk destabilizing it. pg_upgrade is way overdue for some quiet time so we can verify a full day's buildfarm cycle on it before the release wrap. regards, tom lane
On Tue, Sep 4, 2012 at 03:44:35PM -0400, Andrew Dunstan wrote: > > On 09/04/2012 03:09 PM, Andrew Dunstan wrote: > >I realized this morning that I might have been a bit cavalier in > >using dos2unix to smooth away differences in the dumpfiles > >produced by pg_upgrade. Attached is a dump of the diff if this > >isn't done, with Carriage Returns printed as '*' to make them > >visible. As can be seen, in function bodies dump2 has the Carriage > >Returns doubled. I have not had time to delve into how this comes > >about, and I need to attend to some income-producing activity for > >a bit, but I'd like to get it cleaned up ASAP. We are under the > >hammer for 9.2, so any help other people can give on this would be > >appreciated. > > > > > Actually, I have the answer - it's quite simple. We just need to > open the output files in binary mode when we split the dumpall file. > The attached patch fixes it. I think we should backpatch the first > part to 9.0. > diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c > index b905ab0..0a96dde 100644 > --- a/contrib/pg_upgrade/dump.c > +++ b/contrib/pg_upgrade/dump.c > @@ -62,10 +62,10 @@ split_old_dump(void) > if ((all_dump = fopen(filename, "r")) == NULL) > pg_log(PG_FATAL, "Could not open dump file \"%s\": %s\n", filename, getErrorText(errno)); > snprintf(filename, sizeof(filename), "%s", GLOBALS_DUMP_FILE); > - if ((globals_dump = fopen_priv(filename, "w")) == NULL) > + if ((globals_dump = fopen_priv(filename, PG_BINARY_W)) == NULL) > pg_log(PG_FATAL, "Could not write to dump file \"%s\": %s\n", filename, getErrorText(errno)); > snprintf(filename, sizeof(filename), "%s", DB_DUMP_FILE); > - if ((db_dump = fopen_priv(filename, "w")) == NULL) > + if ((db_dump = fopen_priv(filename, PG_BINARY_W)) == NULL) > pg_log(PG_FATAL, "Could not write to dump file \"%s\": %s\n", filename, getErrorText(errno)); > > current_output = globals_dump; > diff --git a/contrib/pg_upgrade/test.sh b/contrib/pg_upgrade/test.sh > index d411ac6..3899600 100644 > --- a/contrib/pg_upgrade/test.sh > +++ b/contrib/pg_upgrade/test.sh > @@ -128,10 +128,6 @@ else > sh ./delete_old_cluster.sh > fi > > -if [ $testhost = Msys ] ; then > - dos2unix "$temp_root"/dump1.sql "$temp_root"/dump2.sql > -fi > - > if diff -q "$temp_root"/dump1.sql "$temp_root"/dump2.sql; then > echo PASSED > exit 0 I reviewed this idea and supports this patch's inclusion in 9.2. I was unclear why it was needed, but I see pg_dumpall, which is the file pg_upgrade splits apart, as also using binary mode to write this file: OPF = fopen(filename, PG_BINARY_W); I agree with Tom that pg_upgrade needs some quiet time. ;-) Andrew, have a sufficient number of buildfarm members verified our recent patches that this can be added. My patch from last night was mostly C comments so isn't something that needs testing. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 09/05/2012 09:11 AM, Bruce Momjian wrote: > > I reviewed this idea and supports this patch's inclusion in 9.2. I was > unclear why it was needed, but I see pg_dumpall, which is the file > pg_upgrade splits apart, as also using binary mode to write this file: > > OPF = fopen(filename, PG_BINARY_W); > > I agree with Tom that pg_upgrade needs some quiet time. ;-) Andrew, > have a sufficient number of buildfarm members verified our recent > patches that this can be added. My patch from last night was mostly C > comments so isn't something that needs testing. I am quite happy not committing anything for now. There are two buildfarm members doing pg_upgrade tests: crake (Fedora 16) and pitta (Windows/Mingw64). The buildfarm code is experimental and not in any release yet, and when it is the test will be optional. The PG_BINARY_W change has only been verified on a non-buildfarm setup on my laptop (Mingw) Note that while it does look like there's a bug either in pg_upgrade or pg_dumpall, it's probably mostly harmless (adding some spurious CRs to function code bodies on Windows). I'd feel happier if it didn't, and happier still if I knew for sure the ultimate origin. Your pg_dumpall discovery above is interesting. I might have time later on today to delve into all this. I'm out of contact for the next few hours. cheers andrew
On 09/05/2012 09:46 AM, Andrew Dunstan wrote: > > On 09/05/2012 09:11 AM, Bruce Momjian wrote: >> >> I reviewed this idea and supports this patch's inclusion in 9.2. I was >> unclear why it was needed, but I see pg_dumpall, which is the file >> pg_upgrade splits apart, as also using binary mode to write this file: >> >> OPF = fopen(filename, PG_BINARY_W); >> >> I agree with Tom that pg_upgrade needs some quiet time. ;-) Andrew, >> have a sufficient number of buildfarm members verified our recent >> patches that this can be added. My patch from last night was mostly C >> comments so isn't something that needs testing. > > > I am quite happy not committing anything for now. > > There are two buildfarm members doing pg_upgrade tests: crake (Fedora > 16) and pitta (Windows/Mingw64). The buildfarm code is experimental > and not in any release yet, and when it is the test will be optional. > > The PG_BINARY_W change has only been verified on a non-buildfarm setup > on my laptop (Mingw) > > Note that while it does look like there's a bug either in pg_upgrade > or pg_dumpall, it's probably mostly harmless (adding some spurious CRs > to function code bodies on Windows). I'd feel happier if it didn't, > and happier still if I knew for sure the ultimate origin. Your > pg_dumpall discovery above is interesting. I might have time later on > today to delve into all this. I'm out of contact for the next few hours. OK, I now have a complete handle on what's going on here, and withdraw my earlier statement that I am confused on this issue :-) First, one lot of CRs is produced because the pg_upgrade test script calls pg_dumpall without -f and redirects that to a file, which Windows kindly opens on text mode. The solution to that is to change the test script to use pg_dumpall -f instead. The second lot of CRs (seen in the second dump file in the diff i previously sent) is produced by pg_upgrade writing its output in text mode, which turns LF into CRLF. The solution to that is the patch to dump.c I posted, which, as Bruce observed, does the same thing that pg_dumpall does. Arguably, it should also open the input file in binary, so that if there really is a CRLF in the dump it won't be eaten. Another question is whether or not pg_dumpall (and pg_dump in text mode too for that matter) should be trying to suppress newline translation on its output even to stdout. It already does that for non-text formats (see call to setmode()) but I don't see why we shouldn't for text as well. But those are obviously longstanding bugs that we can leave to another day. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > OK, I now have a complete handle on what's going on here, and withdraw > my earlier statement that I am confused on this issue :-) > First, one lot of CRs is produced because the pg_upgrade test script > calls pg_dumpall without -f and redirects that to a file, which Windows > kindly opens on text mode. The solution to that is to change the test > script to use pg_dumpall -f instead. > The second lot of CRs (seen in the second dump file in the diff i > previously sent) is produced by pg_upgrade writing its output in text > mode, which turns LF into CRLF. The solution to that is the patch to > dump.c I posted, which, as Bruce observed, does the same thing that > pg_dumpall does. Arguably, it should also open the input file in binary, > so that if there really is a CRLF in the dump it won't be eaten. +1 to all the above. Do we want to risk squeezing this into 9.2.0, or is it better to delay? > Another question is whether or not pg_dumpall (and pg_dump in text mode > too for that matter) should be trying to suppress newline translation on > its output even to stdout. I'm inclined to think not - we've not heard any complaints from Windows users about its current behavior, and it's been like that forever. regards, tom lane
On Wed, Sep 5, 2012 at 03:17:40PM -0400, Andrew Dunstan wrote: > >The PG_BINARY_W change has only been verified on a non-buildfarm > >setup on my laptop (Mingw) > > > >Note that while it does look like there's a bug either in > >pg_upgrade or pg_dumpall, it's probably mostly harmless (adding > >some spurious CRs to function code bodies on Windows). I'd feel > >happier if it didn't, and happier still if I knew for sure the > >ultimate origin. Your pg_dumpall discovery above is interesting. I > >might have time later on today to delve into all this. I'm out of > >contact for the next few hours. > > > OK, I now have a complete handle on what's going on here, and > withdraw my earlier statement that I am confused on this issue :-) > > First, one lot of CRs is produced because the pg_upgrade test script > calls pg_dumpall without -f and redirects that to a file, which > Windows kindly opens on text mode. The solution to that is to change > the test script to use pg_dumpall -f instead. > > The second lot of CRs (seen in the second dump file in the diff i > previously sent) is produced by pg_upgrade writing its output in > text mode, which turns LF into CRLF. The solution to that is the > patch to dump.c I posted, which, as Bruce observed, does the same > thing that pg_dumpall does. Arguably, it should also open the input > file in binary, so that if there really is a CRLF in the dump it > won't be eaten. So, right now we are only add \r for function bodies, which is mostly harmless, but what if a function body has strings with an embedded newlines? What about creating a table with newlines in its identifiers: CREATE TABLE "a b" ("c d" int); If \r is added in there, it would be a data corruption problem. Can you test that? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 09/05/2012 03:36 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> OK, I now have a complete handle on what's going on here, and withdraw >> my earlier statement that I am confused on this issue :-) >> First, one lot of CRs is produced because the pg_upgrade test script >> calls pg_dumpall without -f and redirects that to a file, which Windows >> kindly opens on text mode. The solution to that is to change the test >> script to use pg_dumpall -f instead. >> The second lot of CRs (seen in the second dump file in the diff i >> previously sent) is produced by pg_upgrade writing its output in text >> mode, which turns LF into CRLF. The solution to that is the patch to >> dump.c I posted, which, as Bruce observed, does the same thing that >> pg_dumpall does. Arguably, it should also open the input file in binary, >> so that if there really is a CRLF in the dump it won't be eaten. > +1 to all the above. Do we want to risk squeezing this into 9.2.0, > or is it better to delay? When we (particularly Bruce and I) didn't fully understand what was happening there was a good argument for delay, but now I'd rather put it in so we can remove the error-hiding hack in the test script. I think the risk is minimal. cheers andrew
On 09/05/2012 03:40 PM, Bruce Momjian wrote: > On Wed, Sep 5, 2012 at 03:17:40PM -0400, Andrew Dunstan wrote: >>> The PG_BINARY_W change has only been verified on a non-buildfarm >>> setup on my laptop (Mingw) >>> >>> Note that while it does look like there's a bug either in >>> pg_upgrade or pg_dumpall, it's probably mostly harmless (adding >>> some spurious CRs to function code bodies on Windows). I'd feel >>> happier if it didn't, and happier still if I knew for sure the >>> ultimate origin. Your pg_dumpall discovery above is interesting. I >>> might have time later on today to delve into all this. I'm out of >>> contact for the next few hours. >> >> OK, I now have a complete handle on what's going on here, and >> withdraw my earlier statement that I am confused on this issue :-) >> >> First, one lot of CRs is produced because the pg_upgrade test script >> calls pg_dumpall without -f and redirects that to a file, which >> Windows kindly opens on text mode. The solution to that is to change >> the test script to use pg_dumpall -f instead. >> >> The second lot of CRs (seen in the second dump file in the diff i >> previously sent) is produced by pg_upgrade writing its output in >> text mode, which turns LF into CRLF. The solution to that is the >> patch to dump.c I posted, which, as Bruce observed, does the same >> thing that pg_dumpall does. Arguably, it should also open the input >> file in binary, so that if there really is a CRLF in the dump it >> won't be eaten. > So, right now we are only add \r for function bodies, which is mostly > harmless, but what if a function body has strings with an embedded > newlines? What about creating a table with newlines in its identifiers: > > CREATE TABLE "a > b" ("c > d" int); > > If \r is added in there, it would be a data corruption problem. Can you > test that? These are among the reasons why I am suggesting opening the file in binary mode. You're right, that would be data corruption. I can set up a check, but it will take a bit of time. cheers andrew
On Wed, Sep 5, 2012 at 03:50:13PM -0400, Andrew Dunstan wrote: > >>The second lot of CRs (seen in the second dump file in the diff i > >>previously sent) is produced by pg_upgrade writing its output in > >>text mode, which turns LF into CRLF. The solution to that is the > >>patch to dump.c I posted, which, as Bruce observed, does the same > >>thing that pg_dumpall does. Arguably, it should also open the input > >>file in binary, so that if there really is a CRLF in the dump it > >>won't be eaten. > >So, right now we are only add \r for function bodies, which is mostly > >harmless, but what if a function body has strings with an embedded > >newlines? What about creating a table with newlines in its identifiers: > > > >CREATE TABLE "a > >b" ("c > >d" int); > > > >If \r is added in there, it would be a data corruption problem. Can you > >test that? > > These are among the reasons why I am suggesting opening the file in > binary mode. You're right, that would be data corruption. > > I can set up a check, but it will take a bit of time. My only point is that this is no longer a buildfarm failure issue, it is a potential data corruption issue. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 09/05/2012 03:50 PM, Andrew Dunstan wrote: > > On 09/05/2012 03:40 PM, Bruce Momjian wrote: >> On Wed, Sep 5, 2012 at 03:17:40PM -0400, Andrew Dunstan wrote: >>>> The PG_BINARY_W change has only been verified on a non-buildfarm >>>> setup on my laptop (Mingw) >>>> >>>> Note that while it does look like there's a bug either in >>>> pg_upgrade or pg_dumpall, it's probably mostly harmless (adding >>>> some spurious CRs to function code bodies on Windows). I'd feel >>>> happier if it didn't, and happier still if I knew for sure the >>>> ultimate origin. Your pg_dumpall discovery above is interesting. I >>>> might have time later on today to delve into all this. I'm out of >>>> contact for the next few hours. >>> >>> OK, I now have a complete handle on what's going on here, and >>> withdraw my earlier statement that I am confused on this issue :-) >>> >>> First, one lot of CRs is produced because the pg_upgrade test script >>> calls pg_dumpall without -f and redirects that to a file, which >>> Windows kindly opens on text mode. The solution to that is to change >>> the test script to use pg_dumpall -f instead. >>> >>> The second lot of CRs (seen in the second dump file in the diff i >>> previously sent) is produced by pg_upgrade writing its output in >>> text mode, which turns LF into CRLF. The solution to that is the >>> patch to dump.c I posted, which, as Bruce observed, does the same >>> thing that pg_dumpall does. Arguably, it should also open the input >>> file in binary, so that if there really is a CRLF in the dump it >>> won't be eaten. >> So, right now we are only add \r for function bodies, which is mostly >> harmless, but what if a function body has strings with an embedded >> newlines? What about creating a table with newlines in its identifiers: >> >> CREATE TABLE "a >> b" ("c >> d" int); >> >> If \r is added in there, it would be a data corruption problem. Can you >> test that? > > These are among the reasons why I am suggesting opening the file in > binary mode. You're right, that would be data corruption. > > I can set up a check, but it will take a bit of time. As expected, we get a difference in field names. Here's the extract from the dumps diff (* again represents CR): *************** *** 5220,5228 **** -- CREATE TABLE hasnewline ( ! "x y" integer, ! "a b" text ); --- 5220,5228 ---- -- CREATE TABLE hasnewline ( ! "x* y" integer, ! "a* b" text ); If we open the input and output files in binary mode in pg_upgrade's dump.c this disappears. Given this, I think we have no choice but to apply the patch, all the way back to 9.0 in fact. cheers andrew
On Wed, Sep 5, 2012 at 04:22:18PM -0400, Andrew Dunstan wrote: > >>So, right now we are only add \r for function bodies, which is mostly > >>harmless, but what if a function body has strings with an embedded > >>newlines? What about creating a table with newlines in its identifiers: > >> > >>CREATE TABLE "a > >>b" ("c > >>d" int); > >> > >>If \r is added in there, it would be a data corruption problem. Can you > >>test that? > > > >These are among the reasons why I am suggesting opening the file > >in binary mode. You're right, that would be data corruption. > > > >I can set up a check, but it will take a bit of time. > > > As expected, we get a difference in field names. Here's the extract > from the dumps diff (* again represents CR): > > > *************** > *** 5220,5228 **** > -- > > CREATE TABLE hasnewline ( > ! "x > y" integer, > ! "a > b" text > ); > > --- 5220,5228 ---- > -- > > CREATE TABLE hasnewline ( > ! "x* > y" integer, > ! "a* > b" text > ); > > If we open the input and output files in binary mode in pg_upgrade's > dump.c this disappears. > > Given this, I think we have no choice but to apply the patch, all > the way back to 9.0 in fact. I think you are right. I think I could use some "quite time" right now, as Tom suggested. ;-) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +