Thread: pg_dump/pg_restore seem broken on hamerkop
Buildfarm member hamerkop has been failing in the pg_upgrade regression test for the last several days. The problem looks like this: command: "C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore" --port 50432 --username"Administrator" --exit-on-error --verbose --dbname "postgres" "pg_upgrade_dump_12145.custom" >> "pg_upgrade_dump_12145.log"2>&1 pg_restore: connecting to database for restore pg_restore: [archiver (db)] Error while INITIALIZING: pg_restore: [archiver (db)] could not execute query: ERROR: invalid byte sequence for encoding "UTF8": 0x93 I can't help noticing that this started immediately after commit 0eea804 "pg_dump: Reduce use of global variables". No idea why the issue is only showing up on this one animal. regards, tom lane
> Buildfarm member hamerkop has been failing in the pg_upgrade regression > test for the last several days. The problem looks like this: > > command: "C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore" --port 50432 --username"Administrator" --exit-on-error --verbose --dbname "postgres" "pg_upgrade_dump_12145.custom" >> "pg_upgrade_dump_12145.log"2>&1 > pg_restore: connecting to database for restore > pg_restore: [archiver (db)] Error while INITIALIZING: > pg_restore: [archiver (db)] could not execute query: ERROR: invalid byte sequence for encoding "UTF8": 0x93 > > I can't help noticing that this started immediately after commit > 0eea804 "pg_dump: Reduce use of global variables". No idea why > the issue is only showing up on this one animal. I guess one of possibilities is there's garbage in memory which is related to restore the process. If so, coverity may be able to find something. The last coverity analysis was Oct 12. The commit 0eea804 was made on Oct 14. So let's see what new coverity scan reports... Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
>> Buildfarm member hamerkop has been failing in the pg_upgrade regression >> test for the last several days. The problem looks like this: >> >> command: "C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore" --port 50432 --username"Administrator" --exit-on-error --verbose --dbname "postgres" "pg_upgrade_dump_12145.custom" >> "pg_upgrade_dump_12145.log"2>&1 >> pg_restore: connecting to database for restore >> pg_restore: [archiver (db)] Error while INITIALIZING: >> pg_restore: [archiver (db)] could not execute query: ERROR: invalid byte sequence for encoding "UTF8": 0x93 >> >> I can't help noticing that this started immediately after commit >> 0eea804 "pg_dump: Reduce use of global variables". No idea why >> the issue is only showing up on this one animal. > > I guess one of possibilities is there's garbage in memory which is > related to restore the process. If so, coverity may be able to find > something. The last coverity analysis was Oct 12. The commit 0eea804 > was made on Oct 14. So let's see what new coverity scan reports... The latest coverity scan report did not include paticular defects related to pg_dump/pg_restore... Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Tatsuo Ishii wrote: > > Buildfarm member hamerkop has been failing in the pg_upgrade regression > > test for the last several days. The problem looks like this: > > > > command: "C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore" --port 50432--username "Administrator" --exit-on-error --verbose --dbname "postgres" "pg_upgrade_dump_12145.custom" >> "pg_upgrade_dump_12145.log"2>&1 > > pg_restore: connecting to database for restore > > pg_restore: [archiver (db)] Error while INITIALIZING: > > pg_restore: [archiver (db)] could not execute query: ERROR: invalid byte sequence for encoding "UTF8": 0x93 > > > > I can't help noticing that this started immediately after commit > > 0eea804 "pg_dump: Reduce use of global variables". No idea why > > the issue is only showing up on this one animal. > > I guess one of possibilities is there's garbage in memory which is > related to restore the process. That sounds most likely. The complete error in hamerkop's log is: pg_restore: connecting to database for restore pg_restore: [archiver (db)] Error while INITIALIZING: pg_restore: [archiver (db)] could not execute query: ERROR: invalid byte sequence for encoding "UTF8": 0x93 Command was:-- Started on 2014-10-26 03:06:00 “Œ‹ž (•W€Žž) This "Started on" business comes from pg_backup_archiver.c, which has if (AH->public.verbose) dumpTimestamp(AH, "Started on", AH->createDate); where dumpTimestamp is /** dumpTimestamp*/ static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim) { char buf[64]; if (strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S %z", localtime(&tim)) != 0) ahprintf(AH, "-- %s %s\n\n", msg,buf); } So this seems related to the %z part of the strftime() call. I have no explanation for this failure ATM; maybe pg_restore is failing to set the locale properly? I also notice pg_restore.c previously included pg_backup_archiver.h (which in turn includes <time.h>); strftime requires <time.h> so maybe this is causing a problem, but since pg_restore.c itself is not calling strftime, I don't see how this would be related. [Some more code and git-log reading later] I see that the %z is a very recent addition: it only got there as of commit ad5d46a449, of September 5th ... and now I also see that hamerkop's last green run before the failure, on Oct 13rd, did *not* include the pg_upgrade check. So I'm thinking this was broken much earlier than 0eea804. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > So this seems related to the %z part of the strftime() call. I have no > explanation for this failure ATM; maybe pg_restore is failing to set the > locale properly? I also notice pg_restore.c previously included > pg_backup_archiver.h (which in turn includes <time.h>); strftime > requires <time.h> so maybe this is causing a problem, but since > pg_restore.c itself is not calling strftime, I don't see how this would > be related. Hm. %z ought not be locale-dependent ... however, it has a bigger problem, which is that it's a C99-ism. It's not there in SUSv2, which is our normal baseline for what's portable. I think we need to get rid of that. %Z should be portable. (Is it possible that Windows' strftime() reads %z as doing something other than what C99 says?) > [Some more code and git-log reading later] I see that the %z is a very > recent addition: it only got there as of commit ad5d46a449, of September > 5th ... and now I also see that hamerkop's last green run before the > failure, on Oct 13rd, did *not* include the pg_upgrade check. So I'm > thinking this was broken much earlier than 0eea804. Ooohh ... you are right, the first failing build involved not only the pg_dump refactoring commit, but an upgrade in the buildfarm script that hamerkop was using (from 4.4 to 4.14). So it's entirely possible this issue was already there and we just weren't seeing it tested. regards, tom lane
I wrote: > Hm. %z ought not be locale-dependent ... however, it has a bigger > problem, which is that it's a C99-ism. It's not there in SUSv2, > which is our normal baseline for what's portable. I think we need > to get rid of that. %Z should be portable. > (Is it possible that Windows' strftime() reads %z as doing something > other than what C99 says?) A bit of googling leads me to Microsoft reference material saying that their strftime treats %z and %Z alike. So in point of fact, the assumption underlying commit ad5d46a4494b0b48 was flat out wrong. Switching to %z doesn't get you out of the problem noted in the comments it removed: /* * We don't print the timezone on Win32, because the names are long and * localized, which means they may containcharacters in various random * encodings; this has been seen to cause encoding errors when reading the * dumpscript. */ I'm going to go revert most of that commit and make the code like it was before: if (strftime(buf, sizeof(buf), #ifndef WIN32 "%Y-%m-%d %H:%M:%S %Z", #else "%Y-%m-%d %H:%M:%S", #endif localtime(&now)) != 0) If somebody wants to have timezones printed on Windows, they can try again, but they're gonna have to work harder than this. It's possible that fixing this will not fix whatever's bothering hamerkop, but given the lack of backward-portability of %z, this code has got to go anyway. regards, tom lane
I wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> [Some more code and git-log reading later] I see that the %z is a very >> recent addition: it only got there as of commit ad5d46a449, of September >> 5th ... and now I also see that hamerkop's last green run before the >> failure, on Oct 13rd, did *not* include the pg_upgrade check. So I'm >> thinking this was broken much earlier than 0eea804. > Ooohh ... you are right, the first failing build involved not only > the pg_dump refactoring commit, but an upgrade in the buildfarm script > that hamerkop was using (from 4.4 to 4.14). So it's entirely possible > this issue was already there and we just weren't seeing it tested. hamerkop is still failing in its runs done today. However, I'm not sure that that proves anything about our hoped-for fix, because the commit SHAs it's showing on the buildfarm status page are still from Oct 13. It looks like hamerkop has failed to pull any git updates since it was migrated to the new buildfarm script version. regards, tom lane
On 10/29/2014 12:26 AM, Tom Lane wrote: > I wrote: >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >>> [Some more code and git-log reading later] I see that the %z is a very >>> recent addition: it only got there as of commit ad5d46a449, of September >>> 5th ... and now I also see that hamerkop's last green run before the >>> failure, on Oct 13rd, did *not* include the pg_upgrade check. So I'm >>> thinking this was broken much earlier than 0eea804. >> Ooohh ... you are right, the first failing build involved not only >> the pg_dump refactoring commit, but an upgrade in the buildfarm script >> that hamerkop was using (from 4.4 to 4.14). So it's entirely possible >> this issue was already there and we just weren't seeing it tested. > hamerkop is still failing in its runs done today. However, I'm not sure > that that proves anything about our hoped-for fix, because the commit SHAs > it's showing on the buildfarm status page are still from Oct 13. It looks > like hamerkop has failed to pull any git updates since it was migrated to > the new buildfarm script version. > > This machine appears to be misconfigured for git. I will email the owner about fixing it. cheers andrew
On Sun, Oct 26, 2014 at 08:29:19PM -0400, Tom Lane wrote: > I wrote: > > Hm. %z ought not be locale-dependent ... however, it has a bigger > > problem, which is that it's a C99-ism. It's not there in SUSv2, > > which is our normal baseline for what's portable. I think we need > > to get rid of that. %Z should be portable. > > > (Is it possible that Windows' strftime() reads %z as doing something > > other than what C99 says?) > > A bit of googling leads me to Microsoft reference material saying that > their strftime treats %z and %Z alike. So in point of fact, the > assumption underlying commit ad5d46a4494b0b48 was flat out wrong. > Switching to %z doesn't get you out of the problem noted in the > comments it removed: > > /* > * We don't print the timezone on Win32, because the names are long and > * localized, which means they may contain characters in various random > * encodings; this has been seen to cause encoding errors when reading the > * dump script. > */ > > I'm going to go revert most of that commit and make the code like it > was before: Thanks. The major goal of the patch was to get a timezone designation in there, and you have done that were possible, which is non-Windows. Your C comment clearly explained why Windows is a problem, and centralized the code. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +