Thread: Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69
On Fri, 2013-12-20 at 10:54 -0300, Alvaro Herrera wrote: > I don't see how can the pg_upgrade check fail in this way but not the > regular regression test. This patch includes the following hunk to > pg_config.h.in: > > +/* Enable large inode numbers on Mac OS X 10.5. */ > +#ifndef _DARWIN_USE_64_BIT_INODE > +# define _DARWIN_USE_64_BIT_INODE 1 > +#endif > > > Evidently something is not going well in ReadRecord. It should have > reported the read failure, but didn't. That seems a separate bug that > needs fixed. This is enabling large-file support on OS X, so that seems kind of important. It's not failing with newer versions of OS X, so that leaves the following possibilities, I think: - Large files never worked on 10.5. That would be strange because Autoconf explicitly refers to that version. - New OS X versions have this on by default (could someone check?), so it already worked before, and it's something specific to 10.5 that's broken. - It's something specific to PowerPC or endianness. - That build farm member has a strange setup. If anyone has any of these components, please help isolate the problem. It's conceivable that inconsistent include file order would cause this kind of problem, but we're pretty strict about including the right things first. It someone can reproduce the problem, it would also help dissecting the pg_upgrade test script to see exactly where the failure is introduced.
On 12/21/13, 9:39 AM, Peter Eisentraut wrote: > This is enabling large-file support on OS X, so that seems kind of > important. It's not failing with newer versions of OS X, so that leaves > the following possibilities, I think: > > - Large files never worked on 10.5. That would be strange because > Autoconf explicitly refers to that version. > > - New OS X versions have this on by default (could someone check?), so > it already worked before, and it's something specific to 10.5 that's > broken. Current OS X has 64-bit inodes on by default, so this code works in general. So it's probably one of the other causes. > - It's something specific to PowerPC or endianness. > > - That build farm member has a strange setup.
Peter Eisentraut <peter_e@gmx.net> writes: > On Fri, 2013-12-20 at 10:54 -0300, Alvaro Herrera wrote: >> Evidently something is not going well in ReadRecord. It should have >> reported the read failure, but didn't. That seems a separate bug that >> needs fixed. > This is enabling large-file support on OS X, so that seems kind of > important. It's not failing with newer versions of OS X, so that leaves > the following possibilities, I think: [ gets out ancient PPC laptop ... ] [ man, this thing is slow ... ] [ hours later ... ] There are three or four different bugs here, but the key points are: 1. pg_resetxlog is diligently trashing every single WAL file in pg_xlog/, and then failing (by virtue of some ancient OS X bug in readdir()), so that it doesn't get to the point of recreating a WAL file with a checkpoint record. 2. pg_resetxlog already rewrote pg_control, which might not be such a hot idea; but whether it had or not, pg_control's last-checkpoint pointer would be pointing to a WAL file that is not in pg_xlog/ now. 3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going. 4. The server tries to start, and fails because it can't find a WAL file containing the last checkpoint record. This is pretty unsurprising given the facts above. The reason you don't see any "no such file" report is that XLogFileRead() will report any BasicOpenFile() failure *other than* ENOENT. And nothing else makes up for that. Re point 4: the logic, if you can call it that, in xlog.c and xlogreader.c is making my head spin. There are about four levels of overcomplicated and undercommented code before you ever get down to XLogFileRead, so I have no idea which level to blame for the lack of error reporting in this specific case. But there are pretty clearly some cases in which ignoring ENOENT in XLogFileRead isn't such a good idea, and XLogFileRead isn't being told when to do that or not. Re point 3: I already bitched about that. Re point 2: I wonder whether pg_resetxlog shouldn't do things in a different order. Rewriting pg_control to point to a file that doesn't exist yet doesn't sound great. I'm not sure though if there's any great improvement to be had here; basically, if we fail partway through, we're probably screwed no matter what. Re point 1: there are threads in our archives suggesting that EINVAL from readdir() after unlinking a directory entry is a known problem on some versions of OS X, eg http://www.postgresql.org/message-id/flat/47C45B07-8459-48D8-8FBE-291864019CC2@me.com and I also find stuff on the intertubes suggesting that renaming an entry can cause it too, eg http://www.dovecot.org/list/dovecot/2009-July/041009.html We thought at the time that the readdir bug was new in OS X 10.6, but I'll bet it was there in 10.5's 64-bit-inode support and was only exposed to default builds when 10.6 turned on 64-bit-inodes in userland by default. So Apple fixed it in 10.6.2 but never back-patched the fix into 10.5.x --- shame on them. I'm disinclined to contort our stuff to any great extent to fix it, because all the OS X versions mentioned are several years obsolete. Perhaps though we should override Autoconf's setting of _DARWIN_USE_64_BIT_INODE, if we can do that easily? It's clearly not nearly as problem-free on 10.5 as the Autoconf boys believe, and it's already enabled by default on the release series where it does work. regards, tom lane
I wrote: > Perhaps though we should override Autoconf's setting of > _DARWIN_USE_64_BIT_INODE, if we can do that easily? It's clearly > not nearly as problem-free on 10.5 as the Autoconf boys believe, > and it's already enabled by default on the release series where it > does work. I looked into this and found that _DARWIN_USE_64_BIT_INODE is being turned on by AC_SYS_LARGEFILE. Quite aside from the wisdom of doing this at all, it's got nothing to do with the advertised purpose of that macro: the width of inode_t would affect how many files you can put on one filesystem, not how large the individual files are. I don't think that is something that we need to concern ourselves with enabling when it's not the platform default. And just to add insult to injury, the implementation technique is such that the #define gets put into pg_config.h unconditionally, even if AC_SYS_LARGEFILE isn't executed by the configure script! So IMO this is brain-dead in at least three different ways, and I've pushed a patch to revert it. We still need to address the other issues enumerated in my previous message, but this should be enough to get buildfarm member locust happy again. regards, tom lane
On 2013-12-29 02:48:21 -0500, Tom Lane wrote: > 4. The server tries to start, and fails because it can't find a WAL file > containing the last checkpoint record. This is pretty unsurprising given > the facts above. The reason you don't see any "no such file" report is > that XLogFileRead() will report any BasicOpenFile() failure *other than* > ENOENT. And nothing else makes up for that. > > Re point 4: the logic, if you can call it that, in xlog.c and xlogreader.c > is making my head spin. There are about four levels of overcomplicated > and undercommented code before you ever get down to XLogFileRead, so I > have no idea which level to blame for the lack of error reporting in this > specific case. But there are pretty clearly some cases in which ignoring > ENOENT in XLogFileRead isn't such a good idea, and XLogFileRead isn't > being told when to do that or not. Yes, that code is pretty horrid. To Heikki's and my defense, I don't think the xlogreader.c split had much to do with it tho. I think the path erroring out essentially is ReadRecord()->XLogReadRecord()*->ReadPageInternal()*->XLogPageRead() ->WaitForWALToBecomeAvailable()->XLogFileReadAnyTLI()->XLogFileRead() The *ed functions are new, but it's really code that was in ReadRecord() before. So I don't think too much has changed since 9.0ish, although the timeline switch didn't make it simpler. As far as I can tell XLogFileRead() actually is told when it's ok to ignore an error - the notfoundOK parameter. It's just that we're always passing true for it we're not streaming... I think it might be sufficient to make passing that flag additionally conditional on fetching_ckpt, that's already passed to WaitForWALToBecomeAvailable(), so we'd just need to add it to XLogFileReadAnyTLI(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Dec 29, 2013 at 02:48:21AM -0500, Tom Lane wrote: > There are three or four different bugs here, but the key points are: > > 1. pg_resetxlog is diligently trashing every single WAL file in pg_xlog/, > and then failing (by virtue of some ancient OS X bug in readdir()), so > that it doesn't get to the point of recreating a WAL file with a > checkpoint record. > > 2. pg_resetxlog already rewrote pg_control, which might not be such a hot > idea; but whether it had or not, pg_control's last-checkpoint pointer > would be pointing to a WAL file that is not in pg_xlog/ now. > > 3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going. Does pg_resetxlog return a non-zero exit status? If so, pg_upgrade should have caught that and exited. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Bruce Momjian <bruce@momjian.us> writes: > On Sun, Dec 29, 2013 at 02:48:21AM -0500, Tom Lane wrote: >> 3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going. > Does pg_resetxlog return a non-zero exit status? If so, pg_upgrade > should have caught that and exited. It certainly does: if (errno) { fprintf(stderr, _("%s: could not read from directory \"%s\": %s\n"), progname, XLOGDIR,strerror(errno)); exit(1); } The bug is that pg_upgrade appears to assume (in many places not just this one) that exec_prog() will abort if the called program fails, but *it doesn't*, contrary to the claim in its own header comment. This is because pg_log(FATAL, ...) doesn't call exit(). pg_fatal() does, but that's not what's being called in the throw_error case. I imagine that this used to work correctly and got broken by some ill-advised refactoring, but whatever the origin, it's 100% broken today. regards, tom lane
That was probably me. I'll look into it. > On Jan 6, 2014, at 11:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: >>> On Sun, Dec 29, 2013 at 02:48:21AM -0500, Tom Lane wrote: >>> 3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going. > >> Does pg_resetxlog return a non-zero exit status? If so, pg_upgrade >> should have caught that and exited. > > It certainly does: > > if (errno) > { > fprintf(stderr, _("%s: could not read from directory \"%s\": %s\n"), > progname, XLOGDIR, strerror(errno)); > exit(1); > } > > The bug is that pg_upgrade appears to assume (in many places not just this > one) that exec_prog() will abort if the called program fails, but *it > doesn't*, contrary to the claim in its own header comment. This is > because pg_log(FATAL, ...) doesn't call exit(). pg_fatal() does, but > that's not what's being called in the throw_error case. > > I imagine that this used to work correctly and got broken by some > ill-advised refactoring, but whatever the origin, it's 100% broken today. > > regards, tom lane >
On Tue, Jan 7, 2014 at 08:19:49AM -0500, Peter Eisentraut wrote: > That was probably me. I'll look into it. > > > > > On Jan 6, 2014, at 11:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Bruce Momjian <bruce@momjian.us> writes: > >>> On Sun, Dec 29, 2013 at 02:48:21AM -0500, Tom Lane wrote: > >>> 3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going. > > > >> Does pg_resetxlog return a non-zero exit status? If so, pg_upgrade > >> should have caught that and exited. > > > > It certainly does: > > > > if (errno) > > { > > fprintf(stderr, _("%s: could not read from directory \"%s\": %s\n"), > > progname, XLOGDIR, strerror(errno)); > > exit(1); > > } > > > > The bug is that pg_upgrade appears to assume (in many places not just this > > one) that exec_prog() will abort if the called program fails, but *it > > doesn't*, contrary to the claim in its own header comment. This is > > because pg_log(FATAL, ...) doesn't call exit(). pg_fatal() does, but > > that's not what's being called in the throw_error case. > > > > I imagine that this used to work correctly and got broken by some > > ill-advised refactoring, but whatever the origin, it's 100% broken today. I know Peter is looking at this, but I looked at and I can't see the problem. Every call of exec_prog() that uses pg_resetxlog has throw_error = true, and the test there is: result = system(cmd); if (result != 0) ... pg_log(FATAL, ...) and in pg_log_v() I see: switch (type) ... case PG_FATAL: printf("\n%s", _(message)); printf("Failure, exiting\n"); --> exit(1); break; so I am not clear how you are seeing the return status of pg_resetxlog ignored. I tried the attached patch which causes pg_resetxlog -f to return -1, and got the proper error from pg_upgrade in git head: Performing Upgrade ------------------ Analyzing all rows in the new cluster ok Freezing all rows on the new cluster ok Deleting files from new pg_clog ok Copying old pg_clog to new server ok Setting next transaction ID for new cluster *failure* Consult the last few lines of "pg_upgrade_utility.log" for the probable cause of the failure. Failure, exiting and the last line in "pg_upgrade_utility.log" is: command: "/u/pgsql/bin/pg_resetxlog" -f -x 683 "/u/pgsql/data" >> "pg_upgrade_utility.log" 2>&1 -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
Bruce Momjian wrote: > I know Peter is looking at this, but I looked at and I can't see the > problem. Every call of exec_prog() that uses pg_resetxlog has > throw_error = true, and the test there is: > > result = system(cmd); > > if (result != 0) > ... > pg_log(FATAL, ...) > > and in pg_log_v() I see: > > switch (type) > ... > case PG_FATAL: > printf("\n%s", _(message)); > printf("Failure, exiting\n"); > --> exit(1); > break; This was fixed by Peter two days ago in commit http://git.postgresql.org/pg/commitdiff/ca607b155e86ce529fc9ac322a232f264cda9ab6 -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Bruce Momjian <bruce@momjian.us> writes: > On Tue, Jan 7, 2014 at 08:19:49AM -0500, Peter Eisentraut wrote: >> That was probably me. I'll look into it. > and in pg_log_v() I see: > switch (type) > ... > case PG_FATAL: > printf("\n%s", _(message)); > printf("Failure, exiting\n"); > --> exit(1); > break; Peter just fixed that; see commit ca607b155e86ce529fc9ac322a232f264cda9ab6 regards, tom lane
On Fri, Jan 10, 2014 at 04:06:11PM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Tue, Jan 7, 2014 at 08:19:49AM -0500, Peter Eisentraut wrote: > >> That was probably me. I'll look into it. > > > and in pg_log_v() I see: > > > switch (type) > > ... > > case PG_FATAL: > > printf("\n%s", _(message)); > > printf("Failure, exiting\n"); > > --> exit(1); > > break; > > Peter just fixed that; see commit ca607b155e86ce529fc9ac322a232f264cda9ab6 Oh, I guess I checked git log on the wrong file then. Thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +