Thread: 7.3.5 initdb failure on Irix 6.5.18

7.3.5 initdb failure on Irix 6.5.18

From
Craig Ruff
Date:
I'm trying to use 7.3.5 (for an upgrade of 7.3.2) on Irix 6.5.18 using the
MIPSpro 7.4.1 compiler.  Everything compiles up ok, but 'make check' fails
at the "enabling unlimited row size for system tables..." step with
a core dump of postgres.

The failure is at /backend/access/transam/xlog.c:2544 with an
"unable to locate a valid checkpoint record" panic.  This happens
for both 7.3.4 and 7.3.5, either with -O or -g as the CFLAGS value.

Manually running the command being used by initdb:

    tmp_check/install/stmgr/pgsql-7.3.5/bin/postgres -F \
        -D/stmgr/src/postgresql-7.3.5/src/test/regress/data -O \
        -c search_path=pg_catalog template1

gives:

    LOG:  database system was shut down at 2004-01-15 11:20:44 MST
    LOG:  ReadRecord: invalid magic number 0000 in log file 0, segment 0, offset 32768
    LOG:  invalid primary checkpoint record
    LOG:  ReadRecord: record with zero length at 0/50
    LOG:  invalid secondary checkpoint record
    PANIC:  unable to locate a valid checkpoint record


Interestingly, using a copy of an existing database created by the 7.3.2
installation on the same system works fine.

Has anyone fixed this yet?  If not, does anyone have hints that I can
pursue since I have the source compiled up with debugging enabled?

--

Craig Ruff          NCAR            cruff@ucar.edu
(303) 497-1211      P.O. Box 3000
            Boulder, CO  80307

Re: 7.3.5 initdb failure on Irix 6.5.18

From
Tom Lane
Date:
Craig Ruff <cruff@ucar.edu> writes:
> I'm trying to use 7.3.5 (for an upgrade of 7.3.2) on Irix 6.5.18 using the
> MIPSpro 7.4.1 compiler.  Everything compiles up ok, but 'make check' fails
> at the "enabling unlimited row size for system tables..." step with
> a core dump of postgres.

Hmm, hard to see what could have broken between 7.3.2 and 7.3.4.

> Has anyone fixed this yet?

Nope, first we've heard of it.

> If not, does anyone have hints that I can pursue since I have the
> source compiled up with debugging enabled?

It would seem that the culprit must be somewhere in the 7.3.2-to-7.3.4
changes in xlog.c:

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/access/transam/xlog.c.diff?r1=1.109&r2=1.109.2.3

but I sure don't see anything there that looks like a potential
portability issue.

            regards, tom lane

Re: 7.3.5 initdb failure on Irix 6.5.18

From
Craig Ruff
Date:
On Thu, Jan 15, 2004 at 04:42:50PM -0500, Tom Lane wrote:
> It would seem that the culprit must be somewhere in the 7.3.2-to-7.3.4
> changes in xlog.c:
> ...
> but I sure don't see anything there that looks like a potential
> portability issue.

I have some further info.  7.3.5 compiled with MIPSpro 7.4.1 is broken
with respect to the transaction log files.  Restarting my 7.3.5 install
results in similar errors.

However, when compiled with gcc, 7.3.5 initdb works correctly.  I'm
in the process of testing the import of the 7.3.2 database and running
some transactions to see if the restart works.

Also, PostgreSQL 7.4.1 compiled with MIPSpro 7.4.1 appears to work
(at least the regression test).

Re: 7.3.5 initdb failure on Irix 6.5.18

From
Craig Ruff
Date:
Ok, I have further information on this problem.  I believe it is a compiler
problem.  PostgreSQL version 7.3.3 is also affected when compiled with the
MIPSpro 7.4.1 compiler, but when compiled with MIPSpro 7.4 it is ok.

Using the gcc compiled version of backend/access/transam/xlog.c, I have
gotten the regression test to work.  Next week I'll have to further
nail it down so I can send a bug report to SGI.  Just replacing XLogFlush
with the gcc compiled version allows initdb to finish, but the regression
tests shows there are other problems.

So, a note should probably be made in the documentation that for the
moment, MIPSpro 7.4.1 should probably be avoided.

Re: 7.3.5 initdb failure on Irix 6.5.18

From
Tom Lane
Date:
Craig Ruff <cruff@ucar.edu> writes:
> So, a note should probably be made in the documentation that for the
> moment, MIPSpro 7.4.1 should probably be avoided.

Appreciate the followup.  Let us know if it emerges that the PG code is
doing something unportable.  (It could be that the compiler is doing
something that's legal per the ANSI C standard but breaks our code.)

            regards, tom lane

Re: 7.3.5 initdb failure on Irix 6.5.18

From
Craig Ruff
Date:
Here is what I discovered about this problem.

The MIPSpro 7.4.1 C compiler apparently has a structure assignment code
generation bug that is triggered at backend/access/transam/xlog.c:2683

    LogwrtResult.Write = LogwrtResult.Flush = EndOfLog;

EndOfLog and LogwrtResult.Write are correct, but LogwrtResult.Flush ends
up corrupted.

I've opened a problem report with SGI (case ID 2505985 "MIPSpro 7.4.1 C
structure assignment bug") for those of you who need to track it.  From
what I can see, PostgreSQL 7.3.x is vulnerable, PostgreSQL 7.4.1 seems
to pass its regression test, but I'd probably think twice about using
it when compiled with MIPSpro 7.4.1.

Everything seems ok when compiled with the SGI provided version of GCC 3.2.2.