Re: pg_upgrade broken by xlog numbering - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pg_upgrade broken by xlog numbering
Date
Msg-id CA+TgmoYpLwLARH5yzOLir9uzpgGAsKN-tOo-ZpUF9KAqNP13Ug@mail.gmail.com
Whole thread Raw
In response to Re: pg_upgrade broken by xlog numbering  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Jun 25, 2012 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On MacOS X, on latest sources, initdb fails:
>
>> creating directory /Users/rhaas/pgsql/src/test/regress/./tmp_check/data ... ok
>> creating subdirectories ... ok
>> selecting default max_connections ... 100
>> selecting default shared_buffers ... 32MB
>> creating configuration files ... ok
>> creating template1 database in
>> /Users/rhaas/pgsql/src/test/regress/./tmp_check/data/base/1 ... ok
>> initializing pg_authid ... ok
>> initializing dependencies ... ok
>> creating system views ... ok
>> loading system objects' descriptions ... ok
>> creating collations ... ok
>> creating conversions ... ok
>> creating dictionaries ... FATAL:  control file contains invalid data
>> child process exited with exit code 1
>
> Same for me.  It's crashing here:
>
>    if (ControlFile->state < DB_SHUTDOWNED ||
>        ControlFile->state > DB_IN_PRODUCTION ||
>        !XRecOffIsValid(ControlFile->checkPoint))
>        ereport(FATAL,
>                (errmsg("control file contains invalid data")));
>
> state == DB_SHUTDOWNED, so the problem is with the XRecOffIsValid test.
> ControlFile->checkPoint == 19972072 (0x130BFE8), what's wrong with that?
>
> (I suppose the reason this is only failing on some machines is
> platform-specific variations in xlog entry size, but it's still a bit
> distressing that this got committed in such a broken state.)

I'm guessing that the problem is as follows: in the old code, the
XLogRecord header could not be split, so any offset that was closer to
the end of the page than SizeOfXLogRecord was a sure sign of trouble.
But commit 061e7efb1b4c5b8a5d02122b7780531b8d5bf23d relaxed that
restriction, so now it IS legal for the checkpoint record to be where
it is.  But it seems that XRecOffIsValid() didn't get the memo.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: warning handling in Perl scripts
Next
From: Tom Lane
Date:
Subject: Re: warning handling in Perl scripts