Thread: Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

Heikki, Andres,

> Shortly after this patch was committed, buildfarm member locust (running
> Mac OS X 10.5 apparently) started failing the pg_upgrade check:
> 
> command:
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl"
-w-l "pg_upgrade_server.log" -D
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"-o "-p 57632 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off  -c listen_addresses='' -c unix_socket_permissions=0700 -c
unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"start >>
"pg_upgrade_server.log"2>&1
 
> waiting for server to start....LOG:  database system was shut down at 2013-12-19 12:51:16 CET
> LOG:  invalid primary checkpoint record
> LOG:  invalid secondary checkpoint link in control file
> PANIC:  could not locate a valid checkpoint record

Any comment on this problem?  Somehow ReadRecord is unable to find a
checkpoint, yet there's no error message to be seen anywhere, whereas
pg_resetxlog does report it:

> command:
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog"
-l000000010000000000000009 "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data" >>
"pg_upgrade_utility.log"2>&1
 
> pg_resetxlog: could not read from directory "pg_xlog": Invalid argument

I cannot but think xlogreader is at fault.

Regardless of the solution to the Mac OS X problem, ISTM this should be
fixed.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

From
Andres Freund
Date:
Hi,

On 2013-12-24 12:58:04 -0300, Alvaro Herrera wrote:
> > Shortly after this patch was committed, buildfarm member locust (running
> > Mac OS X 10.5 apparently) started failing the pg_upgrade check:
> > 
> > command:
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl"
-w-l "pg_upgrade_server.log" -D
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"-o "-p 57632 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off  -c listen_addresses='' -c unix_socket_permissions=0700 -c
unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"start >>
"pg_upgrade_server.log"2>&1
 
> > waiting for server to start....LOG:  database system was shut down at 2013-12-19 12:51:16 CET
> > LOG:  invalid primary checkpoint record
> > LOG:  invalid secondary checkpoint link in control file
> > PANIC:  could not locate a valid checkpoint record
> 
> Any comment on this problem?  Somehow ReadRecord is unable to find a
> checkpoint, yet there's no error message to be seen anywhere, whereas
> pg_resetxlog does report it:
> 
> > command:
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog"
-l000000010000000000000009 "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data" >>
"pg_upgrade_utility.log"2>&1
 
> > pg_resetxlog: could not read from directory "pg_xlog": Invalid argument
> 
> I cannot but think xlogreader is at fault.
> 
> Regardless of the solution to the Mac OS X problem, ISTM this should be
> fixed.

I didn't look at any code, and I won't today, but it doesn't look
surprising - the report when starting the server above is presumable the
one in ReadCheckpoint() (or similar) and it probably just reports that
ReadRecord() didn't return a record.
pg_resetxlog (which doesn't use xlogreader!) reports that it couldn't
read from directory "pg_xlog", so there's something wonky independently
from xlogreader. I'd guess that xlog.c read_page callback errors out
without reporting an error. IIRC we're logging some failures as DEBUG
there, because they really aren't unexpected, and normally just signal
the end of wal.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

From
Alvaro Herrera
Date:
Andres Freund wrote:
> Hi,
> 
> On 2013-12-24 12:58:04 -0300, Alvaro Herrera wrote:
> > > Shortly after this patch was committed, buildfarm member locust (running
> > > Mac OS X 10.5 apparently) started failing the pg_upgrade check:
> > > 
> > > command:
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl"
-w-l "pg_upgrade_server.log" -D
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"-o "-p 57632 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off  -c listen_addresses='' -c unix_socket_permissions=0700 -c
unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"start >>
"pg_upgrade_server.log"2>&1
 
> > > waiting for server to start....LOG:  database system was shut down at 2013-12-19 12:51:16 CET
> > > LOG:  invalid primary checkpoint record
> > > LOG:  invalid secondary checkpoint link in control file
> > > PANIC:  could not locate a valid checkpoint record
> > 
> > Any comment on this problem?  Somehow ReadRecord is unable to find a
> > checkpoint, yet there's no error message to be seen anywhere, whereas
> > pg_resetxlog does report it:
> > 
> > > command:
"/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog"
-l000000010000000000000009 "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data" >>
"pg_upgrade_utility.log"2>&1
 
> > > pg_resetxlog: could not read from directory "pg_xlog": Invalid argument
> > 
> > I cannot but think xlogreader is at fault.
> > 
> > Regardless of the solution to the Mac OS X problem, ISTM this should be
> > fixed.
> 
> I didn't look at any code, and I won't today, but it doesn't look
> surprising - the report when starting the server above is presumable the
> one in ReadCheckpoint() (or similar) and it probably just reports that
> ReadRecord() didn't return a record.

How is this not surprising?  Surely failing to find a checkpoint record
is not a problem to be taken lightly.

> pg_resetxlog (which doesn't use xlogreader!) reports that it couldn't
> read from directory "pg_xlog", so there's something wonky independently
> from xlogreader.

Yes, most likely there is.  My point is that the LOG messages above
should have logged the system error that caused the checkpoint record to
be unfindable.

> I'd guess that xlog.c read_page callback errors out without reporting
> an error. IIRC we're logging some failures as DEBUG there, because
> they really aren't unexpected, and normally just signal the end of
> wal.

Hmm?  At least, I recall something like a "unexpected pageaddr" message
is sometimes logged when end-of-wal is found.  Why would other error
messages be hidden?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services