Re: page 1 of relation global/11787 was uninitialized - Mailing list pgsql-hackers

From Andres Freund
Subject Re: page 1 of relation global/11787 was uninitialized
Date
Msg-id 20130409164240.GA11081@awork2.anarazel.de
Whole thread Raw
In response to Re: page 1 of relation global/11787 was uninitialized  ("Joshua D. Drake" <jd@commandprompt.com>)
Responses Re: page 1 of relation global/11787 was uninitialized
Re: page 1 of relation global/11787 was uninitialized
List pgsql-hackers
On 2013-04-09 18:21:20 +0200, Stephen R. van den Berg wrote:
> Just today one of my systems experienced a kernel panic, and halted abruptly.
> Running Linux 3.1.9, PostgreSQL 9.0.4 (Debian 9.0.4-1+b1, to be precise).

Thats an absolutely outdated version of 9.0. You shouldn't be running
this in production.

On 2013-04-09 09:27:52 -0700, Joshua D. Drake wrote:
> 
> On 04/09/2013 09:21 AM, Stephen R. van den Berg wrote:
> 
> >-------------------------
> >
> >Looking at global/11787, doesn't reveal any obvious corruption.

> >The server was running with:
> >  synchronous_commit = off
> >  full_page_writes = off
> 
> full_page_writes = off is the problem.

Yea, and it can cause very hard to recover corruption, its not that you
only may loose some of the last transactions, in contrast to
synchronous_commit=off where you can loose the last transactions but
which never should cause corruption.

> From the docs:
> 
> Turning this parameter off speeds normal operation, but might lead to either
> unrecoverable data corruption, or silent data corruption, after a system
> failure. The risks are similar to turning off fsync, though smaller, and it
> should be turned off only based on the same circumstances recommended for
> that parameter.
> 
> http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#GUC-FULL-PAGE-WRITES

That was my first thought as well, but whilst it sure can cause
corruption, I can't immediately see how it should be responsible for
this error. That seems to indicate another problem.

Stephen, could you check how big global/11787 exactly is? Too bad we
don't know what that relfilenode corresponds to and we can't easily find
out what it maps to.

Afaik we don't have any debugging utility to dump the pg_filenode.map
contents?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Add sql_drop event for event triggers
Next
From: Tom Lane
Date:
Subject: Re: page 1 of relation global/11787 was uninitialized