Thread: Odd corruption issue reported on dba.stackexchange.com, need advice

Odd corruption issue reported on dba.stackexchange.com, need advice

From

Craig Ringer

Date:

19 July 2012, 00:35:38

Hi all

I've been following a strange issue on dba.stackexchange.com and I could use opinions from those more clueful than I on the log files posted there.

http://dba.stackexchange.com/questions/20959/recover-postgresql-database-from-wal-errors-on-startup/20961#comment34356_20961

The short version is that the person did a bulk-load of some PostGIS data using the osm2pgsql data-loader tool ( http://wiki.openstreetmap.org/wiki/Osm2pgsql) to populate a Pg 9.1 database. This appeared to succeed, but when the server was restarted it failed to come up, complaining that "WAL contains references to invalid pages" and "page 1493172 of relation base/16385/477861 was uninitialized" (for many different pages).

The logs of the shutdown suggest that a backend probably crashed, but that shouldn't cause the WAL and heap corruption observed by the OP.

It's interesting to observe the presence of both a fast and an immediate shutdown request in the log.

A trimmed log follows, original linked to in the article above:

LOG: received fast shutdown request
LOG: aborting any active transactions
LOG: received immediate shutdown request
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
LOG: could not send data to client: Broken pipe
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
LOG: could not send data to client: Broken pipe

[Note: The posted log of the shutdown ends here. The poster asserts that this is the complete log and that the following statement is the next one in their log. Very strange.]

LOG: database system was interrupted; last known up at 2012-07-13 00:15:20 UTC
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 4D/A25FE548
LOG: unexpected pageaddr 4D/6F5C6000 in log file 77, segment 196, offset 6053888
LOG: redo done at 4D/C45C5278
WARNING: page 1493172 of relation base/16385/477861 was uninitialized
WARNING: page 2247965 of relation base/16385/477861 was uninitialized
WARNING: page 1493172 of relation base/16385/477861 was uninitialized
PANIC: WAL contains references to invalid pages
LOG: startup process (PID 21574) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure

--
Craig Ringer

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Jeff Davis

Date:

23 July 2012, 21:49:12

On Thu, 2012-07-19 at 11:35 +0800, Craig Ringer wrote:
> The short version is that the person did a bulk-load of some PostGIS
> data using the osm2pgsql data-loader tool
> ( http://wiki.openstreetmap.org/wiki/Osm2pgsql) to populate a Pg 9.1
> database. This appeared to succeed, but when the server was restarted
> it failed to come up, complaining that "WAL contains references to
> invalid pages" and "page 1493172 of relation base/16385/477861 was
> uninitialized" (for many different pages).
>
> The logs of the shutdown suggest that a backend probably crashed, but
> that shouldn't cause the WAL and heap corruption observed by the OP.

Is it possible that the machine has write cache enabled?

Regards,
    Jeff Davis

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Craig Ringer

Date:

23 July 2012, 22:16:42

On 07/24/2012 08:48 AM, Jeff Davis wrote:
> On Thu, 2012-07-19 at 11:35 +0800, Craig Ringer wrote:
>> The short version is that the person did a bulk-load of some PostGIS
>> data using the osm2pgsql data-loader tool
>> ( http://wiki.openstreetmap.org/wiki/Osm2pgsql) to populate a Pg 9.1
>> database. This appeared to succeed, but when the server was restarted
>> it failed to come up, complaining that "WAL contains references to
>> invalid pages" and "page 1493172 of relation base/16385/477861 was
>> uninitialized" (for many different pages).
>>
>> The logs of the shutdown suggest that a backend probably crashed, but
>> that shouldn't cause the WAL and heap corruption observed by the OP.
> Is it possible that the machine has write cache enabled?
It shouldn't matter - only PostgreSQL was restarted, not the whole
machine - and cleanly at that. Very strange.

--
Craig Ringer

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Marcin Mańk

Date:

24 July 2012, 06:31:52

On Tue, Jul 24, 2012 at 3:16 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:

It shouldn't matter - only PostgreSQL was restarted, not the whole machine - and cleanly at that. Very strange.

look at pg_ctlcluster, (which does the restart):

if (!fork()) {

close STDOUT;

exec $pg_ctl, '-D', $info{'pgdata'}, '-s', '-w', '-m', 'fast', @pg_ctl_opts_from_cli, 'stop';

} else {

wait;

$result = $? >> 8;

}

# try harder if "fast" mode does not work

if (-f $info{'pgdata'}.'/postmaster.pid') {

print "(does not shutdown gracefully, now stopping immediately)\n";

$result = system $pg_ctl, '-D', $info{'pgdata'}, '-s', '-w', '-m', 'immediate', @pg_ctl_opts_from_cli, 'stop';

}

# if that still not helps, use the big hammer

if (-f $info{'pgdata'}.'/postmaster.pid') {

print "(does not shutdown, killing the process)";

$pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';

kill (9, $pid) if $pid;

unlink $info{'pgdata'}.'/postmaster.pid';

$result = 0;

}

Could the "big hammer mode" be what's killed the database?

Greetings

Marcin

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Craig Ringer

Date:

24 July 2012, 10:49:16

On 07/24/2012 05:31 PM, Marcin Mańk wrote:

On Tue, Jul 24, 2012 at 3:16 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
It shouldn't matter - only PostgreSQL was restarted, not the whole machine - and cleanly at that. Very strange.
# if that still not helps, use the big hammer
if (-f $info{'pgdata'}.'/postmaster.pid') {
print "(does not shutdown, killing the process)";
$pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
kill (9, $pid) if $pid;
unlink $info{'pgdata'}.'/postmaster.pid';
$result = 0;
}

Could the "big hammer mode" be what's killed the database?

Gah, that's not very wise. While the DB is probably busy doing an immediate shutdown with I/O in flight and backends chatting, let's just KILL IT with no chance of recovery!

I don't know whether PostgreSQL is supposed to cope in this situation or if it counts as abuse beyond what's reasonable. Either way it this is the cause then either Pg or the shutdown script have a potential data corruption bug.

Tom?

--
Craig Ringer

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Scott Marlowe

Date:

24 July 2012, 11:20:46

On Tue, Jul 24, 2012 at 7:48 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
> On 07/24/2012 05:31 PM, Marcin Mańk wrote:
>
> On Tue, Jul 24, 2012 at 3:16 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
>>>
>>> It shouldn't matter - only PostgreSQL was restarted, not the whole
>>> machine - and cleanly at that. Very strange.
>
>         # if that still not helps, use the big hammer
>         if (-f $info{'pgdata'}.'/postmaster.pid') {
>             print "(does not shutdown, killing the process)";
>             $pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
>             kill (9, $pid) if $pid;
>             unlink $info{'pgdata'}.'/postmaster.pid';
>             $result = 0;
>         }
>
> Could the "big hammer mode" be what's killed the database?
>
>
> Gah, that's not very wise. While the DB is probably busy doing an immediate
> shutdown with I/O in flight and backends chatting, let's just KILL IT with
> no chance of recovery!
>
> I don't know whether PostgreSQL is supposed to cope in this situation or if
> it counts as abuse beyond what's reasonable. Either way it this is the cause
> then either Pg or the shutdown script have a potential data corruption bug.

Yes PG should theoretically survive be able to survive anything as
long as fsync is being properly honored.

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Tom Lane

Date:

24 July 2012, 11:45:42

Scott Marlowe <scott.marlowe@gmail.com> writes:
> On Tue, Jul 24, 2012 at 7:48 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
>>>    # if that still not helps, use the big hammer
>>>    if (-f $info{'pgdata'}.'/postmaster.pid') {
>>>        print "(does not shutdown, killing the process)";
>>>        $pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
>>>        kill (9, $pid) if $pid;
>>>        unlink $info{'pgdata'}.'/postmaster.pid';
>>>        $result = 0;
>>>    }

>> Could the "big hammer mode" be what's killed the database?

> Yes PG should theoretically survive be able to survive anything as
> long as fsync is being properly honored.

I will tell you what is horridly, horridly dangerous and stupid about
that script, and it's not the kill -9 on the postmaster.  It's the
forced unlink on the postmaster.pid file, which (a) is entirely
unnecessary, and (b) defeats the safety interlock against starting
a new postmaster before all the old backends have flushed out.

Postgres will survive a postmaster kill just fine; that scenario
gets exercised fairly regularly, because of the Linux OOM killer :-(.
It will not survive having two independent sets of backends scribbling
on the same database, but that's what this script opens you up to.
If you ever used the "big hammer" and then started a new postmaster
before being entirely sure all the old postmaster's child processes
were gone, then that's why you have a corrupt database.

            regards, tom lane

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Marcin Mańk

Date:

25 July 2012, 04:43:55

I privately pointed Martin Pitt (Debian maintainer) to this
discussion, his response below.

Martin, I believe what happened is:
the original complainer did /etc/init.d/postgresql restart
this called pg_ctlcluster --force restart
this effectively did: pg_ctlcluster --force stop (which removed the
pidfile); pg_ctlcluster start;
boom.


On Wed, Jul 25, 2012 at 6:38 AM, Martin Pitt <mpitt@debian.org> wrote:
> Thanks for pointing this out!

> This mode (--force) is not on by default for command line operation,
> i. e. if you use pg_ctlcluster stop manually it will use the default
> "smart" mode and nothing else. --force is just used in the init
> script's "stop" mode, i. e. when the machine is shutting down. I tried
> to come up with something that guarantees that psql is down after that
> (we are restarting the machine -- it WILL be killed, the question is
> just how to make that as graceful as possible).

> I am happy about any improvements there. I saw Tom's mail about
> removing the .pid file; as the "start" mode checks for a stale pid
> file and cleans it up if the pid does not exist any more, or bails out
> if it still does, it is redundant. With the pointed out race condition
> I do agree that this is a dangerous thing to do, so I removed it in
> the development branch:

> http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/revision/1181

> I'll see to getting that into Debian Wheezy as well (which is
> currently frozen).

> Please feel free to forward this to the thread, as I don't have the
> original mail to reply to.

Re: Odd corruption issue reported on dba.stackexchange.com, need advice

From

Marcin Mańk

Date:

25 July 2012, 04:48:24

> On Wed, Jul 25, 2012 at 6:38 AM, Martin Pitt <mpitt@debian.org> wrote:
>> http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/revision/1181
>

http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/annotate/1181/pg_ctlcluster

I still see some pid file unlinking there. Is this actually needed /
not dangerous?

Greetings
Marcin Mańk