BUG #18009: Postgres Recovery not happening - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18009: Postgres Recovery not happening
Date
Msg-id 18009-40a42f84af3fbda1@postgresql.org
Whole thread Raw
Responses Re: BUG #18009: Postgres Recovery not happening  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18009
Logged by:          Vamshi krishna
Email address:      tvk1271@gmail.com
PostgreSQL version: 15.2
Operating system:   AIX
Description:

Basically what i observe is ..Abrupt down of the system, Postgres database
is unable to find its checkpoint record.

2023-06-28 03:17:33.051 CDT|649bec9d.cc01b6|LOG:  database system was shut
down at 2023-06-28 01:20:17 CDT
2023-06-28 03:17:33.054 CDT|649bec9d.cc01b6|LOG:  unexpected pageaddr
0/1000000 in log segment 000000010000000000000003, offset 0 <==
2023-06-28 03:17:33.054 CDT|649bec9d.cc01b6|LOG:  invalid primary checkpoint
record
2023-06-28 03:17:33.054 CDT|649bec9d.cc01b6|PANIC:  could not locate a valid
checkpoint record
2023-06-28 03:17:33.058 CDT|649bec9c.c6015c|LOG:  startup process (PID
13369782) was terminated by signal 6: IOT/Abort trap

I verified in the OS side, we are not observing explicit fsync() call post
writing to this file "000000010000000000000003". I suspect this because 
the writes are present in the VMM page cache and not getting synced up to
the disk. Post restart of my node, DB is not coming up.
Can anyone help me figure out the underlying issue.  reset-wal option is
there, but i am afraid it will lead to data loss of the database. This seems
easily recreating on my node.

Thanks
 Vamshi.


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()
Next
From: Laurenz Albe
Date:
Subject: Re: BUG #18008: SSL certificate error for Stackbuilder 4.2.1