Thread: BUG #17093: invalid primary checkpoint record

BUG #17093: invalid primary checkpoint record

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17093
Logged by:          Mohan nagandlla
Email address:      nagandllamohan@gmail.com
PostgreSQL version: 13.3
Operating system:   Alpine
Description:

HI Team i am facing this issue "invalid primary checkpoint record" more
times. When i got this error i am doing pg_resetwal -f . the wal showing
success and server also running fine. But after some time again the issue is
repeating. I am not sure why this is happening most frequently. 
```
2021-07-08 04:16:24.729 UTC [22167] LOG:  starting PostgreSQL 13.2 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203,
64-bit
2021-07-08 04:16:24.729 UTC [22167] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2021-07-08 04:16:24.729 UTC [22167] LOG:  listening on IPv6 address "::",
port 5432
2021-07-08 04:16:24.732 UTC [22167] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-08 04:16:24.777 UTC [22168] LOG:  database system was interrupted;
last known up at 2021-07-08 01:52:08 UTC
2021-07-08 04:16:33.441 UTC [22168] LOG:  invalid primary checkpoint
record
2021-07-08 04:16:33.441 UTC [22168] PANIC:  could not locate a valid
checkpoint record
2021-07-08 04:16:33.443 UTC [22167] LOG:  startup process (PID 22168) was
terminated by signal 6: Aborted
2021-07-08 04:16:33.443 UTC [22167] LOG:  aborting startup due to startup
process failure
2021-07-08 04:16:33.448 UTC [22167] LOG:  database system is shut down
2021-07-08 04:16:53.617 UTC [22332] LOG:  starting PostgreSQL 13.2 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203,
64-bit
2021-07-08 04:16:53.617 UTC [22332] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2021-07-08 04:16:53.617 UTC [22332] LOG:  listening on IPv6 address "::",
port 5432
2021-07-08 04:16:53.619 UTC [22332] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-08 04:16:53.662 UTC [22333] LOG:  database system was interrupted;
last known up at 2021-07-08 01:52:08 UTC
2021-07-08 04:16:57.053 UTC [22352] FATAL:  the database system is starting
up
2021-07-08 04:16:57.055 UTC [22353] FATAL:  the database system is starting
up
2021-07-08 04:17:00.985 UTC [22333] LOG:  invalid primary checkpoint
record
2021-07-08 04:17:00.985 UTC [22333] PANIC:  could not locate a valid
checkpoint record
2021-07-08 04:17:00.987 UTC [22332] LOG:  startup process (PID 22333) was
terminated by signal 6: Aborted
2021-07-08 04:17:00.987 UTC [22332] LOG:  aborting startup due to startup
process failure
2021-07-08 04:17:00.992 UTC [22332] LOG:  database system is shut down
2021-07-08 04:17:31.189 UTC [22540] LOG:  starting PostgreSQL 13.2 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203,
64-bit
2021-07-08 04:17:31.189 UTC [22540] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2021-07-08 04:17:31.189 UTC [22540] LOG:  listening on IPv6 address "::",
port 5432
2021-07-08 04:17:31.192 UTC [22540] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-08 04:17:31.259 UTC [22541] LOG:  database system was interrupted;
last known up at 2021-07-08 01:52:08 UTC
2021-07-08 04:17:35.643 UTC [22560] FATAL:  the database system is starting
up
2021-07-08 04:17:35.644 UTC [22561] FATAL:  the database system is starting
up
2021-07-08 04:17:38.812 UTC [22541] LOG:  invalid primary checkpoint
record
2021-07-08 04:17:38.812 UTC [22541] PANIC:  could not locate a valid
checkpoint record
2021-07-08 04:17:38.813 UTC [22540] LOG:  startup process (PID 22541) was
terminated by signal 6: Aborted
2021-07-08 04:17:38.813 UTC [22540] LOG:  aborting startup due to startup
process failure
2021-07-08 04:17:38.818 UTC [22540] LOG:  database system is shut down
```


Re: BUG #17093: invalid primary checkpoint record

From
Mohan Nagandlla
Date:
Hi team 
Any updates on this?
Again my server crashed because of invalid chek point .

On Thu, 8 Jul, 2021, 10:18 am PG Bug reporting form, <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17093
Logged by:          Mohan nagandlla
Email address:      nagandllamohan@gmail.com
PostgreSQL version: 13.3
Operating system:   Alpine
Description:       

HI Team i am facing this issue "invalid primary checkpoint record" more
times. When i got this error i am doing pg_resetwal -f . the wal showing
success and server also running fine. But after some time again the issue is
repeating. I am not sure why this is happening most frequently.
```
2021-07-08 04:16:24.729 UTC [22167] LOG:  starting PostgreSQL 13.2 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203,
64-bit
2021-07-08 04:16:24.729 UTC [22167] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2021-07-08 04:16:24.729 UTC [22167] LOG:  listening on IPv6 address "::",
port 5432
2021-07-08 04:16:24.732 UTC [22167] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-08 04:16:24.777 UTC [22168] LOG:  database system was interrupted;
last known up at 2021-07-08 01:52:08 UTC
2021-07-08 04:16:33.441 UTC [22168] LOG:  invalid primary checkpoint
record
2021-07-08 04:16:33.441 UTC [22168] PANIC:  could not locate a valid
checkpoint record
2021-07-08 04:16:33.443 UTC [22167] LOG:  startup process (PID 22168) was
terminated by signal 6: Aborted
2021-07-08 04:16:33.443 UTC [22167] LOG:  aborting startup due to startup
process failure
2021-07-08 04:16:33.448 UTC [22167] LOG:  database system is shut down
2021-07-08 04:16:53.617 UTC [22332] LOG:  starting PostgreSQL 13.2 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203,
64-bit
2021-07-08 04:16:53.617 UTC [22332] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2021-07-08 04:16:53.617 UTC [22332] LOG:  listening on IPv6 address "::",
port 5432
2021-07-08 04:16:53.619 UTC [22332] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-08 04:16:53.662 UTC [22333] LOG:  database system was interrupted;
last known up at 2021-07-08 01:52:08 UTC
2021-07-08 04:16:57.053 UTC [22352] FATAL:  the database system is starting
up
2021-07-08 04:16:57.055 UTC [22353] FATAL:  the database system is starting
up
2021-07-08 04:17:00.985 UTC [22333] LOG:  invalid primary checkpoint
record
2021-07-08 04:17:00.985 UTC [22333] PANIC:  could not locate a valid
checkpoint record
2021-07-08 04:17:00.987 UTC [22332] LOG:  startup process (PID 22333) was
terminated by signal 6: Aborted
2021-07-08 04:17:00.987 UTC [22332] LOG:  aborting startup due to startup
process failure
2021-07-08 04:17:00.992 UTC [22332] LOG:  database system is shut down
2021-07-08 04:17:31.189 UTC [22540] LOG:  starting PostgreSQL 13.2 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203,
64-bit
2021-07-08 04:17:31.189 UTC [22540] LOG:  listening on IPv4 address
"0.0.0.0", port 5432
2021-07-08 04:17:31.189 UTC [22540] LOG:  listening on IPv6 address "::",
port 5432
2021-07-08 04:17:31.192 UTC [22540] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-08 04:17:31.259 UTC [22541] LOG:  database system was interrupted;
last known up at 2021-07-08 01:52:08 UTC
2021-07-08 04:17:35.643 UTC [22560] FATAL:  the database system is starting
up
2021-07-08 04:17:35.644 UTC [22561] FATAL:  the database system is starting
up
2021-07-08 04:17:38.812 UTC [22541] LOG:  invalid primary checkpoint
record
2021-07-08 04:17:38.812 UTC [22541] PANIC:  could not locate a valid
checkpoint record
2021-07-08 04:17:38.813 UTC [22540] LOG:  startup process (PID 22541) was
terminated by signal 6: Aborted
2021-07-08 04:17:38.813 UTC [22540] LOG:  aborting startup due to startup
process failure
2021-07-08 04:17:38.818 UTC [22540] LOG:  database system is shut down
```

Re: BUG #17093: invalid primary checkpoint record

From
Michael Paquier
Date:
On Mon, Jul 12, 2021 at 11:06:23AM +0530, Mohan Nagandlla wrote:
> Hi team
> Any updates on this?
> Again my server crashed because of invalid chek point .

Your instance is, at short sight, corrupted, as the WAL segment
holding the checkpoint record looked at the beginning of recovery
should be flushed to disk.  Please see here:
https://wiki.postgresql.org/wiki/Corruption

Also, note that using blindly pg_resetwal could cause corruptions in
many ways.  On top of that, it may be something that is wrong in your
configuration.  fsync = off could be an issue, for one.  In short, it
is not really possible to say what's wrong without knowing more how
you are using PostgreSQL.
--
Michael

Attachment