Re: psql: FATAL: the database system is starting up - Mailing list pgsql-general
From | Tom K |
---|---|
Subject | Re: psql: FATAL: the database system is starting up |
Date | |
Msg-id | CAE3EmBB2Vut6Tyibj=kdrfDk1jvPXpVvD0gHJ1ysg9A6OD=6PA@mail.gmail.com Whole thread Raw |
In response to | Re: psql: FATAL: the database system is starting up (Adrian Klaver <adrian.klaver@aklaver.com>) |
Responses |
Re: psql: FATAL: the database system is starting up
|
List | pgsql-general |
On Sat, Jun 1, 2019 at 8:53 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 6/1/19 5:32 PM, Tom K wrote:
>
>
> Trying what we did above but on the second node:
>
Was this node the primary?
To me the below looks like there are replication slots set up that are
failing. Not sure how to deal with this at the moment. You might try
single-user mode:
https://www.postgresql.org/docs/10/app-postgres.html
Single-User Mode
...
and see if that at least gets the server started. This is a highly
restricted so do not expect much usability.
These servers did crash before however didn't' notice anything particularly wrong at the time. Perhaps a better check was in order.
Here is the result from attempting to start the DB in single user mode.
-bash-4.2$ /usr/pgsql-10/bin/postgres --single -D /data/patroni --config-file=/data/patroni/postgresql.conf --hot_standby=off --listen_addresses=192.168.0.124 --max_worker_processes=8 --max_locks_per_transaction=64 --wal_level=replica --cluster_name=postgres --wal_log_hints=on --max_wal_senders=10 --track_commit_timestamp=off --max_prepared_transactions=0 --port=5432 --max_replication_slots=10 --max_connections=20 -d 5 2>&1
2019-06-01 23:04:20.876 EDT [31831] DEBUG: invoking IpcMemoryCreate(size=144687104)
2019-06-01 23:04:20.877 EDT [31831] DEBUG: mmap(144703488) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2019-06-01 23:04:20.890 EDT [31831] DEBUG: SlruScanDirectory invoking callback on pg_notify/0000
2019-06-01 23:04:20.890 EDT [31831] DEBUG: removing file "pg_notify/0000"
2019-06-01 23:04:20.890 EDT [31831] DEBUG: dynamic shared memory system will support 128 segments
2019-06-01 23:04:20.891 EDT [31831] DEBUG: created dynamic shared memory control segment 846930886 (3088 bytes)
2019-06-01 23:04:20.891 EDT [31831] DEBUG: InitPostgres
2019-06-01 23:04:20.891 EDT [31831] DEBUG: my backend ID is 1
2019-06-01 23:04:20.891 EDT [31831] LOG: database system was interrupted; last known up at 2019-04-28 06:06:24 EDT
2019-06-01 23:04:21.239 EDT [31831] LOG: invalid record length at 0/4C35CDF8: wanted 24, got 0
2019-06-01 23:04:21.239 EDT [31831] LOG: invalid primary checkpoint record
2019-06-01 23:04:21.239 EDT [31831] LOG: using previous checkpoint record at 0/4C34EDA8
2019-06-01 23:04:21.239 EDT [31831] DEBUG: redo record is at 0/4C34ED70; shutdown FALSE
2019-06-01 23:04:21.239 EDT [31831] DEBUG: next transaction ID: 0:1409831; next OID: 237578
2019-06-01 23:04:21.239 EDT [31831] DEBUG: next MultiXactId: 48; next MultiXactOffset: 174
2019-06-01 23:04:21.239 EDT [31831] DEBUG: oldest unfrozen transaction ID: 549, in database 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: oldest MultiXactId: 1, in database 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: commit timestamp Xid oldest/newest: 0/0
2019-06-01 23:04:21.239 EDT [31831] DEBUG: transaction ID wrap limit is 2147484196, limited by database with OID 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: starting up replication slots
2019-06-01 23:04:21.239 EDT [31831] DEBUG: starting up replication origin progress state
2019-06-01 23:04:21.239 EDT [31831] PANIC: replication checkpoint has wrong magic 0 instead of 307747550
Aborted
-bash-4.2$
2019-06-01 23:04:20.876 EDT [31831] DEBUG: invoking IpcMemoryCreate(size=144687104)
2019-06-01 23:04:20.877 EDT [31831] DEBUG: mmap(144703488) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2019-06-01 23:04:20.890 EDT [31831] DEBUG: SlruScanDirectory invoking callback on pg_notify/0000
2019-06-01 23:04:20.890 EDT [31831] DEBUG: removing file "pg_notify/0000"
2019-06-01 23:04:20.890 EDT [31831] DEBUG: dynamic shared memory system will support 128 segments
2019-06-01 23:04:20.891 EDT [31831] DEBUG: created dynamic shared memory control segment 846930886 (3088 bytes)
2019-06-01 23:04:20.891 EDT [31831] DEBUG: InitPostgres
2019-06-01 23:04:20.891 EDT [31831] DEBUG: my backend ID is 1
2019-06-01 23:04:20.891 EDT [31831] LOG: database system was interrupted; last known up at 2019-04-28 06:06:24 EDT
2019-06-01 23:04:21.239 EDT [31831] LOG: invalid record length at 0/4C35CDF8: wanted 24, got 0
2019-06-01 23:04:21.239 EDT [31831] LOG: invalid primary checkpoint record
2019-06-01 23:04:21.239 EDT [31831] LOG: using previous checkpoint record at 0/4C34EDA8
2019-06-01 23:04:21.239 EDT [31831] DEBUG: redo record is at 0/4C34ED70; shutdown FALSE
2019-06-01 23:04:21.239 EDT [31831] DEBUG: next transaction ID: 0:1409831; next OID: 237578
2019-06-01 23:04:21.239 EDT [31831] DEBUG: next MultiXactId: 48; next MultiXactOffset: 174
2019-06-01 23:04:21.239 EDT [31831] DEBUG: oldest unfrozen transaction ID: 549, in database 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: oldest MultiXactId: 1, in database 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: commit timestamp Xid oldest/newest: 0/0
2019-06-01 23:04:21.239 EDT [31831] DEBUG: transaction ID wrap limit is 2147484196, limited by database with OID 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: starting up replication slots
2019-06-01 23:04:21.239 EDT [31831] DEBUG: starting up replication origin progress state
2019-06-01 23:04:21.239 EDT [31831] PANIC: replication checkpoint has wrong magic 0 instead of 307747550
Aborted
-bash-4.2$
-bash-4.2$ /usr/pgsql-10/bin/postgres --single -D /data/patroni --config-file=/data/patroni/postgresql.conf
2019-06-01 23:06:28.679 EDT [31969] LOG: database system was interrupted; last known up at 2019-04-28 06:06:24 EDT
2019-06-01 23:06:28.953 EDT [31969] LOG: invalid record length at 0/4C35CDF8: wanted 24, got 0
2019-06-01 23:06:28.953 EDT [31969] LOG: invalid primary checkpoint record
2019-06-01 23:06:28.953 EDT [31969] LOG: using previous checkpoint record at 0/4C34EDA8
2019-06-01 23:06:28.953 EDT [31969] PANIC: replication checkpoint has wrong magic 0 instead of 307747550
Aborted
-bash-4.2$
2019-06-01 23:06:28.679 EDT [31969] LOG: database system was interrupted; last known up at 2019-04-28 06:06:24 EDT
2019-06-01 23:06:28.953 EDT [31969] LOG: invalid record length at 0/4C35CDF8: wanted 24, got 0
2019-06-01 23:06:28.953 EDT [31969] LOG: invalid primary checkpoint record
2019-06-01 23:06:28.953 EDT [31969] LOG: using previous checkpoint record at 0/4C34EDA8
2019-06-01 23:06:28.953 EDT [31969] PANIC: replication checkpoint has wrong magic 0 instead of 307747550
Aborted
-bash-4.2$
> Removing the recovery.conf and restarting yields the following messages:
>
> 2019-06-01 20:31:12.231 EDT [21910] LOG: database system was
> interrupted; last known up at 2019-04-28 06:06:24 EDT
> 2019-06-01 20:31:12.505 EDT [21910] LOG: invalid record length at
> 0/4C35CDF8: wanted 24, got 0
> 2019-06-01 20:31:12.505 EDT [21910] LOG: invalid primary checkpoint record
> 2019-06-01 20:31:12.505 EDT [21910] LOG: using previous checkpoint
> record at 0/4C34EDA8
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: redo record is at
> 0/4C34ED70; shutdown FALSE
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: next transaction ID:
> 0:1409831; next OID: 237578
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: next MultiXactId: 48; next
> MultiXactOffset: 174
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: oldest unfrozen transaction
> ID: 549, in database 1
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: oldest MultiXactId: 1, in
> database 1
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: commit timestamp Xid
> oldest/newest: 0/0
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: transaction ID wrap limit is
> 2147484196, limited by database with OID 1
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: MultiXactId wrap limit is
> 2147483648, limited by database with OID 1
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: starting up replication slots
> 2019-06-01 20:31:12.505 EDT [21910] DEBUG: starting up replication
> origin progress state
> 2019-06-01 20:31:12.505 EDT [21910] PANIC: replication checkpoint has
> wrong magic 0 instead of 307747550
> 2019-06-01 20:31:12.506 EDT [21908] DEBUG: reaping dead processes
> 2019-06-01 20:31:12.506 EDT [21908] LOG: startup process (PID 21910)
> was terminated by signal 6: Aborted
> 2019-06-01 20:31:12.506 EDT [21908] LOG: aborting startup due to
> startup process failure
> 2019-06-01 20:31:12.506 EDT [21908] DEBUG: shmem_exit(1): 0
> before_shmem_exit callbacks to make
> 2019-06-01 20:31:12.506 EDT [21908] DEBUG: shmem_exit(1): 5
> on_shmem_exit callbacks to make
> 2019-06-01 20:31:12.506 EDT [21908] DEBUG: cleaning up dynamic shared
> memory control segment with ID 976986759
> 2019-06-01 20:31:12.508 EDT [21908] DEBUG: proc_exit(1): 2 callbacks to
> make
> 2019-06-01 20:31:12.508 EDT [21908] LOG: database system is shut down
> 2019-06-01 20:31:12.508 EDT [21908] DEBUG: exit(1)
> 2019-06-01 20:31:12.508 EDT [21908] DEBUG: shmem_exit(-1): 0
> before_shmem_exit callbacks to make
> 2019-06-01 20:31:12.508 EDT [21908] DEBUG: shmem_exit(-1): 0
> on_shmem_exit callbacks to make
> 2019-06-01 20:31:12.508 EDT [21908] DEBUG: proc_exit(-1): 0 callbacks
> to make
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: logger shutting down
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(0): 0
> before_shmem_exit callbacks to make
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(0): 0
> on_shmem_exit callbacks to make
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: proc_exit(0): 0 callbacks to
> make
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: exit(0)
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(-1): 0
> before_shmem_exit callbacks to make
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(-1): 0
> on_shmem_exit callbacks to make
> 2019-06-01 20:31:12.510 EDT [21909] DEBUG: proc_exit(-1): 0 callbacks
> to make
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
pgsql-general by date: