Thread: recovery long after startup

recovery long after startup

From
Rébeli-Szabó Tamás
Date:
Hello All,

I have noticed that a PostgreSQL 13 server went into recovery mode for 2
seconds during normal operation, not at startup, preventing users from
connecting.

  Here is a log excerpt:

2023-03-29 10:45:46.973 CEST [114782] [unknown]@[unknown] LOG:
  connection received: host=10.150.52.16 port=53965
2023-03-29 10:45:47.088 CEST [114782] appuser@registryapp LOG:
  connection authorized: user=appuser database=registryapp
2023-03-29 10:48:57.483 CEST [115282] [unknown]@[unknown] LOG:
  connection received: host=192.168.1.22 port=63283
023-03-29 10:48:57.516 CEST [114942] LOG:  database system was not
properly shut down; automatic recovery in progress
2023-03-29 10:48:57.552 CEST [114942] LOG:  redo starts at 25/CAB35218
2023-03-29 10:48:57.555 CEST [114942] LOG:  invalid record length at
25/CAB75DC0: wanted 24, got 0
2023-03-29 10:48:57.555 CEST [114942] LOG:  redo done at 25/CAB75D88
2023-03-29 10:48:57.575 CEST [115282] appuser@registryapp FATAL:  the
database system is in recovery mode
2023-03-29 10:48:57.590 CEST [115283] [unknown]@[unknown] LOG:
  connection received: host=10.150.1.1 port=63284
2023-03-29 10:48:57.875 CEST [114942] LOG:  checkpoint starting:
end-of-recovery immediate
2023-03-29 10:48:57.944 CEST [115287] appuser@registryapp FATAL:  the
database system is in recovery mode
2023-03-29 10:48:57.950 CEST [115288] [unknown]@[unknown] LOG:
  connection received: host=10.150.1.15 port=63288
2023-03-29 10:48:58.057 CEST [114942] LOG:  checkpoint complete: wrote
32 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recy
cled; write=0.008 s, sync=0.075 s, total=0.202 s; sync files=11,
longest=0.025 s, average=0.007 s; distance=258 kB, estimate=258 kB
2023-03-29 10:48:58.117 CEST [23615] LOG:  database system is ready to
accept connections

The server definitely did not restart:

postgres=# select pg_postmaster_start_time();
     pg_postmaster_start_time
---------------------------------------------
2023-02-14 20:18:30.689732+01


I am wondering why the server would do crash recovery long after it has
started. Is that normal? What would trigger such an event?

Looking at the source code, my understanding is that recovery is
implemented by the InitWalRecovery function, which is called only from
the StartupXLOG function, whose comment says: "This must be called ONCE
during postmaster or standalone-backend startup." I must be missing
something.

Regards,

tamas





Re: recovery long after startup

From
Tom Lane
Date:
=?utf-8?Q?R=C3=A9beli-Szab=C3=B3_Tam=C3=A1s?= <pub@rblst.info> writes:
> I have noticed that a PostgreSQL 13 server went into recovery mode for 2
> seconds during normal operation, not at startup, preventing users from
> connecting.

The most obvious explanation is that a backend (or other postmaster
child process) crashed.  I lack an explanation of why your log
trace doesn't show anything about that, but that seems like an
idea to pursue.  What is the logging mechanism you're using?

            regards, tom lane