Thread: Fix "database is ready" race condition

Fix "database is ready" race condition

From
Markus Schiltknecht
Date:
Hi,

is there a good reason to print the "database system is ready" message
in StartupXLOG() in xact.c? It has a) nothing to do with xlog and b)
opens a small race condition: the message gets printed, while it still
take some CPU cycles until the postmaster really gets the SIGCHLD signal
and sets StartupPID = 0. If you (or rather: an automated test program)
try to connect within this timespan, you get a "database is starting up"
error, which clearly contradicts the "is ready" message.

I admit this is not a real issue in the common case and only matters in
automated testing or some such. But in case this does not break
anything...  (ereport is used in the reaper, so I guess it's fine to use
that in signal handlers). I'm not sure if the message is needed at all
in BS_XLOG_BOOTSTRAP mode. Probably it should better say something
different.

Patch attached.

Regards

Markus
============================================================
*** src/backend/access/transam/xlog.c    2191ee8ca338d74f666b4d3509cc4361c44b4353
--- src/backend/access/transam/xlog.c    e77a26a26ec46d4479563ed7ff5885ea9c21135a
*************** StartupXLOG(void)
*** 5168,5176 ****
      /* Reload shared-memory state for prepared transactions */
      RecoverPreparedTransactions();

-     ereport(LOG,
-             (errmsg("database system is ready")));
-
      /* Shut down readFile facility, free space */
      if (readFile >= 0)
      {
--- 5168,5173 ----
============================================================
*** src/backend/bootstrap/bootstrap.c    55fd17241f51b6f23131a0d36d5ce583aa7a3488
--- src/backend/bootstrap/bootstrap.c    8a54e88b06acad46c83320ca8fe13caa75ad77b9
*************** BootstrapMain(int argc, char *argv[])
*** 418,423 ****
--- 418,425 ----
              bootstrap_signals();
              BootStrapXLOG();
              StartupXLOG();
+             ereport(LOG,
+                     (errmsg("database system is ready")));
              break;

          case BS_XLOG_STARTUP:
============================================================
*** src/backend/postmaster/postmaster.c    561d13618e62e95a32b42b2e9305a638edacf24f
--- src/backend/postmaster/postmaster.c    5a567893b0ed78d312a19e7054127dc5b6b69df3
*************** reaper(SIGNAL_ARGS)
*** 2040,2045 ****
--- 2040,2048 ----
          if (StartupPID != 0 && pid == StartupPID)
          {
              StartupPID = 0;
+             ereport(LOG,
+                     (errmsg("database system is ready")));
+
              /* Note: FATAL exit of startup is treated as catastrophic */
              if (!EXIT_STATUS_0(exitstatus))
              {

Re: Fix "database is ready" race condition

From
Tom Lane
Date:
Markus Schiltknecht <markus@bluegap.ch> writes:
> is there a good reason to print the "database system is ready" message
> in StartupXLOG() in xact.c? It has a) nothing to do with xlog and b)
> opens a small race condition: the message gets printed, while it still
> take some CPU cycles until the postmaster really gets the SIGCHLD signal
> and sets StartupPID = 0. If you (or rather: an automated test program)
> try to connect within this timespan, you get a "database is starting up"
> error, which clearly contradicts the "is ready" message.

I've applied a modified form of this patch.  The postmaster now says
"database system is ready to accept connections" after it's finished
reacting to the completion of the startup process.

            regards, tom lane

Re: Fix "database is ready" race condition

From
Markus Schiltknecht
Date:
Tom Lane wrote:
> I've applied a modified form of this patch.  The postmaster now says
> "database system is ready to accept connections" after it's finished
> reacting to the completion of the startup process.

Thank you, that's just perfect for me.

Markus