Re: [PATCHES] Fix "database is ready" race condition - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [PATCHES] Fix "database is ready" race condition
Date
Msg-id 1170674761.3645.274.camel@silverbirch.site
Whole thread Raw
In response to Re: [PATCHES] Fix "database is ready" race condition  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCHES] Fix "database is ready" race condition  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, 2007-02-04 at 14:15 -0500, Tom Lane wrote:
> Markus Schiltknecht <markus@bluegap.ch> writes:
> > is there a good reason to print the "database system is ready" message 
> > in StartupXLOG() in xact.c? It has a) nothing to do with xlog and b) 
> > opens a small race condition: the message gets printed, while it still 
> > take some CPU cycles until the postmaster really gets the SIGCHLD signal 
> > and sets StartupPID = 0. If you (or rather: an automated test program) 
> > try to connect within this timespan, you get a "database is starting up" 
> > error, which clearly contradicts the "is ready" message.
> 
> I don't think there's any compelling reason for having that log message
> in its current form.  What about redefining it to mean "postmaster is
> ready to accept connections" --- either with that wording, or keeping
> the old wording?  Then we could just put it in one place in postmaster.c
> and be done.  I think your proposed patch is overcomplicated by trying
> to have it still come out in bootstrap/standalone cases.  For a
> standalone backend, getting a prompt is what tells you it's ready ;-)

I'm OK with moving the message to be executed from another place, but I
have some comments on the changed wording.

Firstly, "Database system" is great general wording. "Postmaster" only
means something if you know the architecture of PostgreSQL, which most
people don't. 

If we did change the wording, I'd want to have separate messages for the
two events of
- database can now accept connections
- recovery is complete

One of the TODO items is to allow the dbserver to be available for
read-only queries while still recovering, so any change to the wording
should be made with that in mind, so we don't need to change it too
often.

My suggestions would be
1. "Database system has completed recovery" and
2. "Database system is ready to accept connections"

Currently those messages would occur in that order and be issued by
StartupXLOG() for (1) and postmaster for (2). In the future they may be
issued in a different order.

If we stick with only a single message, we should keep it the same as
now, wherever the code and whatever the exact timing of its execution.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas ADI SD"
Date:
Subject: Re: Proposal: Commit timestamp
Next
From: "Simon Riggs"
Date:
Subject: Re: SRF optimization question