Thread: What is the best and easiest implementation to reliably wait for the completion of startup?
What is the best and easiest implementation to reliably wait for the completion of startup?
From
"MauMau"
Date:
Hello, I've encountered a problem of PostgreSQL startup, and I can think of a simple solution for that. However, I don't yet have much knowledge about PostgreSQL implementation, I'd like to ask you about what is the best and easiest solution. If it is easy for me to work on during my spare time at home, I'm willing to implement the patch. [problem] I can't reliably wait for the completion of PostgreSQL startup. I want pg_ctl to wait until the server completes startup and accepts connections. Yes, we have "-w" and "-t wait_second" options of pg_ctl. However, what value should I specify to -t? I have to specify much time, say 3600 seconds, in case the startup processing takes long for crash recovery or archive recovery. The bad thing is that pg_ctl continues to wait until the specified duration passes, even if postgres fails to start. For example, it is naturally desirable for pg_ctl to terminate when postgresql.conf contains a syntax error. [solution idea] Use unnamed pipes for postmaster to notify pg_ctl of the completion of startup. That is: pg_ctl's steps: 1. create a pair of unnamed pipes. 2. starts postgres. 3. read the pipe, waiting for a startup completion message from postmaster. postmaster's steps: 1. inherit a pair of unnamed pipes from pg_ctl. 2. do startup processing. 3. write a startup completion message to the pipe, then closes the pipe. I'm wondering if this is correct and easy. One concern is whether postmaster can inherit pipes through system() call. Please give me your ideas. Of course, I would be very happy if some experienced community member could address this problem. And finally, do you think this should be handled as a bug, or an improvement in 9.2? Regards MauMau
Re: What is the best and easiest implementation to reliably wait for the completion of startup?
From
Tom Lane
Date:
"MauMau" <maumau307@gmail.com> writes: > The bad thing is that pg_ctl continues to wait until the specified duration > passes, even if postgres fails to start. For example, it is naturally > desirable for pg_ctl to terminate when postgresql.conf contains a syntax > error. Hmm, I thought we'd fixed this in the last go-round of pg_ctl wait revisions, but testing proves it does not work desirably in HEAD: not only does pg_ctl wait till its timeout elapses, but it then reports "server started" even though the server didn't start. That's clearly a bug :-( I think your proposal of a pipe-based solution might be overkill though. Seems like it would be sufficient for pg_ctl to give up if it doesn't see the postmaster.pid file present within a couple of seconds of postmaster startup. I don't really want to add logic to the postmaster to have the sort of reporting protocol you propose, because not everybody uses pg_ctl to start the postmaster. In any case, we need a fix in 9.1 ... regards, tom lane
Re: What is the best and easiest implementation to reliably wait for the completion of startup?
From
"MauMau"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us> > "MauMau" <maumau307@gmail.com> writes: >> The bad thing is that pg_ctl continues to wait until the specified >> duration >> passes, even if postgres fails to start. For example, it is naturally >> desirable for pg_ctl to terminate when postgresql.conf contains a syntax >> error. > > Hmm, I thought we'd fixed this in the last go-round of pg_ctl wait > revisions, but testing proves it does not work desirably in HEAD: > not only does pg_ctl wait till its timeout elapses, but it then reports > "server started" even though the server didn't start. That's clearly a > bug :-( > > I think your proposal of a pipe-based solution might be overkill though. > Seems like it would be sufficient for pg_ctl to give up if it doesn't > see the postmaster.pid file present within a couple of seconds of > postmaster startup. I don't really want to add logic to the postmaster > to have the sort of reporting protocol you propose, because not > everybody uses pg_ctl to start the postmaster. In any case, we need a > fix in 9.1 ... Yes, I was a bit afraid the pipe-based fix might be overkill, too, so I was wondering if there might be a more easy solution. "server started"... I missed it. That's certainly a bug, as you say. I was also considering the postmaster.pid-based solution exactly as you suggest, but that has a problem -- how many seconds do we assume for "a couple of seconds"? If the system load is temporarily so high that postmaster takes many seconds to create postmaster.pid, pg_ctl mistakenly thinks that postmaster failed to start. I know this is a hypothetical rare case. I don't like touching the postmaster logic and complicating it, but logical correctness needs to come first (Japanese users are very severe). Another problem with postmaster.pid-based solution happens after postmaster crashes. When postmaster crashes, postmaster.pid is left. If the pid in postmaster.pid is allocated to some non-postgres process and that process remains, pg_ctl misjudges that postmaster is starting up, and waits for long time. Regards MauMau