Thread: BUG #14199: The pg_ctl status check on server start is not compatible with the silent_mode=on

VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDE5OQpMb2dnZWQgYnk6ICAg
ICAgICAgIE1ha3N5bSBTb2JvbHlldgpFbWFpbCBhZGRyZXNzOiAgICAgIHNv
Ym9tYXhAZnJlZWJzZC5vcmcKUG9zdGdyZVNRTCB2ZXJzaW9uOiA5LjEuMjIK
T3BlcmF0aW5nIHN5c3RlbTogICBGcmVlQlNEIDEwLjMtUkVMRUFTRSBhbWQ2
NApEZXNjcmlwdGlvbjogICAgICAgIAoKVGhlcmUgaXMgYSBwcm9ibGVtIHdp
dGggcGdfY3RsIHdoZW4gaXQgdHJpZXMgdG8gc3RhcnQgc2VydmVyIHdpdGgg
dGhlCiJzaWxlbnRfbW9kZT1vbiIgb3B0aW9uIGVuYWJsZWQgaW4gcG9zdGdy
ZXNxbC5jb25mLiBTcGVjaWZpY2FsbHksIHRoaXMKb3B0aW9uIGNhdXNlcyBw
b3N0Z3JlcyB0byBmb3JrIG9uY2UgbW9yZSBhZnRlciBzdGFydC4gVGhlcmUg
YXJlIHR3byBwcm9ibGVtcwpjYXVzZWQgYnkgdGhhdDoNCg0KMS4gVGhlIHBt
X3BpZCByZWNvcmRlZCBieSB0aGUgcGdfY3RsIHdoZW4gZG9pbmcgZm9yaytl
eGVjdmUgbm8gbG9uZ2VyCm1hdGNoZXMgdGhlIFBJRCBpbiB0aGUgcG9zdG1h
c3Rlci5waWQgZmlsZS4gVGhpcyBjYXVzZXMgcGdfY3RsIGJhaWwgb3V0Cmlt
bWVkaWF0ZWx5Lg0KDQoyLiBNZXRob2QgdGhhdCBwZ19jdGwgdXNlcyB0byBw
b2xsIGlmIHBvc3RncmVzIGV4aXRlZCBwcmVtYXR1cmVseSBubyBsb25nZXIK
d29ya3MuIEluIFBPU0lYICJjaGlsZCBvZiBteSBjaGlsZCBpcyBub3QgbXkg
Y2hpbGQiLCB0aGVyZWZvcmUgaXQgaXMKaW1wb3NzaWJsZSB0byB3YWl0cGlk
KCkgb24gdGhhdCBwcm9jZXNzIGV2ZW4gaWYgd2UgdXNlIGEgY29ycmVjdCBQ
SUQgZnJvbQp0aGUgcG9zdG1hc3Rlci5waWQgZmlsZSwgd2hpbGUgd2FpdHBp
ZCgpIG9uIHRoZSBvcmlnaW5hbCBwcm9jZXNzIHdvdWxkIGNhdXNlCnJhY2Ug
Y29uZGl0aW9uLCBzaW5jZSB0aGF0IHByb2Nlc3MganVzdCBkb2VzIGZvcmso
KSBhbmQgZXhpdCwgc28gYnkgdGhlIHRpbWUKd2hlbiByZWFsIHBvc3RncmVz
IGhhcyBhIGNoYW5jZSB0byBmdWxseSBwb3B1bGF0ZSBwb3N0bWFzdGVyLnBp
ZCBpdCBtaWdodAphbHJlYWR5IGJlIGdvbmUuDQoNCkF0dGFjaGVkIHBhdGNo
IGZpeGVzIHRoYXQgaXNzdWUgYnkgY2hhbmdpbmcgdGhlIHdheSBwZ19jdGwg
cG9sbHMgb24gdGhlCmNoaWxkIHN0YXR1cy4gSW5zdGVhZCBvZiB1c2luZyB3
YWl0cGlkKCksIHdoaWNoIGFzIGRlc2NyaWJlZCBhYm92ZSBjb3VsZCBub3QK
d29yayBldmVuIGluIHByaW5jaXBsZSBmb3IgdGhlICJncmFuZC1jaGlsZHJl
biIgcHJvY2Vzc2VzLCB3ZSBjcmVhdGUgYQpzb2NrZXRwYWlyIChpLmUuIHBp
cGUpIG9uZSBlbmQgb2Ygd2hpY2ggaXMgdGhlbiBwYXNzZWQgaW50byBmb3Jr
ZWQgcGdfY3RsCmFuZCBoZW5jZSBpbmhlcml0ZWQgYnkgdGhlIHBvc3RncmVz
IGl0c2VsZiBhZnRlciBleGVjdmUuDQoNCkluIHRoZSB1bmxpa2VseSBldmVu
dCBvZiBwb3N0Z3JlcyBleGl0aW5nIHByZW1hdHVyZWx5IHRoYXQgcGlwZSB3
b3VsZCBnZXQKY2xvc2VkIGJ5IHRoZSBrZXJuZWwgYW5kIHNvIHRoYXQgdGhl
IHBnX2N0bCB3b3VsZCBnZXQgRU9GIHRyeWluZyB0byBkbwpub24tYmxvY2tp
bmcgcmVhZCBvbiBpdHMgb3duIGVuZCwgdGhlcmVieSBiZWluZyBhYmxlIHRv
IGJhaWwgb3V0IHF1aWNrbHkKaW5zdGVhZCBvZiB3YWl0aW5nIGZvciB0aGUg
dGltZW91dCB0byBoYXBwZW4uIFRoaXMgc2hvdWxkIHdvcmsgbmljZWx5IG5v
Cm1hdHRlciBob3cgbWFueSB0aW1lcyBjaGlsZCBmb3JrcyBhZnRlciBleGVj
dmUoKS4NCg0KVGhpcyBwcm9ibGVtIGlzIGV4cG9zZWQgYnkgdGhlIGZhY3Qg
dGhhdCBGcmVlQlNEIHBvcnQgaGFzIHRoYXQgb3B0aW9uIHNldCBpbgppdHMg
ZGVmYXVsdCBzZXJ2ZXIgY29uZmlndXJhdGlvbiBmaWxlIGZvciB0aGUgdmVy
c2lvbiA5LjEgLg0KDQpodHRwczovL3N2bndlYi5mcmVlYnNkLm9yZy9wb3J0
cy9oZWFkL2RhdGFiYXNlcy9wb3N0Z3Jlc3FsOTEtc2VydmVyL2ZpbGVzL3Bh
dGNoLXNyYyUzQWJhY2tlbmQlM0F1dGlscyUzQW1pc2MlM0Fwb3N0Z3Jlc3Fs
LmNvbmYuc2FtcGxlP2Fubm90YXRlPTM0MDcyNQ0KDQpGb3Igc29tZSByZWFz
b24gaXQgd2FzIG5vdCBhIHByb2JsZW0gdW50aWwgcmVjZW50bHkuIEkgdGhp
bmsgaXQgbWlnaHQgYmUKYnJvdWdodCBpbnRvIHRoZSBsaW1lbGlnaHQgYnkg
c29tZSBpbnRlcm5hbCBjaGFuZ2VzIGluIHRoZSBQRydzIGhhbmRsaW5nIG9m
CnRoZSBzYWlkIG9wdGlvbi4KCg==
sobomax@freebsd.org writes:
> There is a problem with pg_ctl when it tries to start server with the
> "silent_mode=on" option enabled in postgresql.conf. Specifically, this
> option causes postgres to fork once more after start. There are two problems
> caused by that:
> 1. The pm_pid recorded by the pg_ctl when doing fork+execve no longer
> matches the PID in the postmaster.pid file. This causes pg_ctl bail out
> immediately.
> 2. Method that pg_ctl uses to poll if postgres exited prematurely no longer
> works.

> For some reason it was not a problem until recently.

After reviewing recent commits, I realize that this was caused by
https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=c869a7d5b
While generally that was a good thing, in hindsight it's obvious that
it doesn't work with silent_mode.  That's a non-issue in 9.2 and later
since silent_mode is gone anyway, but it is an issue for 9.1.

> Attached patch fixes that issue by changing the way pg_ctl polls on the
> child status.

There is no need of this complication in >= 9.2.  We could maybe apply it
in the 9.1 branch only, but I am quite loath to accept such a nontrivial
and portability-sensitive change that way.  The main reason being that
9.1 is almost EOL: its next minor release might well be its last.  If
there's anything wrong with this approach, we may not find out about it
until after 9.1 is out of support and won't get patched anymore.

What seems like a more conservative answer to me is to revert c869a7d5b
in 9.1 only, and address the buildfarm stability issue it sought to
resolve by increasing the fixed timeout from 5 seconds to, say, 10.

Thoughts?
        regards, tom lane



On Sun, Jun 19, 2016 at 12:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What seems like a more conservative answer to me is to revert c869a7d5b
> in 9.1 only, and address the buildfarm stability issue it sought to
> resolve by increasing the fixed timeout from 5 seconds to, say, 10.

+1 for doing that. Knowing that silent_mode will be out of community
support scope in a couple of months, that's the right answer to this
bug call.
-- 
Michael



Michael Paquier <michael.paquier@gmail.com> writes:
> On Sun, Jun 19, 2016 at 12:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What seems like a more conservative answer to me is to revert c869a7d5b
>> in 9.1 only, and address the buildfarm stability issue it sought to
>> resolve by increasing the fixed timeout from 5 seconds to, say, 10.

> +1 for doing that. Knowing that silent_mode will be out of community
> support scope in a couple of months, that's the right answer to this
> bug call.

Done that way.
        regards, tom lane