Re: [BUGS] Crash observed during the start of the Postgres process - Mailing list pgsql-bugs
From | K S, Sandhya (Nokia - IN/Bangalore) |
---|---|
Subject | Re: [BUGS] Crash observed during the start of the Postgres process |
Date | |
Msg-id | AM5PR0701MB26425EA0B293B7AC0503E1E3D6C90@AM5PR0701MB2642.eurprd07.prod.outlook.com Whole thread Raw |
In response to | Re: [BUGS] Crash observed during the start of the Postgres process (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-bugs |
Hi Tom Lane, After removing our patch to change FATAL to LOG, we are not observing the crash now. Thank you for your support. We were struck with this issue for a while. Regards, Sandhya -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Friday, May 12, 2017 10:08 PM To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya.k_s@nokia.com> Cc: 'Merlin Moncure' <mmoncure@gmail.com>; 'pgsql-hackers@postgresql.org' <pgsql-hackers@postgresql.org>; 'pgsql-bugs@postgresql.org'<pgsql-bugs@postgresql.org>; Itnal, Prakash (Nokia - IN/Bangalore) <prakash.itnal@nokia.com>;T, Rasna (Nokia - IN/Bangalore) <rasna.t@nokia.com> Subject: Re: [BUGS] Crash observed during the start of the Postgres process "K S, Sandhya (Nokia - IN/Bangalore)" <sandhya.k_s@nokia.com> writes: > I have filtered the logs based on PID (19825) to see if this helps to > debug the issue further. Is this really a stock Postgres build? The proximate cause of the PANIC is that the startup process is seeing other processes active even though it hasn't reachedConsistency. This is bad on any number of levels, quite aside from that particular PANIC, because those other processes are presumably seeing non-consistent database state. Looking elsewhere in the log, we see that indeed there seem to be several backend processes happily executing commands. For instance, here's the trace of one of them starting up: [19810-58f473ff.4d62-187] 2017-04-17 07:51:28.783 GMT < > DEBUG: 00000: forked new backend, pid=19850 socket=10 [19810-58f473ff.4d62-188] 2017-04-17 07:51:28.783 GMT < > LOCATION: BackendStartup, postmaster.c:3884 [19850-58f47400.4d8a-1] 2017-04-17 07:51:28.783 GMT < > LOG: 57P03: the database system is starting up [19850-58f47400.4d8a-2] 2017-04-17 07:51:28.783 GMT < > LOCATION: ProcessStartupPacket, postmaster.c:2143 [19850-58f47400.4d8a-3] 2017-04-17 07:51:28.784 GMT < authentication> DEBUG: 00000: postgres child[19850]: starting with( Now, that LOG message proves that this backend has observed that the database is not ready to allow connections. So why did it only emit the message as LOG and keep going? The code for this in 9.3 looks like /* * If we're going to reject the connection due to database state, say so * now instead of wasting cycles on an authenticationexchange. (This also * allows a pg_ping utility to be written.) */switch (port->canAcceptConnections){ caseCAC_STARTUP: ereport(FATAL, (errcode(ERRCODE_CANNOT_CONNECT_NOW), errmsg("the databasesystem is starting up"))); break;... I can't draw any other conclusion but that you've hacked something to make FATAL act like LOG. Which is a fatal mistake. Errors that are marked FATAL are generally ones where allowing the process to keep going is not an acceptable outcome. regards, tom lane
pgsql-bugs by date: