Re: BUG #13643: Should a process dying bring postgresql down, or not? - Mailing list pgsql-bugs

From Amir Rohan
Subject Re: BUG #13643: Should a process dying bring postgresql down, or not?
Date
Msg-id 5609B428.6020006@mail.com
Whole thread Raw
In response to Re: BUG #13643: Should a process dying bring postgresql down, or not?  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BUG #13643: Should a process dying bring postgresql down, or not?  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
On 09/28/2015 12:06 AM, Alvaro Herrera wrote:
> Amir Rohan wrote:
>> On 09/27/2015 09:59 PM, Alvaro Herrera wrote:
>>> amir.rohan@mail.com wrote:
>>>
>>>> postgres     2181  0.0  0.1 134468  9504 pts/0    T    03:34   0:00 /usr/local/pgsql/bin/postgres -D
/home/local/pg/s1
>>>> postgres     2183  0.0  0.0 134576  4168 ?        Ss   03:34   0:00 postgres: checkpointer process
>>>> postgres     2184  0.0  0.0 134604  2844 ?        Ss   03:34   0:00 postgres: writer process
>>>> postgres     2185  0.0  0.0 134468  2780 ?        Ss   03:34   0:00 postgres: wal writer process
>>>> postgres     2186  0.0  0.0      0     0 ?        Zs   03:34   0:00 [postgres] <defunct>         <<<<<<<<<<<<<<<
deadprocess 
>>>> postgres     2187  0.0  0.0 127300  2204 ?        Ss   03:34   0:00 postgres: stats collector process
>>>> postgres     2193  0.0  0.0 118164  2696 pts/0    T    03:34   0:00 pg_basebackup -D /home/local/pg/backup -p
57833--format=t -x 
>>>> postgres     2194  0.0  0.0 134916  6016 ?        Ss   03:34   0:00 postgres: wal sender process user1 [local]
sendingbackup "pg_basebackup base backup" 
>>>
>>> That postmaster is in STOPped mode is the issue here.  That doesn't
>>> happen unless you take specific action to do that.
>>
>> I hadn't noticed that.  That looks like I suspended pg_ctl during start,
>>  but with the backup in progress already, it's not clear how I managed
>> that state. There was no kill -SIGSTOP involved...
>
> Suspending a process *is* sending sigstop.  You may not have sent
> sigstop explicitely, but the shell would have done it if you suspended
> the process.
>
> Since pg_ctl is not normally long-lived, I'm not sure how you ended up
> suspending it.
>
>> After killing some subprocesses in random I do see postgres
>> restarting the whole group once one goes down, if/once its
>> running/unsuspended.
>
> Well, doing things randomly is unlikely to teach you much ...
>

Pardon my earlier HTML response, I had to use the webmail interface at
the time. Sending again as text.

>
>
> Sent: Monday, September 28, 2015 at 12:06 AM
> From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> To: "Amir Rohan" <amir.rohan@mail.com>
> Cc: pgsql-bugs@postgresql.org
> Subject: Re: BUG #13643: Should a process dying bring postgresql down,
or not?

> Amir Rohan wrote:
>> On 09/27/2015 09:59 PM, Alvaro Herrera wrote:
>> > amir.rohan@mail.com wrote:
>> >
>> >> postgres 2181 0.0 0.1 134468 9504 pts/0 T 03:34 0:00
/usr/local/pgsql/bin/postgres -D /home/local/pg/s1
>> >> postgres 2183 0.0 0.0 134576 4168 ? Ss 03:34 0:00 postgres:
checkpointer process
>> >> postgres 2184 0.0 0.0 134604 2844 ? Ss 03:34 0:00 postgres: writer
process
>> >> postgres 2185 0.0 0.0 134468 2780 ? Ss 03:34 0:00 postgres: wal
writer process
>> >> postgres 2186 0.0 0.0 0 0 ? Zs 03:34 0:00 [postgres] <defunct>
<<<<<<<<<<<<<<< dead process
>> >> postgres 2187 0.0 0.0 127300 2204 ? Ss 03:34 0:00 postgres: stats
collector process
>> >> postgres 2193 0.0 0.0 118164 2696 pts/0 T 03:34 0:00 pg_basebackup
-D /home/local/pg/backup -p 57833 --format=t -x
>> >> postgres 2194 0.0 0.0 134916 6016 ? Ss 03:34 0:00 postgres: wal
sender process user1 [local] sending backup "pg_basebackup base backup"
>> >
>> > That postmaster is in STOPped mode is the issue here. That doesn't
>> > happen unless you take specific action to do that.
>>
>> I hadn't noticed that. That looks like I suspended pg_ctl during start,
>> but with the backup in progress already, it's not clear how I managed
>> that state. There was no kill -SIGSTOP involved...
>
> Suspending a process *is* sending sigstop. You may not have sent
> sigstop explicitely, but the shell would have done it if you suspended
> the process.
>

I *know*. But as you can see that backup process is already underway.
That means pg_ctl had returned by then, and I had issued the
pg_basebackup command. Since I didn't manually send a SIGSTOP,
and postgres was already detached by then, I don't know how it
could have gotten suspended.

> Since pg_ctl is not normally long-lived, I'm not sure how you ended up
> suspending it.
>

exactly.

>> After killing some subprocesses in random I do see postgres
>> restarting the whole group once one goes down, if/once its
>> running/unsuspended.

>
> Well, doing things randomly is unlikely to teach you much ...
>

Well, It can teach you which electric socket will
electrocute you when poked with a fork. That's useful data.

Amir

pgsql-bugs by date:

Previous
From: Jeremy Whiting
Date:
Subject: Re: BUG #13646: Upgrading existing db from 9.2 to 9.4.4 not working using postgresql-setup.
Next
From: Alvaro Herrera
Date:
Subject: Re: BUG #13643: Should a process dying bring postgresql down, or not?