Thread: Multiple postmasters running from same directory

Multiple postmasters running from same directory

From
Vikas Sharma
Date:
Hi,

We are running Postgresql 9.4 with streaming replication and repmgr. Operating system is RHEL6.8

On the master I can see multiple postmaster processes from the same data directory. 

ps -ef |grep -i postgres|grep postm
postgres  81440      1  0 Jan31 ?        00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres  97072  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres  97074  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

The streaming replication with one standby looks fine.

I was expecting to see only one postmaster process instead of three and the time shown in PS output for two extra processes changes to current time with every PS command I enter. Secondly, I logfile is full of "Incomplete startup packet" message.

I need help from you experts, Is this the right behaviour of postgres? what could have gone wrong in my case.

Best Regards
Vikas

Re: Multiple postmasters running from same directory

From
Laurenz Albe
Date:
Vikas Sharma wrote:
> We are running Postgresql 9.4 with streaming replication and repmgr. Operating system is RHEL6.8
> 
> On the master I can see multiple postmaster processes from the same data directory. 
> 
> ps -ef |grep -i postgres|grep postm
> postgres  81440      1  0 Jan31 ?        00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
> postgres  97072  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
> postgres  97074  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
> 
> The streaming replication with one standby looks fine.
> 
> I was expecting to see only one postmaster process instead of three and the time shown in
> PS output for two extra processes changes to current time with every PS command I enter.
> Secondly, I logfile is full of "Incomplete startup packet" message.
> 
> I need help from you experts, Is this the right behaviour of postgres? what could have gone wrong in my case.

That looks ok.

The two other processes are children of the postmaster.
It is strange that their process title did not get updated.

What do you see for the processes with "pid" 97072 and 97074 in pg_stat_activity?

The "incomplete startup packet" is caused by processes that connect to the
PostgreSQL TCP port, but don't complete a database connection.
Often these are monitoring or load balancing programs.

Yours,
Laurenz Albe


Re: Multiple postmasters running from same directory

From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> Vikas Sharma wrote:
>> On the master I can see multiple postmaster processes from the same data directory.
>> ps -ef |grep -i postgres|grep postm
>> postgres  81440      1  0 Jan31 ?        00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>> postgres  97072  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>> postgres  97074  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

> The two other processes are children of the postmaster.
> It is strange that their process title did not get updated.

Seeing that they're showing zero runtime, I bet that these are just-forked
children that have not had time to change their process title yet.
The thing that is strange is that you have a steady enough flow of new
connections that there are usually some children like that.

> The "incomplete startup packet" is caused by processes that connect to the
> PostgreSQL TCP port, but don't complete a database connection.
> Often these are monitoring or load balancing programs.

Putting two and two together, you have some monitoring program that is
hitting the postmaster with a constant stream of TCP connection requests
none of which get completed, resulting in a whole lot of useless fork
activity.  Dial down the monitoring.

            regards, tom lane


Re: Multiple postmasters running from same directory

From
Francisco Olarte
Date:
On Tue, Feb 13, 2018 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Laurenz Albe <laurenz.albe@cybertec.at> writes:
>> Vikas Sharma wrote:
>>> On the master I can see multiple postmaster processes from the same data directory.
>>> ps -ef |grep -i postgres|grep postm
>>> postgres  81440      1  0 Jan31 ?        00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>>> postgres  97072  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>>> postgres  97074  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>
>> The two other processes are children of the postmaster.
>> It is strange that their process title did not get updated.
>
> Seeing that they're showing zero runtime, I bet that these are just-forked
> children that have not had time to change their process title yet.
> The thing that is strange is that you have a steady enough flow of new
> connections that there are usually some children like that.

I assume proc title is changed after full startup, as it shows db and user....

>> The "incomplete startup packet" is caused by processes that connect to the
>> PostgreSQL TCP port, but don't complete a database connection.
>> Often these are monitoring or load balancing programs.
>
> Putting two and two together, you have some monitoring program that is
> hitting the postmaster with a constant stream of TCP connection requests
> none of which get completed, resulting in a whole lot of useless fork
> activity.  Dial down the monitoring.

Adding the incomplete startup to the mix, it may be a misconfigured
monitoring program sending just a byte or two, or zero, and then
waiting for response, which will give ps more time to catch the child
in that state. Haven't look at the code, but given messages state with
1 identifier byte plus a 4 byte length, many of the forms of reading
that would lead to a big wait for at least 5 bytes, or for the first
byte.

Francisco Olarte.


Re: Multiple postmasters running from same directory

From
Tom Lane
Date:
Francisco Olarte <folarte@peoplecall.com> writes:
> On Tue, Feb 13, 2018 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Putting two and two together, you have some monitoring program that is
>> hitting the postmaster with a constant stream of TCP connection requests
>> none of which get completed, resulting in a whole lot of useless fork
>> activity.  Dial down the monitoring.

> Adding the incomplete startup to the mix, it may be a misconfigured
> monitoring program sending just a byte or two, or zero, and then
> waiting for response, which will give ps more time to catch the child
> in that state. Haven't look at the code, but given messages state with
> 1 identifier byte plus a 4 byte length, many of the forms of reading
> that would lead to a big wait for at least 5 bytes, or for the first
> byte.

Hm, yeah.  From memory, the child process will wait a maximum of 60
seconds to receive a startup packet.  If the hypothesized probing program
sends nothing, or just a small number of bytes, and then sits rather than
closing the connection, then this state would easily persist long enough
to be observable in ps.

If you're not sure where these probes are coming from, turning on
log_connections should help: the "connection received" message comes out
before waiting for the startup packet.

            regards, tom lane


Re: Multiple postmasters running from same directory

From
Vikas Sharma
Date:
Thanks Tom, 

So is it normal for postgres to fork out new postmaster processes from the same data directory? I haven't seen this earlier.

I will check from where those connection requests are coming in, 

Best Regards
Vikas

On Feb 13, 2018 15:50, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> Vikas Sharma wrote:
>> On the master I can see multiple postmaster processes from the same data directory.
>> ps -ef |grep -i postgres|grep postm
>> postgres  81440      1  0 Jan31 ?        00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>> postgres  97072  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
>> postgres  97074  81440  0 12:17 ?        00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

> The two other processes are children of the postmaster.
> It is strange that their process title did not get updated.

Seeing that they're showing zero runtime, I bet that these are just-forked
children that have not had time to change their process title yet.
The thing that is strange is that you have a steady enough flow of new
connections that there are usually some children like that.

> The "incomplete startup packet" is caused by processes that connect to the
> PostgreSQL TCP port, but don't complete a database connection.
> Often these are monitoring or load balancing programs.

Putting two and two together, you have some monitoring program that is
hitting the postmaster with a constant stream of TCP connection requests
none of which get completed, resulting in a whole lot of useless fork
activity.  Dial down the monitoring.

                        regards, tom lane

Re: Multiple postmasters running from same directory

From
Tom Lane
Date:
Vikas Sharma <shavikas@gmail.com> writes:
> So is it normal for postgres to fork out new postmaster processes from the
> same data directory? I haven't seen this earlier.

They're not postmasters, they're child processes, as you can easily tell
from the PID/PPID columns of your ps output.  But a process inherits its
title from the parent at fork(), and per this discussion, they haven't
changed it yet.

            regards, tom lane