Multiple postmasters running from same directory

Started by Vikas Sharmaabout 8 years ago7 messagesgeneral
Jump to latest
#1Vikas Sharma
shavikas@gmail.com

Hi,

We are running Postgresql 9.4 with streaming replication and repmgr.
Operating system is RHEL6.8

On the master I can see multiple postmaster processes from the same data
directory.

ps -ef |grep -i postgres|grep postm
postgres 81440 1 0 Jan31 ? 00:11:37
/usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97072 81440 0 12:17 ? 00:00:00
/usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97074 81440 0 12:17 ? 00:00:00
/usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

The streaming replication with one standby looks fine.

I was expecting to see only one postmaster process instead of three and the
time shown in PS output for two extra processes changes to current time
with every PS command I enter. Secondly, I logfile is full of "Incomplete
startup packet" message.

I need help from you experts, Is this the right behaviour of postgres? what
could have gone wrong in my case.

Best Regards
Vikas

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Vikas Sharma (#1)
Re: Multiple postmasters running from same directory

Vikas Sharma wrote:

We are running Postgresql 9.4 with streaming replication and repmgr. Operating system is RHEL6.8

On the master I can see multiple postmaster processes from the same data directory.

ps -ef |grep -i postgres|grep postm
postgres 81440 1 0 Jan31 ? 00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97072 81440 0 12:17 ? 00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97074 81440 0 12:17 ? 00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

The streaming replication with one standby looks fine.

I was expecting to see only one postmaster process instead of three and the time shown in
PS output for two extra processes changes to current time with every PS command I enter.
Secondly, I logfile is full of "Incomplete startup packet" message.

I need help from you experts, Is this the right behaviour of postgres? what could have gone wrong in my case.

That looks ok.

The two other processes are children of the postmaster.
It is strange that their process title did not get updated.

What do you see for the processes with "pid" 97072 and 97074 in pg_stat_activity?

The "incomplete startup packet" is caused by processes that connect to the
PostgreSQL TCP port, but don't complete a database connection.
Often these are monitoring or load balancing programs.

Yours,
Laurenz Albe

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#2)
Re: Multiple postmasters running from same directory

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Vikas Sharma wrote:

On the master I can see multiple postmaster processes from the same data directory.
ps -ef |grep -i postgres|grep postm
postgres 81440 1 0 Jan31 ? 00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97072 81440 0 12:17 ? 00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97074 81440 0 12:17 ? 00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

The two other processes are children of the postmaster.
It is strange that their process title did not get updated.

Seeing that they're showing zero runtime, I bet that these are just-forked
children that have not had time to change their process title yet.
The thing that is strange is that you have a steady enough flow of new
connections that there are usually some children like that.

The "incomplete startup packet" is caused by processes that connect to the
PostgreSQL TCP port, but don't complete a database connection.
Often these are monitoring or load balancing programs.

Putting two and two together, you have some monitoring program that is
hitting the postmaster with a constant stream of TCP connection requests
none of which get completed, resulting in a whole lot of useless fork
activity. Dial down the monitoring.

regards, tom lane

#4Francisco Olarte
folarte@peoplecall.com
In reply to: Tom Lane (#3)
Re: Multiple postmasters running from same directory

On Tue, Feb 13, 2018 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Vikas Sharma wrote:

On the master I can see multiple postmaster processes from the same data directory.
ps -ef |grep -i postgres|grep postm
postgres 81440 1 0 Jan31 ? 00:11:37 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97072 81440 0 12:17 ? 00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data
postgres 97074 81440 0 12:17 ? 00:00:00 /usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

The two other processes are children of the postmaster.
It is strange that their process title did not get updated.

Seeing that they're showing zero runtime, I bet that these are just-forked
children that have not had time to change their process title yet.
The thing that is strange is that you have a steady enough flow of new
connections that there are usually some children like that.

I assume proc title is changed after full startup, as it shows db and user....

The "incomplete startup packet" is caused by processes that connect to the
PostgreSQL TCP port, but don't complete a database connection.
Often these are monitoring or load balancing programs.

Putting two and two together, you have some monitoring program that is
hitting the postmaster with a constant stream of TCP connection requests
none of which get completed, resulting in a whole lot of useless fork
activity. Dial down the monitoring.

Adding the incomplete startup to the mix, it may be a misconfigured
monitoring program sending just a byte or two, or zero, and then
waiting for response, which will give ps more time to catch the child
in that state. Haven't look at the code, but given messages state with
1 identifier byte plus a 4 byte length, many of the forms of reading
that would lead to a big wait for at least 5 bytes, or for the first
byte.

Francisco Olarte.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Francisco Olarte (#4)
Re: Multiple postmasters running from same directory

Francisco Olarte <folarte@peoplecall.com> writes:

On Tue, Feb 13, 2018 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Putting two and two together, you have some monitoring program that is
hitting the postmaster with a constant stream of TCP connection requests
none of which get completed, resulting in a whole lot of useless fork
activity. Dial down the monitoring.

Adding the incomplete startup to the mix, it may be a misconfigured
monitoring program sending just a byte or two, or zero, and then
waiting for response, which will give ps more time to catch the child
in that state. Haven't look at the code, but given messages state with
1 identifier byte plus a 4 byte length, many of the forms of reading
that would lead to a big wait for at least 5 bytes, or for the first
byte.

Hm, yeah. From memory, the child process will wait a maximum of 60
seconds to receive a startup packet. If the hypothesized probing program
sends nothing, or just a small number of bytes, and then sits rather than
closing the connection, then this state would easily persist long enough
to be observable in ps.

If you're not sure where these probes are coming from, turning on
log_connections should help: the "connection received" message comes out
before waiting for the startup packet.

regards, tom lane

#6Vikas Sharma
shavikas@gmail.com
In reply to: Tom Lane (#3)
Re: Multiple postmasters running from same directory

Thanks Tom,

So is it normal for postgres to fork out new postmaster processes from the
same data directory? I haven't seen this earlier.

I will check from where those connection requests are coming in,

Best Regards
Vikas

On Feb 13, 2018 15:50, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Vikas Sharma wrote:

On the master I can see multiple postmaster processes from the same

data directory.

ps -ef |grep -i postgres|grep postm
postgres 81440 1 0 Jan31 ? 00:11:37

/usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

postgres 97072 81440 0 12:17 ? 00:00:00

/usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

postgres 97074 81440 0 12:17 ? 00:00:00

/usr/pgsql-9.4/bin/postmaster -D /var/lib/pgsql/9.4/data

The two other processes are children of the postmaster.
It is strange that their process title did not get updated.

Seeing that they're showing zero runtime, I bet that these are just-forked
children that have not had time to change their process title yet.
The thing that is strange is that you have a steady enough flow of new
connections that there are usually some children like that.

The "incomplete startup packet" is caused by processes that connect to

the

PostgreSQL TCP port, but don't complete a database connection.
Often these are monitoring or load balancing programs.

Putting two and two together, you have some monitoring program that is
hitting the postmaster with a constant stream of TCP connection requests
none of which get completed, resulting in a whole lot of useless fork
activity. Dial down the monitoring.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vikas Sharma (#6)
Re: Multiple postmasters running from same directory

Vikas Sharma <shavikas@gmail.com> writes:

So is it normal for postgres to fork out new postmaster processes from the
same data directory? I haven't seen this earlier.

They're not postmasters, they're child processes, as you can easily tell
from the PID/PPID columns of your ps output. But a process inherits its
title from the parent at fork(), and per this discussion, they haven't
changed it yet.

regards, tom lane