Occupied port warning

Started by Peter Eisentrautover 20 years ago17 messages
#1Peter Eisentraut
peter_e@gmx.net

During a recent training session I was reminded about a peculiar
misbehavior that recent PostgreSQL releases exhibit when the TCP port
they are trying to bind to is occupied:

LOG: could not bind IPv4 socket: Address already in use
HINT: Is another postmaster already running on port 5432? If not, wait
a few seconds and retry.
WARNING: could not create listen socket for "localhost"

The trainees found this behavior somewhat unuseful. Can someone remind
me why this is not an error? Does any other server software behave
this way?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#1)
Re: Occupied port warning

Peter Eisentraut said:

During a recent training session I was reminded about a peculiar
misbehavior that recent PostgreSQL releases exhibit when the TCP port
they are trying to bind to is occupied:

LOG: could not bind IPv4 socket: Address already in use
HINT: Is another postmaster already running on port 5432? If not, wait
a few seconds and retry.
WARNING: could not create listen socket for "localhost"

The trainees found this behavior somewhat unuseful. Can someone remind
me why this is not an error? Does any other server software behave
this way?

IIRC, in previous versions any bind failure was fatal, but in 8.0 we decided
to be slightly more forgiving and only bail out if we failed to bind at all.

cheers

andrew

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#2)
Re: Occupied port warning

Andrew Dunstan wrote:

IIRC, in previous versions any bind failure was fatal, but in 8.0 we
decided to be slightly more forgiving and only bail out if we failed
to bind at all.

I realize that, but I would like to know where that bright idea came
from in violation of all other principles of this and any other
software. I recall that it had something to do with IPv6, but I'm not
sure.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

In reply to: Peter Eisentraut (#3)
Re: Occupied port warning

At 2005-06-28 15:14:29 +0200, peter_e@gmx.net wrote:

I recall that it had something to do with IPv6, but I'm not sure.

Under Linux, if you bind to AF_INET6/::0, a subsequent bind to AF_INET/0
will fail, but the IPv4 address is also bound by the first call, and the
program will accept IPv4 connections anyway (BSD behaves differently).

Maybe that had something to do with it? I remember I had to add code to
my program to allow that second bind to fail without complaint, and now
my code also exits only if it can't bind anything at all.

(For what it's worth, I don't think this behaviour is such a big deal.)

-- ams

#5Alvaro Herrera
alvherre@surnet.cl
In reply to: Peter Eisentraut (#3)
Re: Occupied port warning

On Tue, Jun 28, 2005 at 03:14:29PM +0200, Peter Eisentraut wrote:

Andrew Dunstan wrote:

IIRC, in previous versions any bind failure was fatal, but in 8.0 we
decided to be slightly more forgiving and only bail out if we failed
to bind at all.

I realize that, but I would like to know where that bright idea came
from in violation of all other principles of this and any other
software. I recall that it had something to do with IPv6, but I'm not
sure.

If the TCP socket is used we can still bind to the Unix-domain socket,
no?

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Vivir y dejar de vivir son soluciones imaginarias.
La existencia est� en otra parte" (Andre Breton)

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#3)
Re: Occupied port warning

Peter Eisentraut wrote:

Andrew Dunstan wrote:

IIRC, in previous versions any bind failure was fatal, but in 8.0 we
decided to be slightly more forgiving and only bail out if we failed
to bind at all.

I realize that, but I would like to know where that bright idea came
from in violation of all other principles of this and any other
software. I recall that it had something to do with IPv6, but I'm not
sure.

It came from the fertile brain of Tom Lane :-)

see http://archives.postgresql.org/pgsql-hackers/2004-03/msg00679.php

I think "violation of all other principles of this and any other
software" is far too strong.

cheers

andrew

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#5)
Re: Occupied port warning

Alvaro Herrera wrote:

If the TCP socket is used we can still bind to the Unix-domain
socket, no?

If I configured a TCP/IP socket, what good does a Unix-domain socket do
me?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#6)
Re: Occupied port warning

Andrew Dunstan wrote:

see http://archives.postgresql.org/pgsql-hackers/2004-03/msg00679.php

Well, with once release of field experience behind me I'd like to
revisit this idea. Who would actually be hurt by generating an error
here like it used to do?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#8)
Re: Occupied port warning

I wrote:

Andrew Dunstan wrote:

see
http://archives.postgresql.org/pgsql-hackers/2004-03/msg00679.php

Well, with once release of field experience behind me I'd like to
revisit this idea. Who would actually be hurt by generating an error
here like it used to do?

It seems that the only concern was broken resolvers (namely, "localhost"
not being resolvable). Then you can easily replace that with
127.0.0.1, or * if you like. That sounds like the place for an error
message with a hint, not silent failure. Comments?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: Occupied port warning

Peter Eisentraut <peter_e@gmx.net> writes:

During a recent training session I was reminded about a peculiar
misbehavior that recent PostgreSQL releases exhibit when the TCP port
they are trying to bind to is occupied:

LOG: could not bind IPv4 socket: Address already in use
HINT: Is another postmaster already running on port 5432? If not, wait
a few seconds and retry.
WARNING: could not create listen socket for "localhost"

The trainees found this behavior somewhat unuseful.

What behavior are you proposing, exactly?

I don't think it's practical to make the server error out if it can't
bind to every socket it tries to bind to --- that will leave you dead
in the water in an uncomfortably large number of scenarios. I think
the cases that forced us to adopt this behavior originally were ones
where userland thinks IPv6 is supported but the kernel does not.
Thus, we can *not* treat the list returned by getaddrinfo as gospel.

It might be reasonable to treat some error conditions as fatal but
not others. But you'd have to engage in pretty close analysis to
make sure you weren't buying into any bad behaviors.

regards, tom lane

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#10)
Re: Occupied port warning

Tom Lane wrote:

What behavior are you proposing, exactly?

The least thing it should do is error out if *no* TCP/IP port could be
created while listen_addresses is set.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#11)
Re: Occupied port warning

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane wrote:

What behavior are you proposing, exactly?

The least thing it should do is error out if *no* TCP/IP port could be
created while listen_addresses is set.

That might be reasonable --- I think right now we only die if we
couldn't create the Unix socket either.

regards, tom lane

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#12)
Re: Occupied port warning

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane wrote:

What behavior are you proposing, exactly?

The least thing it should do is error out if *no* TCP/IP port could be
created while listen_addresses is set.

That might be reasonable --- I think right now we only die if we
couldn't create the Unix socket either.

correct (in the cases where we try to create it, e.g. Unix but not Windows).

cheers

andrew

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#11)
Re: Occupied port warning

I wrote:

The least thing it should do is error out if *no* TCP/IP port could
be created while listen_addresses is set.

It's doing that now, and that should guard against the most common
problem, namemly the port already being occupied (since all TCP/IP
listen sockets use the same port).

Reading the comments in StreamServerPort, it seems the only problem we
can't go fatal error everywhere is that on some systems the IPv4 and
IPv6 sockets fight each other when bind() is called. For the other
failure modes, it seems that no such precautions are necessary. In
particular, I think we could error out in all of the following cases:

- Host or service name could not be resolved (just specify it
numerically instead). This would help against mistyped host names and
misconfigured name servers.

- MaxListen exceeded (don't configure so many sockets instead).

- socket() failed

- listen() failed

I think we could also error out if we cannot create at least one listen
socket for each entry in listen_addresses (instead of at least one
overall).

Comments on that?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#14)
Re: Occupied port warning

Peter Eisentraut <peter_e@gmx.net> writes:

Reading the comments in StreamServerPort, it seems the only problem we
can't go fatal error everywhere is that on some systems the IPv4 and
IPv6 sockets fight each other when bind() is called. For the other
failure modes, it seems that no such precautions are necessary. In
particular, I think we could error out in all of the following cases:

I think you are putting *far* too much faith in the platforms that are
out there. We fought enough kernel and libc bugs (or at least
disagreements) while we were putting in IPv6 support to make me very
wary of proposals to treat socket problems as fatal. I would much
rather have the postmaster start and not connect to everything it
originally tried to connect to than have it refuse to play ball until
you get a new kernel version.

- socket() failed

Definitely wrong, see archives. EAFNOSUPPORT for example is an entirely
expected case.

- listen() failed

Ditto, see archives.

I think we could also error out if we cannot create at least one listen
socket for each entry in listen_addresses (instead of at least one
overall).

No; that will break cases that don't need to break.

I was willing to hold still for the limited check you just applied,
but I do not see that making it less error-tolerant than that is a
good idea at all. It will just put obstacles in the path of newbies.

(In fact, I'm not even convinced that the limited check will survive
beta. I think we'll be taking it out again, or at least reducing it
to a WARNING, when the complaints start coming in. As of CVS tip,
a default postmaster configuration will refuse to start if there is
anything wrong with your "localhost" DNS setup, and we already learned
that there are way too many machines where that is true.)

regards, tom lane

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#15)
Re: Occupied port warning

Tom Lane wrote:

I think we could also error out if we cannot create at least one
listen socket for each entry in listen_addresses (instead of at
least one overall).

No; that will break cases that don't need to break.

Which cases would that be? If you specify a host name and it doesn't
get used at all, what sense could that possibly make?

I was willing to hold still for the limited check you just applied,
but I do not see that making it less error-tolerant than that is a
good idea at all. It will just put obstacles in the path of newbies.

Not ignoring errors is one of the staples of PostgreSQL. What you are
proposing here sounds entirely like a MySQL design plan. Maybe that is
newbie-friendly in your mind, but I really doubt that. I agree that we
do not want to force people to change kernel or system libraries. But
it is not acceptable to ignore misconfigurations where a simple change
of a few configuration parameters would correct the situation, as in
this case:

(In fact, I'm not even convinced that the limited check will survive
beta. I think we'll be taking it out again, or at least reducing it
to a WARNING, when the complaints start coming in. As of CVS tip,
a default postmaster configuration will refuse to start if there is
anything wrong with your "localhost" DNS setup, and we already
learned that there are way too many machines where that is true.)

Here, you simply change the configuration to use numeric IP addresses.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#16)
Re: Occupied port warning

Peter Eisentraut <peter_e@gmx.net> writes:

Not ignoring errors is one of the staples of PostgreSQL. What you are
proposing here sounds entirely like a MySQL design plan. Maybe that is
newbie-friendly in your mind, but I really doubt that. I agree that we
do not want to force people to change kernel or system libraries. But
it is not acceptable to ignore misconfigurations where a simple change
of a few configuration parameters would correct the situation,

My fundamental objection here is that I think you will be making error
cases out of situations where a kernel update is the only solution;
in particular the ones stemming from kernel and libc not being on the
same page about whether IPv6 is supported. We must likewise not assume
that a would-be Postgres user is in a position to fix his DNS
infrastructure. Treating these problems as warnings instead of hard
errors is hardly equivalent to risking data loss --- all it says is that
you won't be able to connect from certain places until you fix it, which
is certainly not worse than being unable to connect from anyplace
because you cannot get the postmaster to start.

regards, tom lane