Syslogger tries to write to /dev/null on Windows

Started by Heikki Linnakangasabout 16 years ago6 messagesbugs
Jump to latest
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com

A customer is facing a problem on PostgreSQL 8.3.10 on Windows where the
syslogger process dies mysteriously every few hours under load. I
haven't yet figured out why, but when postmaster tries to respawn
syslogger, it doesn't start up but dies immediately.

The reason the relaunch fails is that in SysLoggerMain we have this:

/*
* If we restarted, our stderr is already redirected into our own input
* pipe. This is of course pretty useless, not to mention that it
* interferes with detecting pipe EOF. Point stderr to /dev/null. This
* assumes that all interesting messages generated in the syslogger will
* come through elog.c and will be sent to write_syslogger_file.
*/
if (redirection_done)
{
int fd = open(NULL_DEV, O_WRONLY, 0);

/*
* The closes might look redundant, but they are not: we want to be
* darn sure the pipe gets closed even if the open failed. We can
* survive running with stderr pointing nowhere, but we can't afford
* to have extra pipe input descriptors hanging around.
*/
close(fileno(stdout));
close(fileno(stderr));
dup2(fd, fileno(stdout));
dup2(fd, fileno(stderr));
close(fd);
}

NULL_DEV is defined in c.h as "/dev/null", which doesn't work on
windows. We have a port-specific #define DEVNULL which does work, we
should be using that.

Peter Eisentraut inadvertently fixed this for 8.4:

http://archives.postgresql.org/pgsql-committers/2008-12/msg00095.php

by removing NULL_DEV and using always DEVNULL, but back-branches need
that too. I'll leave NULL_DEV as it is just in case it's used by 3rd
party modules, but change the two uses of it to use DEVNULL.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#2Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#1)
Re: Syslogger tries to write to /dev/null on Windows

2010/4/1 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:

NULL_DEV is defined in c.h as "/dev/null", which doesn't work on
windows. We have a port-specific #define DEVNULL which does work, we
should be using that.

Wow. Looking at the code around NULL_DEV, that actually looks like a
merge mistake from the old windows branch or something - it has a
comment stating that it should be different on NT, but it isn't...

Peter Eisentraut inadvertently fixed this for 8.4:

http://archives.postgresql.org/pgsql-committers/2008-12/msg00095.php

by removing NULL_DEV and using always DEVNULL, but back-branches need
that too. I'll leave NULL_DEV as it is just in case it's used by 3rd
party modules, but change the two uses of it to use DEVNULL.

Sounds good to me.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#1)
Re: Syslogger tries to write to /dev/null on Windows

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

A customer is facing a problem on PostgreSQL 8.3.10 on Windows where the
syslogger process dies mysteriously every few hours under load. I
haven't yet figured out why, but when postmaster tries to respawn
syslogger, it doesn't start up but dies immediately.

The reason the relaunch fails is that in SysLoggerMain we have this:

/*
* If we restarted, our stderr is already redirected into our own input
* pipe. This is of course pretty useless, not to mention that it
* interferes with detecting pipe EOF. Point stderr to /dev/null. This
* assumes that all interesting messages generated in the syslogger will
* come through elog.c and will be sent to write_syslogger_file.
*/
if (redirection_done)
{
int fd = open(NULL_DEV, O_WRONLY, 0);

/*
* The closes might look redundant, but they are not: we want to be
* darn sure the pipe gets closed even if the open failed. We can
* survive running with stderr pointing nowhere, but we can't afford
* to have extra pipe input descriptors hanging around.
*/
close(fileno(stdout));
close(fileno(stderr));
dup2(fd, fileno(stdout));
dup2(fd, fileno(stderr));
close(fd);
}

NULL_DEV is defined in c.h as "/dev/null", which doesn't work on
windows. We have a port-specific #define DEVNULL which does work, we
should be using that.

Hmm. I agree with your proposed change, but it seems to me that there
is still another mystery here: why does the mistaken open() argument
lead to a crash? Per the second comment, this code is supposed to keep
working even if the open() fails. If it fails because of that, we have
robustness problems everywhere not only on Windows --- consider ENFILE
or some such.

regards, tom lane

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#3)
Re: Syslogger tries to write to /dev/null on Windows

Tom Lane wrote:

Hmm. I agree with your proposed change, but it seems to me that there
is still another mystery here: why does the mistaken open() argument
lead to a crash? Per the second comment, this code is supposed to keep
working even if the open() fails. If it fails because of that, we have
robustness problems everywhere not only on Windows --- consider ENFILE
or some such.

Hmm, good question.

The open() fails and returns a return code as you would expect. But the
dup2() call crashes when passed an invalid file descriptor, I just
tested that with a small test program on Windows.

Googling around, this seems to be because of a feature called Parameter
Validation: http://msdn.microsoft.com/en-us/library/ksazx244.aspx

Following the example there to set a custom Invalid Parameter Handler
Routine makes dup2() not crash on an invalid file handle. We could do
that in PostgreSQL, setting a handler that reports a warning in the log
and continues. Or we could just be more careful when passing parameters
to system functions; this is the first time we hear about this so it
doesn't seem to be a very widespread issue. In this case we should do this:

*** a/src/backend/postmaster/syslogger.c
--- b/src/backend/postmaster/syslogger.c
***************
*** 194,202 ****
  		 */
  		close(fileno(stdout));
  		close(fileno(stderr));
! 		dup2(fd, fileno(stdout));
! 		dup2(fd, fileno(stderr));
! 		close(fd);
  	}
  	/*
--- 194,205 ----
  		 */
  		close(fileno(stdout));
  		close(fileno(stderr));
! 		if (fd != -1)
! 		{
! 			dup2(fd, fileno(stdout));
! 			dup2(fd, fileno(stderr));
! 			close(fd);
! 		}
  	}

/*

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#4)
Re: Syslogger tries to write to /dev/null on Windows

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

The open() fails and returns a return code as you would expect. But the
dup2() call crashes when passed an invalid file descriptor, I just
tested that with a small test program on Windows.

Ah, thanks Windows :-(

! if (fd != -1)
! {
! dup2(fd, fileno(stdout));
! dup2(fd, fileno(stderr));
! close(fd);
! }

+1 for fixing it like this. It's cleaner anyway.

Is that actually the cause of the original bug report, or is there
another issue yet to solve?

regards, tom lane

#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#5)
Re: Syslogger tries to write to /dev/null on Windows

Tom Lane wrote:

Is that actually the cause of the original bug report, or is there
another issue yet to solve?

I still don't know what caused syslogger to die in the first place, this
bug only affects its respawning. It might not be a PostgreSQL issue at
all, but we'll see.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com