pgsql: Reserve the shared memory region during backend startup on

Started by Nonameover 16 years ago9 messages
#1Noname
mha@postgresql.org

Log Message:
-----------
Reserve the shared memory region during backend startup on Windows, so
that memory allocated by starting third party DLLs doesn't end up
conflicting with it.

Hopefully this solves the long-time issue with "could not reattach
to shared memory" errors on Win32.

Patch from Tsutomu Yamada and me, based on idea from Trevor Talbot.

Modified Files:
--------------
pgsql/src/backend/port:
win32_shmem.c (r1.11 -> r1.12)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/port/win32_shmem.c?r1=1.11&r2=1.12)
pgsql/src/backend/postmaster:
postmaster.c (r1.584 -> r1.585)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/postmaster/postmaster.c?r1=1.584&r2=1.585)
pgsql/src/include/port:
win32.h (r1.88 -> r1.89)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/port/win32.h?r1=1.88&r2=1.89)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

mha@postgresql.org (Magnus Hagander) writes:

Log Message:
-----------
Reserve the shared memory region during backend startup on Windows, so
that memory allocated by starting third party DLLs doesn't end up
conflicting with it.

I am wondering why failure of the various TerminateProcess calls in
postmaster.c is elog(ERROR) and not elog(LOG). While that probably
shouldn't happen, aborting the postmaster isn't a good response if it
does. This patch introduces a new occurrence, but I see it is just
copying what was there already.

regards, tom lane

#3Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#2)
Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

On Sat, Jul 25, 2009 at 19:50, Tom Lane<tgl@sss.pgh.pa.us> wrote:

mha@postgresql.org (Magnus Hagander) writes:

Log Message:
-----------
Reserve the shared memory region during backend startup on Windows, so
that memory allocated by starting third party DLLs doesn't end up
conflicting with it.

I am wondering why failure of the various TerminateProcess calls in
postmaster.c is elog(ERROR) and not elog(LOG).  While that probably
shouldn't happen, aborting the postmaster isn't a good response if it
does.  This patch introduces a new occurrence, but I see it is just
copying what was there already.

The case where it's doing it now is really a "can't happen" place, so
I don't think it's a big issue there. It could be argued that if we
can't terminate a process we just created (but never even started!),
something is very very badly broken on the system and we might as well
give up. Same for the part where we fail to ResumeThread() on the main
thread of a new process.

However, it seems that for example the one at line 3629 - where we're
just failing to save our backend state - shouldn't be such a FATAL
error. But if you actually look up into the function
save_backend_variables(), it's actually hardcoded to return true... In
case something goes wrong in there, there's an actual ereport(ERROR)
happening deep down already
(write_inheritable_socket/write_duplicated_handle).

To fix that we'd just have to turn those functions all into returning
boolean and log with LOG instead. AFAIK, we've had zero reports of
this actually happening, so I'm not sure it's worth redesigning.
Thoughts?

--
Magnus Hagander
Self: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#3)
Re: Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

Magnus Hagander <magnus@hagander.net> writes:

To fix that we'd just have to turn those functions all into returning
boolean and log with LOG instead. AFAIK, we've had zero reports of
this actually happening, so I'm not sure it's worth redesigning.
Thoughts?

I'm not really insisting on a redesign. I'm talking about the places
where the code author appears not to have understood that ERROR means
FATAL, because the code keeps plugging on after it anyway. As far as
I can see, using ERROR at lines 3630, 3657, 3674, and 3693 is just
plain bogus, and changing to LOG there requires no other fixing.

regards, tom lane

#5Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#4)
Re: Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

On Mon, Jul 27, 2009 at 16:14, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

To fix that we'd just have to turn those functions all into returning
boolean and log with LOG instead. AFAIK, we've had zero reports of
this actually happening, so I'm not sure it's worth redesigning.
Thoughts?

I'm not really insisting on a redesign.  I'm talking about the places
where the code author appears not to have understood that ERROR means
FATAL, because the code keeps plugging on after it anyway.  As far as
I can see, using ERROR at lines 3630, 3657, 3674, and 3693 is just
plain bogus, and changing to LOG there requires no other fixing.

3630: can't happen, because we already elog(ERROR) deep in the
function, which is what I tried to outline above. That's the one
requiring a redesign - because the errors *inside* the function it
calls can certainly happen.
3657: is one of those "should never happen" issues - which we haven't
had any reports of.
3674: see above
3693 is a comment, I assume you mean 3683, which is also one of those
can't happen.

But. I'll look into cleaning those up for HEAD anyway, but due to lack
of reports I think we should skip backpatch. Reasonable?

--
Magnus Hagander
Self: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#5)
Re: Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jul 27, 2009 at 16:14, Tom Lane<tgl@sss.pgh.pa.us> wrote:

I'm not really insisting on a redesign. �I'm talking about the places
where the code author appears not to have understood that ERROR means
FATAL, because the code keeps plugging on after it anyway. �As far as
I can see, using ERROR at lines 3630, 3657, 3674, and 3693 is just
plain bogus, and changing to LOG there requires no other fixing.

But. I'll look into cleaning those up for HEAD anyway, but due to lack
of reports I think we should skip backpatch. Reasonable?

Fair enough.

regards, tom lane

#7Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#6)
1 attachment(s)
Re: Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

On Tue, Jul 28, 2009 at 15:45, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jul 27, 2009 at 16:14, Tom Lane<tgl@sss.pgh.pa.us> wrote:

I'm not really insisting on a redesign.  I'm talking about the places
where the code author appears not to have understood that ERROR means
FATAL, because the code keeps plugging on after it anyway.  As far as
I can see, using ERROR at lines 3630, 3657, 3674, and 3693 is just
plain bogus, and changing to LOG there requires no other fixing.

But. I'll look into cleaning those up for HEAD anyway, but due to lack
of reports I think we should skip backpatch. Reasonable?

Fair enough.

Here's what I came up with. Seems ok?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

exec_errors.patchtext/x-diff; charset=US-ASCII; name=exec_errors.patchDownload
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 3627,3633 **** internal_forkexec(int argc, char *argv[], Port *port)
  		 * mess with the half-started process
  		 */
  		if (!TerminateProcess(pi.hProcess, 255))
! 			ereport(ERROR,
  					(errmsg_internal("could not terminate unstarted process: error code %d",
  									 (int) GetLastError())));
  		CloseHandle(pi.hProcess);
--- 3627,3633 ----
  		 * mess with the half-started process
  		 */
  		if (!TerminateProcess(pi.hProcess, 255))
! 			ereport(LOG,
  					(errmsg_internal("could not terminate unstarted process: error code %d",
  									 (int) GetLastError())));
  		CloseHandle(pi.hProcess);
***************
*** 3654,3660 **** internal_forkexec(int argc, char *argv[], Port *port)
  		 * process and give up.
  		 */
  		if (!TerminateProcess(pi.hProcess, 255))
! 			ereport(ERROR,
  					(errmsg_internal("could not terminate process that failed to reserve memory: error code %d",
  									 (int) GetLastError())));
  		CloseHandle(pi.hProcess);
--- 3654,3660 ----
  		 * process and give up.
  		 */
  		if (!TerminateProcess(pi.hProcess, 255))
! 			ereport(LOG,
  					(errmsg_internal("could not terminate process that failed to reserve memory: error code %d",
  									 (int) GetLastError())));
  		CloseHandle(pi.hProcess);
***************
*** 3671,3677 **** internal_forkexec(int argc, char *argv[], Port *port)
  	{
  		if (!TerminateProcess(pi.hProcess, 255))
  		{
! 			ereport(ERROR,
  					(errmsg_internal("could not terminate unstartable process: error code %d",
  									 (int) GetLastError())));
  			CloseHandle(pi.hProcess);
--- 3671,3677 ----
  	{
  		if (!TerminateProcess(pi.hProcess, 255))
  		{
! 			ereport(LOG,
  					(errmsg_internal("could not terminate unstartable process: error code %d",
  									 (int) GetLastError())));
  			CloseHandle(pi.hProcess);
***************
*** 3680,3686 **** internal_forkexec(int argc, char *argv[], Port *port)
  		}
  		CloseHandle(pi.hProcess);
  		CloseHandle(pi.hThread);
! 		ereport(ERROR,
  				(errmsg_internal("could not resume thread of unstarted process: error code %d",
  								 (int) GetLastError())));
  		return -1;
--- 3680,3686 ----
  		}
  		CloseHandle(pi.hProcess);
  		CloseHandle(pi.hThread);
! 		ereport(LOG,
  				(errmsg_internal("could not resume thread of unstarted process: error code %d",
  								 (int) GetLastError())));
  		return -1;
***************
*** 4430,4437 **** extern int	pgStatSock;
  #define write_inheritable_socket(dest, src, childpid) (*(dest) = (src))
  #define read_inheritable_socket(dest, src) (*(dest) = *(src))
  #else
! static void write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE child);
! static void write_inheritable_socket(InheritableSocket *dest, SOCKET src,
  						 pid_t childPid);
  static void read_inheritable_socket(SOCKET *dest, InheritableSocket *src);
  #endif
--- 4430,4437 ----
  #define write_inheritable_socket(dest, src, childpid) (*(dest) = (src))
  #define read_inheritable_socket(dest, src) (*(dest) = *(src))
  #else
! static bool write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE child);
! static bool write_inheritable_socket(InheritableSocket *dest, SOCKET src,
  						 pid_t childPid);
  static void read_inheritable_socket(SOCKET *dest, InheritableSocket *src);
  #endif
***************
*** 4448,4454 **** save_backend_variables(BackendParameters *param, Port *port,
  #endif
  {
  	memcpy(&param->port, port, sizeof(Port));
! 	write_inheritable_socket(&param->portsocket, port->sock, childPid);
  
  	strlcpy(param->DataDir, DataDir, MAXPGPATH);
  
--- 4448,4455 ----
  #endif
  {
  	memcpy(&param->port, port, sizeof(Port));
! 	if (!write_inheritable_socket(&param->portsocket, port->sock, childPid))
! 		return false;
  
  	strlcpy(param->DataDir, DataDir, MAXPGPATH);
  
***************
*** 4469,4475 **** save_backend_variables(BackendParameters *param, Port *port,
  	param->ProcGlobal = ProcGlobal;
  	param->AuxiliaryProcs = AuxiliaryProcs;
  	param->PMSignalState = PMSignalState;
! 	write_inheritable_socket(&param->pgStatSock, pgStatSock, childPid);
  
  	param->PostmasterPid = PostmasterPid;
  	param->PgStartTime = PgStartTime;
--- 4470,4477 ----
  	param->ProcGlobal = ProcGlobal;
  	param->AuxiliaryProcs = AuxiliaryProcs;
  	param->PMSignalState = PMSignalState;
! 	if (!write_inheritable_socket(&param->pgStatSock, pgStatSock, childPid))
! 		return false;
  
  	param->PostmasterPid = PostmasterPid;
  	param->PgStartTime = PgStartTime;
***************
*** 4479,4487 **** save_backend_variables(BackendParameters *param, Port *port,
  
  #ifdef WIN32
  	param->PostmasterHandle = PostmasterHandle;
! 	write_duplicated_handle(&param->initial_signal_pipe,
  							pgwin32_create_signal_listener(childPid),
! 							childProcess);
  #endif
  
  	memcpy(&param->syslogPipe, &syslogPipe, sizeof(syslogPipe));
--- 4481,4490 ----
  
  #ifdef WIN32
  	param->PostmasterHandle = PostmasterHandle;
! 	if (!write_duplicated_handle(&param->initial_signal_pipe,
  							pgwin32_create_signal_listener(childPid),
! 							childProcess))
! 		return false;
  #endif
  
  	memcpy(&param->syslogPipe, &syslogPipe, sizeof(syslogPipe));
***************
*** 4501,4507 **** save_backend_variables(BackendParameters *param, Port *port,
   * Duplicate a handle for usage in a child process, and write the child
   * process instance of the handle to the parameter file.
   */
! static void
  write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE childProcess)
  {
  	HANDLE		hChild = INVALID_HANDLE_VALUE;
--- 4504,4510 ----
   * Duplicate a handle for usage in a child process, and write the child
   * process instance of the handle to the parameter file.
   */
! static bool
  write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE childProcess)
  {
  	HANDLE		hChild = INVALID_HANDLE_VALUE;
***************
*** 4513,4523 **** write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE childProcess)
  						 0,
  						 TRUE,
  						 DUPLICATE_CLOSE_SOURCE | DUPLICATE_SAME_ACCESS))
! 		ereport(ERROR,
  				(errmsg_internal("could not duplicate handle to be written to backend parameter file: error code %d",
  								 (int) GetLastError())));
  
  	*dest = hChild;
  }
  
  /*
--- 4516,4530 ----
  						 0,
  						 TRUE,
  						 DUPLICATE_CLOSE_SOURCE | DUPLICATE_SAME_ACCESS))
! 	{
! 		ereport(LOG,
  				(errmsg_internal("could not duplicate handle to be written to backend parameter file: error code %d",
  								 (int) GetLastError())));
+ 		return false;
+ 	}
  
  	*dest = hChild;
+ 	return true;
  }
  
  /*
***************
*** 4527,4533 **** write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE childProcess)
   * common on Windows (antivirus, firewalls, download managers etc) break
   * straight socket inheritance.
   */
! static void
  write_inheritable_socket(InheritableSocket *dest, SOCKET src, pid_t childpid)
  {
  	dest->origsocket = src;
--- 4534,4540 ----
   * common on Windows (antivirus, firewalls, download managers etc) break
   * straight socket inheritance.
   */
! static bool
  write_inheritable_socket(InheritableSocket *dest, SOCKET src, pid_t childpid)
  {
  	dest->origsocket = src;
***************
*** 4535,4544 **** write_inheritable_socket(InheritableSocket *dest, SOCKET src, pid_t childpid)
  	{
  		/* Actual socket */
  		if (WSADuplicateSocket(src, childpid, &dest->wsainfo) != 0)
! 			ereport(ERROR,
  					(errmsg("could not duplicate socket %d for use in backend: error code %d",
  							src, WSAGetLastError())));
  	}
  }
  
  /*
--- 4542,4555 ----
  	{
  		/* Actual socket */
  		if (WSADuplicateSocket(src, childpid, &dest->wsainfo) != 0)
! 		{
! 			ereport(LOG,
  					(errmsg("could not duplicate socket %d for use in backend: error code %d",
  							src, WSAGetLastError())));
+ 			return false;
+ 		}
  	}
+ 	return true;
  }
  
  /*
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#7)
Re: Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

Magnus Hagander <magnus@hagander.net> writes:

But. I'll look into cleaning those up for HEAD anyway, but due to lack
of reports I think we should skip backpatch. Reasonable?

Fair enough.

Here's what I came up with. Seems ok?

Works for me.

regards, tom lane

#9Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#8)
Re: Re: [COMMITTERS] pgsql: Reserve the shared memory region during backend startup on

On Wed, Aug 5, 2009 at 16:53, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

But. I'll look into cleaning those up for HEAD anyway, but due to lack
of reports I think we should skip backpatch. Reasonable?

Fair enough.

Here's what I came up with. Seems ok?

Works for me.

Applied.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/