Re: [PATCH] "could not reattach to shared memory" on Windows

Started by Etienne Dubealmost 16 years ago5 messages
#1Etienne Dube
etdube@gmail.com

Hi,

We've come across this issue on 8.2.15 on a Windows Server 2008
instance. I noticed the patch hasn't been applied to the 8.2 branch yet.
Any chances that this will be part of an eventual 8.2.16 release? Do you
need more testing and feedback before commiting the patch?

Thanks,

Etienne Dube

Show quoted text

* *From*: Magnus Hagander <magnus@hagander.net>
* *To*: Tom Lane <tgl@sss.pgh.pa.us>
* *Cc*: Tsutomu Yamada <tsutomu@sraoss.co.jp>, Alvaro Herrera
<alvherre@commandprompt.com>, pgsql-hackers@postgresql.org, Dave
Page <dpage@pgadmin.org>
* *Subject*: Re: [PATCH] "could not reattach to shared memory" on
Windows
* *Date*: Tue, 11 Aug 2009 17:14:08 +0200
* *Message-id*:
<9837222c0908110814n414b2fcbxcaf7c0e1fcc05999@mail.gmail.com
<http://archives.postgresql.org/pgsql-hackers/2009-08/msg00894.php&gt;&gt;

------------------------------------------------------------------------
On Tue, Aug 11, 2009 at 16:30, Magnus Hagander<magnus@hagander.net> wrote:

On Mon, Aug 10, 2009 at 19:33, Magnus Hagander<magnus@hagander.net> wrote:

On Mon, Aug 10, 2009 at 16:58, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Aug 10, 2009 at 16:10, Tom Lane<tgl@sss.pgh.pa.us> wrote:

8.2 as well, no?

8.2 has a different shmem implementation - the one that emulates sysv
shmem. The patch will need to be changed around for that, and I
haven't looked at that. It may be worthwhile to do that, but it's a
separate patch, so let's get it out in 8.3 and 8.4 first.

If it's at all hard to do, I could see deprecating 8.2 for Windows
instead.

I haven't looked at how much work it would be at all yet. So let's do
that before we decide to deprecate anything. As mentioned downthread,
8.2 is a very widespread release, and we really want to avoid
deprecating it.

Here's an attempt at a backport to 8.2. I haven't examined it in
detail, but it passes "make check" on mingw.

Comments?

I've also built a binary that should be copy:able on top of an 8.2.13
installation made from the standard installer, to test this feature.
Anybody on 8.2 on Windows, please give it a shot and let us know how
it works.

http://www.hagander.net/pgsql/postgres_exe_virtualalloc_8_2.zip

--
Magnus Hagander

Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#2Magnus Hagander
magnus@hagander.net
In reply to: Etienne Dube (#1)
Re: [PATCH] "could not reattach to shared memory" on Windows

IIRC, we've had zero reports on whether the patch worked at all on 8.2
in an environment where the problem actually existed. So yes, some
testing and feedback would be much apprecaited.

//Magnus

2010/2/8 Etienne Dube <etdube@gmail.com>:

Hi,

We've come across this issue on 8.2.15 on a Windows Server 2008 instance. I noticed the patch hasn't been applied to the 8.2 branch yet. Any chances that this will be part of an eventual 8.2.16 release? Do you need more testing and feedback before commiting the patch?

Thanks,

Etienne Dube

   * *From*: Magnus Hagander <magnus@hagander.net>
   * *To*: Tom Lane <tgl@sss.pgh.pa.us>
   * *Cc*: Tsutomu Yamada <tsutomu@sraoss.co.jp>, Alvaro Herrera
     <alvherre@commandprompt.com>, pgsql-hackers@postgresql.org, Dave
     Page <dpage@pgadmin.org>
   * *Subject*: Re: [PATCH] "could not reattach to shared memory" on
     Windows
   * *Date*: Tue, 11 Aug 2009 17:14:08 +0200
   * *Message-id*:
     <9837222c0908110814n414b2fcbxcaf7c0e1fcc05999@mail.gmail.com
     <http://archives.postgresql.org/pgsql-hackers/2009-08/msg00894.php&gt;&gt;

------------------------------------------------------------------------
On Tue, Aug 11, 2009 at 16:30, Magnus Hagander<magnus@hagander.net> wrote:

On Mon, Aug 10, 2009 at 19:33, Magnus Hagander<magnus@hagander.net> wrote:

On Mon, Aug 10, 2009 at 16:58, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Aug 10, 2009 at 16:10, Tom Lane<tgl@sss.pgh.pa.us> wrote:

8.2 as well, no?

8.2 has a different shmem implementation - the one that emulates sysv
shmem. The patch will need to be changed around for that, and I
haven't looked at that. It may be worthwhile to do that, but it's a
separate patch, so let's get it out in 8.3 and 8.4 first.

If it's at all hard to do, I could see deprecating 8.2 for Windows
instead.

I haven't looked at how much work it would be at all yet. So let's do
that before we decide to deprecate anything. As mentioned downthread,
8.2 is a very widespread release, and we really want to avoid
deprecating it.

Here's an attempt at a backport to 8.2. I haven't examined it  in
detail, but it passes "make check" on mingw.

Comments?

I've also built a binary that should be copy:able on top of an 8.2.13
installation made from the standard installer, to test this feature.
Anybody on 8.2 on Windows, please give it a shot and let us know how
it works.

http://www.hagander.net/pgsql/postgres_exe_virtualalloc_8_2.zip

--
 Magnus Hagander

 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#3Etienne Dube
etdube@gmail.com
In reply to: Magnus Hagander (#2)

Magnus Hagander wrote:

IIRC, we've had zero reports on whether the patch worked at all on 8.2
in an environment where the problem actually existed. So yes, some
testing and feedback would be much apprecaited.

//Magnus

Thanks for your quick reply.
We upgraded to Service Pack 2 and it solved the problem. Nevertheless,
I'll try to reproduce the issue under a Win2008 SP1 VM to see whether
the patch makes a difference. 8.2.x win32 binaries are built using MinGW
right?

Etienne

Show quoted text

2010/2/8 Etienne Dube <etdube@gmail.com>:

Hi,

We've come across this issue on 8.2.15 on a Windows Server 2008 instance. I noticed the patch hasn't been applied to the 8.2 branch yet. Any chances that this will be part of an eventual 8.2.16 release? Do you need more testing and feedback before commiting the patch?

Thanks,

Etienne Dube

* *From*: Magnus Hagander <magnus@hagander.net>
* *To*: Tom Lane <tgl@sss.pgh.pa.us>
* *Cc*: Tsutomu Yamada <tsutomu@sraoss.co.jp>, Alvaro Herrera
<alvherre@commandprompt.com>, pgsql-hackers@postgresql.org, Dave
Page <dpage@pgadmin.org>
* *Subject*: Re: [PATCH] "could not reattach to shared memory" on
Windows
* *Date*: Tue, 11 Aug 2009 17:14:08 +0200
* *Message-id*:
<9837222c0908110814n414b2fcbxcaf7c0e1fcc05999@mail.gmail.com
<http://archives.postgresql.org/pgsql-hackers/2009-08/msg00894.php&gt;&gt;

------------------------------------------------------------------------
On Tue, Aug 11, 2009 at 16:30, Magnus Hagander<magnus@hagander.net> wrote:

On Mon, Aug 10, 2009 at 19:33, Magnus Hagander<magnus@hagander.net> wrote:

On Mon, Aug 10, 2009 at 16:58, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Aug 10, 2009 at 16:10, Tom Lane<tgl@sss.pgh.pa.us> wrote:

8.2 as well, no?

8.2 has a different shmem implementation - the one that emulates sysv
shmem. The patch will need to be changed around for that, and I
haven't looked at that. It may be worthwhile to do that, but it's a
separate patch, so let's get it out in 8.3 and 8.4 first.

If it's at all hard to do, I could see deprecating 8.2 for Windows
instead.

I haven't looked at how much work it would be at all yet. So let's do
that before we decide to deprecate anything. As mentioned downthread,
8.2 is a very widespread release, and we really want to avoid
deprecating it.

Here's an attempt at a backport to 8.2. I haven't examined it in
detail, but it passes "make check" on mingw.

Comments?

I've also built a binary that should be copy:able on top of an 8.2.13
installation made from the standard installer, to test this feature.
Anybody on 8.2 on Windows, please give it a shot and let us know how
it works.

http://www.hagander.net/pgsql/postgres_exe_virtualalloc_8_2.zip

--
Magnus Hagander

Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Etienne Dube
etdube@gmail.com
In reply to: Etienne Dube (#3)
1 attachment(s)

On 09/02/2010 4:09 PM, Etienne Dube wrote:

Magnus Hagander wrote:

IIRC, we've had zero reports on whether the patch worked at all on 8.2
in an environment where the problem actually existed. So yes, some
testing and feedback would be much apprecaited.

//Magnus

Thanks for your quick reply.
We upgraded to Service Pack 2 and it solved the problem. Nevertheless,
I'll try to reproduce the issue under a Win2008 SP1 VM to see whether
the patch makes a difference. 8.2.x win32 binaries are built using
MinGW right?

Etienne

The "could not reattach to shared memory" bug came back to bite us, this
time on a production machine running Windows Server 2008 R2 x64. I
manually applied the patch against the 8.2.17 sources and installed the
build on a test server. It has been running for two days without any
issue. We'll see after a while if the patch actually fixes the problem
(it seems to happen only after the postgres service has been up and
running for some time) but in case you want to include this fix in a
future 8.2.18 release, I've attached the new patch to apply against the
REL8_2_STABLE branch.

Etienne

Attachments:

pgsql-REL8_2_STABLE_virtualalloc.patchtext/plain; name=pgsql-REL8_2_STABLE_virtualalloc.patchDownload
Index: backend/port/sysv_shmem.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/port/sysv_shmem.c,v
retrieving revision 1.47.2.2
diff -u -r1.47.2.2 sysv_shmem.c
--- backend/port/sysv_shmem.c	1 May 2010 22:46:50 -0000	1.47.2.2
+++ backend/port/sysv_shmem.c	17 Jul 2010 19:19:51 -0000
@@ -49,6 +49,10 @@
 
 unsigned long UsedShmemSegID = 0;
 void	   *UsedShmemSegAddr = NULL;
+#ifdef WIN32
+Size		UsedShmemSegSize = 0;
+#endif
+
 
 static void *InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size);
 static void IpcMemoryDetach(int status, Datum shmaddr);
@@ -445,6 +449,7 @@
 
 	/* Save info for possible future use */
 	UsedShmemSegAddr = memAddress;
+	UsedShmemSegSize = size;
 	UsedShmemSegID = (unsigned long) NextShmemSegID;
 
 	return hdr;
Index: backend/port/win32/shmem.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/port/win32/Attic/shmem.c,v
retrieving revision 1.13.2.1
diff -u -r1.13.2.1 shmem.c
--- backend/port/win32/shmem.c	4 May 2009 08:36:42 -0000	1.13.2.1
+++ backend/port/win32/shmem.c	17 Jul 2010 19:22:13 -0000
@@ -12,8 +12,11 @@
  */
 
 #include "postgres.h"
+#include "miscadmin.h"
 
 static DWORD s_segsize = 0;
+extern void *UsedShmemSegAddr;
+extern Size UsedShmemSegSize;
 
 /* Detach from a shared mem area based on its address */
 int
@@ -29,6 +32,13 @@
 void *
 shmat(int memId, void *shmaddr, int flag)
 {
+	/* Release the memory region reserved in the postmaster */
+	if (IsUnderPostmaster)
+	{
+		if (VirtualFree(shmaddr, 0, MEM_RELEASE) == 0)
+			elog(FATAL, "failed to release reserved memory region (addr=%p): %lu",
+				 shmaddr, GetLastError());
+	}
 	/* TODO -- shmat needs to count # attached to shared mem */
 	void	   *lpmem = MapViewOfFileEx((HANDLE) memId,
 										FILE_MAP_WRITE | FILE_MAP_READ,
@@ -128,3 +138,53 @@
 
 	return (int) hmap;
 }
+
+/*
+ * pgwin32_ReserveSharedMemoryRegion(hChild)
+ *
+ * Reserve the memory region that will be used for shared memory in a child
+ * process. It is called before the child process starts, to make sure the
+ * memory is available.
+ *
+ * Once the child starts, DLLs loading in different order or threads getting
+ * scheduled differently may allocate memory which can conflict with the
+ * address space we need for our shared memory. By reserving the shared
+ * memory region before the child starts, and freeing it only just before we
+ * attempt to get access to the shared memory forces these allocations to
+ * be given different address ranges that don't conflict.
+ *
+ * NOTE! This function executes in the postmaster, and should for this
+ * reason not use elog(FATAL) since that would take down the postmaster.
+ */
+int
+pgwin32_ReserveSharedMemoryRegion(HANDLE hChild)
+{
+	void *address;
+
+	Assert(UsedShmemSegAddr != NULL);
+	Assert(UsedShmemSegSize != 0);
+
+	address = VirtualAllocEx(hChild, UsedShmemSegAddr, UsedShmemSegSize,
+								MEM_RESERVE, PAGE_READWRITE);
+	if (address == NULL) {
+		/* Don't use FATAL since we're running in the postmaster */
+		elog(LOG, "could not reserve shared memory region (addr=%p) for child %lu: %lu",
+			 UsedShmemSegAddr, hChild, GetLastError());
+		return false;
+	}
+	if (address != UsedShmemSegAddr)
+	{
+		/*
+		 * Should never happen - in theory if allocation granularity causes strange
+		 * effects it could, so check just in case.
+		 *
+		 * Don't use FATAL since we're running in the postmaster.
+		 */
+	    elog(LOG, "reserved shared memory region got incorrect address %p, expected %p",
+			 address, UsedShmemSegAddr);
+		VirtualFreeEx(hChild, address, 0, MEM_RELEASE);
+		return false;
+	}
+
+	return true;
+}
Index: backend/postmaster/postmaster.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.505.2.8
diff -u -r1.505.2.8 postmaster.c
--- backend/postmaster/postmaster.c	1 Apr 2010 20:12:42 -0000	1.505.2.8
+++ backend/postmaster/postmaster.c	17 Jul 2010 19:24:43 -0000
@@ -3186,7 +3186,7 @@
 		return -1;				/* log made by save_backend_variables */
 	}
 
-	/* Drop the shared memory that is now inherited to the backend */
+	/* Drop the parameter shared memory that is now inherited to the backend */
 	if (!UnmapViewOfFile(param))
 		elog(LOG, "could not unmap view of backend parameter file: error code %d",
 			 (int) GetLastError());
@@ -3195,6 +3195,25 @@
 			 (int) GetLastError());
 
 	/*
+	 * Reserve the memory region used by our main shared memory segment before we
+	 * resume the child process.
+	 */
+	if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
+	{
+		/*
+		 * Failed to reserve the memory, so terminate the newly created
+		 * process and give up.
+		 */
+		if (!TerminateProcess(pi.hProcess, 255))
+			ereport(ERROR,
+					(errmsg_internal("could not terminate process that failed to reserve memory: error code %d",
+									 (int) GetLastError())));
+		CloseHandle(pi.hProcess);
+		CloseHandle(pi.hThread);
+		return -1;			/* logging done made by pgwin32_ReserveSharedMemoryRegion() */
+	}
+
+	/*
 	 * Now that the backend variables are written out, we start the child
 	 * thread so it can start initializing while we set up the rest of the
 	 * parent state.
Index: include/port/win32.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/port/win32.h,v
retrieving revision 1.63.2.3
diff -u -r1.63.2.3 win32.h
--- include/port/win32.h	29 Nov 2007 16:44:26 -0000	1.63.2.3
+++ include/port/win32.h	17 Jul 2010 19:26:18 -0000
@@ -263,6 +263,9 @@
 extern int	pgwin32_is_service(void);
 #endif
 
+/* in backend/port/win32/shmem.c */
+extern int	pgwin32_ReserveSharedMemoryRegion(HANDLE);
+
 /* in port/win32error.c */
 extern void _dosmaperr(unsigned long);
 

#5Magnus Hagander
magnus@hagander.net
In reply to: Etienne Dube (#4)
Re: [PATCH] "could not reattach to shared memory" on Windows

On Tue, Jul 20, 2010 at 17:31, Etienne Dube <etdube@gmail.com> wrote:

On 09/02/2010 4:09 PM, Etienne Dube wrote:

Magnus Hagander wrote:

IIRC, we've had zero reports on whether the patch worked at all on 8.2
in an environment where the problem actually existed. So yes, some
testing and feedback would be much apprecaited.

//Magnus

Thanks for your quick reply.
We upgraded to Service Pack 2 and it solved the problem. Nevertheless,
I'll try to reproduce the issue under a Win2008 SP1 VM to see whether the
patch makes a difference. 8.2.x win32 binaries are built using MinGW right?

Etienne

The "could not reattach to shared memory" bug came back to bite us, this
time on a production machine running Windows Server 2008 R2 x64. I manually
applied the patch against the 8.2.17 sources and installed the build on a
test server. It has been running for two days without any issue. We'll see
after a while if the patch actually fixes the problem (it seems to happen
only after the postgres service has been up and running for some time) but
in case you want to include this fix in a future 8.2.18 release, I've
attached the new patch to apply against the REL8_2_STABLE branch.

Yes, I think it's time to backpatch this to 8.2 - it has worked very
well on 8.3 and 8.4, and we've had a couple of good reports on 8.2 by
now. So I've done that, so it should be in the next 8.2 version.

In fact, there was a small bug in the patch that broke all non-win32
platforms, so I fixed that while at it :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/