on_exit_reset fails to clear DSM-related exit actions

Started by Tom Lanealmost 12 years ago21 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I just noticed that the DSM patch has introduced a whole new class of
failures related to the bug #9464 issue: to wit, any on_detach
actions registered in a parent process will also be performed when a
child process exits, because nothing has been added to on_exit_reset
to prevent that. It seems likely that this is undesirable.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#1)
Re: on_exit_reset fails to clear DSM-related exit actions

On Fri, Mar 7, 2014 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I just noticed that the DSM patch has introduced a whole new class of
failures related to the bug #9464 issue: to wit, any on_detach
actions registered in a parent process will also be performed when a
child process exits, because nothing has been added to on_exit_reset
to prevent that. It seems likely that this is undesirable.

I don't think this can actually happen. There are quite a number of
things that would go belly-up if you tried to use dynamic shared
memory from the postmaster, which is why dsm_create() and dsm_attach()
both Assert(IsUnderPostmaster). Without calling those function,
there's no way for code running in the postmaster to call
on_dsm_detach(), because it's got nothing to pass for the first
argument.

In case you're wondering, the major reason I disallowed this is that
the control segment tracks the number of backends attached to each
segment, and there's no provision for adjusting that value each time
the postmaster forks. We could add such provision, but it seems like
there would be race conditions, and the postmaster might have to
participate in locking, so it might be pretty ugly, and a performance
suck for anyone who doesn't need this functionality. And it doesn't
seem very useful anyway. The postmaster really shouldn't be touching
any shared memory segment more than the absolutely minimal amount, so
the main possible benefit I can see is that you could have the mapping
already in place for each new backend rather than having to set it up
in every backend. But I'm prepared to hope that isn't actually
important enough to be worth worrying about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: on_exit_reset fails to clear DSM-related exit actions

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 7, 2014 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I just noticed that the DSM patch has introduced a whole new class of
failures related to the bug #9464 issue: to wit, any on_detach
actions registered in a parent process will also be performed when a
child process exits, because nothing has been added to on_exit_reset
to prevent that. It seems likely that this is undesirable.

I don't think this can actually happen. There are quite a number of
things that would go belly-up if you tried to use dynamic shared
memory from the postmaster, which is why dsm_create() and dsm_attach()
both Assert(IsUnderPostmaster).

Nonetheless it seems like a good idea to make on_exit_reset drop any
such queued actions.

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*. So whether or not you buy Andres' argument that it's not
necessary for atexit_callback to defend against this scenario, there's
actually no other defense possible given the way things work in HEAD.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: on_exit_reset fails to clear DSM-related exit actions

Hi,

On 2014-03-07 13:54:42 -0500, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I don't think this can actually happen. There are quite a number of
things that would go belly-up if you tried to use dynamic shared
memory from the postmaster, which is why dsm_create() and dsm_attach()
both Assert(IsUnderPostmaster).

Nonetheless it seems like a good idea to make on_exit_reset drop any
such queued actions.

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*.

Hm, aren't those actions called via shmem_exit() calling
dsm_backend_shutdown() et al? I think that should be cleared by
on_shmem_exit()?

So whether or not you buy Andres' argument that it's not
necessary for atexit_callback to defend against this scenario, there's
actually no other defense possible given the way things work in HEAD.

I think you're misunderstanding me. I am saying we *should* defend
against it. Our opinions just seem to differ on what to do when the
scenario is detected. I am saying we should scream bloody murder (which
admittedly is a hard in a child), you want to essentially declare it
supported.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: on_exit_reset fails to clear DSM-related exit actions

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-03-07 13:54:42 -0500, Tom Lane wrote:

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*.

Hm, aren't those actions called via shmem_exit() calling
dsm_backend_shutdown() et al? I think that should be cleared by
on_shmem_exit()?

But dsm_backend_shutdown gets called whether or not any shmem_exit
actions are registered.

I think you're misunderstanding me. I am saying we *should* defend
against it. Our opinions just seem to differ on what to do when the
scenario is detected. I am saying we should scream bloody murder (which
admittedly is a hard in a child), you want to essentially declare it
supported.

And if we scream bloody murder, what will happen? Absolutely nothing
except we'll annoy our users. They won't have control over the
third-party libraries that are doing what you want to complain about.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: on_exit_reset fails to clear DSM-related exit actions

On 2014-03-07 14:14:17 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-03-07 13:54:42 -0500, Tom Lane wrote:

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*.

Hm, aren't those actions called via shmem_exit() calling
dsm_backend_shutdown() et al? I think that should be cleared by
on_shmem_exit()?

But dsm_backend_shutdown gets called whether or not any shmem_exit
actions are registered.

Yikes. I thought on_exit_reset() disarmed the atexit callback, but
indeed, it does not...

I think you're misunderstanding me. I am saying we *should* defend
against it. Our opinions just seem to differ on what to do when the
scenario is detected. I am saying we should scream bloody murder (which
admittedly is a hard in a child), you want to essentially declare it
supported.

And if we scream bloody murder, what will happen? Absolutely nothing
except we'll annoy our users. They won't have control over the
third-party libraries that are doing what you want to complain about.

If the third party library is suitably careful it will only use fork()
and then exec() or _exit(), otherwise there are other things that'll be
broken (e.g. stdio). In that case everything was and is fine. If not,
the library's user can then fix or file a bug against the library.

Both perl and glibc seem to do just that in system() btw...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#3)
Re: on_exit_reset fails to clear DSM-related exit actions

On Fri, Mar 7, 2014 at 1:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 7, 2014 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I just noticed that the DSM patch has introduced a whole new class of
failures related to the bug #9464 issue: to wit, any on_detach
actions registered in a parent process will also be performed when a
child process exits, because nothing has been added to on_exit_reset
to prevent that. It seems likely that this is undesirable.

I don't think this can actually happen. There are quite a number of
things that would go belly-up if you tried to use dynamic shared
memory from the postmaster, which is why dsm_create() and dsm_attach()
both Assert(IsUnderPostmaster).

Nonetheless it seems like a good idea to make on_exit_reset drop any
such queued actions.

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*.

Hmm. So the problematic sequence of events is where a postmaster
child forks, and then exits without exec-ing, perhaps because e.g.
exec fails?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Peter LaDow
petela@gocougs.wsu.edu
In reply to: Andres Freund (#6)
Re: on_exit_reset fails to clear DSM-related exit actions

On Friday, March 7, 2014, Andres Freund <andres@2ndquadrant.com> wrote:

If the third party library is suitably careful it will only use fork()
and then exec() or _exit(), otherwise there are other things that'll be

But that is not possible* in our case of trying to spawn an asynchronous
backgound process. The goal here is for the extension to spawn the process
and return immediately. If we exec in the first level child, and
immediately return, who reaps the child when done?

This is why we fork twice and exit in the first level child so that the
extension can reap the first.

Unless Postgres happily reaps all children perhaps through a SIGCHLD
handler?

(* I suppose we could exec a program that itself forks/execs a background
process. But that is essentially no different than what we are doing now,
other than trying to meet this fork/exec/_exit requirement. And that's fine
if that is to be the case. We just need to know what it is.)

Both perl and glibc seem to do just that in system() btw...

I don't know about Perl, but system is blocking. What if you don't want to
wait for the child to exit?

Pete

#9Andres Freund
andres@2ndquadrant.com
In reply to: Peter LaDow (#8)
Re: on_exit_reset fails to clear DSM-related exit actions

On 2014-03-07 12:09:45 -0800, Peter LaDow wrote:

On Friday, March 7, 2014, Andres Freund <andres@2ndquadrant.com> wrote:

If the third party library is suitably careful it will only use fork()
and then exec() or _exit(), otherwise there are other things that'll be

But that is not possible* in our case of trying to spawn an asynchronous
backgound process.

Forking twice is ok as well, as long as you just use _exit() after the
fork. The thing is that you shouldn't run any nontrivial code in the
fork, as long as you're connected to the original environment (fd's,
memory mappings and so forth).

Both perl and glibc seem to do just that in system() btw...

I don't know about Perl, but system is blocking. What if you don't want to
wait for the child to exit?

Man. The point is that that the library code is carefully written to not
use exit() but _exit() after a fork failure, that's it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Peter LaDow
petela@gocougs.wsu.edu
In reply to: Andres Freund (#9)
Re: on_exit_reset fails to clear DSM-related exit actions

On Fri, Mar 7, 2014 at 12:17 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Man. The point is that that the library code is carefully written to not
use exit() but _exit() after a fork failure, that's it.

I understand your point. I understand that in the case of the
postmaster we don't want to invoke behavior that can cause problems by
calling exit(). But how does this jive with Tom's point (on the bug
list) about other 3rd party libraries perhaps registering atexit
handlers?

If the convention is that only fork/exec/_exit is permissible after a
fork (what about on_exit_reset?), then third party libraries also need
to be aware of that convention and not depend on their atexit handlers
being called.

And I'm not advocating a certain solution. I'm only trying to clarify
what the solution is so that we "comply" with the convention. We
don't want to break posgres or any other "well behaved" third party
libraries. I don't know the internals of postgres (hence original bug
report and this thread), so I of course defer to you and the other
experts here. So, in our case we call on_exit_reset() after the fork
in the child, and then from there on only use fork, exec, or _exit, as
you mention above.

Pete

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Peter LaDow
petela@gocougs.wsu.edu
In reply to: Andres Freund (#9)
Re: on_exit_reset fails to clear DSM-related exit actions

On Fri, Mar 7, 2014 at 12:17 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Forking twice is ok as well, as long as you just use _exit() after the
fork. The thing is that you shouldn't run any nontrivial code in the
fork, as long as you're connected to the original environment (fd's,
memory mappings and so forth).

Just to be clear, what do you mean by "nontrivial code"? Do you mean
C library calls, other than fork/exec/_exit?

Pete

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Peter LaDow
petela@gocougs.wsu.edu
In reply to: Peter LaDow (#11)
Re: on_exit_reset fails to clear DSM-related exit actions

On Fri, Mar 7, 2014 at 12:55 PM, Peter LaDow <petela@gocougs.wsu.edu> wrote:

Just to be clear, what do you mean by "nontrivial code"? Do you mean
C library calls, other than fork/exec/_exit?

I think I've answered my own question:

http://man7.org/linux/man-pages/man7/signal.7.html

The 'Async-signal-safe functions' are the nontrivial C calls that may be called.

Pete

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#7)
1 attachment(s)
Re: on_exit_reset fails to clear DSM-related exit actions

On Fri, Mar 7, 2014 at 2:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:

The big picture here is that in the scenario being debated in the other
thread, exit() in a child process forked from a backend will execute that
backend's on_detach actions *even if the code had done on_exit_reset after
the fork*.

Hmm. So the problematic sequence of events is where a postmaster
child forks, and then exits without exec-ing, perhaps because e.g.
exec fails?

I've attempted a fix for this case. The attached patch makes
test_shm_mq fork() a child process that calls on_exit_reset() and then
exits. Without the fix I just pushed, this causes the tests to fail;
with this fix, this does not cause the tests to fail.

I'm not entirely sure that this is exactly right, but I think it's an
improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

forktest.patchtext/x-patch; charset=US-ASCII; name=forktest.patchDownload
diff --git a/contrib/test_shm_mq/test.c b/contrib/test_shm_mq/test.c
index 59f18ec..f6b9dd4 100644
--- a/contrib/test_shm_mq/test.c
+++ b/contrib/test_shm_mq/test.c
@@ -13,8 +13,11 @@
 
 #include "postgres.h"
 
+#include <unistd.h>
+
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "storage/ipc.h"
 
 #include "test_shm_mq.h"
 
@@ -72,6 +75,15 @@ test_shm_mq(PG_FUNCTION_ARGS)
 	/* Set up dynamic shared memory segment and background workers. */
 	test_shm_mq_setup(queue_size, nworkers, &seg, &outqh, &inqh);
 
+	/* Try forking a child that immediately exits. */
+	if (fork() == 0)
+	{
+		elog(LOG, "child is PID %d", getpid());
+		on_exit_reset();
+		exit(0);
+	}
+	elog(LOG, "parent is PID %d", getpid());
+
 	/* Send the initial message. */
 	res = shm_mq_send(outqh, message_size, message_contents, false);
 	if (res != SHM_MQ_SUCCESS)
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: on_exit_reset fails to clear DSM-related exit actions

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 7, 2014 at 2:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Hmm. So the problematic sequence of events is where a postmaster
child forks, and then exits without exec-ing, perhaps because e.g.
exec fails?

I've attempted a fix for this case. The attached patch makes
test_shm_mq fork() a child process that calls on_exit_reset() and then
exits. Without the fix I just pushed, this causes the tests to fail;
with this fix, this does not cause the tests to fail.

I'm not entirely sure that this is exactly right, but I think it's an
improvement.

This looks generally sane to me, but I wonder whether you should also
teach PGSharedMemoryDetach() to physically detach from DSM segments
not just the main shmem seg. Or maybe better, adjust most of the
places that call on_exit_reset and then PGSharedMemoryDetach to also
make a detach-from-DSM call. It looks like both sysv_shmem.c and
win32_shmem.c have internal callers of PGSharedMemoryDetach, which
probably shouldn't affect DSM.

You could argue that that's unnecessary on the grounds that the postmaster
will never have any DSM segments attached, but I think it would be
good coding practice anyway. People who copy-and-paste those bits of
code into other places are not likely to think of adding DSM calls.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#14)
Re: on_exit_reset fails to clear DSM-related exit actions

On Mon, Mar 10, 2014 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 7, 2014 at 2:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Hmm. So the problematic sequence of events is where a postmaster
child forks, and then exits without exec-ing, perhaps because e.g.
exec fails?

I've attempted a fix for this case. The attached patch makes
test_shm_mq fork() a child process that calls on_exit_reset() and then
exits. Without the fix I just pushed, this causes the tests to fail;
with this fix, this does not cause the tests to fail.

I'm not entirely sure that this is exactly right, but I think it's an
improvement.

This looks generally sane to me, but I wonder whether you should also
teach PGSharedMemoryDetach() to physically detach from DSM segments
not just the main shmem seg. Or maybe better, adjust most of the
places that call on_exit_reset and then PGSharedMemoryDetach to also
make a detach-from-DSM call.

Hmm, good catch. Maybe we should invent a new function that is
defined to mean "detach ALL shared memory segments".

A related point that's been bugging me for a while, and has just
struck me again, is that background workers for which
BGWORKER_SHMEM_ACCESS is not passed probably ought to be detaching
shared memory (and DSMs). Currently, and since Alvaro originally
added the facility, the death of a non-BGWORKER_SHMEM_ACCESS backend
is used in postmaster.c to decide whether to call HandleChildCrash().
But such workers could still clobber shared memory arbitrarily; they
haven't unmapped it. Oddly, the code in postmaster.c is only cares
about the flag when checking EXIT_STATUS_0()/EXIT_STATUS_1(), not when
checking ReleasePostmasterChildSlot()...

It looks like both sysv_shmem.c and
win32_shmem.c have internal callers of PGSharedMemoryDetach, which
probably shouldn't affect DSM.

You could argue that that's unnecessary on the grounds that the postmaster
will never have any DSM segments attached, but I think it would be
good coding practice anyway. People who copy-and-paste those bits of
code into other places are not likely to think of adding DSM calls.

Well, actually the postmaster WILL have the control segment attached
(not nothing else).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#15)
Re: on_exit_reset fails to clear DSM-related exit actions

Robert Haas escribi�:

A related point that's been bugging me for a while, and has just
struck me again, is that background workers for which
BGWORKER_SHMEM_ACCESS is not passed probably ought to be detaching
shared memory (and DSMs). Currently, and since Alvaro originally
added the facility, the death of a non-BGWORKER_SHMEM_ACCESS backend
is used in postmaster.c to decide whether to call HandleChildCrash().
But such workers could still clobber shared memory arbitrarily; they
haven't unmapped it. Oddly, the code in postmaster.c is only cares
about the flag when checking EXIT_STATUS_0()/EXIT_STATUS_1(), not when
checking ReleasePostmasterChildSlot()...

Clearly there's not a lot of consistency on that. I don't think I had
made up my mind completely about such details. I do remember that
unmapping/detaching the shared memory segment didn't cross my mind; the
flag, as I recall, only controls (controlled) whether to attach to it
explicitely.

IOW feel free to whack around.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#14)
Re: on_exit_reset fails to clear DSM-related exit actions

On Mon, Mar 10, 2014 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 7, 2014 at 2:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Hmm. So the problematic sequence of events is where a postmaster
child forks, and then exits without exec-ing, perhaps because e.g.
exec fails?

I've attempted a fix for this case. The attached patch makes
test_shm_mq fork() a child process that calls on_exit_reset() and then
exits. Without the fix I just pushed, this causes the tests to fail;
with this fix, this does not cause the tests to fail.

I'm not entirely sure that this is exactly right, but I think it's an
improvement.

This looks generally sane to me, but I wonder whether you should also
teach PGSharedMemoryDetach() to physically detach from DSM segments
not just the main shmem seg.

I looked at this a little more. It seems dsm_backend_shutdown()
already does almost the right thing; it fires all the on-detach
callbacks and unmaps the corresponding segments. It does NOT unmap
the control segment, though, and processes that are unmapping the main
shared memory segment for safety should really be getting rid of the
control segment too (even though the consequences of corrupting the
control segment are much less bad).

One option is to just change that function to also unmap the control
segment, and maybe rename it to dsm_detach_all(), and then use that
everywhere. The problem is that I'm not sure we really want to incur
the overhead of an extra munmap() during every backend exit, just to
get rid of a control segment which was going to be unmapped anyway by
process termination. For that matter, I'm not sure why we bother
arranging that for the main shared memory segment, either: surely
whatever function the shmdt() and munmap() calls in IpcMemoryDetach
may have will be equally well-served by the forthcoming exit()?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#17)
Re: on_exit_reset fails to clear DSM-related exit actions

Robert Haas <robertmhaas@gmail.com> writes:

One option is to just change that function to also unmap the control
segment, and maybe rename it to dsm_detach_all(), and then use that
everywhere. The problem is that I'm not sure we really want to incur
the overhead of an extra munmap() during every backend exit, just to
get rid of a control segment which was going to be unmapped anyway by
process termination. For that matter, I'm not sure why we bother
arranging that for the main shared memory segment, either: surely
whatever function the shmdt() and munmap() calls in IpcMemoryDetach
may have will be equally well-served by the forthcoming exit()?

Before you lobotomize that code too much, consider the postmaster
crash-recovery case. That path does need to detach from the old
shmem segment.

Also, I might be wrong, but I think IpcMemoryDetach is a *postmaster*
on_shmem_exit routine; it isn't called during backend exit.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
1 attachment(s)
Re: on_exit_reset fails to clear DSM-related exit actions

On Mon, Mar 17, 2014 at 11:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

One option is to just change that function to also unmap the control
segment, and maybe rename it to dsm_detach_all(), and then use that
everywhere. The problem is that I'm not sure we really want to incur
the overhead of an extra munmap() during every backend exit, just to
get rid of a control segment which was going to be unmapped anyway by
process termination. For that matter, I'm not sure why we bother
arranging that for the main shared memory segment, either: surely
whatever function the shmdt() and munmap() calls in IpcMemoryDetach
may have will be equally well-served by the forthcoming exit()?

Before you lobotomize that code too much, consider the postmaster
crash-recovery case. That path does need to detach from the old
shmem segment.

Also, I might be wrong, but I think IpcMemoryDetach is a *postmaster*
on_shmem_exit routine; it isn't called during backend exit.

Ah, right. I verified using strace that, at least on Linux
2.6.32-279.el6.x86_64, the only system call made on disconnecting a
psql session is exit_group(0). I also tried setting breakpoints on
PGSharedMemoryDetach and IpcMemoryDetach and they do not fire.

After mulling over a few possible approaches, I came up with the
attached, which seems short and to the point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

dsm-detach-all.patchtext/x-patch; charset=US-ASCII; name=dsm-detach-all.patchDownload
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index cf2ce46..8153160 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -40,6 +40,7 @@
 #include "postmaster/fork_process.h"
 #include "postmaster/pgarch.h"
 #include "postmaster/postmaster.h"
+#include "storage/dsm.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
@@ -163,6 +164,7 @@ pgarch_start(void)
 			on_exit_reset();
 
 			/* Drop our connection to postmaster's shared memory, as well */
+			dsm_detach_all();
 			PGSharedMemoryDetach();
 
 			PgArchiverMain(0, NULL);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 1ca5d13..3dc280a 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -50,6 +50,7 @@
 #include "postmaster/postmaster.h"
 #include "storage/proc.h"
 #include "storage/backendid.h"
+#include "storage/dsm.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
@@ -709,6 +710,7 @@ pgstat_start(void)
 			on_exit_reset();
 
 			/* Drop our connection to postmaster's shared memory, as well */
+			dsm_detach_all();
 			PGSharedMemoryDetach();
 
 			PgstatCollectorMain(0, NULL);
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index e277a9a..4731ab7 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -39,6 +39,7 @@
 #include "postmaster/fork_process.h"
 #include "postmaster/postmaster.h"
 #include "postmaster/syslogger.h"
+#include "storage/dsm.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
 #include "storage/pg_shmem.h"
@@ -626,6 +627,7 @@ SysLogger_Start(void)
 			on_exit_reset();
 
 			/* Drop our connection to postmaster's shared memory, as well */
+			dsm_detach_all();
 			PGSharedMemoryDetach();
 
 			/* do the work */
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 232c099..c967177 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -722,6 +722,8 @@ dsm_attach(dsm_handle h)
 
 /*
  * At backend shutdown time, detach any segments that are still attached.
+ * (This is similar to dsm_detach_all, except that there's no reason to
+ * unmap the control segment before exiting, so we don't bother.)
  */
 void
 dsm_backend_shutdown(void)
@@ -736,6 +738,31 @@ dsm_backend_shutdown(void)
 }
 
 /*
+ * Detach all shared memory segments, including the control segments.  This
+ * should be called, along with PGSharedMemoryDetach, in processes that
+ * might inherit mappings but are not intended to be connected to dynamic
+ * shared memory.
+ */
+void
+dsm_detach_all(void)
+{
+	void   *control_address = dsm_control;
+
+	while (!dlist_is_empty(&dsm_segment_list))
+	{
+		dsm_segment	   *seg;
+
+		seg = dlist_head_element(dsm_segment, node, &dsm_segment_list);
+		dsm_detach(seg);
+	}
+
+	if (control_address != NULL)
+		dsm_impl_op(DSM_OP_DETACH, dsm_control_handle, 0,
+					&dsm_control_impl_private, &control_address,
+					&dsm_control_mapped_size, ERROR);
+}
+
+/*
  * Resize an existing shared memory segment.
  *
  * This may cause the shared memory segment to be remapped at a different
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index 03dd767..e4669a1 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -20,6 +20,7 @@ typedef struct dsm_segment dsm_segment;
 /* Startup and shutdown functions. */
 extern void dsm_postmaster_startup(void);
 extern void dsm_backend_shutdown(void);
+extern void dsm_detach_all(void);
 
 /* Functions that create, update, or remove mappings. */
 extern dsm_segment *dsm_create(Size size);
#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
Re: on_exit_reset fails to clear DSM-related exit actions

Robert Haas <robertmhaas@gmail.com> writes:

After mulling over a few possible approaches, I came up with the
attached, which seems short and to the point.

Looks reasonable in principle. I didn't run through all the existing
PGSharedMemoryDetach calls to see if there are any other places to
call dsm_detach_all, but it's an easy fix if there are any.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#20)
Re: on_exit_reset fails to clear DSM-related exit actions

On Mon, Mar 17, 2014 at 1:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

After mulling over a few possible approaches, I came up with the
attached, which seems short and to the point.

Looks reasonable in principle. I didn't run through all the existing
PGSharedMemoryDetach calls to see if there are any other places to
call dsm_detach_all, but it's an easy fix if there are any.

OK, committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers