Fixing order of resowner cleanup in 12, for Windows

Started by Thomas Munroover 6 years ago17 messages

thomas.munro@gmail.com

over 6 years ago

Hi all,

A while back I posted a patch[1]/messages/by-id/CAEepm=2ikUtjmiJ18bTnwaeUBoiYN=wMDSdhU1jy=8WzNhET-Q@mail.gmail.com to change the order of resowner
cleanup so that DSM handles are released last. That's useful for the
error cleanup path on Windows, when a SharedFileSet is cleaned up (a
mechanism that's used by parallel CREATE INDEX and parallel hash join,
for spilling files to disk under a temporary directory, with automatic
cleanup). Previously we believed that it was OK to unlink things that
other processes might have currently open as long as you use the
FILE_SHARE_DELETE flag, but that turned out not to be the full story:
you can unlink files that someone has open, but you can't unlink the
directory that contains them! Hence the desire to reverse the
clean-up order.

It didn't seem worth the risk of back-patching the change, because the
only consequence is a confusing message that appears somewhere near
the real error:

LOG: could not rmdir directory
"base/pgsql_tmp/pgsql_tmp5088.0.sharedfileset": Directory not empty

I suppose we probably should make the change to 12 though: then owners
of extensions that use DSM detach hooks (if there any such extensions)
will have a bit of time to get used to the new order during the beta
period. I'll need to find someone to test this with a fault injection
scenario on Windows before committing it, but wanted to sound out the
list for any objections to this late change?

[1]: /messages/by-id/CAEepm=2ikUtjmiJ18bTnwaeUBoiYN=wMDSdhU1jy=8WzNhET-Q@mail.gmail.com

--
Thomas Munro
https://enterprisedb.com

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Thomas Munro (#1)

Re: Fixing order of resowner cleanup in 12, for Windows

Thomas Munro <thomas.munro@gmail.com> writes:

A while back I posted a patch[1] to change the order of resowner
cleanup so that DSM handles are released last. That's useful for the
error cleanup path on Windows, when a SharedFileSet is cleaned up (a
mechanism that's used by parallel CREATE INDEX and parallel hash join,
for spilling files to disk under a temporary directory, with automatic
cleanup).

I guess what I'm wondering is if there are any potential negative
consequences, ie code that won't work if we change the order like this.
I'm finding it hard to visualize what that would be, but then again
this failure mode wasn't obvious either.

I suppose we probably should make the change to 12 though: then owners
of extensions that use DSM detach hooks (if there any such extensions)
will have a bit of time to get used to the new order during the beta
period. I'll need to find someone to test this with a fault injection
scenario on Windows before committing it, but wanted to sound out the
list for any objections to this late change?

Since we haven't started beta yet, I don't see a reason not to change
it. Worst case is that it causes problems and we revert it.

I concur with not back-patching, in any case.

regards, tom lane

Thomas Munro

thomas.munro@gmail.com

over 6 years ago

In reply to: Tom Lane (#2)

1 attachment(s)

Re: Fixing order of resowner cleanup in 12, for Windows

On Fri, May 3, 2019 at 2:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

A while back I posted a patch[1] to change the order of resowner
cleanup so that DSM handles are released last. That's useful for the
error cleanup path on Windows, when a SharedFileSet is cleaned up (a
mechanism that's used by parallel CREATE INDEX and parallel hash join,
for spilling files to disk under a temporary directory, with automatic
cleanup).

I guess what I'm wondering is if there are any potential negative
consequences, ie code that won't work if we change the order like this.
I'm finding it hard to visualize what that would be, but then again
this failure mode wasn't obvious either.

I can't think of anything in core. The trouble here is that we're
talking about hypothetical out-of-tree code that could want to plug in
detach hooks to do anything at all, so it's hard to say. One idea
that occurred to me is that if someone comes up with a genuine need to
run arbitrary callbacks before locks are released (for example), we
could provide a way to be called in all three phases and receive the
phase, though admittedly in this case FileClose() is in the same phase
as I'm proposing to put dsm_detach(), so there is an ordering
requirement that might require more fine grained phases. I don't
know.

I suppose we probably should make the change to 12 though: then owners
of extensions that use DSM detach hooks (if there any such extensions)
will have a bit of time to get used to the new order during the beta
period. I'll need to find someone to test this with a fault injection
scenario on Windows before committing it, but wanted to sound out the
list for any objections to this late change?

Since we haven't started beta yet, I don't see a reason not to change
it. Worst case is that it causes problems and we revert it.

I concur with not back-patching, in any case.

Here's a way to produce an error which might produce the log message
on Windows. Does anyone want to try it?

postgres=# create table foo as select generate_series(1, 10000000)::int i;
SELECT 10000000
postgres=# set synchronize_seqscans = off;
SET
postgres=# create index on foo ((1 / (5000000 - i)));
psql: ERROR: division by zero
postgres=# create index on foo ((1 / (5000000 - i)));
psql: ERROR: division by zero
postgres=# create index on foo ((1 / (5000000 - i)));
psql: ERROR: division by zero
CONTEXT: parallel worker

(If you don't turn sync scan off, it starts scanning from where it
left off last time and then fails immediately, which may interfere
with the experiment if you run it more than once, I'm not sure).

If it does produce the log message, then the attached patch should
make it go away.

--
Thomas Munro
https://enterprisedb.com

Attachments:

0001-Detach-from-DSM-segments-last-in-resowner.c.patchapplication/octet-stream; name=0001-Detach-from-DSM-segments-last-in-resowner.c.patchDownload

From 46e5a869fc3804fd5df34e2f1dc35daa06a7a907 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Thu, 29 Nov 2018 12:24:07 +1300
Subject: [PATCH] Detach from DSM segments last in resowner.c.

It turns out that Windows' FILE_SHARE_DELETE mode doesn't quite give
POSIX semantics for unlink().  While it lets you unlink a file or
a directory that has an open handle, it doesn't let you unlink a
directory that previously contained a file that has an open handle.

This has the consequence that an error raised while a SharedFileSet
exists can cause us to fail to unlink a temporary directory, even
though all the files within it have been unlinked.  We write a LOG
message and leak the empty directory until the next restart eventually
cleans it up.

Fix, by detaching from DSM segments only after we have closed all
file handles.  Since SharedFileSet uses a DSM detach hook to unlink
files only when the last backend detaches, it's only our own file
handles that we have to worry about.

This was a secondary issue discovered while investigating bug #15460.

Author: Thomas Munro, based on a suggestion from Robert Haas
Discussion: https://postgr.es/m/CA%2BhUKGKVWbz_iniqvFujPZLioFPxGwuVV6PJeeCrQ8SVcdg7FQ%40mail.gmail.com
Discussion: https://postgr.es/m/15460-b6db80de822fa0ad@postgresql.org
Discussion: https://postgr.es/m/CAEepm=1Ugp7mNFX-YpSfWr0F_6QA4jzjtavauBcoAAZGd7_tPA@mail.gmail.com
---
 src/backend/utils/resowner/resowner.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 64aafef311..505508d405 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -536,16 +536,6 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 			RelationClose(res);
 		}

-		/* Ditto for dynamic shared memory segments */
-		while (ResourceArrayGetAny(&(owner->dsmarr), &foundres))
-		{
-			dsm_segment *res = (dsm_segment *) DatumGetPointer(foundres);
-
-			if (isCommit)
-				PrintDSMLeakWarning(res);
-			dsm_detach(res);
-		}
-
 		/* Ditto for JIT contexts */
 		while (ResourceArrayGetAny(&(owner->jitarr), &foundres))
 		{
@@ -669,6 +659,16 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 				PrintFileLeakWarning(res);
 			FileClose(res);
 		}
+
+		/* Ditto for dynamic shared memory segments */
+		while (ResourceArrayGetAny(&(owner->dsmarr), &foundres))
+		{
+			dsm_segment *res = (dsm_segment *) DatumGetPointer(foundres);
+
+			if (isCommit)
+				PrintDSMLeakWarning(res);
+			dsm_detach(res);
+		}
 	}

 	/* Let add-on modules get a chance too */
-- 
2.21.0

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Thomas Munro (#3)

Re: Fixing order of resowner cleanup in 12, for Windows

Thomas Munro <thomas.munro@gmail.com> writes:

If it does produce the log message, then the attached patch should
make it go away.

One thing I don't care for about this patch is that the original code
looked like it didn't matter what order we did the resource releases in,
and the patched code still looks like that. You're not doing future
hackers any service by failing to include a comment that explains that
DSM detach MUST BE LAST, and explaining why. Even with that, I'd only
rate it about a 75% chance that somebody won't try to add their new
resource type at the end --- but with no comment, the odds they'll
get it right are indistinguishable from zero.

regards, tom lane

Thomas Munro

thomas.munro@gmail.com

over 6 years ago

In reply to: Tom Lane (#4)

1 attachment(s)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 11:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

One thing I don't care for about this patch is that the original code
looked like it didn't matter what order we did the resource releases in,
and the patched code still looks like that. You're not doing future
hackers any service by failing to include a comment that explains that
DSM detach MUST BE LAST, and explaining why. Even with that, I'd only
rate it about a 75% chance that somebody won't try to add their new
resource type at the end --- but with no comment, the odds they'll
get it right are indistinguishable from zero.

Ok, here's a version that provides a specific reason (the Windows file
handle thing) and also a more general reasoning: we don't really want
extension (or core) authors writing callbacks that depend on eg pins
or locks or whatever else being still held when they run, because
that's fragile, so calling them last is the best and most conservative
choice. I think if someone does come with legitimate reasons to want
that, we should discuss it then, and perhaps consider something a bit
like the ResourceRelease_callbacks list: its callbacks are invoked for
each phase.

--
Thomas Munro
https://enterprisedb.com

Attachments:

0001-Detach-from-DSM-segments-last-in-resowner.c-v2.patchapplication/x-patch; name=0001-Detach-from-DSM-segments-last-in-resowner.c-v2.patchDownload

From dfb5b10435b52d07cc0f294ee9919d67170b9601 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Thu, 29 Nov 2018 12:24:07 +1300
Subject: [PATCH] Detach from DSM segments last in resowner.c.

It turns out that Windows' FILE_SHARE_DELETE mode doesn't quite give
POSIX semantics for unlink().  While it lets you unlink a file or
a directory that has an open handle, it doesn't let you unlink a
directory that previously contained a file that has an open handle.

This has the consequence that an error raised while a SharedFileSet
exists can cause us to fail to unlink a temporary directory, even
though all the files within it have been unlinked.  We write a LOG
message and leak the empty directory until the next restart eventually
cleans it up.

Fix, by detaching from DSM segments only after we have closed all
file handles.  Since SharedFileSet uses a DSM detach hook to unlink
files only when the last backend detaches, it's only our own file
handles that we have to worry about.

This was a secondary issue discovered while investigating bug #15460.

Author: Thomas Munro, based on a suggestion from Robert Haas
Discussion: https://postgr.es/m/CA%2BhUKGKVWbz_iniqvFujPZLioFPxGwuVV6PJeeCrQ8SVcdg7FQ%40mail.gmail.com
Discussion: https://postgr.es/m/15460-b6db80de822fa0ad@postgresql.org
Discussion: https://postgr.es/m/CAEepm=1Ugp7mNFX-YpSfWr0F_6QA4jzjtavauBcoAAZGd7_tPA@mail.gmail.com
---
 src/backend/utils/resowner/resowner.c | 32 ++++++++++++++++++---------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 64aafef311..839df34112 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -536,16 +536,6 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 			RelationClose(res);
 		}

-		/* Ditto for dynamic shared memory segments */
-		while (ResourceArrayGetAny(&(owner->dsmarr), &foundres))
-		{
-			dsm_segment *res = (dsm_segment *) DatumGetPointer(foundres);
-
-			if (isCommit)
-				PrintDSMLeakWarning(res);
-			dsm_detach(res);
-		}
-
 		/* Ditto for JIT contexts */
 		while (ResourceArrayGetAny(&(owner->jitarr), &foundres))
 		{
@@ -669,6 +659,28 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 				PrintFileLeakWarning(res);
 			FileClose(res);
 		}
+
+		/*
+		 * Ditto for dynamic shared memory segments.
+		 *
+		 * Besides releasing shared memory, dsm_detach() also runs arbitrary
+		 * registered callbacks.  We do this after closing temporary files,
+		 * because SharedFileSetOnDetach tries to unlink shared temporary
+		 * directories when the last backend detaches.  On Windows, that fails
+		 * if file handles still exist for files inside that directory.
+		 *
+		 * More generally, doing this last prevents extension writers from
+		 * developing DSM detach hooks that depend on other resources and thus
+		 * on the exact release order.
+		 */
+		while (ResourceArrayGetAny(&(owner->dsmarr), &foundres))
+		{
+			dsm_segment *res = (dsm_segment *) DatumGetPointer(foundres);
+
+			if (isCommit)
+				PrintDSMLeakWarning(res);
+			dsm_detach(res);
+		}
 	}

 	/* Let add-on modules get a chance too */
-- 
2.21.0

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Thomas Munro (#5)

Re: Fixing order of resowner cleanup in 12, for Windows

Thomas Munro <thomas.munro@gmail.com> writes:

On Mon, May 6, 2019 at 11:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

... You're not doing future
hackers any service by failing to include a comment that explains that
DSM detach MUST BE LAST, and explaining why.

Ok, here's a version that provides a specific reason (the Windows file
handle thing) and also a more general reasoning: we don't really want
extension (or core) authors writing callbacks that depend on eg pins
or locks or whatever else being still held when they run, because
that's fragile, so calling them last is the best and most conservative
choice.

LGTM.

... I think if someone does come with legitimate reasons to want
that, we should discuss it then, and perhaps consider something a bit
like the ResourceRelease_callbacks list: its callbacks are invoked for
each phase.

Hmm, now that you mention it: this bit at the very end

/* Let add-on modules get a chance too */
for (item = ResourceRelease_callbacks; item; item = item->next)
item->callback(phase, isCommit, isTopLevel, item->arg);

seems kind of misplaced given this discussion. Should we not run that
*first*, before we release core resources for the same phase? It's
a lot more plausible that extension resources depend on core resources
than vice versa.

regards, tom lane

Thomas Munro

thomas.munro@gmail.com

over 6 years ago

In reply to: Tom Lane (#6)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 3:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

On Mon, May 6, 2019 at 11:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

... You're not doing future
hackers any service by failing to include a comment that explains that
DSM detach MUST BE LAST, and explaining why.

Ok, here's a version that provides a specific reason (the Windows file
handle thing) and also a more general reasoning: we don't really want
extension (or core) authors writing callbacks that depend on eg pins
or locks or whatever else being still held when they run, because
that's fragile, so calling them last is the best and most conservative
choice.

LGTM.

Cool. I'll wait a bit to see if we can get confirmation from a
Windows hacker that it does what I claim. Or maybe I should try to
come up with a regression test that exercises it without having to
create a big table.

... I think if someone does come with legitimate reasons to want
that, we should discuss it then, and perhaps consider something a bit
like the ResourceRelease_callbacks list: its callbacks are invoked for
each phase.

Hmm, now that you mention it: this bit at the very end

/* Let add-on modules get a chance too */
for (item = ResourceRelease_callbacks; item; item = item->next)
item->callback(phase, isCommit, isTopLevel, item->arg);

seems kind of misplaced given this discussion. Should we not run that
*first*, before we release core resources for the same phase? It's
a lot more plausible that extension resources depend on core resources
than vice versa.

Not sure. Changing the meaning of the existing callbacks from last
to first in each phase seems a bit unfriendly. If it's useful to be
able to run a callback before RESOURCE_RELEASE_BEFORE_LOCKS, perhaps
we need a new phase that comes before that?

--
Thomas Munro
https://enterprisedb.com

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Thomas Munro (#3)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 3:43 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Fri, May 3, 2019 at 2:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

A while back I posted a patch[1] to change the order of resowner
cleanup so that DSM handles are released last. That's useful for the
error cleanup path on Windows, when a SharedFileSet is cleaned up (a
mechanism that's used by parallel CREATE INDEX and parallel hash join,
for spilling files to disk under a temporary directory, with automatic
cleanup).

I guess what I'm wondering is if there are any potential negative
consequences, ie code that won't work if we change the order like this.
I'm finding it hard to visualize what that would be, but then again
this failure mode wasn't obvious either.

I can't think of anything in core. The trouble here is that we're
talking about hypothetical out-of-tree code that could want to plug in
detach hooks to do anything at all, so it's hard to say. One idea
that occurred to me is that if someone comes up with a genuine need to
run arbitrary callbacks before locks are released (for example), we
could provide a way to be called in all three phases and receive the
phase, though admittedly in this case FileClose() is in the same phase
as I'm proposing to put dsm_detach(), so there is an ordering
requirement that might require more fine grained phases. I don't
know.

I suppose we probably should make the change to 12 though: then owners
of extensions that use DSM detach hooks (if there any such extensions)
will have a bit of time to get used to the new order during the beta
period. I'll need to find someone to test this with a fault injection
scenario on Windows before committing it, but wanted to sound out the
list for any objections to this late change?

Since we haven't started beta yet, I don't see a reason not to change
it. Worst case is that it causes problems and we revert it.

I concur with not back-patching, in any case.

Here's a way to produce an error which might produce the log message
on Windows. Does anyone want to try it?

I can give it a try.

postgres=# create table foo as select generate_series(1, 10000000)::int i;
SELECT 10000000
postgres=# set synchronize_seqscans = off;
SET
postgres=# create index on foo ((1 / (5000000 - i)));
psql: ERROR: division by zero
postgres=# create index on foo ((1 / (5000000 - i)));
psql: ERROR: division by zero
postgres=# create index on foo ((1 / (5000000 - i)));
psql: ERROR: division by zero
CONTEXT: parallel worker

(If you don't turn sync scan off, it starts scanning from where it
left off last time and then fails immediately, which may interfere
with the experiment if you run it more than once, I'm not sure).

If it does produce the log message, then the attached patch should
make it go away.

Are you referring to log message "LOG: could not rmdir directory
"base/pgsql_tmp/pgsql_tmp3692.0.sharedfileset": Directory not empty"?
If so, I am getting it both before and after your patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Thomas Munro

thomas.munro@gmail.com

over 6 years ago

In reply to: Amit Kapila (#8)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, May 6, 2019 at 3:43 AM Thomas Munro <thomas.munro@gmail.com> wrote:

Here's a way to produce an error which might produce the log message
on Windows. Does anyone want to try it?

I can give it a try.

Thanks!

If it does produce the log message, then the attached patch should
make it go away.

Are you referring to log message "LOG: could not rmdir directory
"base/pgsql_tmp/pgsql_tmp3692.0.sharedfileset": Directory not empty"?

Yes.

If so, I am getting it both before and after your patch.

Huh. I thought the only problem here was the phenomenon demonstrated
by [1]/messages/by-id/CAEepm=2rH_V5by1kH1Q1HZWPFj=4ykjU4JcyoKMNVT6Jh8Q_Rw@mail.gmail.com. I'm a bit stumped... if we've closed all the handles in every
backend before detaching, and then the last to detach unlinks all the
files first and then the directory, how can we get that error?

[1]: /messages/by-id/CAEepm=2rH_V5by1kH1Q1HZWPFj=4ykjU4JcyoKMNVT6Jh8Q_Rw@mail.gmail.com

--
Thomas Munro
https://enterprisedb.com

#10

Amit Kapila

amit.kapila16@gmail.com

over 6 years ago

In reply to: Thomas Munro (#9)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 4:41 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, May 6, 2019 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

If so, I am getting it both before and after your patch.

Huh. I thought the only problem here was the phenomenon demonstrated
by [1]. I'm a bit stumped... if we've closed all the handles in every
backend before detaching, and then the last to detach unlinks all the
files first and then the directory, how can we get that error?

Yeah, I am also not sure what caused that and I have verified it two
times to ensure that I have not made any mistake. I can try once
again tomorrow after adding some debug messages, but if someone else
can also once confirm the behavior, it would be good.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#11

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Tom Lane (#2)

Re: Fixing order of resowner cleanup in 12, for Windows

On Thu, May 2, 2019 at 10:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

A while back I posted a patch[1] to change the order of resowner
cleanup so that DSM handles are released last. That's useful for the
error cleanup path on Windows, when a SharedFileSet is cleaned up (a
mechanism that's used by parallel CREATE INDEX and parallel hash join,
for spilling files to disk under a temporary directory, with automatic
cleanup).

I guess what I'm wondering is if there are any potential negative
consequences, ie code that won't work if we change the order like this.
I'm finding it hard to visualize what that would be, but then again
this failure mode wasn't obvious either.

I have a thought about this. It seems to me that when it comes to
backend-private memory, we release it even later: aborting the
transaction does nothing, and we do it only later when we clean up the
transaction. So I wonder whether we're going to find that we actually
want to postpone reclaiming dynamic shared memory for even longer than
this change would do. But in general, I think we've already
established the principle that releasing memory needs to happen last,
because every other resource that you might be using is tracked using
data structures that are, uh, stored in memory. Therefore I suspect
that this change is going in the right direction.

To put that another way, the issue here is that the removal of the
files can't happen after the cleanup of the memory that tells us which
files to remove. If we had the corresponding problem for the
non-parallel case, it would mean that we were deleting the
transaction's memory context before we finished releasing all the
resources managed by the transaction's resowner, which would be
insane. I believe I put the call to release DSM segments where it is
on the theory that "we should release dynamic shared memory as early
as possible because freeing memory is good," completing failing to
take into account that this was not at all like what we do for
backend-private memory.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Robert Haas (#11)

Re: Fixing order of resowner cleanup in 12, for Windows

Robert Haas <robertmhaas@gmail.com> writes:

I have a thought about this. It seems to me that when it comes to
backend-private memory, we release it even later: aborting the
transaction does nothing, and we do it only later when we clean up the
transaction. So I wonder whether we're going to find that we actually
want to postpone reclaiming dynamic shared memory for even longer than
this change would do. But in general, I think we've already
established the principle that releasing memory needs to happen last,
because every other resource that you might be using is tracked using
data structures that are, uh, stored in memory. Therefore I suspect
that this change is going in the right direction.

Hmm. That argument suggests that DSM cleanup shouldn't be part of
resowner cleanup at all, but should be handled as a bespoke, late
step in transaction cleanup, as memory-context release is.

Not sure if that's going too far or not. It would definitely be a big
change in environment for DSM-cleanup hooks.

regards, tom lane

#13

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Tom Lane (#12)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I have a thought about this. It seems to me that when it comes to
backend-private memory, we release it even later: aborting the
transaction does nothing, and we do it only later when we clean up the
transaction. So I wonder whether we're going to find that we actually
want to postpone reclaiming dynamic shared memory for even longer than
this change would do. But in general, I think we've already
established the principle that releasing memory needs to happen last,
because every other resource that you might be using is tracked using
data structures that are, uh, stored in memory. Therefore I suspect
that this change is going in the right direction.

Hmm. That argument suggests that DSM cleanup shouldn't be part of
resowner cleanup at all, but should be handled as a bespoke, late
step in transaction cleanup, as memory-context release is.

Not sure if that's going too far or not. It would definitely be a big
change in environment for DSM-cleanup hooks.

Right. That's why I favor applying the change to move DSM cleanup to
the end for now, and seeing how that goes. It could be that we'll
eventually discover that doing it before all of the AtEOXact_BLAH
functions have had a short at doing their thing is still too early,
but the only concrete problem that we know about right now can be
solved by this much-less-invasive change.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Robert Haas (#13)

Re: Fixing order of resowner cleanup in 12, for Windows

Robert Haas <robertmhaas@gmail.com> writes:

Right. That's why I favor applying the change to move DSM cleanup to
the end for now, and seeing how that goes. It could be that we'll
eventually discover that doing it before all of the AtEOXact_BLAH
functions have had a short at doing their thing is still too early,
but the only concrete problem that we know about right now can be
solved by this much-less-invasive change.

But Amit's results say that this *doesn't* fix the problem that we know
about. I suspect the reason is exactly that we need to run AtEOXact_Files
or the like before closing DSM. But we should get some Windows developer
to trace through this and identify the cause for-sure before we go
designing an invasive fix.

regards, tom lane

#15

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Tom Lane (#14)

Re: Fixing order of resowner cleanup in 12, for Windows

On Mon, May 6, 2019 at 1:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Right. That's why I favor applying the change to move DSM cleanup to
the end for now, and seeing how that goes. It could be that we'll
eventually discover that doing it before all of the AtEOXact_BLAH
functions have had a short at doing their thing is still too early,
but the only concrete problem that we know about right now can be
solved by this much-less-invasive change.

But Amit's results say that this *doesn't* fix the problem that we know
about. I suspect the reason is exactly that we need to run AtEOXact_Files
or the like before closing DSM. But we should get some Windows developer
to trace through this and identify the cause for-sure before we go
designing an invasive fix.

Huh, OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16

Thomas Munro

thomas.munro@gmail.com

over 6 years ago

In reply to: Robert Haas (#15)

Re: Fixing order of resowner cleanup in 12, for Windows

On Tue, May 7, 2019 at 6:08 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, May 6, 2019 at 1:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Right. That's why I favor applying the change to move DSM cleanup to
the end for now, and seeing how that goes. It could be that we'll
eventually discover that doing it before all of the AtEOXact_BLAH
functions have had a short at doing their thing is still too early,
but the only concrete problem that we know about right now can be
solved by this much-less-invasive change.

But Amit's results say that this *doesn't* fix the problem that we know
about. I suspect the reason is exactly that we need to run AtEOXact_Files
or the like before closing DSM. But we should get some Windows developer
to trace through this and identify the cause for-sure before we go
designing an invasive fix.

Huh, OK.

The reason the patch didn't solve the problem is that
AtEOXact_Parallel() calls DestroyParallelContext(). So DSM segments
that happen to belong to ParallelContext objects are already gone by
the time resowner.c gets involved.

--
Thomas Munro
https://enterprisedb.com

#17

Thomas Munro

thomas.munro@gmail.com

over 6 years ago

In reply to: Thomas Munro (#16)

Re: Fixing order of resowner cleanup in 12, for Windows

On Thu, May 9, 2019 at 10:23 PM Thomas Munro <thomas.munro@gmail.com> wrote:

The reason the patch didn't solve the problem is that
AtEOXact_Parallel() calls DestroyParallelContext(). So DSM segments
that happen to belong to ParallelContext objects are already gone by
the time resowner.c gets involved.

This was listed as an open item for PostgreSQL 12, but I'm going to
move it to "older bugs". I want to fix it, but now that I understand
what's wrong, it's a slightly bigger design issue than I'm game to try
to fix right now.

This means that 12, like 11, will be capable of leaking empty
temporary directories on Windows whenever an error is raised in the
middle of parallel CREATE INDEX or multi-batch Parallel Hash Join.
The directories are eventually unlinked at restart, and at least
the (potentially large) files inside the directory are unlinked on
abort. I think we can live with that for a bit longer.

--
Thomas Munro
https://enterprisedb.com