BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Started by PG Bug reporting formover 5 years ago21 messages
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16663
Logged by: Denis Patron
Email address: denis.patron@previnet.it
PostgreSQL version: 11.9
Operating system: CentOS 7
Description:

I have an index, which at the file system level, is made up of multiple
segments (file: <id>.1, <id>.2 ecc). When I DROP INDEX, the index is dropped
in Postgresql but at the file system level, the segments are marked as
"deleted". if I check with the lsof command, I see that the segments are in
use from an idle connection. This does not happen if the index is formed by
only one segment (in my case <1Gb). How can I prevent this?
thanks

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

This is not a bug.

At Fri, 09 Oct 2020 13:24:15 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in

The following bug has been logged on the website:

Bug reference: 16663
Logged by: Denis Patron
Email address: denis.patron@previnet.it
PostgreSQL version: 11.9
Operating system: CentOS 7
Description:

I have an index, which at the file system level, is made up of multiple
segments (file: <id>.1, <id>.2 ecc). When I DROP INDEX, the index is dropped
in Postgresql but at the file system level, the segments are marked as
"deleted". if I check with the lsof command, I see that the segments are in
use from an idle connection. This does not happen if the index is formed by
only one segment (in my case <1Gb). How can I prevent this?
thanks

That references to deleted files will dissapear at the beginning of
the next transaction.

At the time a relation including an index is dropped, the first
segment file (named as "<id>" without a suffix number) is left behind
so the file is not shown as "(deleted)" in lsof output.

The next checkpoint removes the first segment.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#2)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Hi,

On 2020-10-14 12:05:10 +0900, Kyotaro Horiguchi wrote:

This is not a bug.

At Fri, 09 Oct 2020 13:24:15 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in

The following bug has been logged on the website:

Bug reference: 16663
Logged by: Denis Patron
Email address: denis.patron@previnet.it
PostgreSQL version: 11.9
Operating system: CentOS 7
Description:

I have an index, which at the file system level, is made up of multiple
segments (file: <id>.1, <id>.2 ecc). When I DROP INDEX, the index is dropped
in Postgresql but at the file system level, the segments are marked as
"deleted". if I check with the lsof command, I see that the segments are in
use from an idle connection. This does not happen if the index is formed by
only one segment (in my case <1Gb). How can I prevent this?
thanks

That references to deleted files will dissapear at the beginning of
the next transaction.

At the time a relation including an index is dropped, the first
segment file (named as "<id>" without a suffix number) is left behind
so the file is not shown as "(deleted)" in lsof output.

I think we should consider either occasionally sending a sinval catchup
interrupt to backends that have been idle for a while, or to use a timer
that we use to limit the maximum time until we process sinvals. Just
having to wait till all backends become busy and process sinval events
doesn't really seem like good approach to me.

Regards,

Andres

#4denis.patron
denis.patron@previnet.it
In reply to: Andres Freund (#3)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Andres Freund wrote

Hi,

On 2020-10-14 12:05:10 +0900, Kyotaro Horiguchi wrote:

This is not a bug.

At Fri, 09 Oct 2020 13:24:15 +0000, PG Bug reporting form &lt;

noreply@

&gt; wrote in

The following bug has been logged on the website:

Bug reference: 16663
Logged by: Denis Patron
Email address:

denis.patron@

PostgreSQL version: 11.9
Operating system: CentOS 7
Description:

I have an index, which at the file system level, is made up of multiple
segments (file:

<id>
.1,
<id>
.2 ecc). When I DROP INDEX, the index is dropped

in Postgresql but at the file system level, the segments are marked as
"deleted". if I check with the lsof command, I see that the segments

are in

use from an idle connection. This does not happen if the index is

formed by

only one segment (in my case <1Gb). How can I prevent this?
thanks

That references to deleted files will dissapear at the beginning of
the next transaction.

At the time a relation including an index is dropped, the first
segment file (named as "

<id>
" without a suffix number) is left behind

so the file is not shown as "(deleted)" in lsof output.

I think we should consider either occasionally sending a sinval catchup
interrupt to backends that have been idle for a while, or to use a timer
that we use to limit the maximum time until we process sinvals. Just
having to wait till all backends become busy and process sinval events
doesn't really seem like good approach to me.

Regards,

Andres

thanks for replying.
the problem is that I have a very large database, with indexes of up to 70
Gb. while I redo the indexes in concurrently mode, if an idle transaction is
using the index in question, the segment file (<id> _1 <id> _2 etc) of the
index remains in the filesystem (marked as deleted) as long as the idle
connection that it is blocking it does not make another transaction. this
means that I can have hundreds of GB of space occupied by files marked
"deleted", and this for hours. the risk is to run out of free space

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-bugs-f2117394.html

#5Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#3)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Wed, Oct 14, 2020 at 5:35 PM Andres Freund <andres@anarazel.de> wrote:

On 2020-10-14 12:05:10 +0900, Kyotaro Horiguchi wrote:

At the time a relation including an index is dropped, the first
segment file (named as "<id>" without a suffix number) is left behind
so the file is not shown as "(deleted)" in lsof output.

I think we should consider either occasionally sending a sinval catchup
interrupt to backends that have been idle for a while, or to use a timer
that we use to limit the maximum time until we process sinvals. Just
having to wait till all backends become busy and process sinval events
doesn't really seem like good approach to me.

Oops, I also replied to this but now I see that I accidentally replied
only to Horiguchi-san and not the list! I was thinking that we should
perhaps consider truncating the files to give back the disk space (as
we do for the first segment), so that it doesn't matter so much how
long other backends take to process SHAREDINVALSMGR_ID, close their
descriptors and release the inode.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#5)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Oct 14, 2020 at 5:35 PM Andres Freund <andres@anarazel.de> wrote:

I think we should consider either occasionally sending a sinval catchup
interrupt to backends that have been idle for a while, or to use a timer
that we use to limit the maximum time until we process sinvals. Just
having to wait till all backends become busy and process sinval events
doesn't really seem like good approach to me.

Oops, I also replied to this but now I see that I accidentally replied
only to Horiguchi-san and not the list! I was thinking that we should
perhaps consider truncating the files to give back the disk space (as
we do for the first segment), so that it doesn't matter so much how
long other backends take to process SHAREDINVALSMGR_ID, close their
descriptors and release the inode.

+1, I was also thinking that. It'd be pretty easy to fit into the
existing system structure (I think, without having looked at the relevant
code lately), and it would not add any overhead to normal processing.
Installing a timeout to handle this per Andres' idea inevitably *would*
add overhead.

regards, tom lane

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#6)
1 attachment(s)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Thu, Oct 15, 2020 at 8:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Oct 14, 2020 at 5:35 PM Andres Freund <andres@anarazel.de> wrote:

I think we should consider either occasionally sending a sinval catchup
interrupt to backends that have been idle for a while, or to use a timer
that we use to limit the maximum time until we process sinvals. Just
having to wait till all backends become busy and process sinval events
doesn't really seem like good approach to me.

Oops, I also replied to this but now I see that I accidentally replied
only to Horiguchi-san and not the list! I was thinking that we should
perhaps consider truncating the files to give back the disk space (as
we do for the first segment), so that it doesn't matter so much how
long other backends take to process SHAREDINVALSMGR_ID, close their
descriptors and release the inode.

+1, I was also thinking that. It'd be pretty easy to fit into the
existing system structure (I think, without having looked at the relevant
code lately), and it would not add any overhead to normal processing.
Installing a timeout to handle this per Andres' idea inevitably *would*
add overhead.

Alright, here is a first swing at making our behaviour more consistent
in two ways:

1. The first segment should be truncated even in recovery.
2. Later segments should be truncated on commit.

I don't know why the existing coding decides not to try to unlink the
later segments if the truncate of segment 0 failed. We already
committed, we should plough on.

Attachments:

0001-Free-disk-space-for-dropped-relations-on-commit.patchtext/x-patch; charset=US-ASCII; name=0001-Free-disk-space-for-dropped-relations-on-commit.patchDownload
From 2125d347fdfb3b4d373cdfe549e6c6c2471c6f17 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 15 Oct 2020 13:22:44 +1300
Subject: [PATCH] Free disk space for dropped relations on commit.

When committing a transaction that dropped a relation, we previously
truncated only the first segment file to free up disk space, because it
won't be unlinked until the next checkpoint.

Truncate later segments too, even though we unlink them on commit.  This
frees the disk space immediately, even if other backends have open file
descriptors and might take a long time to get around to handling shared
invalidation events and closing them.

Also extend the same behavior to the first segment in recovery.

Bug: #16663
Reported-by: Denis Patron <denis.patron@previnet.it>
Discussion: https://postgr.es/m/16663-fe97ccf9932fc800%40postgresql.org
---
 src/backend/storage/smgr/md.c | 60 +++++++++++++++++++++++------------
 1 file changed, 40 insertions(+), 20 deletions(-)

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1d4aa482cc..14a1942f8a 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -286,11 +286,39 @@ mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		mdunlinkfork(rnode, forkNum, isRedo);
 }
 
+/*
+ * Reduce the size of a file to zero, to free up disk space.
+ */
+static void
+do_truncate(const char *path)
+{
+	int			ret;
+	int			fd;
+
+	/* truncate(2) would be easier here, but Windows hasn't got it */
+	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
+	if (fd >= 0)
+	{
+		int			save_errno;
+
+		ret = ftruncate(fd, 0);
+		save_errno = errno;
+		CloseTransientFile(fd);
+		errno = save_errno;
+	}
+	else
+		ret = -1;
+	if (ret < 0 && errno != ENOENT)
+		ereport(WARNING,
+				(errcode_for_file_access(),
+				 errmsg("could not truncate file \"%s\": %m", path)));
+}
+
 static void
 mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 {
 	char	   *path;
-	int			ret;
+	int			ret = 0;
 
 	path = relpath(rnode, forkNum);
 
@@ -303,6 +331,10 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		if (!RelFileNodeBackendIsTemp(rnode))
 			register_forget_request(rnode, forkNum, 0 /* first seg */ );
 
+		/* Prevent other backends' fds from holding on to the disk space */
+		if (!RelFileNodeBackendIsTemp(rnode))
+			do_truncate(path);
+
 		/* Next unlink the file */
 		ret = unlink(path);
 		if (ret < 0 && errno != ENOENT)
@@ -312,25 +344,8 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 	}
 	else
 	{
-		/* truncate(2) would be easier here, but Windows hasn't got it */
-		int			fd;
-
-		fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
-		if (fd >= 0)
-		{
-			int			save_errno;
-
-			ret = ftruncate(fd, 0);
-			save_errno = errno;
-			CloseTransientFile(fd);
-			errno = save_errno;
-		}
-		else
-			ret = -1;
-		if (ret < 0 && errno != ENOENT)
-			ereport(WARNING,
-					(errcode_for_file_access(),
-					 errmsg("could not truncate file \"%s\": %m", path)));
+		/* Prevent other backends' fds from holding on to the disk space */
+		do_truncate(path);
 
 		/* Register request to unlink first segment later */
 		register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
@@ -358,6 +373,11 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 				register_forget_request(rnode, forkNum, segno);
 
 			sprintf(segpath, "%s.%u", path, segno);
+
+			/* Prevent other backends' fds from holding on to the disk space */
+			if (!RelFileNodeBackendIsTemp(rnode))
+				do_truncate(segpath);
+
 			if (unlink(segpath) < 0)
 			{
 				/* ENOENT is expected after the last segment... */
-- 
2.20.1

#8Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Thomas Munro (#7)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Ouch. You beat me to it.

At Thu, 15 Oct 2020 14:26:36 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in

On Thu, Oct 15, 2020 at 8:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Oct 14, 2020 at 5:35 PM Andres Freund <andres@anarazel.de> wrote:

I think we should consider either occasionally sending a sinval catchup
interrupt to backends that have been idle for a while, or to use a timer
that we use to limit the maximum time until we process sinvals. Just
having to wait till all backends become busy and process sinval events
doesn't really seem like good approach to me.

Oops, I also replied to this but now I see that I accidentally replied
only to Horiguchi-san and not the list! I was thinking that we should
perhaps consider truncating the files to give back the disk space (as
we do for the first segment), so that it doesn't matter so much how
long other backends take to process SHAREDINVALSMGR_ID, close their
descriptors and release the inode.

+1, I was also thinking that. It'd be pretty easy to fit into the
existing system structure (I think, without having looked at the relevant
code lately), and it would not add any overhead to normal processing.
Installing a timeout to handle this per Andres' idea inevitably *would*
add overhead.

Alright, here is a first swing at making our behaviour more consistent
in two ways:

1. The first segment should be truncated even in recovery.
2. Later segments should be truncated on commit.

I don't know why the existing coding decides not to try to unlink the
later segments if the truncate of segment 0 failed. We already
committed, we should plough on.

I was trying the almost the same thing except how to emit the error
message for truncation and not trying to unlink if truncation ends
with ENOENT for following segments.

regareds.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#9denni.pat
denni.pat@gmail.com
In reply to: Thomas Munro (#7)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

thanks for the patch.
Do you think it can be included in the next minor releases or the only
solution will be to recompile?
regards
Denis

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-bugs-f2117394.html

#10Thomas Munro
thomas.munro@gmail.com
In reply to: denni.pat (#9)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Thu, Oct 15, 2020 at 8:20 PM denni.pat <denni.pat@gmail.com> wrote:

thanks for the patch.
Do you think it can be included in the next minor releases or the only
solution will be to recompile?

I would vote +1 for back-patching a fix for this problem (that is,
pushing it into the minor releases), because I agree that it's very
arguably a bug that we treat the segments differently, and looking
around I do see reports of people having to terminate processes to get
their disk space back. I'd definitely want a consensus on that plan
from some experienced reviewers and testers, though. For anyone
wanting to test this, you might want to set RELSEGSIZE to a smaller
number in src/include/pg_config.h.

#11Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Thomas Munro (#10)
1 attachment(s)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Thomas,
I get into the patch and I think it's worth being committed and
backpatched.
BTW I noticed that sometimes the same comparisons are done twice, and I
made a very minor refactor of the code. PFA v2 of a patch if you don't mind.
As for the question on what to do with the additional segments if the first
one failed to be truncated, I don't consider myself experienced enough and
surely someone else's independent opinion is very much welcome.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com&gt;

Attachments:

v2-0001-Free-disk-space-for-dropped-relations-on-commit.patchapplication/octet-stream; name=v2-0001-Free-disk-space-for-dropped-relations-on-commit.patchDownload
From e9d611fb3f45cdd9641fe6e73bb4fa7f7f08aad2 Mon Sep 17 00:00:00 2001
From: Pavel Borisov <pashkin.elfe@gmail.com>
Date: Tue, 10 Nov 2020 13:39:00 +0400
Subject: [PATCH v2] Free disk space for dropped relations on commit.

When committing a transaction that dropped a relation, we previously
truncated only the first segment file to free up disk space, because it
won't be unlinked until the next checkpoint.

Truncate later segments too, even though we unlink them on commit.  This
frees the disk space immediately, even if other backends have open file
descriptors and might take a long time to get around to handling shared
invalidation events and closing them.

Also extend the same behavior to the first segment in recovery.

Author: Thomas Munro
Bug: #16663
Reported-by: Denis Patron <denis.patron@previnet.it>
Discussion: https://postgr.es/m/16663-fe97ccf9932fc800%40postgresql.org
---
 src/backend/storage/smgr/md.c | 74 +++++++++++++++++++++++------------
 1 file changed, 48 insertions(+), 26 deletions(-)

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1d4aa482cc..785ee7ef1e 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -286,11 +286,39 @@ mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		mdunlinkfork(rnode, forkNum, isRedo);
 }
 
+/*
+ * Reduce the size of a file to zero, to free up disk space.
+ */
+static void
+do_truncate(const char *path)
+{
+	int			ret;
+	int			fd;
+
+	/* truncate(2) would be easier here, but Windows hasn't got it */
+	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
+	if (fd >= 0)
+	{
+		int			save_errno;
+
+		ret = ftruncate(fd, 0);
+		save_errno = errno;
+		CloseTransientFile(fd);
+		errno = save_errno;
+	}
+	else
+		ret = -1;
+	if (ret < 0 && errno != ENOENT)
+		ereport(WARNING,
+				(errcode_for_file_access(),
+				 errmsg("could not truncate file \"%s\": %m", path)));
+}
+
 static void
 mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 {
 	char	   *path;
-	int			ret;
+	int			ret = 0;
 
 	path = relpath(rnode, forkNum);
 
@@ -299,10 +327,15 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 	 */
 	if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
 	{
-		/* First, forget any pending sync requests for the first segment */
 		if (!RelFileNodeBackendIsTemp(rnode))
+		{
+			/* First, forget any pending sync requests for the first segment */
 			register_forget_request(rnode, forkNum, 0 /* first seg */ );
 
+			/* Prevent other backends' fds from holding on to the disk space */
+			do_truncate(path);
+		}
+
 		/* Next unlink the file */
 		ret = unlink(path);
 		if (ret < 0 && errno != ENOENT)
@@ -312,25 +345,8 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 	}
 	else
 	{
-		/* truncate(2) would be easier here, but Windows hasn't got it */
-		int			fd;
-
-		fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
-		if (fd >= 0)
-		{
-			int			save_errno;
-
-			ret = ftruncate(fd, 0);
-			save_errno = errno;
-			CloseTransientFile(fd);
-			errno = save_errno;
-		}
-		else
-			ret = -1;
-		if (ret < 0 && errno != ENOENT)
-			ereport(WARNING,
-					(errcode_for_file_access(),
-					 errmsg("could not truncate file \"%s\": %m", path)));
+		/* Prevent other backends' fds from holding on to the disk space */
+		do_truncate(path);
 
 		/* Register request to unlink first segment later */
 		register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
@@ -350,14 +366,20 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		 */
 		for (segno = 1;; segno++)
 		{
-			/*
-			 * Forget any pending sync requests for this segment before we try
-			 * to unlink.
-			 */
+			sprintf(segpath, "%s.%u", path, segno);
+
 			if (!RelFileNodeBackendIsTemp(rnode))
+			{
+				/*
+				 * Forget any pending sync requests for this segment before we try
+				 * to unlink.
+				 */
 				register_forget_request(rnode, forkNum, segno);
 
-			sprintf(segpath, "%s.%u", path, segno);
+				/* Prevent other backends' fds from holding on to the disk space */
+				do_truncate(segpath);
+			}
+
 			if (unlink(segpath) < 0)
 			{
 				/* ENOENT is expected after the last segment... */
-- 
2.28.0

#12Neil Chen
carpenter.nail.cz@gmail.com
In reply to: Pavel Borisov (#11)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Hi, I have tested the feature and it worked well.
One thing that doesn't matter is that the modify here seems unnecessary, right?

Show quoted text

mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
char *path;
- int ret;
+ int ret = 0;
path = relpath(rnode, forkNum

#13Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Neil Chen (#12)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

One thing that doesn't matter is that the modify here seems unnecessary,
right?

mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
char *path;
- int ret;
+ int ret = 0;
path = relpath(rnode, forkNum

I suppose it is indeed necessary as otherwise the result of the comparison
is not defined in case of 'else' block in the mdunlinkfork() :
346 else
347 {
348 /* Prevent other backends' fds from holding on to the disk
space */
349 do_truncate(path);
.....
356 * Delete any additional segments.
357 */
358 if (ret >= 0)
----------^^^^^^^

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com&gt;

#14Nail Carpenter
carpenter.nail.cz@gmail.com
In reply to: Pavel Borisov (#13)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

Yes, It's my fault. You're right.

Pavel Borisov <pashkin.elfe@gmail.com> 于2020年11月19日周四 下午11:55写道:

One thing that doesn't matter is that the modify here seems unnecessary,

right?

mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
char *path;
- int ret;
+ int ret = 0;
path = relpath(rnode, forkNum

I suppose it is indeed necessary as otherwise the result of the comparison
is not defined in case of 'else' block in the mdunlinkfork() :
346 else
347 {
348 /* Prevent other backends' fds from holding on to the disk
space */
349 do_truncate(path);
.....
356 * Delete any additional segments.
357 */
358 if (ret >= 0)
----------^^^^^^^

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com&gt;

So in the present logic, *ret* is always 0 if it is not in recovery mode
(and other *if* conditions are not satisfied). But when the *if* condition
is satisfied, it is possible to skip the deletion of additional segments.
Considering that our goal is to always try to unlink additional segments,
*ret* seems unnecessary here. The code flow looks like:

if (isRedo || .....)
{
int ret; /* move to here */
....
}
else
{ }

/* Delete any additional segments. */
if (true)
...

Or is there any reason to allow us to skip the attempt to delete additional
segments in recovery mode?

#15David Zhang
david.zhang@highgo.ca
In reply to: Pavel Borisov (#11)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

I verified the patch "v2-0001-Free-disk-space-for-dropped-relations-on-commit.patch" on master branch "0cc99327888840f2bf572303b68438e4caf62de9". It works for me. Below is my test procedure and results.

=== Before the patch ===
#1 from psql console 1, create table and index then insert enough data
postgres=# CREATE TABLE test_tbl ( a int, b text);
postgres=# CREATE INDEX idx_test_tbl on test_tbl (a);
postgres=# INSERT INTO test_tbl SELECT generate_series(1,80000000),'Hello world!';
postgres=# INSERT INTO test_tbl SELECT generate_series(1,80000000),'Hello world!';

#2 check files size
david:12867$ du -h
12G .

#3 from psql console 2, drop the index
postgres=# drop index idx_test_tbl;

#4 check files size in different ways,
david:12867$ du -h
7.8G .
david:12867$ ls -l
...
-rw------- 1 david david 0 Nov 23 20:07 16402
...

$ lsof -nP | grep '(deleted)' |grep pgdata
...
postgres 25736 david 45u REG 259,2 0 12592758 /home/david/sandbox/postgres/pgdata/base/12867/16402 (deleted)
postgres 25736 david 49u REG 259,2 1073741824 12592798 /home/david/sandbox/postgres/pgdata/base/12867/16402.1 (deleted)
postgres 25736 david 53u REG 259,2 1073741824 12592739 /home/david/sandbox/postgres/pgdata/base/12867/16402.2 (deleted)
postgres 25736 david 59u REG 259,2 372604928 12592800 /home/david/sandbox/postgres/pgdata/base/12867/16402.3 (deleted)
...

The index relnode id "16402" displays size "0" from postgres database folder, but when using lsof to check, all 16402.x are still in used by a psql connection except 16402 is set to 0. Check it again after an hour, lsof shows the same results.

=== After the patch ===
Repeat step 1 ~ 4, lsof shows all the index relnode files (in this case, the index relnode id 16389) are removed within about 1 minute.
$ lsof -nP | grep '(deleted)' |grep pgdata
...
postgres 32707 david 66u REG 259,2 0 12592763 /home/david/sandbox/postgres/pgdata/base/12867/16389.1 (deleted)
postgres 32707 david 70u REG 259,2 0 12592823 /home/david/sandbox/postgres/pgdata/base/12867/16389.2 (deleted)
postgres 32707 david 74u REG 259,2 0 12592805 /home/david/sandbox/postgres/pgdata/base/12867/16389.3 (deleted)
...

One thing interesting for me is that, if the index is created after data records has been inserted, then lsof doesn't show this issue.

#16Pavel Borisov
pashkin.elfe@gmail.com
In reply to: David Zhang (#15)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

Given we got two other reviews from Neil and David, I think I can finalize my own review and mark the patch as ready for committer if nobody has objections.
Thank you!

Pavel Borisov

The new status of this patch is: Ready for Committer

#17Thomas Munro
thomas.munro@gmail.com
In reply to: Pavel Borisov (#16)
3 attachment(s)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Wed, Nov 25, 2020 at 8:00 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

The new status of this patch is: Ready for Committer

Thanks! One small thing bothered me about the last version of the
patch. It tried to unlink when ENOENT had already been enountered by
open(2), so COMMIT of a DROP looks like:

openat(AT_FDCWD, "base/14208/16384", O_RDWR) = 54
ftruncate(54, 0) = 0
close(54) = 0
openat(AT_FDCWD, "base/14208/16384.1", O_RDWR) = -1 ENOENT
unlink("base/14208/16384.1") = -1 ENOENT
openat(AT_FDCWD, "base/14208/16384_fsm", O_RDWR) = -1 ENOENT
unlink("base/14208/16384_fsm") = -1 ENOENT
openat(AT_FDCWD, "base/14208/16384_vm", O_RDWR) = -1 ENOENT
unlink("base/14208/16384_vm") = -1 ENOENT
openat(AT_FDCWD, "base/14208/16384_init", O_RDWR) = -1 ENOENT
unlink("base/14208/16384_init") = -1 ENOENT

So I fixed that, by adding a return value to do_truncate() and
checking it. That's the version I plan to commit tomorrow, unless
there are further comments or objections. I've also attached a
version suitable for REL_11_STABLE and earlier branches (with a name
that cfbot should ignore), where things are slightly different. In
those branches, the register_forget_request() logic is elsewhere.

While looking at trace output, I figured we should just use
truncate(2) on non-Windows, on the master branch only. It's not like
it really makes much difference, but I don't see why we shouldn't
allow ourselves to use ancient standardised Unix syscalls when we can.
That'd get us down to just the following when committing a DROP:

truncate("base/14208/16396", 0) = 0
truncate("base/14208/16396.1", 0) = -1 ENOENT
truncate("base/14208/16396_fsm", 0) = -1 ENOENT
truncate("base/14208/16396_vm", 0) = -1 ENOENT
truncate("base/14208/16396_init", 0) = -1 ENOENT

Attachments:

v3-0001-Free-disk-space-for-dropped-relations-on-commit.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Free-disk-space-for-dropped-relations-on-commit.patchDownload
From 8c2c189ce0c2aa25d0c3f414034b939abe4eb4ee Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Mon, 30 Nov 2020 11:33:38 +1300
Subject: [PATCH v3] Free disk space for dropped relations on commit.

When committing a transaction that dropped a relation, we previously
truncated only the first segment file to free up disk space (the one
that won't be unlinked until the next checkpoint).

Truncate higher numbered segments too, even though we unlink them on
commit.  This frees the disk space immediately, even if other backends
have open file descriptors and might take a long time to get around to
handling shared invalidation events and closing them.  Also extend the
same behavior to the first segment, in recovery.

Back-patch to all supported releases.

Bug: #16663
Reported-by: Denis Patron <denis.patron@previnet.it>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com>
Reviewed-by: David Zhang <david.zhang@highgo.ca>
Discussion: https://postgr.es/m/16663-fe97ccf9932fc800%40postgresql.org
---
 src/backend/storage/smgr/md.c | 100 +++++++++++++++++++++++-----------
 1 file changed, 69 insertions(+), 31 deletions(-)

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1d4aa482cc..c697af00d9 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -286,6 +286,41 @@ mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		mdunlinkfork(rnode, forkNum, isRedo);
 }
 
+/*
+ * Truncate a file to release disk space.
+ */
+static int
+do_truncate(const char *path)
+{
+	int			save_errno;
+	int			ret;
+	int			fd;
+
+	/* truncate(2) would be easier here, but Windows hasn't got it */
+	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
+	if (fd >= 0)
+	{
+		ret = ftruncate(fd, 0);
+		save_errno = errno;
+		CloseTransientFile(fd);
+		errno = save_errno;
+	}
+	else
+		ret = -1;
+
+	/* Log a warning here to avoid repetition in callers. */
+	if (ret < 0 && errno != ENOENT)
+	{
+		save_errno = errno;
+		ereport(WARNING,
+				(errcode_for_file_access(),
+				 errmsg("could not truncate file \"%s\": %m", path)));
+		errno = save_errno;
+	}
+
+	return ret;
+}
+
 static void
 mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 {
@@ -299,38 +334,31 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 	 */
 	if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
 	{
-		/* First, forget any pending sync requests for the first segment */
 		if (!RelFileNodeBackendIsTemp(rnode))
+		{
+			/* Prevent other backends' fds from holding on to the disk space */
+			ret = do_truncate(path);
+
+			/* Forget any pending sync requests for the first segment */
 			register_forget_request(rnode, forkNum, 0 /* first seg */ );
+		}
+		else
+			ret = 0;
 
-		/* Next unlink the file */
-		ret = unlink(path);
-		if (ret < 0 && errno != ENOENT)
-			ereport(WARNING,
-					(errcode_for_file_access(),
-					 errmsg("could not remove file \"%s\": %m", path)));
+		/* Next unlink the file, unless it was already found to be missing */
+		if (ret == 0 || errno != ENOENT)
+		{
+			ret = unlink(path);
+			if (ret < 0 && errno != ENOENT)
+				ereport(WARNING,
+						(errcode_for_file_access(),
+						 errmsg("could not remove file \"%s\": %m", path)));
+		}
 	}
 	else
 	{
-		/* truncate(2) would be easier here, but Windows hasn't got it */
-		int			fd;
-
-		fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
-		if (fd >= 0)
-		{
-			int			save_errno;
-
-			ret = ftruncate(fd, 0);
-			save_errno = errno;
-			CloseTransientFile(fd);
-			errno = save_errno;
-		}
-		else
-			ret = -1;
-		if (ret < 0 && errno != ENOENT)
-			ereport(WARNING,
-					(errcode_for_file_access(),
-					 errmsg("could not truncate file \"%s\": %m", path)));
+		/* Prevent other backends' fds from holding on to the disk space */
+		ret = do_truncate(path);
 
 		/* Register request to unlink first segment later */
 		register_unlink_segment(rnode, forkNum, 0 /* first seg */ );
@@ -350,14 +378,24 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		 */
 		for (segno = 1;; segno++)
 		{
-			/*
-			 * Forget any pending sync requests for this segment before we try
-			 * to unlink.
-			 */
+			sprintf(segpath, "%s.%u", path, segno);
+
 			if (!RelFileNodeBackendIsTemp(rnode))
+			{
+				/*
+				 * Prevent other backends' fds from holding on to the disk
+				 * space.
+				 */
+				if (do_truncate(segpath) < 0 && errno == ENOENT)
+					break;
+
+				/*
+				 * Forget any pending sync requests for this segment before we
+				 * try to unlink.
+				 */
 				register_forget_request(rnode, forkNum, segno);
+			}
 
-			sprintf(segpath, "%s.%u", path, segno);
 			if (unlink(segpath) < 0)
 			{
 				/* ENOENT is expected after the last segment... */
-- 
2.20.1

v3-0001-Free-disk-space-for-dropped-relations-on-commit.patch-REL_11_STABLEapplication/octet-stream; name=v3-0001-Free-disk-space-for-dropped-relations-on-commit.patch-REL_11_STABLEDownload
From 613950726dd7c672db9b7866e34ee2d63183d788 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Mon, 30 Nov 2020 11:33:38 +1300
Subject: [PATCH] Free disk space for dropped relations on commit.

When committing a transaction that dropped a relation, we previously
truncated only the first segment file to free up disk space (the one
that won't be unlinked until the next checkpoint).

Truncate higher numbered segments too, even though we unlink them on
commit.  This frees the disk space immediately, even if other backends
have open file descriptors and might take a long time to get around to
handling shared invalidation events and closing them.  Also extend the
same behavior to the first segment, in recovery.

Back-patch to all supported releases.

Bug: #16663
Reported-by: Denis Patron <denis.patron@previnet.it>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com>
Reviewed-by: David Zhang <david.zhang@highgo.ca>
Discussion: https://postgr.es/m/16663-fe97ccf9932fc800%40postgresql.org
---
 src/backend/storage/smgr/md.c | 89 +++++++++++++++++++++++++----------
 1 file changed, 65 insertions(+), 24 deletions(-)

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 200cc7f657..424dfef6dd 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -406,6 +406,41 @@ mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		mdunlinkfork(rnode, forkNum, isRedo);
 }
 
+/*
+ * Truncate a file to release disk space.
+ */
+static int
+do_truncate(const char *path)
+{
+	int			save_errno;
+	int			ret;
+	int			fd;
+
+	/* truncate(2) would be easier here, but Windows hasn't got it */
+	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
+	if (fd >= 0)
+	{
+		ret = ftruncate(fd, 0);
+		save_errno = errno;
+		CloseTransientFile(fd);
+		errno = save_errno;
+	}
+	else
+		ret = -1;
+
+	/* Log a warning here to avoid repetition in callers. */
+	if (ret < 0 && errno != ENOENT)
+	{
+		save_errno = errno;
+		ereport(WARNING,
+				(errcode_for_file_access(),
+				 errmsg("could not truncate file \"%s\": %m", path)));
+		errno = save_errno;
+	}
+
+	return ret;
+}
+
 static void
 mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 {
@@ -419,33 +454,28 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 	 */
 	if (isRedo || forkNum != MAIN_FORKNUM || RelFileNodeBackendIsTemp(rnode))
 	{
-		ret = unlink(path);
-		if (ret < 0 && errno != ENOENT)
-			ereport(WARNING,
-					(errcode_for_file_access(),
-					 errmsg("could not remove file \"%s\": %m", path)));
-	}
-	else
-	{
-		/* truncate(2) would be easier here, but Windows hasn't got it */
-		int			fd;
-
-		fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
-		if (fd >= 0)
+		if (!RelFileNodeBackendIsTemp(rnode))
 		{
-			int			save_errno;
-
-			ret = ftruncate(fd, 0);
-			save_errno = errno;
-			CloseTransientFile(fd);
-			errno = save_errno;
+			/* Prevent other backends' fds from holding on to the disk space */
+			ret = do_truncate(path);
 		}
 		else
-			ret = -1;
-		if (ret < 0 && errno != ENOENT)
-			ereport(WARNING,
-					(errcode_for_file_access(),
-					 errmsg("could not truncate file \"%s\": %m", path)));
+			ret = 0;
+
+		/* Next unlink the file, unless it was already found to be missing */
+		if (ret == 0 || errno != ENOENT)
+		{
+			ret = unlink(path);
+			if (ret < 0 && errno != ENOENT)
+				ereport(WARNING,
+						(errcode_for_file_access(),
+						 errmsg("could not remove file \"%s\": %m", path)));
+		}
+	}
+	else
+	{
+		/* Prevent other backends' fds from holding on to the disk space */
+		ret = do_truncate(path);
 
 		/* Register request to unlink first segment later */
 		register_unlink(rnode);
@@ -466,6 +496,17 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
 		for (segno = 1;; segno++)
 		{
 			sprintf(segpath, "%s.%u", path, segno);
+
+			if (!RelFileNodeBackendIsTemp(rnode))
+			{
+				/*
+				 * Prevent other backends' fds from holding on to the disk
+				 * space.
+				 */
+				if (do_truncate(segpath) < 0 && errno == ENOENT)
+					break;
+			}
+
 			if (unlink(segpath) < 0)
 			{
 				/* ENOENT is expected after the last segment... */
-- 
2.20.1

v3-0002-Use-truncate-2-instead-of-open-ftruncate-close.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Use-truncate-2-instead-of-open-ftruncate-close.patchDownload
From 0857703de891ccd1ffb8c96ab35ae5e90929ddb4 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Mon, 30 Nov 2020 18:32:32 +1300
Subject: [PATCH 2/2] Use truncate(2) instead of open+ftruncate+close.

When truncating files, use POSIX truncate(2).  Windows doesn't have
that, so provide a fallback.

Discussion: https://postgr.es/m/16663-fe97ccf9932fc800%40postgresql.org
---
 src/backend/storage/file/fd.c | 27 +++++++++++++++++++++++++++
 src/backend/storage/smgr/md.c | 13 +------------
 src/include/storage/fd.h      |  1 +
 3 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 05abcf72d6..046953022b 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -622,6 +622,33 @@ pg_flush_data(int fd, off_t offset, off_t nbytes)
 #endif
 }
 
+/*
+ * Truncate a file to a given length by name.
+ */
+int
+pg_truncate(const char *path, off_t length)
+{
+#ifdef WIN32
+	int		save_errno;
+	int		ret;
+	int		fd;
+
+	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
+	if (fd >= 0)
+	{
+		ret = ftruncate(fd, 0);
+		save_errno = errno;
+		CloseTransientFile(fd);
+		errno = save_errno;
+	}
+	else
+		ret = -1;
+
+	return ret;
+#else
+	return truncate(path, length);
+#endif
+}
 
 /*
  * fsync_fname -- fsync a file or directory, handling errors properly
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index c697af00d9..9889ad6ad8 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -294,19 +294,8 @@ do_truncate(const char *path)
 {
 	int			save_errno;
 	int			ret;
-	int			fd;
 
-	/* truncate(2) would be easier here, but Windows hasn't got it */
-	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
-	if (fd >= 0)
-	{
-		ret = ftruncate(fd, 0);
-		save_errno = errno;
-		CloseTransientFile(fd);
-		errno = save_errno;
-	}
-	else
-		ret = -1;
+	ret = pg_truncate(path, 0);
 
 	/* Log a warning here to avoid repetition in callers. */
 	if (ret < 0 && errno != ENOENT)
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index e209f047e8..4e1cc12e23 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -153,6 +153,7 @@ extern int	pg_fsync_no_writethrough(int fd);
 extern int	pg_fsync_writethrough(int fd);
 extern int	pg_fdatasync(int fd);
 extern void pg_flush_data(int fd, off_t offset, off_t amount);
+extern int	pg_truncate(const char *path, off_t length);
 extern void fsync_fname(const char *fname, bool isdir);
 extern int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 extern int	durable_rename(const char *oldfile, const char *newfile, int loglevel);
-- 
2.20.1

#18Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#17)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Mon, Nov 30, 2020 at 6:59 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Wed, Nov 25, 2020 at 8:00 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

The new status of this patch is: Ready for Committer

... That's the version I plan to commit tomorrow, unless
there are further comments or objections. ...

Done, and back-patched.

I thought a bit more about the fact that we fail to unlink
higher-numbered segments in certain error cases, potentially leaving
stray files behind. As far as I can see, nothing we do in this
code-path is going to be a bullet-proof solution to that problem. One
simple idea would be for the checkpointer to refuse to unlink segment
0 (thereby allowing the relfilenode to be recycled) until it has
scanned the parent directory for any related files that shouldn't be
there.

While looking at trace output, I figured we should just use
truncate(2) on non-Windows, on the master branch only. It's not like
it really makes much difference, but I don't see why we shouldn't
allow ourselves to use ancient standardised Unix syscalls when we can.

Also pushed, but only to master.

#19Michael Paquier
michael@paquier.xyz
In reply to: Thomas Munro (#17)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Mon, Nov 30, 2020 at 06:59:40PM +1300, Thomas Munro wrote:

So I fixed that, by adding a return value to do_truncate() and
checking it. That's the version I plan to commit tomorrow, unless
there are further comments or objections. I've also attached a
version suitable for REL_11_STABLE and earlier branches (with a name
that cfbot should ignore), where things are slightly different. In
those branches, the register_forget_request() logic is elsewhere.

Hmm. Sorry for arriving late at the party. But is that really
something suitable for a backpatch? Sure, it is not optimal to not
truncate all the segments when a transaction dropping a relation
commits, but this was not completely broken either.
--
Michael

#20Thomas Munro
thomas.munro@gmail.com
In reply to: Michael Paquier (#19)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Tue, Dec 1, 2020 at 3:55 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 30, 2020 at 06:59:40PM +1300, Thomas Munro wrote:

So I fixed that, by adding a return value to do_truncate() and
checking it. That's the version I plan to commit tomorrow, unless
there are further comments or objections. I've also attached a
version suitable for REL_11_STABLE and earlier branches (with a name
that cfbot should ignore), where things are slightly different. In
those branches, the register_forget_request() logic is elsewhere.

Hmm. Sorry for arriving late at the party. But is that really
something suitable for a backpatch? Sure, it is not optimal to not
truncate all the segments when a transaction dropping a relation
commits, but this was not completely broken either.

I felt on balance it was a "bug", since it causes operational
difficulties for people and was clearly not our intended behaviour,
and I announced this intention 6 weeks ago. Of course I'll be happy
to revert it from the back-branches if that's the consensus. Any
other opinions?

#21Michael Paquier
michael@paquier.xyz
In reply to: Thomas Munro (#20)
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted

On Tue, Dec 01, 2020 at 04:06:48PM +1300, Thomas Munro wrote:

I felt on balance it was a "bug", since it causes operational
difficulties for people and was clearly not our intended behaviour,
and I announced this intention 6 weeks ago.

Oops, sorry for missing this discussion for such a long time :/

Of course I'll be happy to revert it from the back-branches if
that's the consensus. Any > other opinions?

If there are no other opinions, I am also fine to rely on your
judgment.
--
Michael