[PATCH] Free BufFile metadata in close and append paths

Started by CharSyamabout 2 months ago2 messageshackers
Jump to latest
#1CharSyam
charsyam@gmail.com

Hi, Hackers,

Continuing the storage/file audit, I found two small but consistent
metadata leaks in BufFile. BufFileClose() does not release the
pstrdup'd name on FileSet-based BufFiles, and BufFileAppend()
silently retains the source wrapper (including its BLCKSZ buffer)
even though the function semantics already say the source must not
be touched after the call.

Details and the parallel-sort motivation are in the commit message.

Thanks,
DaeMyung

---

FileSet-based BufFiles duplicate their logical name, but BufFileClose()
did not release it. Release that metadata along with the other
palloc'd state owned by the BufFile.

BufFileAppend() transfers only the underlying file handles to the
target while telling callers not to use the source again. Free the
consumed source wrapper and metadata after the transfer so repeated
appends do not retain a BLCKSZ-sized BufFile wrapper until context
reset.

This matters most for parallel sort workflows that issue many
BufFileAppend() calls in a single context: each retained source
wrapper holds a BLCKSZ-sized buffer until context reset, which can
run into hundreds of KB at high DOP.
---
src/backend/storage/file/buffile.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)

Attachments:

0001-Free-BufFile-metadata-in-close-and-append-paths.patchtext/x-patch; name=0001-Free-BufFile-metadata-in-close-and-append-paths.patchDownload+17-5
#2Michael Paquier
michael@paquier.xyz
In reply to: CharSyam (#1)
Re: [PATCH] Free BufFile metadata in close and append paths

On Thu, Apr 30, 2026 at 12:46:59AM +0900, DaeMyung Kang wrote:

diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index c4afe4d368a..1d1efcd139b 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -419,8 +419,10 @@ BufFileClose(BufFile *file)
/* close and delete the underlying file(s) */
for (i = 0; i < file->numFiles; i++)
FileClose(file->files[i]);
-	/* release the buffer space */
+	/* release the buffer space and other metadata */
pfree(file->files);
+	if (file->name)
+		pfree((void *) file->name);
pfree(file);
}

Yeah, I can get behind that. We care about resources in this specific
call.

@@ -907,6 +909,7 @@ BufFileAppend(BufFile *target, BufFile *source)

Assert(source->readOnly);
Assert(!source->dirty);
+ Assert(target != source);

The addition of this assertion looks sensible.

+	/*
+	 * The underlying files now belong to target.  Free only source's wrapper
+	 * and metadata, leaving the transferred file handles open.
+	 */
+	if (source->name)
+		pfree((void *) source->name);
+	pfree(source->files);
+	pfree(source);
+
return startBlock;

However I cannot really get behind the pfree() calls you are adding
here. What if the caller cares about keeping a track of the source
data? Your assumptions are based on the sole caller of
BufFileAppend() in the tree. There could be callers outside the core
tree, in extension code.

None of this is material for v19. Could you add an entry in the
upcoming commit fest at [1]https://commitfest.postgresql.org/59/ -- Michael? You can add me as reviewer and/or
committer, so as I don't forget about it. I am writing a note down,
to not forget, but notebooks are not perfect.

[1]: https://commitfest.postgresql.org/59/ -- Michael
--
Michael