pgsql: Drop unnamed portal immediately after execution to completion
Drop unnamed portal immediately after execution to completion
Previously, unnamed portals were kept until the next Bind message or the
end of the transaction. This could cause temporary files to persist
longer than expected and make logging not reflect the actual SQL
responsible for the temporary file.
This patch changes exec_execute_message() to drop unnamed portals
immediately after execution to completion at the end of an Execute
message, making their removal more aggressive. This forces temporary
file cleanups to happen at the same time as the completion of the portal
execution, with statement logging correctly reflecting to which
statements these temporary files were attached to (see the diffs in the
TAP test updated by this commit for an idea).
The documentation is updated to describe the lifetime of unnamed
portals, and test cases are updated to verify temporary file removal and
proper statement logging after unnamed portal execution. This changes
how unnamed portals are handled in the protocol, hence no backpatch is
done.
Author: Frédéric Yhuel <frederic.yhuel@dalibo.com>
Co-Authored-by: Sami Imseih <samimseih@gmail.com>
Co-Authored-by: Mircea Cadariu <cadariu.mircea@gmail.com>
Discussion: /messages/by-id/CAA5RZ0tTrTUoEr3kDXCuKsvqYGq8OOHiBwoD-dyJocq95uEOTQ@mail.gmail.com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/1fd981f05369340a8afa4d013a350b0b2ac6e33e
Modified Files
--------------
doc/src/sgml/protocol.sgml | 4 +--
src/backend/tcop/postgres.c | 10 +++++++
src/test/modules/test_misc/t/009_log_temp_files.pl | 33 +++++++++++-----------
3 files changed, 29 insertions(+), 18 deletions(-)
On Wed, Nov 5, 2025 at 12:43 AM Michael Paquier <michael@paquier.xyz> wrote:
Drop unnamed portal immediately after execution to completion
This patch doesn't look well-considered to me. One problem is that
it's a wire protocol change to fix a minor logging anomaly, which
seems disproportionate. Another problem is that the new portal-drop
behavior is conditional on whether XACT_FLAGS_NEEDIMMEDIATECOMMIT gets
set, which seems unprincipled. In addition to those points, I am not
entirely certain that the "here is no need for it beyond this point"
comment is correct. I mean, I think it will normally be true, but what
if the client wants to send a Describe message after-the-fact, or an
additional Execute message that will presumably return zero rows?
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Nov 5, 2025 at 12:43 AM Michael Paquier <michael@paquier.xyz> wrote:
Drop unnamed portal immediately after execution to completion
This patch doesn't look well-considered to me. One problem is that
it's a wire protocol change to fix a minor logging anomaly, which
seems disproportionate. Another problem is that the new portal-drop
behavior is conditional on whether XACT_FLAGS_NEEDIMMEDIATECOMMIT gets
set, which seems unprincipled. In addition to those points, I am not
entirely certain that the "here is no need for it beyond this point"
comment is correct. I mean, I think it will normally be true, but what
if the client wants to send a Describe message after-the-fact, or an
additional Execute message that will presumably return zero rows?
Yeah, the whole idea of changing the wire-level behavior to fix this
has been making me itch: I don't believe for a moment that it won't
cause compatibility problems. I do not have a better proposal for
fixing the problem though.
regards, tom lane
On Mon, Nov 10, 2025 at 04:28:02PM -0500, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
This patch doesn't look well-considered to me. One problem is that
it's a wire protocol change to fix a minor logging anomaly, which
seems disproportionate. Another problem is that the new portal-drop
behavior is conditional on whether XACT_FLAGS_NEEDIMMEDIATECOMMIT gets
set, which seems unprincipled. In addition to those points, I am not
entirely certain that the "here is no need for it beyond this point"
comment is correct. I mean, I think it will normally be true, but what
if the client wants to send a Describe message after-the-fact, or an
additional Execute message that will presumably return zero rows?Yeah, the whole idea of changing the wire-level behavior to fix this
has been making me itch: I don't believe for a moment that it won't
cause compatibility problems.I do not have a better proposal for fixing the problem though.
All the other solutions mentioned mean to work around the issue of the
unnamed portal still being present by maintaining the query string it
in a different context across multiple messages. And I doubt that
anybody would be thrilled by that.
If you think that we should continue to live with the protocol for
unnamed portals as-is and continue to live with the existing behavior,
meaning a revert of 1fd981f05369, that would not be the end of the
world here: we'd still have some tests that track the
past-and-currently-released behavior.
Even if strange, it works with the timing where the unnamed protocol
is dropped. Perhaps somebody would be able to come up with a approach
better than a more aggressive portal drop once completed, but I don't
quite see how much can be done as long as we manipulate the unnamed
portals this way (aka drop then with a follow-up bind, and expect the
logging to show the correct statements attached to a previous
message that we have no knowledge about). The original statement
logging did not really consider all these fancier cases with the
extended query protocol.
--
Michael
On Mon, Nov 10, 2025 at 6:31 PM Michael Paquier <michael@paquier.xyz> wrote:
All the other solutions mentioned mean to work around the issue of the
unnamed portal still being present by maintaining the query string it
in a different context across multiple messages. And I doubt that
anybody would be thrilled by that.
Why not?
I mean, I haven't studied this problem and I don't know how
complicated that would be, but it isn't obvious to me that it would be
wildly impractical. We would need to pass around pointers to the query
string, and make sure that the string itself isn't freed before all
the pointers are gone, but maybe there's a reasonable way to do that.
I would definitely rather do that than change the wire protocol. If
that isn't practical, another option might be to just suppress the
context in places where we know it can be misleading, rather than
displaying a misleading context. That would still require us to know
in which situations that the context might be misleading, which would
likely merit some careful thought, but it would at least avoid us
needing to know the query string upon which we wanted to place blame.
Of course, this approach would produce less useful output, but it
would still be better than showing wrong information. (I see that Tom
previously suggested exactly this approach.)
To me, this is a classic example of why programming with global
variables is often a bad idea. If we lived in a world where a backend
could only ever do one thing at a time, then consulting a global
variable to find out what it's currently doing would be fine, but we
have this whole system of portals and prepared statements precisely so
that a backend can do more than one thing at a time, so the right
thing to do is pass around pointers to the appropriate context object
-- a Portal, or whatever -- to all the code that needs that
information. If the existing context object, a Portal or whatever,
doesn't have a long enough lifetime to still be available at all
points where we need that information, then we should either make the
existing type of context object live longer or invent a separate kind
of context object that with a different lifespan that is suited to the
remaining problems. Given how old and crufty the Portal machinery
seems to be, the latter seems like the likely winner.
If you think that we should continue to live with the protocol for
unnamed portals as-is and continue to live with the existing behavior,
meaning a revert of 1fd981f05369, that would not be the end of the
world here: we'd still have some tests that track the
past-and-currently-released behavior.
Yes, I'd argue for a revert. I think the risk of subtle breakage is
high, and there is a good chance that it will be too late to pull this
back by the time we realize what the problem is. I also feel that it's
solving the problem in the wrong way. To me, this solution feels like
saying "doing proper bookkeeping is hard, so let's redefine the
contract with the user instead." But doing proper bookkeeping is a big
part of what good software engineering is all about. We've put an
enormous amount of energy into our error reporting and a lot of that
energy has gone into making sure that the details that would be useful
to the user are available in the parts of the code that need those
details. There's lots and lots of code already getting query strings
to be available at places where we might want to complain about them,
and very often we also pass through a parse location as well, so that
we can complain about a specific part of a query string. The fact that
the query string that shows up in this case is wrong or misleading
doesn't make that the wrong approach; it just means we haven't totally
solved the problem yet.
Even if strange, it works with the timing where the unnamed protocol
is dropped. Perhaps somebody would be able to come up with a approach
better than a more aggressive portal drop once completed, but I don't
quite see how much can be done as long as we manipulate the unnamed
portals this way (aka drop then with a follow-up bind, and expect the
logging to show the correct statements attached to a previous
message that we have no knowledge about). The original statement
logging did not really consider all these fancier cases with the
extended query protocol.
I have a really hard time with the idea that the problem here is with
when the unnamed portal is dropped. The reasoning behind the existing
wire protocol specification is a little unclear to me, but it seems
like a perfectly elegant concept: you keep the unnamed portal around
until something creates a new unnamed portal. I like that definition
because it's simply and devoid of ambiguity, so other software
interacting with PostgreSQL knows exactly what to expect. When you
start to make the rules more complicated, that makes it harder for
other software that speaks our wire protocol to be fully correct,
especially if it must deal with both an old rule and a new one
depending on the version of PostgreSQL to which it's connected. I
think you could make an argument here that the problem is with when
the temporary file cleanup is done (perhaps it should happen sooner,
when the previous query is still in context) or that the temporary
file cleanup should be passed some appropriate context so that it
knows who to blame for error or that the temporary file cleanup can't
really attribute any errors to a specific statement and thus shouldn't
mention one at all, but I don't think blaming the unnamed portal for
this is the right idea. Our portal management code seems to me to be
full of deficiencies, but the wire protocol specification itself seems
to be fine, so I think we should try to do a better job implementing
the protocol, rather than changing it.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
I understand the point that’s been raised. Would it be an option to
indeed revert the portal drop in postgres.c, but then append the right
query in the "temporary file: ..." log line instead? This would work at
least for me.
Attached is a POC patch (contains a layering violation, hoping it could
be avoided somehow).
Kind regards,
Mircea Cadariu
Attachments:
v1-0001-show-temp-file-creating-query.patchtext/plain; charset=UTF-8; name=v1-0001-show-temp-file-creating-query.patchDownload
From 6bd06e7cfaa4f4efb22be08ce72886324c69e566 Mon Sep 17 00:00:00 2001
From: Mircea Cadariu <cadariu.mircea@gmail.com>
Date: Wed, 12 Nov 2025 10:45:21 +0000
Subject: [PATCH v1] show temp file creating query
---
src/backend/storage/file/fd.c | 40 ++++++++++++++++---
src/backend/tcop/postgres.c | 10 -----
.../modules/test_misc/t/009_log_temp_files.pl | 16 ++++----
3 files changed, 42 insertions(+), 24 deletions(-)
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index a4ec7959f3..a5f4e97f92 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -97,6 +97,7 @@
#include "storage/aio.h"
#include "storage/fd.h"
#include "storage/ipc.h"
+#include "tcop/tcopprot.h"
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/resowner.h"
@@ -206,6 +207,7 @@ typedef struct vfd
/* NB: fileName is malloc'd, and must be free'd when closing the VFD */
int fileFlags; /* open(2) flags for (re)opening the file */
mode_t fileMode; /* mode to pass to open(2) */
+ char *temp_file_creator_query; /* creator query for this temp file, if any */
} Vfd;
/*
@@ -1482,6 +1484,11 @@ FreeVfd(File file)
free(vfdP->fileName);
vfdP->fileName = NULL;
}
+ if (vfdP->temp_file_creator_query != NULL)
+ {
+ free(vfdP->temp_file_creator_query);
+ vfdP->temp_file_creator_query = NULL;
+ }
vfdP->fdstate = 0x0;
vfdP->nextFree = VfdCache[0].nextFree;
@@ -1524,18 +1531,27 @@ FileAccess(File file)
/*
* Called whenever a temporary file is deleted to report its size.
+ * If temp_file_creator_query is non-NULL, it represents the query that created this
+ * temp file and will be logged.
*/
static void
-ReportTemporaryFileUsage(const char *path, off_t size)
+ReportTemporaryFileUsage(const char *path, off_t size, const char *temp_file_creator_query)
{
pgstat_report_tempfile(size);
if (log_temp_files >= 0)
{
if ((size / 1024) >= log_temp_files)
- ereport(LOG,
- (errmsg("temporary file: path \"%s\", size %lu",
- path, (unsigned long) size)));
+ {
+ if (temp_file_creator_query != NULL)
+ ereport(LOG,
+ (errmsg("temporary file: path \"%s\", size %lu, created due to: %s",
+ path, (unsigned long) size, temp_file_creator_query)));
+ else
+ ereport(LOG,
+ (errmsg("temporary file: path \"%s\", size %lu",
+ path, (unsigned long) size)));
+ }
}
}
@@ -1842,6 +1858,12 @@ OpenTemporaryFileInTablespace(Oid tblspcOid, bool rejectError)
tempfilepath);
}
+ /*
+ * Remember the creator query for this temp file.
+ */
+ if (file > 0 && debug_query_string != NULL)
+ VfdCache[file].temp_file_creator_query = strdup(debug_query_string);
+
return file;
}
@@ -1889,6 +1911,12 @@ PathNameCreateTemporaryFile(const char *path, bool error_on_failure)
/* Register it for automatic close. */
RegisterTemporaryFile(file);
+ /*
+ * Remember the creator query for this temp file.
+ */
+ if (debug_query_string != NULL)
+ VfdCache[file].temp_file_creator_query = strdup(debug_query_string);
+
return file;
}
@@ -1960,7 +1988,7 @@ PathNameDeleteTemporaryFile(const char *path, bool error_on_failure)
}
if (stat_errno == 0)
- ReportTemporaryFileUsage(path, filestats.st_size);
+ ReportTemporaryFileUsage(path, filestats.st_size, NULL);
else
{
errno = stat_errno;
@@ -2048,7 +2076,7 @@ FileClose(File file)
/* and last report the stat results */
if (stat_errno == 0)
- ReportTemporaryFileUsage(vfdP->fileName, filestats.st_size);
+ ReportTemporaryFileUsage(vfdP->fileName, filestats.st_size, vfdP->temp_file_creator_query);
else
{
errno = stat_errno;
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2bd8910268..7dd75a490a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2327,16 +2327,6 @@ exec_execute_message(const char *portal_name, long max_rows)
* message. The next protocol message will start a fresh timeout.
*/
disable_statement_timeout();
-
- /*
- * We completed fetching from an unnamed portal. There is no need
- * for it beyond this point, so drop it now rather than wait for
- * the next Bind message to do this cleanup. This ensures that
- * the correct statement is logged when cleaning up temporary file
- * usage.
- */
- if (portal->name[0] == '\0')
- PortalDrop(portal, false);
}
/* Send appropriate CommandComplete to client */
diff --git a/src/test/modules/test_misc/t/009_log_temp_files.pl b/src/test/modules/test_misc/t/009_log_temp_files.pl
index 7ecd301ae2..4d88577e98 100644
--- a/src/test/modules/test_misc/t/009_log_temp_files.pl
+++ b/src/test/modules/test_misc/t/009_log_temp_files.pl
@@ -39,7 +39,7 @@ SELECT 'unnamed portal';
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"unnamed portal");
@@ -51,7 +51,7 @@ $node->safe_psql(
SELECT a FROM foo ORDER BY a OFFSET \$1 \\bind 4991 \\g
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"bind and implicit transaction");
@@ -65,7 +65,7 @@ SELECT 'named portal';
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"named portal");
@@ -79,7 +79,7 @@ SELECT 'pipelined query';
\\endpipeline
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"pipelined query");
@@ -91,7 +91,7 @@ SELECT a, a, a FROM foo ORDER BY a OFFSET \$1 \\parse p1
\\bind_named p1 4993 \\g
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a, a, a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a, a, a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"parse and bind");
@@ -104,7 +104,7 @@ SELECT a FROM foo ORDER BY a OFFSET 4994;
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET 4994;/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET 4994;/s,
$log_offset),
"simple query");
@@ -120,7 +120,7 @@ CLOSE mycur;
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+CLOSE mycur;/s,
+ qr/LOG:\s+temporary file: path.*created due to: FETCH 10 FROM mycur;/s,
$log_offset),
"cursor");
@@ -135,7 +135,7 @@ DEALLOCATE p1;
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+EXECUTE p1;/s,
+ qr/LOG:\s+temporary file: path.*created due to: EXECUTE p1;/s,
$log_offset),
"prepare/execute");
--
2.39.5 (Apple Git-154)
On Wed, Nov 12, 2025 at 11:14:26AM +0000, Mircea Cadariu wrote:
I understand the point that’s been raised. Would it be an option to indeed
revert the portal drop in postgres.c, but then append the right query in the
"temporary file: ..." log line instead? This would work at least for me.
This is new, attaching the information to a Vfd in fd.c. Not sure
that adding this information to this structure is a good concept.
This layer of the code has no idea of query strings currently, so that
feels a bit like a layer violation.
Thinking a bit outside the box.. I was wondering about a plan D (plan
A is what's on HEAD, plan B is copying around the query string, plan C
this Vfd approach) where we shut down the executor when we have
finished the execution of an unnamed portal, with something like the
attached based on a more aggressive PortalCleanup(). However, that
would not fly far if the client decides to send an extra execute
message post-portal-completion where we'd still want the executor to
be around, even if there are no rows to process. We presumably do not
need the temp files anymore at this stage, but I don't really like the
fact that we'd need to somehow take a shortcut if we want only to
clean up the temp files.
Perhaps the best answer for now is to revert and continue the
discussion for this cycle as there seem to be little love for the
current HEAD's solution with the protocol change.
If folks have more ideas or input, please feel free.
--
Michael
Attachments:
portal-cleanup.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2bd89102686e..8d3739298afe 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -38,6 +38,7 @@
#include "commands/async.h"
#include "commands/event_trigger.h"
#include "commands/explain_state.h"
+#include "commands/portalcmds.h"
#include "commands/prepare.h"
#include "common/pg_prng.h"
#include "jit/jit.h"
@@ -2330,13 +2331,14 @@ exec_execute_message(const char *portal_name, long max_rows)
/*
* We completed fetching from an unnamed portal. There is no need
- * for it beyond this point, so drop it now rather than wait for
+ * for it beyond this point, so clean it now rather than wait for
* the next Bind message to do this cleanup. This ensures that
* the correct statement is logged when cleaning up temporary file
- * usage.
+ * usage with the executor shutdown.
*/
- if (portal->name[0] == '\0')
- PortalDrop(portal, false);
+ if (portal->name[0] == '\0' &&
+ portal->cleanup)
+ PortalCleanup(portal);
}
/* Send appropriate CommandComplete to client */
On 13/11/2025 06:27, Michael Paquier wrote:
This is new, attaching the information to a Vfd in fd.c. Not sure
that adding this information to this structure is a good concept.
This layer of the code has no idea of query strings currently, so that
feels a bit like a layer violation.
Thanks for having a look! FWIW I found a way for Plan C to work without
including tcoprot into fd, see attached.
There's a new field in that structure indeed, but maybe not that far
fetched, it's the query that triggered the creation of the file.
Kind regards,
Mircea Cadariu
Attachments:
v2-0001-show-temp-file-creating-query.patchtext/plain; charset=UTF-8; name=v2-0001-show-temp-file-creating-query.patchDownload
From 08e5e884a2d86e0510d29c94fdf17405a25d822c Mon Sep 17 00:00:00 2001
From: Mircea Cadariu <cadariu.mircea@gmail.com>
Date: Wed, 12 Nov 2025 10:45:21 +0000
Subject: [PATCH v1] show temp file creating query
---
src/backend/storage/file/fd.c | 39 ++++++++++++++++---
src/backend/tcop/postgres.c | 10 -----
src/include/miscadmin.h | 2 +
src/include/tcop/tcopprot.h | 1 -
.../modules/test_misc/t/009_log_temp_files.pl | 16 ++++----
5 files changed, 43 insertions(+), 25 deletions(-)
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index a4ec7959f3..f03d31fbd1 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -206,6 +206,7 @@ typedef struct vfd
/* NB: fileName is malloc'd, and must be free'd when closing the VFD */
int fileFlags; /* open(2) flags for (re)opening the file */
mode_t fileMode; /* mode to pass to open(2) */
+ char *temp_file_creator_query; /* creator query for this temp file, if any */
} Vfd;
/*
@@ -1482,6 +1483,11 @@ FreeVfd(File file)
free(vfdP->fileName);
vfdP->fileName = NULL;
}
+ if (vfdP->temp_file_creator_query != NULL)
+ {
+ free(vfdP->temp_file_creator_query);
+ vfdP->temp_file_creator_query = NULL;
+ }
vfdP->fdstate = 0x0;
vfdP->nextFree = VfdCache[0].nextFree;
@@ -1524,18 +1530,27 @@ FileAccess(File file)
/*
* Called whenever a temporary file is deleted to report its size.
+ * If temp_file_creator_query is non-NULL, it represents the query that created this
+ * temp file and will be logged.
*/
static void
-ReportTemporaryFileUsage(const char *path, off_t size)
+ReportTemporaryFileUsage(const char *path, off_t size, const char *temp_file_creator_query)
{
pgstat_report_tempfile(size);
if (log_temp_files >= 0)
{
if ((size / 1024) >= log_temp_files)
- ereport(LOG,
- (errmsg("temporary file: path \"%s\", size %lu",
- path, (unsigned long) size)));
+ {
+ if (temp_file_creator_query != NULL)
+ ereport(LOG,
+ (errmsg("temporary file: path \"%s\", size %lu, created due to: %s",
+ path, (unsigned long) size, temp_file_creator_query)));
+ else
+ ereport(LOG,
+ (errmsg("temporary file: path \"%s\", size %lu",
+ path, (unsigned long) size)));
+ }
}
}
@@ -1842,6 +1857,12 @@ OpenTemporaryFileInTablespace(Oid tblspcOid, bool rejectError)
tempfilepath);
}
+ /*
+ * Remember the creator query for this temp file.
+ */
+ if (file > 0 && debug_query_string != NULL)
+ VfdCache[file].temp_file_creator_query = strdup(debug_query_string);
+
return file;
}
@@ -1889,6 +1910,12 @@ PathNameCreateTemporaryFile(const char *path, bool error_on_failure)
/* Register it for automatic close. */
RegisterTemporaryFile(file);
+ /*
+ * Remember the creator query for this temp file.
+ */
+ if (debug_query_string != NULL)
+ VfdCache[file].temp_file_creator_query = strdup(debug_query_string);
+
return file;
}
@@ -1960,7 +1987,7 @@ PathNameDeleteTemporaryFile(const char *path, bool error_on_failure)
}
if (stat_errno == 0)
- ReportTemporaryFileUsage(path, filestats.st_size);
+ ReportTemporaryFileUsage(path, filestats.st_size, NULL);
else
{
errno = stat_errno;
@@ -2048,7 +2075,7 @@ FileClose(File file)
/* and last report the stat results */
if (stat_errno == 0)
- ReportTemporaryFileUsage(vfdP->fileName, filestats.st_size);
+ ReportTemporaryFileUsage(vfdP->fileName, filestats.st_size, vfdP->temp_file_creator_query);
else
{
errno = stat_errno;
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2bd8910268..7dd75a490a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2327,16 +2327,6 @@ exec_execute_message(const char *portal_name, long max_rows)
* message. The next protocol message will start a fresh timeout.
*/
disable_statement_timeout();
-
- /*
- * We completed fetching from an unnamed portal. There is no need
- * for it beyond this point, so drop it now rather than wait for
- * the next Bind message to do this cleanup. This ensures that
- * the correct statement is logged when cleaning up temporary file
- * usage.
- */
- if (portal->name[0] == '\0')
- PortalDrop(portal, false);
}
/* Send appropriate CommandComplete to client */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733dde..030ad593a8 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -208,6 +208,8 @@ extern PGDLLIMPORT Oid MyDatabaseId;
extern PGDLLIMPORT Oid MyDatabaseTableSpace;
+extern PGDLLIMPORT const char *debug_query_string;
+
extern PGDLLIMPORT bool MyDatabaseHasLoginEventTriggers;
/*
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index c1bcfdec67..962eec7f9b 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -23,7 +23,6 @@
typedef struct ExplainState ExplainState; /* defined in explain_state.h */
extern PGDLLIMPORT CommandDest whereToSendOutput;
-extern PGDLLIMPORT const char *debug_query_string;
extern PGDLLIMPORT int PostAuthDelay;
extern PGDLLIMPORT int client_connection_check_interval;
diff --git a/src/test/modules/test_misc/t/009_log_temp_files.pl b/src/test/modules/test_misc/t/009_log_temp_files.pl
index 7ecd301ae2..4d88577e98 100644
--- a/src/test/modules/test_misc/t/009_log_temp_files.pl
+++ b/src/test/modules/test_misc/t/009_log_temp_files.pl
@@ -39,7 +39,7 @@ SELECT 'unnamed portal';
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"unnamed portal");
@@ -51,7 +51,7 @@ $node->safe_psql(
SELECT a FROM foo ORDER BY a OFFSET \$1 \\bind 4991 \\g
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"bind and implicit transaction");
@@ -65,7 +65,7 @@ SELECT 'named portal';
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"named portal");
@@ -79,7 +79,7 @@ SELECT 'pipelined query';
\\endpipeline
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"pipelined query");
@@ -91,7 +91,7 @@ SELECT a, a, a FROM foo ORDER BY a OFFSET \$1 \\parse p1
\\bind_named p1 4993 \\g
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a, a, a FROM foo ORDER BY a OFFSET \$1/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a, a, a FROM foo ORDER BY a OFFSET \$1/s,
$log_offset),
"parse and bind");
@@ -104,7 +104,7 @@ SELECT a FROM foo ORDER BY a OFFSET 4994;
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+SELECT a FROM foo ORDER BY a OFFSET 4994;/s,
+ qr/LOG:\s+temporary file: path.*created due to: SELECT a FROM foo ORDER BY a OFFSET 4994;/s,
$log_offset),
"simple query");
@@ -120,7 +120,7 @@ CLOSE mycur;
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+CLOSE mycur;/s,
+ qr/LOG:\s+temporary file: path.*created due to: FETCH 10 FROM mycur;/s,
$log_offset),
"cursor");
@@ -135,7 +135,7 @@ DEALLOCATE p1;
END;
});
ok( $node->log_contains(
- qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+EXECUTE p1;/s,
+ qr/LOG:\s+temporary file: path.*created due to: EXECUTE p1;/s,
$log_offset),
"prepare/execute");
--
2.39.5 (Apple Git-154)
Thinking a bit outside the box.. I was wondering about a plan D (plan
A is what's on HEAD, plan B is copying around the query string, plan C
this Vfd approach) where we shut down the executor when we have
finished the execution of an unnamed portal, with something like the
attached based on a more aggressive PortalCleanup().
I am not sure I like this idea as-is, because besides that fact
that it's still a wire level change, it's not safe at all to
re-enter exec_execute_message after you just cleaned the
portal but did not drop it.
if (!PortalIsValid(portal)) will tell you the portal is still valid, but its
resources, like queryDesc and others are no longer available.
You can actually see what happens there with this handy
extended.py that communicates directly over the wire-protocol.
See the "back-to-back execute" print.
It results in
```
TRAP: failed Assert("queryDesc || portal->holdStore"), File:
"../src/backend/tcop/pquery.c", Line: 875, PID: 32358
0 postgres 0x00000001028ce52c
ExceptionalCondition + 108
1 postgres 0x00000001027a2470 PortalRunMulti + 0
2 postgres 0x00000001027a1fc0 PortalRun + 492
3 postgres 0x000000010279ff54 PostgresMain + 8896
4 postgres 0x0000000102799bbc BackendInitialize + 0
5 postgres 0x00000001026f1264
postmaster_child_launch + 364
6 postgres 0x00000001026f588c ServerLoop + 5840
7 postgres 0x00000001026f3800
InitProcessGlobals + 0
8 postgres 0x0000000102641564 help + 0
9 dyld 0x00000001930d6b98 start + 6076
```
This idea could perhaps work, but needs more thought.
Perhaps the best answer for now is to revert and continue the
discussion for this cycle as there seem to be little love for the
current HEAD's solution with the protocol change.If folks have more ideas or input, please feel free.
That is probably the best idea now.
This is new, attaching the information to a Vfd in fd.c. Not sure
that adding this information to this structure is a good concept.
This layer of the code has no idea of query strings currently, so that
feels a bit like a layer violation.
Thanks for having a look! FWIW I found a way for Plan C to work without
including tcoprot into fd, see attached.
There's a new field in that structure indeed, but maybe not that far
fetched, it's the query that triggered the creation of the file.
To Michael's point, this looks like a layering violation. Besides,
I think this will double log, although I did not test, because you log
the statement once when closing the temp and once when logging
the STATEMENT.
One thing I am still not so sure about is we currently say that things
like the query_string will out live the portal, so I am still not clear what
is the risk of copying the query_string to debug_query_string during
PortalDrop?
```
/*
* We don't have to copy anything into the portal, because everything
* we are passing here is in MessageContext or the
* per_parsetree_context, and so will outlive the portal anyway.
*/
PortalDefineQuery(portal,
NULL,
query_string,
commandTag,
plantree_list,
NULL);
```
--
Sami Imseih
Amazon Web Services (AWS)
Attachments:
On Thu, Nov 13, 2025 at 04:02:24PM -0600, Sami Imseih wrote:
I am not sure I like this idea as-is, because besides that fact
that it's still a wire level change, it's not safe at all to
re-enter exec_execute_message after you just cleaned the
portal but did not drop it.
This will come at the cost of tracking a new state in the backend
where the portal would still live with the executor state gone, I
don't like much my own plan D and the road where this leads at the
end.
That is probably the best idea now.
I have just done a revert of 1fd981f05369 now, let's continue the
discussions.
--
Michael
Hi Michael,
I think I found a gap in the tests we added previously for documenting
the current behaviour. See attached patch for your consideration.
What's interesting about the holdable cursors scenario is that as far as
I can tell the temp files are cleaned up during PersistHoldablePortal
instead of PortalDrop.
Kind regards,
Mircea Cadariu
Attachments:
v1-0001-add-holdable-cursors-test.patchtext/plain; charset=UTF-8; name=v1-0001-add-holdable-cursors-test.patchDownload
From a81f2312b895582475a733fae497594df2727dfa Mon Sep 17 00:00:00 2001
From: Mircea Cadariu <cadariu.mircea@gmail.com>
Date: Fri, 14 Nov 2025 14:46:43 +0000
Subject: [PATCH v1] add holdable cursors test
---
.../modules/test_misc/t/009_log_temp_files.pl | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/src/test/modules/test_misc/t/009_log_temp_files.pl b/src/test/modules/test_misc/t/009_log_temp_files.pl
index 462a949e41..5cac8f0122 100644
--- a/src/test/modules/test_misc/t/009_log_temp_files.pl
+++ b/src/test/modules/test_misc/t/009_log_temp_files.pl
@@ -123,12 +123,27 @@ ok( $node->log_contains(
$log_offset),
"cursor");
+note "holdable cursor: temporary file dropped during COMMIT";
+$log_offset = -s $node->logfile;
+$node->safe_psql(
+ "postgres", qq{
+BEGIN;
+DECLARE holdcur CURSOR WITH HOLD FOR SELECT a FROM foo ORDER BY a OFFSET 4996;
+FETCH 10 FROM holdcur;
+COMMIT;
+CLOSE holdcur;
+});
+ok( $node->log_contains(
+ qr/LOG:\s+temporary file: path.*\n.*\ STATEMENT:\s+COMMIT;/s,
+ $log_offset),
+ "holdable cursor");
+
note "prepare/execute: temporary file dropped under EXECUTE";
$log_offset = -s $node->logfile;
$node->safe_psql(
"postgres", qq{
BEGIN;
-PREPARE p1 AS SELECT a FROM foo ORDER BY a OFFSET 4996;
+PREPARE p1 AS SELECT a FROM foo ORDER BY a OFFSET 4997;
EXECUTE p1;
DEALLOCATE p1;
END;
--
2.39.5 (Apple Git-154)
On Fri, Nov 14, 2025 at 03:10:32PM +0000, Mircea Cadariu wrote:
What's interesting about the holdable cursors scenario is that as far as I
can tell the temp files are cleaned up during PersistHoldablePortal instead
of PortalDrop.
As part of the ExecutorEnd() done in this case. Right, it looks like
a good thing to track this behavior as well in the long-run. I'll
double-check that later.
--
Michael
On Fri, Nov 14, 2025 at 03:10:32PM +0000, Mircea Cadariu wrote:
What's interesting about the holdable cursors scenario is that as far as I
can tell the temp files are cleaned up during PersistHoldablePortal instead
of PortalDrop.
Done this one as e7cde9dad285.
--
Michael