pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction
While working on a patch, I noticed this pre-existing behavior, which seems to
be new since v11, maybe due to changes to SRF.
|postgres=# SELECT pg_ls_dir('.') LIMIT 1;
|WARNING: 1 temporary files and directories not closed at end-of-transaction
|pg_ls_dir | pg_dynshmem
|postgres=# SELECT pg_ls_waldir() LIMIT 1;
|WARNING: 1 temporary files and directories not closed at end-of-transaction
|-[ RECORD 1 ]+-------------------------------------------------------------
|pg_ls_waldir | (00000001000031920000007B,16777216,"2020-03-08 03:50:34-07")
Note, that doesn't happen with "SELECT * FROM".
I'm not sure what the solution is to that, but my patch was going to make it
worse rather than better for pg_ls_tmpdir.
--
Justin
Justin Pryzby <pryzby@telsasoft.com> writes:
While working on a patch, I noticed this pre-existing behavior, which seems to
be new since v11, maybe due to changes to SRF.
|postgres=# SELECT pg_ls_dir('.') LIMIT 1;
|WARNING: 1 temporary files and directories not closed at end-of-transaction
Hmm, actually it looks to me like pg_ls_dir has been broken forever.
The reason the warning didn't show up before v11 is that CleanupTempFiles
didn't bleat about leaked "allocated" directories before that
(cf 9cb7db3f0).
I guess we ought to change that function to use returns-a-tuplestore
protocol instead of thinking it can hold a directory open across calls.
It's not hard to think of use-cases where the existing behavior would
cause issues worse than a nanny-ish WARNING, especially on platforms
with tight "ulimit -n" limits.
regards, tom lane
On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
While working on a patch, I noticed this pre-existing behavior, which seems to
be new since v11, maybe due to changes to SRF.|postgres=# SELECT pg_ls_dir('.') LIMIT 1;
|WARNING: 1 temporary files and directories not closed at end-of-transactionHmm, actually it looks to me like pg_ls_dir has been broken forever.
The reason the warning didn't show up before v11 is that CleanupTempFiles
didn't bleat about leaked "allocated" directories before that
(cf 9cb7db3f0).I guess we ought to change that function to use returns-a-tuplestore
protocol instead of thinking it can hold a directory open across calls.
It's not hard to think of use-cases where the existing behavior would
cause issues worse than a nanny-ish WARNING, especially on platforms
with tight "ulimit -n" limits.
Thanks for the analysis.
Do you mean it should enumerate all files during the initial SRF call, or use
something other than the SRF_* macros ?
--
Justin
Justin Pryzby <pryzby@telsasoft.com> writes:
On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
I guess we ought to change that function to use returns-a-tuplestore
protocol instead of thinking it can hold a directory open across calls.
It's not hard to think of use-cases where the existing behavior would
cause issues worse than a nanny-ish WARNING, especially on platforms
with tight "ulimit -n" limits.
Thanks for the analysis.
Do you mean it should enumerate all files during the initial SRF call, or use
something other than the SRF_* macros ?
It has to enumerate all the files during the first call. I suppose it
could do that and then still hand back the results one-at-a-time, but
there seems little point compared to filling a tuplestore immediately.
So probably the SRF_ macros are useless here.
Another possible solution is to register an exprstate-shutdown hook to
ensure the resource is cleaned up, but I'm not very happy with that
because it does nothing to prevent the hazard of overrunning the
available resources if you have several of these active at once.
I've just finished scanning the source code and concluding that all
of these functions are similarly broken:
pg_ls_dir
pg_ls_dir_files
pg_tablespace_databases
pg_logdir_ls_internal
pg_timezone_names
pgrowlocks
The first five risk leaking an open directory, the last risks leaking
an active tablescan and open relation.
I don't see anything in the documentation (either funcapi.h or
xfunc.sgml) warning that the function might not get run to completion,
either ...
regards, tom lane
I wrote:
I've just finished scanning the source code and concluding that all
of these functions are similarly broken:
pg_ls_dir
pg_ls_dir_files
pg_tablespace_databases
pg_logdir_ls_internal
pg_timezone_names
pgrowlocks
BTW, another thing I noticed while looking around is that some of
the functions using SRF_RETURN_DONE() think they should clean up
memory beforehand. This is a waste of code/cycles, as long as the
memory was properly allocated in funcctx->multi_call_memory_ctx,
because funcapi.c takes care of deleting that context.
We should probably document that *any* manual cleanup before
SRF_RETURN_DONE() is an antipattern. If you have to have cleanup,
it needs to be done via RegisterExprContextCallback instead.
regards, tom lane
On Sun, Mar 08, 2020 at 03:40:09PM -0400, Tom Lane wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
I guess we ought to change that function to use returns-a-tuplestore
protocol instead of thinking it can hold a directory open across calls.Thanks for the analysis.
Do you mean it should enumerate all files during the initial SRF call, or use
something other than the SRF_* macros ?It has to enumerate all the files during the first call. I suppose it
I've just finished scanning the source code and concluding that all
of these functions are similarly broken:
pg_ls_dir_files
I patched this one to see what it looks like and to allow /hopefully/ moving
forward one way or another with the pg_ls_tmpfile() patch set (or at least
avoid trying to do anything there which is too inconsistent with this fix).
I don't see anything in the documentation (either funcapi.h or
xfunc.sgml) warning that the function might not get run to completion,
either ...
Also, at first glance, these seem to be passing constant "randomAccess=true"
rather than (bool) (rsinfo->allowedModes&SFRM_Materialize_Random)
$ git grep -wl SFRM_Materialize |xargs grep -l 'tuplestore_begin_heap(true'
contrib/dblink/dblink.c
contrib/pageinspect/brinfuncs.c
contrib/pg_stat_statements/pg_stat_statements.c
src/backend/access/transam/xlogfuncs.c
src/backend/commands/event_trigger.c
src/backend/commands/extension.c
src/backend/foreign/foreign.c
src/backend/replication/logical/launcher.c
src/backend/replication/logical/logicalfuncs.c
src/backend/replication/logical/origin.c
src/backend/replication/slotfuncs.c
src/backend/replication/walsender.c
src/backend/storage/ipc/shmem.c
src/backend/utils/adt/pgstatfuncs.c
src/backend/utils/misc/guc.c
src/backend/utils/misc/pg_config.c
--
Justin
Attachments:
v1-0001-pg_ls_dir_files-avoid-leaking-DIR-if-not-run-to-c.patchtext/x-diff; charset=us-asciiDownload
From 1fd2918d65ed8cc64158e407e56ed61b44e951db Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 10 Mar 2020 21:01:47 -0500
Subject: [PATCH v1] pg_ls_dir_files: avoid leaking DIR if not run to
completion
Change to return a tuplestore rather than leaving directory open across calls.
Discussion: https://www.postgresql.org/message-id/20200308173103.GC1357%40telsasoft.com
See also: 9cb7db3f0
---
src/backend/utils/adt/genfile.c | 90 +++++++++++++++------------------
1 file changed, 42 insertions(+), 48 deletions(-)
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 5bda2af87c..046d218ffa 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -527,67 +527,63 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS)
static Datum
pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
{
- FuncCallContext *funcctx;
struct dirent *de;
- directory_fctx *fctx;
-
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
- TupleDesc tupdesc;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ bool randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ DIR *dirdesc;
+ Tuplestorestate *tupstore;
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ /* check to see if caller supports us returning a tuplestore */
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not "
+ "allowed in this context")));
- fctx = palloc(sizeof(directory_fctx));
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
- tupdesc = CreateTemplateTupleDesc(3);
- TupleDescInitEntry(tupdesc, (AttrNumber) 1, "name",
- TEXTOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 2, "size",
- INT8OID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 3, "modification",
- TIMESTAMPTZOID, -1, 0);
- funcctx->tuple_desc = BlessTupleDesc(tupdesc);
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
- fctx->location = pstrdup(dir);
- fctx->dirdesc = AllocateDir(fctx->location);
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
- if (!fctx->dirdesc)
- {
- if (missing_ok && errno == ENOENT)
- {
- MemoryContextSwitchTo(oldcontext);
- SRF_RETURN_DONE(funcctx);
- }
- else
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
- }
+ MemoryContextSwitchTo(oldcontext);
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
+ dirdesc = AllocateDir(dir);
+ if (!dirdesc)
+ {
+ if (missing_ok && errno == ENOENT)
+ return NULL; // XXX
+ else
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ dir)));
}
- funcctx = SRF_PERCALL_SETUP();
- fctx = (directory_fctx *) funcctx->user_fctx;
-
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ while ((de = ReadDir(dirdesc, dir)) != NULL)
{
Datum values[3];
- bool nulls[3];
+ bool nulls[3] = {0};
char path[MAXPGPATH * 2];
struct stat attrib;
- HeapTuple tuple;
/* Skip hidden files */
if (de->d_name[0] == '.')
continue;
/* Get the file info */
- snprintf(path, sizeof(path), "%s/%s", fctx->location, de->d_name);
+ snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
if (stat(path, &attrib) < 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -600,14 +596,12 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
values[0] = CStringGetTextDatum(de->d_name);
values[1] = Int64GetDatum((int64) attrib.st_size);
values[2] = TimestampTzGetDatum(time_t_to_timestamptz(attrib.st_mtime));
- memset(nulls, 0, sizeof(nulls));
-
- tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
- SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
- FreeDir(fctx->dirdesc);
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return NULL;
}
/* Function to return the list of files in the log directory */
--
2.17.0
Justin Pryzby <pryzby@telsasoft.com> writes:
On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
I guess we ought to change that function to use returns-a-tuplestore
protocol instead of thinking it can hold a directory open across calls.
I patched this one to see what it looks like and to allow /hopefully/ moving
forward one way or another with the pg_ls_tmpfile() patch set (or at least
avoid trying to do anything there which is too inconsistent with this fix).
I reviewed this, added some test cases, and pushed it, so that we can see
if the buildfarm finds anything wrong. (I'm not expecting that, because
this should all be pretty portable, but you never know.) Assuming not,
we need to fix the other functions similarly, and then do something about
revising the documentation to warn against this coding style. Do you
want to have a go at that?
Also, at first glance, these seem to be passing constant "randomAccess=true"
rather than (bool) (rsinfo->allowedModes&SFRM_Materialize_Random)
Hm. Not a bug, but possibly a performance issue, if the tuplestore
gets big enough for that to matter. (I think it doesn't matter until
we start spilling to temp files.)
regards, tom lane
On Wed, Mar 11, 2020 at 03:32:38PM -0400, Tom Lane wrote:
I patched this one to see what it looks like and to allow /hopefully/ moving
forward one way or another with the pg_ls_tmpfile() patch set (or at least
avoid trying to do anything there which is too inconsistent with this fix).I reviewed this, added some test cases, and pushed it, so that we can see
Thanks, tests were on my periphery..
| In passing, fix bogus error report for stat() failure: it was
| whining about the directory when it should be fingering the
| individual file. Doubtless a copy-and-paste error.
Thanks again ; that was my 0001 patch on the other thread. No rebase conflict
even ;)
/messages/by-id/20191228101650.GG12890@telsasoft.com
Do you want to have a go at that?
First draft attached. Note that I handled pg_ls_dir, even though I'm proposing
on the other thread to collapse/merge/meld it with pg_ls_dir_files [0]/messages/by-id/20200310183037.GA29065@telsasoft.com v9-0008-generalize-pg_ls_dir_files-and-retire-pg_ls_dir.patch.
Possibly that's a bad idea with tuplestore, due to returning a scalar vs a row
and needing to conditionally call CreateTemplateTupleDesc vs
get_call_result_type. I'll rebase that patch later today.
I didn't write test cases yet. Also didn't look for functions not on your
list.
I noticed this doesn't actually do anything, but kept it for now...except in
pg_ls_dir error case:
src/include/utils/tuplestore.h:/* tuplestore_donestoring() used to be required, but is no longer used */
src/include/utils/tuplestore.h:#define tuplestore_donestoring(state) ((void) 0)
I found a few documentation bits that I think aren't relevant, but could
possibly be misread to encourage the bad coding practice. This is about *sql*
functions:
|37.5.8. SQL Functions Returning Sets
|When an SQL function is declared as returning SETOF sometype, the function's
|final query is executed TO COMPLETION, and each row it outputs is returned as
|an element of the result set.
|...
|Set-returning functions in the select list are always evaluated as though they
|are on the inside of a nested-loop join with the rest of the FROM clause, so
|that the function(s) are run TO COMPLETION before the next row from the FROM
|clause is considered.
--
Justin
[0]: /messages/by-id/20200310183037.GA29065@telsasoft.com v9-0008-generalize-pg_ls_dir_files-and-retire-pg_ls_dir.patch
v9-0008-generalize-pg_ls_dir_files-and-retire-pg_ls_dir.patch
Attachments:
0001-SRF-avoid-leaking-resources-if-not-run-to-completion.patchtext/x-diff; charset=us-asciiDownload
From 43e7e5a9b679a4172808b248df2bc3365b6336e4 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 11 Mar 2020 10:09:18 -0500
Subject: [PATCH] SRF: avoid leaking resources if not run to completion
Change to return a tuplestore populated immediately and returned in full.
Discussion: https://www.postgresql.org/message-id/20200308173103.GC1357%40telsasoft.com
See also: 9cb7db3f0 and 085b6b66
---
contrib/adminpack/adminpack.c | 83 +++++++++-------
contrib/pgrowlocks/pgrowlocks.c | 163 ++++++++++++++-----------------
doc/src/sgml/xfunc.sgml | 17 +++-
src/backend/utils/adt/datetime.c | 100 ++++++++-----------
src/backend/utils/adt/genfile.c | 112 +++++++++++----------
src/backend/utils/adt/misc.c | 117 ++++++++++++----------
src/include/funcapi.h | 7 +-
7 files changed, 306 insertions(+), 293 deletions(-)
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index bc45e79895..2afb999c6e 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -504,50 +504,57 @@ pg_logdir_ls_v1_1(PG_FUNCTION_ARGS)
static Datum
pg_logdir_ls_internal(FunctionCallInfo fcinfo)
{
- FuncCallContext *funcctx;
struct dirent *de;
- directory_fctx *fctx;
+
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
+ DIR *dirdesc;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not "
+ "allowed in this context")));
if (strcmp(Log_filename, "postgresql-%Y-%m-%d_%H%M%S.log") != 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("the log_filename parameter must equal 'postgresql-%%Y-%%m-%%d_%%H%%M%%S.log'")));
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
- TupleDesc tupdesc;
-
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
- fctx = palloc(sizeof(directory_fctx));
+ /* Is this right ?? */
+ tupdesc = CreateTemplateTupleDesc(2);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "starttime",
+ TIMESTAMPOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "filename",
+ TEXTOID, -1, 0);
- tupdesc = CreateTemplateTupleDesc(2);
- TupleDescInitEntry(tupdesc, (AttrNumber) 1, "starttime",
- TIMESTAMPOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 2, "filename",
- TEXTOID, -1, 0);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
- funcctx->attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ MemoryContextSwitchTo(oldcontext);
- fctx->location = pstrdup(Log_directory);
- fctx->dirdesc = AllocateDir(fctx->location);
-
- if (!fctx->dirdesc)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
-
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
- }
-
- funcctx = SRF_PERCALL_SETUP();
- fctx = (directory_fctx *) funcctx->user_fctx;
+ dirdesc = AllocateDir(Log_directory);
+ if (!dirdesc)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ Log_directory)));
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ while ((de = ReadDir(dirdesc, Log_directory)) != NULL)
{
char *values[2];
HeapTuple tuple;
@@ -584,13 +591,13 @@ pg_logdir_ls_internal(FunctionCallInfo fcinfo)
/* Seems the timestamp is OK; prepare and return tuple */
values[0] = timestampbuf;
- values[1] = psprintf("%s/%s", fctx->location, de->d_name);
-
- tuple = BuildTupleFromCStrings(funcctx->attinmeta, values);
+ values[1] = psprintf("%s/%s", Log_directory, de->d_name);
- SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+ tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupdesc), values);
+ tuplestore_puttuple(tupstore, tuple);
}
- FreeDir(fctx->dirdesc);
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return (Datum) 0;
}
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index a2c44a916c..908019e774 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -54,13 +54,6 @@ PG_FUNCTION_INFO_V1(pgrowlocks);
#define NCHARS 32
-typedef struct
-{
- Relation rel;
- TableScanDesc scan;
- int ncolumns;
-} MyData;
-
#define Atnum_tid 0
#define Atnum_xmax 1
#define Atnum_ismulti 2
@@ -71,82 +64,85 @@ typedef struct
Datum
pgrowlocks(PG_FUNCTION_ARGS)
{
- FuncCallContext *funcctx;
TableScanDesc scan;
HeapScanDesc hscan;
HeapTuple tuple;
TupleDesc tupdesc;
AttInMetadata *attinmeta;
- Datum result;
- MyData *mydata;
Relation rel;
- if (SRF_IS_FIRSTCALL())
- {
- text *relname;
- RangeVar *relrv;
- MemoryContext oldcontext;
- AclResult aclresult;
-
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
-
- /* Build a tuple descriptor for our result type */
- if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
- elog(ERROR, "return type must be a row type");
-
- attinmeta = TupleDescGetAttInMetadata(tupdesc);
- funcctx->attinmeta = attinmeta;
-
- relname = PG_GETARG_TEXT_PP(0);
- relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
- rel = relation_openrv(relrv, AccessShareLock);
-
- if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
- ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("only heap AM is supported")));
-
- if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("\"%s\" is a partitioned table",
- RelationGetRelationName(rel)),
- errdetail("Partitioned tables do not contain rows.")));
- else if (rel->rd_rel->relkind != RELKIND_RELATION)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("\"%s\" is not a table",
- RelationGetRelationName(rel))));
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
+
+ text *relname;
+ RangeVar *relrv;
+ MemoryContext oldcontext;
+ AclResult aclresult;
+ char **values;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not "
+ "allowed in this context")));
+
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+
+ relname = PG_GETARG_TEXT_PP(0);
+ relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
+ rel = relation_openrv(relrv, AccessShareLock);
+
+ if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+ ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only heap AM is supported")));
+
+ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is a partitioned table",
+ RelationGetRelationName(rel)),
+ errdetail("Partitioned tables do not contain rows.")));
+ else if (rel->rd_rel->relkind != RELKIND_RELATION)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a table",
+ RelationGetRelationName(rel))));
+
+ /*
+ * check permissions: must have SELECT on table or be in
+ * pg_stat_scan_tables
+ */
+ aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
+ ACL_SELECT);
+ if (aclresult != ACLCHECK_OK)
+ aclresult = is_member_of_role(GetUserId(), DEFAULT_ROLE_STAT_SCAN_TABLES) ? ACLCHECK_OK : ACLCHECK_NO_PRIV;
+
+ if (aclresult != ACLCHECK_OK)
+ aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
+ RelationGetRelationName(rel));
+
+ scan = table_beginscan(rel, GetActiveSnapshot(), 0, NULL);
+ hscan = (HeapScanDesc) scan;
- /*
- * check permissions: must have SELECT on table or be in
- * pg_stat_scan_tables
- */
- aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
- ACL_SELECT);
- if (aclresult != ACLCHECK_OK)
- aclresult = is_member_of_role(GetUserId(), DEFAULT_ROLE_STAT_SCAN_TABLES) ? ACLCHECK_OK : ACLCHECK_NO_PRIV;
-
- if (aclresult != ACLCHECK_OK)
- aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
-
- scan = table_beginscan(rel, GetActiveSnapshot(), 0, NULL);
- hscan = (HeapScanDesc) scan;
- mydata = palloc(sizeof(*mydata));
- mydata->rel = rel;
- mydata->scan = scan;
- mydata->ncolumns = tupdesc->natts;
- funcctx->user_fctx = mydata;
-
- MemoryContextSwitchTo(oldcontext);
- }
+ MemoryContextSwitchTo(oldcontext);
- funcctx = SRF_PERCALL_SETUP();
- attinmeta = funcctx->attinmeta;
- mydata = (MyData *) funcctx->user_fctx;
- scan = mydata->scan;
- hscan = (HeapScanDesc) scan;
+ values = (char **) palloc(tupdesc->natts * sizeof(char *));
/* scan the relation */
while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
@@ -169,9 +165,6 @@ pgrowlocks(PG_FUNCTION_ARGS)
*/
if (htsu == TM_BeingModified)
{
- char **values;
-
- values = (char **) palloc(mydata->ncolumns * sizeof(char *));
values[Atnum_tid] = (char *) DirectFunctionCall1(tidout,
PointerGetDatum(&tuple->t_self));
@@ -297,16 +290,7 @@ pgrowlocks(PG_FUNCTION_ARGS)
/* build a tuple */
tuple = BuildTupleFromCStrings(attinmeta, values);
-
- /* make the tuple into a datum */
- result = HeapTupleGetDatum(tuple);
-
- /*
- * no need to pfree what we allocated; it's on a short-lived
- * memory context anyway
- */
-
- SRF_RETURN_NEXT(funcctx, result);
+ tuplestore_puttuple(tupstore, tuple);
}
else
{
@@ -314,8 +298,9 @@ pgrowlocks(PG_FUNCTION_ARGS)
}
}
+ tuplestore_donestoring(tupstore);
table_endscan(scan);
- table_close(mydata->rel, AccessShareLock);
+ table_close(rel, AccessShareLock);
- SRF_RETURN_DONE(funcctx);
+ return (Datum) 0;
}
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index ca5e6efd7e..11311905c7 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -2812,7 +2812,7 @@ HeapTupleGetDatum(HeapTuple tuple)
<title>Returning Sets</title>
<para>
- There is also a special API that provides support for returning
+ There is also a special, helper API that provides support for returning
sets (multiple rows) from a C-language function. A set-returning
function must follow the version-1 calling conventions. Also,
source files must include <filename>funcapi.h</filename>, as
@@ -2820,10 +2820,19 @@ HeapTupleGetDatum(HeapTuple tuple)
</para>
<para>
- A set-returning function (<acronym>SRF</acronym>) is called
- once for each item it returns. The <acronym>SRF</acronym> must
- therefore save enough state to remember what it was doing and
+ The helper API allows writing a set-returning function
+ (<acronym>SRF</acronym>) using the ValuePerCall mode, in which the SRF is
+ called once for each item it returns. The <acronym>SRF</acronym> must
+ then save enough state to remember what it was doing and
return the next item on each call.
+ Note, however, that an SRF will possibly not be run to completion, due to
+ a LIMIT or a cursor, and should avoid leaving resources like DIR*s or
+ tablescans opened across calls. To avoid leaking resources in those cases,
+ instead of using ValuePerCall mode, the SRF should populate and return a
+ tuplestore with SFRM_Materialize mode. See fmgr/README.
+ </para>
+
+ <para>
The structure <structname>FuncCallContext</structname> is provided to help
control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</literal>
is used to hold a pointer to <structname>FuncCallContext</structname>
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 4f109111d1..3c5a428654 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -4756,11 +4756,8 @@ Datum
pg_timezone_names(PG_FUNCTION_ARGS)
{
MemoryContext oldcontext;
- FuncCallContext *funcctx;
pg_tzenum *tzenum;
pg_tz *tz;
- Datum result;
- HeapTuple tuple;
Datum values[4];
bool nulls[4];
int tzoff;
@@ -4770,58 +4767,45 @@ pg_timezone_names(PG_FUNCTION_ARGS)
Interval *resInterval;
struct pg_tm itm;
- /* stuff done only on the first call of the function */
- if (SRF_IS_FIRSTCALL())
- {
- TupleDesc tupdesc;
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
- /* create a function context for cross-call persistence */
- funcctx = SRF_FIRSTCALL_INIT();
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
- /*
- * switch to memory context appropriate for multiple function calls
- */
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not "
+ "allowed in this context")));
- /* initialize timezone scanning code */
- tzenum = pg_tzenumerate_start();
- funcctx->user_fctx = (void *) tzenum;
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
- /*
- * build tupdesc for result tuples. This must match this function's
- * pg_proc entry!
- */
- tupdesc = CreateTemplateTupleDesc(4);
- TupleDescInitEntry(tupdesc, (AttrNumber) 1, "name",
- TEXTOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 2, "abbrev",
- TEXTOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 3, "utc_offset",
- INTERVALOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_dst",
- BOOLOID, -1, 0);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
- funcctx->tuple_desc = BlessTupleDesc(tupdesc);
- MemoryContextSwitchTo(oldcontext);
- }
+ MemoryContextSwitchTo(oldcontext);
- /* stuff done on every call of the function */
- funcctx = SRF_PERCALL_SETUP();
- tzenum = (pg_tzenum *) funcctx->user_fctx;
+ /* initialize timezone scanning code */
+ tzenum = pg_tzenumerate_start();
/* search for another zone to display */
for (;;)
{
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
tz = pg_tzenumerate_next(tzenum);
- MemoryContextSwitchTo(oldcontext);
-
if (!tz)
- {
- pg_tzenumerate_end(tzenum);
- funcctx->user_fctx = NULL;
- SRF_RETURN_DONE(funcctx);
- }
+ break;
/* Convert now() to local time in this zone */
if (timestamp2tm(GetCurrentTransactionStartTimestamp(),
@@ -4840,25 +4824,23 @@ pg_timezone_names(PG_FUNCTION_ARGS)
if (tzn && strlen(tzn) > 31)
continue;
- /* Found a displayable zone */
- break;
- }
-
- MemSet(nulls, 0, sizeof(nulls));
- values[0] = CStringGetTextDatum(pg_get_timezone_name(tz));
- values[1] = CStringGetTextDatum(tzn ? tzn : "");
+ MemSet(nulls, 0, sizeof(nulls));
+ values[0] = CStringGetTextDatum(pg_get_timezone_name(tz));
+ values[1] = CStringGetTextDatum(tzn ? tzn : "");
- MemSet(&itm, 0, sizeof(struct pg_tm));
- itm.tm_sec = -tzoff;
- resInterval = (Interval *) palloc(sizeof(Interval));
- tm2interval(&itm, 0, resInterval);
- values[2] = IntervalPGetDatum(resInterval);
+ MemSet(&itm, 0, sizeof(struct pg_tm));
+ itm.tm_sec = -tzoff;
+ resInterval = (Interval *) palloc(sizeof(Interval));
+ tm2interval(&itm, 0, resInterval);
+ values[2] = IntervalPGetDatum(resInterval);
- values[3] = BoolGetDatum(tm.tm_isdst > 0);
+ values[3] = BoolGetDatum(tm.tm_isdst > 0);
- tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
- result = HeapTupleGetDatum(tuple);
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
- SRF_RETURN_NEXT(funcctx, result);
+ pg_tzenumerate_end(tzenum);
+ tuplestore_donestoring(tupstore);
+ return (Datum) 0;
}
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index bcf9bd1b97..e348089418 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -36,13 +36,6 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
-typedef struct
-{
- char *location;
- DIR *dirdesc;
- bool include_dot_dirs;
-} directory_fctx;
-
/*
* Convert a "text" filename argument to C string, and check it's allowable.
@@ -447,67 +440,82 @@ pg_stat_file_1arg(PG_FUNCTION_ARGS)
Datum
pg_ls_dir(PG_FUNCTION_ARGS)
{
- FuncCallContext *funcctx;
- struct dirent *de;
- directory_fctx *fctx;
- MemoryContext oldcontext;
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
- if (SRF_IS_FIRSTCALL())
+ bool missing_ok = false;
+ bool include_dot_dirs = false;
+ char *location;
+ DIR *dirdesc;
+ struct dirent *de;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not "
+ "allowed in this context")));
+
+ /* check the optional arguments */
+ if (PG_NARGS() == 3)
{
- bool missing_ok = false;
- bool include_dot_dirs = false;
+ if (!PG_ARGISNULL(1))
+ missing_ok = PG_GETARG_BOOL(1);
+ if (!PG_ARGISNULL(2))
+ include_dot_dirs = PG_GETARG_BOOL(2);
+ }
- /* check the optional arguments */
- if (PG_NARGS() == 3)
- {
- if (!PG_ARGISNULL(1))
- missing_ok = PG_GETARG_BOOL(1);
- if (!PG_ARGISNULL(2))
- include_dot_dirs = PG_GETARG_BOOL(2);
- }
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ tupdesc = CreateTemplateTupleDesc(1);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "column", TEXTOID, -1, 0);
- fctx = palloc(sizeof(directory_fctx));
- fctx->location = convert_and_check_filename(PG_GETARG_TEXT_PP(0));
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+ MemoryContextSwitchTo(oldcontext);
- fctx->include_dot_dirs = include_dot_dirs;
- fctx->dirdesc = AllocateDir(fctx->location);
+ location = convert_and_check_filename(PG_GETARG_TEXT_PP(0));
+ dirdesc = AllocateDir(location);
- if (!fctx->dirdesc)
- {
- if (missing_ok && errno == ENOENT)
- {
- MemoryContextSwitchTo(oldcontext);
- SRF_RETURN_DONE(funcctx);
- }
- else
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
- }
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
+ if (!dirdesc)
+ {
+ if (missing_ok && errno == ENOENT)
+ return (Datum) 0;
+ else
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ location)));
}
- funcctx = SRF_PERCALL_SETUP();
- fctx = (directory_fctx *) funcctx->user_fctx;
-
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ while ((de = ReadDir(dirdesc, location)) != NULL)
{
- if (!fctx->include_dot_dirs &&
+ Datum values[1];
+ bool nulls[1] = {false};
+
+ if (!include_dot_dirs &&
(strcmp(de->d_name, ".") == 0 ||
strcmp(de->d_name, "..") == 0))
continue;
- SRF_RETURN_NEXT(funcctx, CStringGetTextDatum(de->d_name));
+ values[0] = CStringGetTextDatum(de->d_name);
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
- FreeDir(fctx->dirdesc);
-
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return (Datum) 0;
}
/*
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 323e36b81c..789febebe3 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -194,72 +194,87 @@ current_query(PG_FUNCTION_ARGS)
/* Function to find out which databases make use of a tablespace */
-typedef struct
-{
- char *location;
- DIR *dirdesc;
-} ts_db_fctx;
-
Datum
pg_tablespace_databases(PG_FUNCTION_ARGS)
{
- FuncCallContext *funcctx;
struct dirent *de;
- ts_db_fctx *fctx;
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
- Oid tablespaceOid = PG_GETARG_OID(0);
+ MemoryContext oldcontext;
+ Oid tablespaceOid = PG_GETARG_OID(0);
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
- fctx = palloc(sizeof(ts_db_fctx));
+ DIR *dirdesc;
+ char *location;
- if (tablespaceOid == GLOBALTABLESPACE_OID)
- {
- fctx->dirdesc = NULL;
- ereport(WARNING,
- (errmsg("global tablespace never has databases")));
- }
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not "
+ "allowed in this context")));
+
+ if (tablespaceOid == GLOBALTABLESPACE_OID)
+ {
+ dirdesc = NULL;
+ ereport(WARNING,
+ (errmsg("global tablespace never has databases")));
+ }
+ else
+ {
+ if (tablespaceOid == DEFAULTTABLESPACE_OID)
+ location = psprintf("base");
else
- {
- if (tablespaceOid == DEFAULTTABLESPACE_OID)
- fctx->location = psprintf("base");
- else
- fctx->location = psprintf("pg_tblspc/%u/%s", tablespaceOid,
- TABLESPACE_VERSION_DIRECTORY);
+ location = psprintf("pg_tblspc/%u/%s", tablespaceOid,
+ TABLESPACE_VERSION_DIRECTORY);
- fctx->dirdesc = AllocateDir(fctx->location);
+ dirdesc = AllocateDir(location);
- if (!fctx->dirdesc)
- {
- /* the only expected error is ENOENT */
- if (errno != ENOENT)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
- ereport(WARNING,
- (errmsg("%u is not a tablespace OID", tablespaceOid)));
- }
+ if (!dirdesc)
+ {
+ /* the only expected error is ENOENT */
+ if (errno != ENOENT)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ location)));
+ ereport(WARNING,
+ (errmsg("%u is not a tablespace OID", tablespaceOid)));
}
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
}
- funcctx = SRF_PERCALL_SETUP();
- fctx = (ts_db_fctx *) funcctx->user_fctx;
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
- if (!fctx->dirdesc) /* not a tablespace */
- SRF_RETURN_DONE(funcctx);
+ tupdesc = CreateTemplateTupleDesc(1);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "column", OIDOID, -1, 0);
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (!dirdesc) /* not a tablespace */
+ return 0;
+
+ while ((de = ReadDir(dirdesc, location)) != NULL)
{
Oid datOid = atooid(de->d_name);
char *subdir;
bool isempty;
+ Datum values[1];
+ bool nulls[1] = {false};
/* this test skips . and .., but is awfully weak */
if (!datOid)
@@ -267,18 +282,20 @@ pg_tablespace_databases(PG_FUNCTION_ARGS)
/* if database subdir is empty, don't report tablespace as used */
- subdir = psprintf("%s/%s", fctx->location, de->d_name);
+ subdir = psprintf("%s/%s", location, de->d_name);
isempty = directory_is_empty(subdir);
pfree(subdir);
if (isempty)
continue; /* indeed, nothing in it */
- SRF_RETURN_NEXT(funcctx, ObjectIdGetDatum(datOid));
+ values[0] = ObjectIdGetDatum(datOid);
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
- FreeDir(fctx->dirdesc);
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return (Datum) 0;
}
diff --git a/src/include/funcapi.h b/src/include/funcapi.h
index f9b75ae390..471100b7e5 100644
--- a/src/include/funcapi.h
+++ b/src/include/funcapi.h
@@ -234,7 +234,12 @@ extern Datum HeapTupleHeaderGetDatum(HeapTupleHeader tuple);
/*----------
* Support for Set Returning Functions (SRFs)
*
- * The basic API for SRFs looks something like:
+ * The basic API for SRFs using ValuePerCall mode looks something like this.
+ * Note that an SRF will possibly not be run to completion, due to a LIMIT or a
+ * cursor, and should avoid leaving resources like DIR*s or tablescans opened
+ * across calls. To avoid leaking resources in those cases, instead of using
+ * ValuePerCall mode, the SRF should populate and return a tuplestore with
+ * SFRM_Materialize mode. See fmgr/README.
*
* Datum
* my_Set_Returning_Function(PG_FUNCTION_ARGS)
--
2.17.0
On Sun, Mar 08, 2020 at 04:30:44PM -0400, Tom Lane wrote:
BTW, another thing I noticed while looking around is that some of
the functions using SRF_RETURN_DONE() think they should clean up
memory beforehand. This is a waste of code/cycles, as long as the
memory was properly allocated in funcctx->multi_call_memory_ctx,
because funcapi.c takes care of deleting that context.We should probably document that *any* manual cleanup before
SRF_RETURN_DONE() is an antipattern. If you have to have cleanup,
it needs to be done via RegisterExprContextCallback instead.
This part appears to be already in place since
e4186762ffaa4188e16702e8f4f299ea70988b96:
|The memory context that is current when the SRF is called is a transient
|context that will be cleared between calls. This means that you do not need to
|call pfree on everything you allocated using palloc; it will go away anyway.
|However, if you want to allocate any data structures to live across calls, you
|need to put them somewhere else. The memory context referenced by
|multi_call_memory_ctx is a suitable location for any data that needs to survive
|until the SRF is finished running. In most cases, this means that you should
|switch into multi_call_memory_ctx while doing the first-call setup.
--
Justin
Nitpick: please see c4dcd9144ba6.
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 11 Mar 2020 10:09:18 -0500
Subject: [PATCH] SRF: avoid leaking resources if not run to completionChange to return a tuplestore populated immediately and returned in full.
Discussion: /messages/by-id/20200308173103.GC1357@telsasoft.com
I wonder if this isn't saying that the whole value-per-call protocol is
bogus, in that it seems impossible to write a useful function with it.
Maybe we should add one final call with a special flag "function
shutdown" or something, so that these resources can be released if the
SRF isn't run to completion?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
I wonder if this isn't saying that the whole value-per-call protocol is
bogus, in that it seems impossible to write a useful function with it.
Only if you have a *very* narrow definition of "useful function".
If you look through SRF_RETURN_DONE callers, only a small minority
are trying to do resource cleanup beforehand.
Maybe we should add one final call with a special flag "function
shutdown" or something, so that these resources can be released if the
SRF isn't run to completion?
We already have an appropriate mechanism for cleaning up resources,
ie RegisterExprContextCallback. I do not think what you're suggesting
could be made to work without an incompatible break in the API for
SRFs, so it's not an improvement over telling people more forcefully
about how to do resource cleanup correctly.
regards, tom lane
On Thu, Mar 12, 2020 at 07:11:56AM -0500, Justin Pryzby wrote:
Do you want to have a go at that?
First draft attached. Note that I handled pg_ls_dir, even though I'm proposing
on the other thread to collapse/merge/meld it with pg_ls_dir_files [0].
Possibly that's a bad idea with tuplestore, due to returning a scalar vs a row
and needing to conditionally call CreateTemplateTupleDesc vs
get_call_result_type. I'll rebase that patch later today.I didn't write test cases yet. Also didn't look for functions not on your
list.I noticed this doesn't actually do anything, but kept it for now...except in
pg_ls_dir error case:src/include/utils/tuplestore.h:/* tuplestore_donestoring() used to be required, but is no longer used */
src/include/utils/tuplestore.h:#define tuplestore_donestoring(state) ((void) 0)
v2 attached - I will add to next CF in case you want to defer it until later.
--
Justin
Attachments:
v2-0001-SRF-avoid-leaking-resources-if-not-run-to-complet.patchtext/x-diff; charset=us-asciiDownload
From 9ae75f693adf246ad551e258198606fd63190a20 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 11 Mar 2020 10:09:18 -0500
Subject: [PATCH v2 1/2] SRF: avoid leaking resources if not run to completion
Change to return a tuplestore populated immediately and returned in full.
Discussion: https://www.postgresql.org/message-id/20200308173103.GC1357%40telsasoft.com
See also: 9cb7db3f0 and 085b6b66
---
contrib/adminpack/adminpack.c | 82 +++++-----
contrib/pgrowlocks/pgrowlocks.c | 162 +++++++++----------
doc/src/sgml/xfunc.sgml | 17 +-
src/backend/utils/adt/datetime.c | 99 +++++-------
src/backend/utils/adt/genfile.c | 111 +++++++------
src/backend/utils/adt/misc.c | 116 +++++++------
src/include/funcapi.h | 7 +-
src/test/regress/expected/misc_functions.out | 18 +++
src/test/regress/sql/misc_functions.sql | 6 +
9 files changed, 325 insertions(+), 293 deletions(-)
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index bc45e79895..070025f5a8 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -504,50 +504,56 @@ pg_logdir_ls_v1_1(PG_FUNCTION_ARGS)
static Datum
pg_logdir_ls_internal(FunctionCallInfo fcinfo)
{
- FuncCallContext *funcctx;
struct dirent *de;
- directory_fctx *fctx;
+
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
+ DIR *dirdesc;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not allowed in this context")));
if (strcmp(Log_filename, "postgresql-%Y-%m-%d_%H%M%S.log") != 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("the log_filename parameter must equal 'postgresql-%%Y-%%m-%%d_%%H%%M%%S.log'")));
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
- TupleDesc tupdesc;
-
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
- fctx = palloc(sizeof(directory_fctx));
+ /* Is this right ?? */
+ tupdesc = CreateTemplateTupleDesc(2);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "starttime",
+ TIMESTAMPOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "filename",
+ TEXTOID, -1, 0);
- tupdesc = CreateTemplateTupleDesc(2);
- TupleDescInitEntry(tupdesc, (AttrNumber) 1, "starttime",
- TIMESTAMPOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 2, "filename",
- TEXTOID, -1, 0);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
- funcctx->attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ MemoryContextSwitchTo(oldcontext);
- fctx->location = pstrdup(Log_directory);
- fctx->dirdesc = AllocateDir(fctx->location);
-
- if (!fctx->dirdesc)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
-
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
- }
-
- funcctx = SRF_PERCALL_SETUP();
- fctx = (directory_fctx *) funcctx->user_fctx;
+ dirdesc = AllocateDir(Log_directory);
+ if (!dirdesc)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ Log_directory)));
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ while ((de = ReadDir(dirdesc, Log_directory)) != NULL)
{
char *values[2];
HeapTuple tuple;
@@ -584,13 +590,13 @@ pg_logdir_ls_internal(FunctionCallInfo fcinfo)
/* Seems the timestamp is OK; prepare and return tuple */
values[0] = timestampbuf;
- values[1] = psprintf("%s/%s", fctx->location, de->d_name);
-
- tuple = BuildTupleFromCStrings(funcctx->attinmeta, values);
+ values[1] = psprintf("%s/%s", Log_directory, de->d_name);
- SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+ tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupdesc), values);
+ tuplestore_puttuple(tupstore, tuple);
}
- FreeDir(fctx->dirdesc);
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return (Datum) 0;
}
diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index a2c44a916c..baad83fcde 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -54,13 +54,6 @@ PG_FUNCTION_INFO_V1(pgrowlocks);
#define NCHARS 32
-typedef struct
-{
- Relation rel;
- TableScanDesc scan;
- int ncolumns;
-} MyData;
-
#define Atnum_tid 0
#define Atnum_xmax 1
#define Atnum_ismulti 2
@@ -71,82 +64,84 @@ typedef struct
Datum
pgrowlocks(PG_FUNCTION_ARGS)
{
- FuncCallContext *funcctx;
TableScanDesc scan;
HeapScanDesc hscan;
HeapTuple tuple;
TupleDesc tupdesc;
AttInMetadata *attinmeta;
- Datum result;
- MyData *mydata;
Relation rel;
- if (SRF_IS_FIRSTCALL())
- {
- text *relname;
- RangeVar *relrv;
- MemoryContext oldcontext;
- AclResult aclresult;
-
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
-
- /* Build a tuple descriptor for our result type */
- if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
- elog(ERROR, "return type must be a row type");
-
- attinmeta = TupleDescGetAttInMetadata(tupdesc);
- funcctx->attinmeta = attinmeta;
-
- relname = PG_GETARG_TEXT_PP(0);
- relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
- rel = relation_openrv(relrv, AccessShareLock);
-
- if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
- ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("only heap AM is supported")));
-
- if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("\"%s\" is a partitioned table",
- RelationGetRelationName(rel)),
- errdetail("Partitioned tables do not contain rows.")));
- else if (rel->rd_rel->relkind != RELKIND_RELATION)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("\"%s\" is not a table",
- RelationGetRelationName(rel))));
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
+
+ text *relname;
+ RangeVar *relrv;
+ MemoryContext oldcontext;
+ AclResult aclresult;
+ char **values;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not allowed in this context")));
+
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ attinmeta = TupleDescGetAttInMetadata(tupdesc);
+
+ relname = PG_GETARG_TEXT_PP(0);
+ relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
+ rel = relation_openrv(relrv, AccessShareLock);
+
+ if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+ ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only heap AM is supported")));
+
+ if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is a partitioned table",
+ RelationGetRelationName(rel)),
+ errdetail("Partitioned tables do not contain rows.")));
+ else if (rel->rd_rel->relkind != RELKIND_RELATION)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a table",
+ RelationGetRelationName(rel))));
+
+ /*
+ * check permissions: must have SELECT on table or be in
+ * pg_stat_scan_tables
+ */
+ aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
+ ACL_SELECT);
+ if (aclresult != ACLCHECK_OK)
+ aclresult = is_member_of_role(GetUserId(), DEFAULT_ROLE_STAT_SCAN_TABLES) ? ACLCHECK_OK : ACLCHECK_NO_PRIV;
+
+ if (aclresult != ACLCHECK_OK)
+ aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
+ RelationGetRelationName(rel));
+
+ scan = table_beginscan(rel, GetActiveSnapshot(), 0, NULL);
+ hscan = (HeapScanDesc) scan;
- /*
- * check permissions: must have SELECT on table or be in
- * pg_stat_scan_tables
- */
- aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
- ACL_SELECT);
- if (aclresult != ACLCHECK_OK)
- aclresult = is_member_of_role(GetUserId(), DEFAULT_ROLE_STAT_SCAN_TABLES) ? ACLCHECK_OK : ACLCHECK_NO_PRIV;
-
- if (aclresult != ACLCHECK_OK)
- aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
-
- scan = table_beginscan(rel, GetActiveSnapshot(), 0, NULL);
- hscan = (HeapScanDesc) scan;
- mydata = palloc(sizeof(*mydata));
- mydata->rel = rel;
- mydata->scan = scan;
- mydata->ncolumns = tupdesc->natts;
- funcctx->user_fctx = mydata;
-
- MemoryContextSwitchTo(oldcontext);
- }
+ MemoryContextSwitchTo(oldcontext);
- funcctx = SRF_PERCALL_SETUP();
- attinmeta = funcctx->attinmeta;
- mydata = (MyData *) funcctx->user_fctx;
- scan = mydata->scan;
- hscan = (HeapScanDesc) scan;
+ values = (char **) palloc(tupdesc->natts * sizeof(char *));
/* scan the relation */
while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
@@ -169,9 +164,6 @@ pgrowlocks(PG_FUNCTION_ARGS)
*/
if (htsu == TM_BeingModified)
{
- char **values;
-
- values = (char **) palloc(mydata->ncolumns * sizeof(char *));
values[Atnum_tid] = (char *) DirectFunctionCall1(tidout,
PointerGetDatum(&tuple->t_self));
@@ -297,16 +289,7 @@ pgrowlocks(PG_FUNCTION_ARGS)
/* build a tuple */
tuple = BuildTupleFromCStrings(attinmeta, values);
-
- /* make the tuple into a datum */
- result = HeapTupleGetDatum(tuple);
-
- /*
- * no need to pfree what we allocated; it's on a short-lived
- * memory context anyway
- */
-
- SRF_RETURN_NEXT(funcctx, result);
+ tuplestore_puttuple(tupstore, tuple);
}
else
{
@@ -314,8 +297,9 @@ pgrowlocks(PG_FUNCTION_ARGS)
}
}
+ tuplestore_donestoring(tupstore);
table_endscan(scan);
- table_close(mydata->rel, AccessShareLock);
+ table_close(rel, AccessShareLock);
- SRF_RETURN_DONE(funcctx);
+ return (Datum) 0;
}
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index a91f887119..432758e75e 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -2812,7 +2812,7 @@ HeapTupleGetDatum(HeapTuple tuple)
<title>Returning Sets</title>
<para>
- There is also a special API that provides support for returning
+ There is also a special, helper API that provides support for returning
sets (multiple rows) from a C-language function. A set-returning
function must follow the version-1 calling conventions. Also,
source files must include <filename>funcapi.h</filename>, as
@@ -2820,10 +2820,19 @@ HeapTupleGetDatum(HeapTuple tuple)
</para>
<para>
- A set-returning function (<acronym>SRF</acronym>) is called
- once for each item it returns. The <acronym>SRF</acronym> must
- therefore save enough state to remember what it was doing and
+ The helper API allows writing a set-returning function
+ (<acronym>SRF</acronym>) using the ValuePerCall mode, in which the SRF is
+ called once for each item it returns. The <acronym>SRF</acronym> must
+ then save enough state to remember what it was doing and
return the next item on each call.
+ Note, however, that an SRF will possibly not be run to completion, due to
+ a LIMIT or a cursor, and should avoid leaving resources like DIR*s or
+ tablescans opened across calls. To avoid leaking resources in those cases,
+ instead of using ValuePerCall mode, the SRF should populate and return a
+ tuplestore with SFRM_Materialize mode. See fmgr/README.
+ </para>
+
+ <para>
The structure <structname>FuncCallContext</structname> is provided to help
control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</literal>
is used to hold a pointer to <structname>FuncCallContext</structname>
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 4f109111d1..d0966c8305 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -4756,11 +4756,8 @@ Datum
pg_timezone_names(PG_FUNCTION_ARGS)
{
MemoryContext oldcontext;
- FuncCallContext *funcctx;
pg_tzenum *tzenum;
pg_tz *tz;
- Datum result;
- HeapTuple tuple;
Datum values[4];
bool nulls[4];
int tzoff;
@@ -4770,58 +4767,44 @@ pg_timezone_names(PG_FUNCTION_ARGS)
Interval *resInterval;
struct pg_tm itm;
- /* stuff done only on the first call of the function */
- if (SRF_IS_FIRSTCALL())
- {
- TupleDesc tupdesc;
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
- /* create a function context for cross-call persistence */
- funcctx = SRF_FIRSTCALL_INIT();
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
- /*
- * switch to memory context appropriate for multiple function calls
- */
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not allowed in this context")));
- /* initialize timezone scanning code */
- tzenum = pg_tzenumerate_start();
- funcctx->user_fctx = (void *) tzenum;
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
- /*
- * build tupdesc for result tuples. This must match this function's
- * pg_proc entry!
- */
- tupdesc = CreateTemplateTupleDesc(4);
- TupleDescInitEntry(tupdesc, (AttrNumber) 1, "name",
- TEXTOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 2, "abbrev",
- TEXTOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 3, "utc_offset",
- INTERVALOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_dst",
- BOOLOID, -1, 0);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
- funcctx->tuple_desc = BlessTupleDesc(tupdesc);
- MemoryContextSwitchTo(oldcontext);
- }
+ MemoryContextSwitchTo(oldcontext);
- /* stuff done on every call of the function */
- funcctx = SRF_PERCALL_SETUP();
- tzenum = (pg_tzenum *) funcctx->user_fctx;
+ /* initialize timezone scanning code */
+ tzenum = pg_tzenumerate_start();
/* search for another zone to display */
for (;;)
{
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
tz = pg_tzenumerate_next(tzenum);
- MemoryContextSwitchTo(oldcontext);
-
if (!tz)
- {
- pg_tzenumerate_end(tzenum);
- funcctx->user_fctx = NULL;
- SRF_RETURN_DONE(funcctx);
- }
+ break;
/* Convert now() to local time in this zone */
if (timestamp2tm(GetCurrentTransactionStartTimestamp(),
@@ -4840,25 +4823,23 @@ pg_timezone_names(PG_FUNCTION_ARGS)
if (tzn && strlen(tzn) > 31)
continue;
- /* Found a displayable zone */
- break;
- }
-
- MemSet(nulls, 0, sizeof(nulls));
- values[0] = CStringGetTextDatum(pg_get_timezone_name(tz));
- values[1] = CStringGetTextDatum(tzn ? tzn : "");
+ MemSet(nulls, 0, sizeof(nulls));
+ values[0] = CStringGetTextDatum(pg_get_timezone_name(tz));
+ values[1] = CStringGetTextDatum(tzn ? tzn : "");
- MemSet(&itm, 0, sizeof(struct pg_tm));
- itm.tm_sec = -tzoff;
- resInterval = (Interval *) palloc(sizeof(Interval));
- tm2interval(&itm, 0, resInterval);
- values[2] = IntervalPGetDatum(resInterval);
+ MemSet(&itm, 0, sizeof(struct pg_tm));
+ itm.tm_sec = -tzoff;
+ resInterval = (Interval *) palloc(sizeof(Interval));
+ tm2interval(&itm, 0, resInterval);
+ values[2] = IntervalPGetDatum(resInterval);
- values[3] = BoolGetDatum(tm.tm_isdst > 0);
+ values[3] = BoolGetDatum(tm.tm_isdst > 0);
- tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
- result = HeapTupleGetDatum(tuple);
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
- SRF_RETURN_NEXT(funcctx, result);
+ pg_tzenumerate_end(tzenum);
+ tuplestore_donestoring(tupstore);
+ return (Datum) 0;
}
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index bcf9bd1b97..39a54d532f 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -36,13 +36,6 @@
#include "utils/syscache.h"
#include "utils/timestamp.h"
-typedef struct
-{
- char *location;
- DIR *dirdesc;
- bool include_dot_dirs;
-} directory_fctx;
-
/*
* Convert a "text" filename argument to C string, and check it's allowable.
@@ -447,67 +440,81 @@ pg_stat_file_1arg(PG_FUNCTION_ARGS)
Datum
pg_ls_dir(PG_FUNCTION_ARGS)
{
- FuncCallContext *funcctx;
- struct dirent *de;
- directory_fctx *fctx;
- MemoryContext oldcontext;
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
+
+ bool missing_ok = false;
+ bool include_dot_dirs = false;
+ char *location;
+ DIR *dirdesc;
+ struct dirent *de;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not allowed in this context")));
- if (SRF_IS_FIRSTCALL())
+ /* check the optional arguments */
+ if (PG_NARGS() == 3)
{
- bool missing_ok = false;
- bool include_dot_dirs = false;
+ if (!PG_ARGISNULL(1))
+ missing_ok = PG_GETARG_BOOL(1);
+ if (!PG_ARGISNULL(2))
+ include_dot_dirs = PG_GETARG_BOOL(2);
+ }
- /* check the optional arguments */
- if (PG_NARGS() == 3)
- {
- if (!PG_ARGISNULL(1))
- missing_ok = PG_GETARG_BOOL(1);
- if (!PG_ARGISNULL(2))
- include_dot_dirs = PG_GETARG_BOOL(2);
- }
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ tupdesc = CreateTemplateTupleDesc(1);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "column", TEXTOID, -1, 0);
- fctx = palloc(sizeof(directory_fctx));
- fctx->location = convert_and_check_filename(PG_GETARG_TEXT_PP(0));
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+ MemoryContextSwitchTo(oldcontext);
- fctx->include_dot_dirs = include_dot_dirs;
- fctx->dirdesc = AllocateDir(fctx->location);
+ location = convert_and_check_filename(PG_GETARG_TEXT_PP(0));
+ dirdesc = AllocateDir(location);
- if (!fctx->dirdesc)
- {
- if (missing_ok && errno == ENOENT)
- {
- MemoryContextSwitchTo(oldcontext);
- SRF_RETURN_DONE(funcctx);
- }
- else
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
- }
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
+ if (!dirdesc)
+ {
+ if (missing_ok && errno == ENOENT)
+ return (Datum) 0;
+ else
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ location)));
}
- funcctx = SRF_PERCALL_SETUP();
- fctx = (directory_fctx *) funcctx->user_fctx;
-
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ while ((de = ReadDir(dirdesc, location)) != NULL)
{
- if (!fctx->include_dot_dirs &&
+ Datum values[1];
+ bool nulls[1] = {false};
+
+ if (!include_dot_dirs &&
(strcmp(de->d_name, ".") == 0 ||
strcmp(de->d_name, "..") == 0))
continue;
- SRF_RETURN_NEXT(funcctx, CStringGetTextDatum(de->d_name));
+ values[0] = CStringGetTextDatum(de->d_name);
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
- FreeDir(fctx->dirdesc);
-
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return (Datum) 0;
}
/*
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 323e36b81c..9919087335 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -194,72 +194,86 @@ current_query(PG_FUNCTION_ARGS)
/* Function to find out which databases make use of a tablespace */
-typedef struct
-{
- char *location;
- DIR *dirdesc;
-} ts_db_fctx;
-
Datum
pg_tablespace_databases(PG_FUNCTION_ARGS)
{
- FuncCallContext *funcctx;
struct dirent *de;
- ts_db_fctx *fctx;
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
- Oid tablespaceOid = PG_GETARG_OID(0);
+ MemoryContext oldcontext;
+ Oid tablespaceOid = PG_GETARG_OID(0);
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ bool randomAccess;
- fctx = palloc(sizeof(ts_db_fctx));
+ DIR *dirdesc;
+ char *location;
- if (tablespaceOid == GLOBALTABLESPACE_OID)
- {
- fctx->dirdesc = NULL;
- ereport(WARNING,
- (errmsg("global tablespace never has databases")));
- }
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("materialize mode required, but it is not allowed in this context")));
+
+ if (tablespaceOid == GLOBALTABLESPACE_OID)
+ {
+ dirdesc = NULL;
+ ereport(WARNING,
+ (errmsg("global tablespace never has databases")));
+ }
+ else
+ {
+ if (tablespaceOid == DEFAULTTABLESPACE_OID)
+ location = psprintf("base");
else
- {
- if (tablespaceOid == DEFAULTTABLESPACE_OID)
- fctx->location = psprintf("base");
- else
- fctx->location = psprintf("pg_tblspc/%u/%s", tablespaceOid,
- TABLESPACE_VERSION_DIRECTORY);
+ location = psprintf("pg_tblspc/%u/%s", tablespaceOid,
+ TABLESPACE_VERSION_DIRECTORY);
- fctx->dirdesc = AllocateDir(fctx->location);
+ dirdesc = AllocateDir(location);
- if (!fctx->dirdesc)
- {
- /* the only expected error is ENOENT */
- if (errno != ENOENT)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open directory \"%s\": %m",
- fctx->location)));
- ereport(WARNING,
- (errmsg("%u is not a tablespace OID", tablespaceOid)));
- }
+ if (!dirdesc)
+ {
+ /* the only expected error is ENOENT */
+ if (errno != ENOENT)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open directory \"%s\": %m",
+ location)));
+ ereport(WARNING,
+ (errmsg("%u is not a tablespace OID", tablespaceOid)));
}
- funcctx->user_fctx = fctx;
- MemoryContextSwitchTo(oldcontext);
}
- funcctx = SRF_PERCALL_SETUP();
- fctx = (ts_db_fctx *) funcctx->user_fctx;
+ /* The Tuplestore and TupleDesc should be created in ecxt_per_query_memory */
+ oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
+
+ randomAccess = (rsinfo->allowedModes&SFRM_Materialize_Random) != 0;
+ tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
- if (!fctx->dirdesc) /* not a tablespace */
- SRF_RETURN_DONE(funcctx);
+ tupdesc = CreateTemplateTupleDesc(1);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "column", OIDOID, -1, 0);
- while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (!dirdesc) /* not a tablespace */
+ return 0;
+
+ while ((de = ReadDir(dirdesc, location)) != NULL)
{
Oid datOid = atooid(de->d_name);
char *subdir;
bool isempty;
+ Datum values[1];
+ bool nulls[1] = {false};
/* this test skips . and .., but is awfully weak */
if (!datOid)
@@ -267,18 +281,20 @@ pg_tablespace_databases(PG_FUNCTION_ARGS)
/* if database subdir is empty, don't report tablespace as used */
- subdir = psprintf("%s/%s", fctx->location, de->d_name);
+ subdir = psprintf("%s/%s", location, de->d_name);
isempty = directory_is_empty(subdir);
pfree(subdir);
if (isempty)
continue; /* indeed, nothing in it */
- SRF_RETURN_NEXT(funcctx, ObjectIdGetDatum(datOid));
+ values[0] = ObjectIdGetDatum(datOid);
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
- FreeDir(fctx->dirdesc);
- SRF_RETURN_DONE(funcctx);
+ tuplestore_donestoring(tupstore);
+ FreeDir(dirdesc);
+ return (Datum) 0;
}
diff --git a/src/include/funcapi.h b/src/include/funcapi.h
index f9b75ae390..471100b7e5 100644
--- a/src/include/funcapi.h
+++ b/src/include/funcapi.h
@@ -234,7 +234,12 @@ extern Datum HeapTupleHeaderGetDatum(HeapTupleHeader tuple);
/*----------
* Support for Set Returning Functions (SRFs)
*
- * The basic API for SRFs looks something like:
+ * The basic API for SRFs using ValuePerCall mode looks something like this.
+ * Note that an SRF will possibly not be run to completion, due to a LIMIT or a
+ * cursor, and should avoid leaving resources like DIR*s or tablescans opened
+ * across calls. To avoid leaking resources in those cases, instead of using
+ * ValuePerCall mode, the SRF should populate and return a tuplestore with
+ * SFRM_Materialize mode. See fmgr/README.
*
* Datum
* my_Set_Returning_Function(PG_FUNCTION_ARGS)
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e217b678d7..b56ce74586 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -180,6 +180,24 @@ select count(*) >= 0 as ok from pg_ls_archive_statusdir();
t
(1 row)
+select * from (select pg_ls_dir('.')a)a where a='base' limit 1;
+ a
+------
+ base
+(1 row)
+
+select * from (select (pg_timezone_names()).name)ptn where name='UTC' limit 1;
+ name
+------
+ UTC
+(1 row)
+
+select datname from (select pg_tablespace_databases(b.oid) as pts from pg_tablespace b where b.spcname!='pg_global')pts join pg_database db on pts.pts=db.oid limit 1;
+ datname
+-----------
+ template1
+(1 row)
+
--
-- Test adding a support function to a subject function
--
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 1e11eb3554..5441b8b643 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -51,6 +51,12 @@ from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1;
select count(*) >= 0 as ok from pg_ls_archive_statusdir();
+select * from (select pg_ls_dir('.')a)a where a='base' limit 1;
+
+select * from (select (pg_timezone_names()).name)ptn where name='UTC' limit 1;
+
+select datname from (select pg_tablespace_databases(b.oid) as pts from pg_tablespace b where b.spcname!='pg_global')pts join pg_database db on pts.pts=db.oid limit 1;
+
--
-- Test adding a support function to a subject function
--
--
2.17.0
v2-0002-Clean-up-existing-anti-pattern-of-freeing-SRF-res.patchtext/x-diff; charset=us-asciiDownload
From 7942df89eed700dfcf71294e9e0aed821b737281 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Fri, 13 Mar 2020 19:53:42 -0500
Subject: [PATCH v2 2/2] Clean up existing anti-pattern of freeing SRF
resources
See also: e4186762ffaa4188e16702e8f4f299ea70988b96
Discussion: https://www.postgresql.org/message-id/8034.1583699444%40sss.pgh.pa.us
---
contrib/pageinspect/btreefuncs.c | 17 ++++++-----------
contrib/pageinspect/ginfuncs.c | 5 +++--
contrib/pageinspect/hashfuncs.c | 8 +++-----
doc/src/sgml/xfunc.sgml | 12 +++++++++---
src/backend/access/transam/multixact.c | 5 +----
src/backend/tsearch/wparser.c | 13 ++++---------
src/backend/utils/adt/jsonfuncs.c | 16 ++--------------
src/backend/utils/adt/tsvector_op.c | 2 +-
src/include/funcapi.h | 7 ++++---
9 files changed, 33 insertions(+), 52 deletions(-)
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index e6a2fc1e15..0f244e4f30 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -502,12 +502,9 @@ bt_page_items(PG_FUNCTION_ARGS)
uargs->offset++;
SRF_RETURN_NEXT(fctx, result);
}
- else
- {
- pfree(uargs->page);
- pfree(uargs);
- SRF_RETURN_DONE(fctx);
- }
+
+ /* allocations in multi_call_memory_ctx are released automatically */
+ SRF_RETURN_DONE(fctx);
}
/*-------------------------------------------------------
@@ -590,11 +587,9 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
uargs->offset++;
SRF_RETURN_NEXT(fctx, result);
}
- else
- {
- pfree(uargs);
- SRF_RETURN_DONE(fctx);
- }
+
+ /* allocations in multi_call_memory_ctx are released automatically */
+ SRF_RETURN_DONE(fctx);
}
/* Number of output arguments (columns) for bt_metap() */
diff --git a/contrib/pageinspect/ginfuncs.c b/contrib/pageinspect/ginfuncs.c
index 7e2cafab74..5109c4b133 100644
--- a/contrib/pageinspect/ginfuncs.c
+++ b/contrib/pageinspect/ginfuncs.c
@@ -260,6 +260,7 @@ gin_leafpage_items(PG_FUNCTION_ARGS)
SRF_RETURN_NEXT(fctx, result);
}
- else
- SRF_RETURN_DONE(fctx);
+
+ /* allocations in multi_call_memory_ctx are released automatically */
+ SRF_RETURN_DONE(fctx);
}
diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index 984ac33186..d024f1ac7b 100644
--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -374,11 +374,9 @@ hash_page_items(PG_FUNCTION_ARGS)
SRF_RETURN_NEXT(fctx, result);
}
- else
- {
- pfree(uargs);
- SRF_RETURN_DONE(fctx);
- }
+
+ /* allocations in multi_call_memory_ctx are released automatically */
+ SRF_RETURN_DONE(fctx);
}
/* ------------------------------------------------
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 432758e75e..f3e96087fa 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -2861,7 +2861,8 @@ typedef struct FuncCallContext
* OPTIONAL pointer to miscellaneous user-provided context information
*
* user_fctx is for use as a pointer to your own data to retain
- * arbitrary context information between calls of your function.
+ * context information between calls of your function.
+ * It should probably not include resources other than allocated memory.
*/
void *user_fctx;
@@ -2881,7 +2882,8 @@ typedef struct FuncCallContext
* multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
* by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
* context for any memory that is to be reused across multiple calls
- * of the SRF.
+ * of the SRF. Memory palloc'd in this context will be pfree'd
+ * automatically.
*/
MemoryContext multi_call_memory_ctx;
@@ -2935,6 +2937,7 @@ SRF_RETURN_NEXT(funcctx, result)
SRF_RETURN_DONE(funcctx)
</programlisting>
to clean up and end the <acronym>SRF</acronym>.
+ Allocations within multi_call_memory_ctx are automatically pfree'd.
</para>
<para>
@@ -3004,7 +3007,10 @@ my_set_returning_function(PG_FUNCTION_ARGS)
}
else
{
- /* Here we are done returning items and just need to clean up: */
+ /*
+ * Here we are done returning items and just need to clean up, but no
+ * need to pfree allocations within multi_call_memory_ctx.
+ */
<replaceable>user code</replaceable>
SRF_RETURN_DONE(funcctx);
}
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 50e98caaeb..cb92e049aa 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3388,9 +3388,6 @@ pg_get_multixact_members(PG_FUNCTION_ARGS)
SRF_RETURN_NEXT(funccxt, HeapTupleGetDatum(tuple));
}
- if (multi->nmembers > 0)
- pfree(multi->members);
- pfree(multi);
-
+ /* allocations in multi_call_memory_ctx are released automatically */
SRF_RETURN_DONE(funccxt);
}
diff --git a/src/backend/tsearch/wparser.c b/src/backend/tsearch/wparser.c
index 88005c0519..85eaf041d6 100644
--- a/src/backend/tsearch/wparser.c
+++ b/src/backend/tsearch/wparser.c
@@ -104,9 +104,8 @@ tt_process_call(FuncCallContext *funcctx)
st->cur++;
return result;
}
- if (st->list)
- pfree(st->list);
- pfree(st);
+
+ /* allocations in multi_call_memory_ctx are released automatically */
return (Datum) 0;
}
@@ -245,12 +244,8 @@ prs_process_call(FuncCallContext *funcctx)
st->cur++;
return result;
}
- else
- {
- if (st->list)
- pfree(st->list);
- pfree(st);
- }
+
+ /* allocations in multi_call_memory_ctx are released automatically */
return (Datum) 0;
}
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index f92861d8d2..c5a286dd58 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -535,7 +535,6 @@ jsonb_object_keys(PG_FUNCTION_ARGS)
{
FuncCallContext *funcctx;
OkeysState *state;
- int i;
if (SRF_IS_FIRSTCALL())
{
@@ -598,12 +597,7 @@ jsonb_object_keys(PG_FUNCTION_ARGS)
SRF_RETURN_NEXT(funcctx, CStringGetTextDatum(nxt));
}
- /* cleanup to reduce or eliminate memory leaks */
- for (i = 0; i < state->result_count; i++)
- pfree(state->result[i]);
- pfree(state->result);
- pfree(state);
-
+ /* allocations in multi_call_memory_ctx are released automatically */
SRF_RETURN_DONE(funcctx);
}
@@ -706,7 +700,6 @@ json_object_keys(PG_FUNCTION_ARGS)
{
FuncCallContext *funcctx;
OkeysState *state;
- int i;
if (SRF_IS_FIRSTCALL())
{
@@ -755,12 +748,7 @@ json_object_keys(PG_FUNCTION_ARGS)
SRF_RETURN_NEXT(funcctx, CStringGetTextDatum(nxt));
}
- /* cleanup to reduce or eliminate memory leaks */
- for (i = 0; i < state->result_count; i++)
- pfree(state->result[i]);
- pfree(state->result);
- pfree(state);
-
+ /* allocations in multi_call_memory_ctx are released automatically */
SRF_RETURN_DONE(funcctx);
}
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 108dd998c7..e7b8fbc3ca 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -706,7 +706,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
}
else
{
- pfree(tsin);
+ /* allocations in multi_call_memory_ctx are released automatically */
SRF_RETURN_DONE(funcctx);
}
}
diff --git a/src/include/funcapi.h b/src/include/funcapi.h
index 471100b7e5..8f6db97fcf 100644
--- a/src/include/funcapi.h
+++ b/src/include/funcapi.h
@@ -77,7 +77,7 @@ typedef struct FuncCallContext
* OPTIONAL pointer to miscellaneous user-provided context information
*
* user_fctx is for use as a pointer to your own struct to retain
- * arbitrary context information between calls of your function.
+ * context information between calls of your function.
*/
void *user_fctx;
@@ -93,8 +93,9 @@ typedef struct FuncCallContext
/*
* memory context used for structures that must live for multiple calls
*
- * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
- * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
+ * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and
+ * automatically cleaned up by SRF_RETURN_DONE(). It is the most
+ * appropriate memory
* context for any memory that is to be reused across multiple calls of
* the SRF.
*/
--
2.17.0
Justin Pryzby <pryzby@telsasoft.com> writes:
v2 attached - I will add to next CF in case you want to defer it until later.
Thanks, reviewed and pushed. Since this is a bug fix (at least in part)
I didn't want to wait.
regards, tom lane
On Mon, Mar 16, 2020 at 09:38:50PM -0400, Tom Lane wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
v2 attached - I will add to next CF in case you want to defer it until later.
Thanks, reviewed and pushed. Since this is a bug fix (at least in part)
I didn't want to wait.
Thanks for fixing my test case and pushing.
--
Justin
Justin Pryzby <pryzby@telsasoft.com> writes:
Thanks for fixing my test case and pushing.
The buildfarm just showed up another instability in the test cases
we added:
=========================== regression.diffs ================
diff -U3 /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/test/regress/expected/misc_functions.out /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/regress/results/misc_functions.out
--- /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/test/regress/expected/misc_functions.out 2020-03-17 08:14:50.292037956 +0100
+++ /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/regress/results/misc_functions.out 2020-03-28 13:55:12.490024822 +0100
@@ -169,11 +169,7 @@
select (w).size = :segsize as ok
from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1;
- ok
-----
- t
-(1 row)
-
+ERROR: could not stat file "pg_wal/000000010000000000000078": No such file or directory
select count(*) >= 0 as ok from pg_ls_archive_statusdir();
ok
----
It's pretty obvious what happened here: concurrent activity renamed or
removed the WAL segment between when we saw it in the directory and
when we tried to stat() it.
This seems like it would be just as much of a hazard for field usage
as it is for regression testing, so I propose that we fix these
directory-scanning functions to silently ignore ENOENT failures from
stat(). Are there any for which we should not do that?
regards, tom lane
On Sat, Mar 28, 2020 at 01:13:54PM -0400, Tom Lane wrote:
The buildfarm just showed up another instability in the test cases
we added:
Yea, as you said, this is an issue with the *testcase*. The function behavior
didn't change, we just weren't previously exercising it.
select (w).size = :segsize as ok from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1; - ok ----- - t -(1 row) - +ERROR: could not stat file "pg_wal/000000010000000000000078": No such file or directory select count(*) >= 0 as ok from pg_ls_archive_statusdir(); ok ----It's pretty obvious what happened here: concurrent activity renamed or
removed the WAL segment between when we saw it in the directory and
when we tried to stat() it.This seems like it would be just as much of a hazard for field usage
as it is for regression testing,
That's clearly true for pg_ls_waldir(), which call pg_ls_dir_files, and
includes some metadata columns.
so I propose that we fix these directory-scanning functions to silently
ignore ENOENT failures from stat(). Are there any for which we should not do
that?
I think it's reasonable to ignore transient ENOENT for tmpdir, logdir, and
probably archive_statusdir. That doesn't currently affect pg_ls_dir(), which
lists file but not metadata for an arbitrary dir, so doesn't call stat().
Note that dangling links in the other functions currently cause (wrong [0]Which you fixed in 085b6b667 and I previously fixed at: /messages/by-id/attachment/106478/v2-0001-BUG-in-errmsg.patch |$ sudo ln -s foo /var/log/postgresql/bar |ts=# SELECT * FROM pg_ls_logdir() ORDER BY 3; |ERROR: could not stat directory "/var/log/postgresql": No such file or directory)
error. I guess it should be documented if broken link is will be ignored due
to ENOENT.
Maybe we should lstat() the file to determine if it's a dangling link; if
lstat() fails, then skip it. Currently, we use stat(), which shows metdata of
a link's *target*. Maybe we'd change that.
Note that I have a patch which generalizes pg_ls_dir_files and makes
pg_ls_dir() a simple wrapper, so if that's pursued, they would behave the same
unless I add another flag to do otherwise (but behaving the same has its
merits). It already uses lstat() to show links to dirs as isdir=no, which was
needed to avoid recursing into links-to-dirs in the new helper function
pg_ls_dir_recurse(). https://commitfest.postgresql.org/26/2377/
[0]: Which you fixed in 085b6b667 and I previously fixed at: /messages/by-id/attachment/106478/v2-0001-BUG-in-errmsg.patch |$ sudo ln -s foo /var/log/postgresql/bar |ts=# SELECT * FROM pg_ls_logdir() ORDER BY 3; |ERROR: could not stat directory "/var/log/postgresql": No such file or directory
/messages/by-id/attachment/106478/v2-0001-BUG-in-errmsg.patch
|$ sudo ln -s foo /var/log/postgresql/bar
|ts=# SELECT * FROM pg_ls_logdir() ORDER BY 3;
|ERROR: could not stat directory "/var/log/postgresql": No such file or directory
Justin Pryzby <pryzby@telsasoft.com> writes:
On Sat, Mar 28, 2020 at 01:13:54PM -0400, Tom Lane wrote:
so I propose that we fix these directory-scanning functions to silently
ignore ENOENT failures from stat(). Are there any for which we should not do
that?
Maybe we should lstat() the file to determine if it's a dangling link; if
lstat() fails, then skip it. Currently, we use stat(), which shows metdata of
a link's *target*. Maybe we'd change that.
Hm, good point that ENOENT could refer to a symlink's target. Still,
I'm not sure it's worth going out of our way to disambiguate that,
given that these directories aren't really supposed to contain symlinks.
(And on the third hand, if they aren't supposed to, then maybe these
functions needn't look through any symlinks? In which case just
substituting lstat for stat would resolve the ambiguity.)
Note that I have a patch which generalizes pg_ls_dir_files and makes
pg_ls_dir() a simple wrapper, so if that's pursued, they would behave the same
unless I add another flag to do otherwise (but behaving the same has its
merits). It already uses lstat() to show links to dirs as isdir=no, which was
needed to avoid recursing into links-to-dirs in the new helper function
pg_ls_dir_recurse(). https://commitfest.postgresql.org/26/2377/
I think we need a back-patchable fix for the ENOENT failure, seeing that
we back-patched the new regression test; intermittent buildfarm failures
are no fun in any branch. So new functions aren't too relevant here,
although it's fair to look ahead at whether the same behavior will be
appropriate for them.
regards, tom lane
I wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
Maybe we should lstat() the file to determine if it's a dangling link; if
lstat() fails, then skip it. Currently, we use stat(), which shows metdata of
a link's *target*. Maybe we'd change that.
Hm, good point that ENOENT could refer to a symlink's target. Still,
I'm not sure it's worth going out of our way to disambiguate that,
given that these directories aren't really supposed to contain symlinks.
(And on the third hand, if they aren't supposed to, then maybe these
functions needn't look through any symlinks? In which case just
substituting lstat for stat would resolve the ambiguity.)
After looking at the callers of pg_ls_dir_files, and noticing that
it's already defined to ignore anything that's not a regular file,
I think switching to lstat makes sense.
I also grepped the other uses of ReadDir[Extended], and didn't see
any other ones that seemed desperately in need of changing.
So the attached seems like a sufficient fix.
regards, tom lane
Attachments:
ignore-concurrent-removals-in-pg_ls_dir_files.patchtext/x-diff; charset=us-ascii; name=ignore-concurrent-removals-in-pg_ls_dir_files.patchDownload
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 01185f2..8429a12 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -596,10 +596,15 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
/* Get the file info */
snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
- if (stat(path, &attrib) < 0)
+ if (lstat(path, &attrib) < 0)
+ {
+ /* Ignore concurrently-deleted files, else complain */
+ if (errno == ENOENT)
+ continue;
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not stat file \"%s\": %m", path)));
+ }
/* Ignore anything but regular files */
if (!S_ISREG(attrib.st_mode))
On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
I wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
Maybe we should lstat() the file to determine if it's a dangling link; if
lstat() fails, then skip it. Currently, we use stat(), which shows metdata of
a link's *target*. Maybe we'd change that.Hm, good point that ENOENT could refer to a symlink's target. Still,
I'm not sure it's worth going out of our way to disambiguate that,
given that these directories aren't really supposed to contain symlinks.
(And on the third hand, if they aren't supposed to, then maybe these
functions needn't look through any symlinks? In which case just
substituting lstat for stat would resolve the ambiguity.)After looking at the callers of pg_ls_dir_files, and noticing that
it's already defined to ignore anything that's not a regular file,
I think switching to lstat makes sense.
Yea, only pg_ls_dir() shows special file types (and currently the others even
hide dirs).
The essence of your patch is to ignore ENOENT, but you also changed to use
lstat(), which seems unrelated. That means we'll now hide (non-broken)
symlinks. Is that intentional/needed ? I guess maybe you're trying to fix the
bug (?) that symlinks aren't skipped? If so, I guess it should be a separate
commit, or the commit message should say so. I think the doc update is already
handled by: 8b6d94cf6c8319bfd6bebf8b863a5db586c19c3b (we didn't used to say we
skipped specials, and now we say we do, and we'll to follow through RSN and
actually do it, too).
Show quoted text
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index 01185f2..8429a12 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -596,10 +596,15 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)/* Get the file info */ snprintf(path, sizeof(path), "%s/%s", dir, de->d_name); - if (stat(path, &attrib) < 0) + if (lstat(path, &attrib) < 0) + { + /* Ignore concurrently-deleted files, else complain */ + if (errno == ENOENT) + continue; ereport(ERROR, (errcode_for_file_access(), errmsg("could not stat file \"%s\": %m", path))); + }/* Ignore anything but regular files */
if (!S_ISREG(attrib.st_mode))
Justin Pryzby <pryzby@telsasoft.com> writes:
On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
After looking at the callers of pg_ls_dir_files, and noticing that
it's already defined to ignore anything that's not a regular file,
I think switching to lstat makes sense.
Yea, only pg_ls_dir() shows special file types (and currently the others even
hide dirs).
The essence of your patch is to ignore ENOENT, but you also changed to use
lstat(), which seems unrelated. That means we'll now hide (non-broken)
symlinks. Is that intentional/needed ?
Well, the following comment says "ignore anything but regular files",
so I'm supposing that that is the behavior that we actually want here
and failed to implement correctly. There might be scope for
additional directory-reading functions, but I'd think you'd want
more information (such as the file type) returned from anything
that doesn't act this way.
In practice, since these directories shouldn't contain symlinks,
it's likely moot. The only place in PG data directories where
we actually expect symlinks is pg_tablespace ... and that contains
symlinks to directories, so that this function would ignore them
anyway.
regards, tom lane
On Sun, Mar 29, 2020 at 01:22:04PM -0400, Tom Lane wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
After looking at the callers of pg_ls_dir_files, and noticing that
it's already defined to ignore anything that's not a regular file,
I think switching to lstat makes sense.Yea, only pg_ls_dir() shows special file types (and currently the others even
hide dirs).The essence of your patch is to ignore ENOENT, but you also changed to use
lstat(), which seems unrelated. That means we'll now hide (non-broken)
symlinks. Is that intentional/needed ?Well, the following comment says "ignore anything but regular files",
so I'm supposing that that is the behavior that we actually want here
and failed to implement correctly. There might be scope for
additional directory-reading functions, but I'd think you'd want
more information (such as the file type) returned from anything
that doesn't act this way.
Maybe pg_stat_file() deserves similar attention ? Right now, it'll fail on a
broken link. If we changed it to lstat(), then it'd work, but it'd also show
metadata for the *link* rather than its target.
Patch proposed as v14-0001 patch here may be relevant:
https://www.postgresql.org/message-id/20200317190401.GY26184%40telsasoft.com
- indicating if it is a directory. Typical usages include:
+ indicating if it is a directory (or a symbolic link to a directory).
...
In practice, since these directories shouldn't contain symlinks,
it's likely moot. The only place in PG data directories where
we actually expect symlinks is pg_tablespace ... and that contains
symlinks to directories, so that this function would ignore them
anyway.
I wouldn't hesitate to make symlinks, at least in log. It's surprising when
files are hidden, but I won't argue about the best behavior here.
I'm thinking of distributions or local configurations that use
/var/log/postgresql. I didn't remember or didn't realize, but it looks like
debian's packages use logging_collector=off and then launch postmaster with
2> /var/log/postgres/... It seems reasonable to do something like:
log/huge-querylog.csv => /zfs/compressed/...
--
Justin
Hello Justin,
Well, the following comment says "ignore anything but regular files",
so I'm supposing that that is the behavior that we actually want here
and failed to implement correctly. There might be scope for
additional directory-reading functions, but I'd think you'd want
more information (such as the file type) returned from anything
that doesn't act this way.Maybe pg_stat_file() deserves similar attention ? Right now, it'll fail on a
broken link. If we changed it to lstat(), then it'd work, but it'd also show
metadata for the *link* rather than its target.
Yep. I think this traditional answer is the rational answer.
As I wrote about an earlier version of the patch, ISTM that instead of
reinventing, extending, adapting various ls variants (with/without
metadata, which show only files, which shows target of links, which shows
directory, etc.) we would just need *one* postgres "ls" implementation
which would be like "ls -la arg" (returns file type, dates), and then
everything else is a wrapper around that with appropriate filtering that
can be done at the SQL level, like you started with recurse.
It would reduce the amount of C code and I find the SQL-level approach
quite elegant.
--
Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes:
As I wrote about an earlier version of the patch, ISTM that instead of
reinventing, extending, adapting various ls variants (with/without
metadata, which show only files, which shows target of links, which shows
directory, etc.) we would just need *one* postgres "ls" implementation
which would be like "ls -la arg" (returns file type, dates), and then
everything else is a wrapper around that with appropriate filtering that
can be done at the SQL level, like you started with recurse.
Yeah, I agree that some new function that can represent symlinks
explicitly in its output is the place to deal with this, for
people who want to deal with it.
In the meantime, there's still the question of what pg_ls_dir_files
should do exactly. Are we content to have it ignore symlinks?
I remain inclined to think that's the right thing given its current
brief.
regards, tom lane
Hello,
As I wrote about an earlier version of the patch, ISTM that instead of
reinventing, extending, adapting various ls variants (with/without
metadata, which show only files, which shows target of links, which shows
directory, etc.) we would just need *one* postgres "ls" implementation
which would be like "ls -la arg" (returns file type, dates), and then
everything else is a wrapper around that with appropriate filtering that
can be done at the SQL level, like you started with recurse.Yeah, I agree that some new function that can represent symlinks
explicitly in its output is the place to deal with this, for
people who want to deal with it.In the meantime, there's still the question of what pg_ls_dir_files
should do exactly. Are we content to have it ignore symlinks?
I remain inclined to think that's the right thing given its current
brief.
My 0.02€:
I agree that it is enough to reproduce the current behavior of various
existing pg_ls* functions, but on the other hand outputing a column type
char like ls (-, d, l…) looks like really no big deal. I'd say that the
only reason not to do it may be to pass this before feature freeze.
--
Fabien.
On Tue, Mar 31, 2020 at 07:36:03AM +0200, Fabien COELHO wrote:
As I wrote about an earlier version of the patch, ISTM that instead of
reinventing, extending, adapting various ls variants (with/without
metadata, which show only files, which shows target of links, which shows
directory, etc.) we would just need *one* postgres "ls" implementation
which would be like "ls -la arg" (returns file type, dates), and then
everything else is a wrapper around that with appropriate filtering that
can be done at the SQL level, like you started with recurse.Yeah, I agree that some new function that can represent symlinks
explicitly in its output is the place to deal with this, for
people who want to deal with it.In the meantime, there's still the question of what pg_ls_dir_files
should do exactly. Are we content to have it ignore symlinks?
I remain inclined to think that's the right thing given its current
brief.My 0.02€:
I agree that it is enough to reproduce the current behavior of various
existing pg_ls* functions, but on the other hand outputing a column type
char like ls (-, d, l…) looks like really no big deal. I'd say that the only
reason not to do it may be to pass this before feature freeze.
Remember, there's two threads here, and this one is about the bug in stable
releases ($SUBJECT), and now the instability in the test that was added with
its fix.
I suggest to leave stat() alone in your patch for stable releases. I think
it's okay if we change behavior so that a broken symlink is skipped instead of
erroring (as a side effect of skipping ENOENT with stat()). But not okay if we
change pg_ls_logdir() to hide symlinks in back braches.
--
Justin
Justin Pryzby <pryzby@telsasoft.com> writes:
I suggest to leave stat() alone in your patch for stable releases. I think
it's okay if we change behavior so that a broken symlink is skipped instead of
erroring (as a side effect of skipping ENOENT with stat()). But not okay if we
change pg_ls_logdir() to hide symlinks in back braches.
Meh. I'm not really convinced, but in the absence of anyone expressing
support for my position, I'll do it that way. I don't think it's worth
doing both a stat and lstat to tell the difference between file-is-gone
and file-is-a-broken-symlink.
regards, tom lane
Hello hackers,
31.03.2020 19:41, Tom Lane wrote:
Justin Pryzby <pryzby@telsasoft.com> writes:
I suggest to leave stat() alone in your patch for stable releases. I think
it's okay if we change behavior so that a broken symlink is skipped instead of
erroring (as a side effect of skipping ENOENT with stat()). But not okay if we
change pg_ls_logdir() to hide symlinks in back braches.Meh. I'm not really convinced, but in the absence of anyone expressing
support for my position, I'll do it that way. I don't think it's worth
doing both a stat and lstat to tell the difference between file-is-gone
and file-is-a-broken-symlink.
As we've discovered in Bug #[16161]/messages/by-id/16161-7a985d2f1bbe8f71@postgresql.org, stat() for "concurrently-deleted
file" can also return ERROR_ACCESS_DENIED on Windows. It seems that
pg_upgradeCheck failures seen on
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=fairywren&br=REL_13_STABLE
caused by the same issue.
Shouldn't pg_ls_dir_files() retry stat() on ERROR_ACCESS_DENIED just
like the pgwin32_open() does to ignore files in "delete pending" state?
[16161]: /messages/by-id/16161-7a985d2f1bbe8f71@postgresql.org
/messages/by-id/16161-7a985d2f1bbe8f71@postgresql.org
Best regards,
Alexander
Alexander Lakhin <exclusion@gmail.com> writes:
Shouldn't pg_ls_dir_files() retry stat() on ERROR_ACCESS_DENIED just
like the pgwin32_open() does to ignore files in "delete pending" state?
That would soon lead us to changing every stat() caller in the system
to have Windows-specific looping logic. No thanks. If we need to do
this, let's put in a Windows wrapper layer comparable to pgwin32_open()
for open().
regards, tom lane
13.11.2020 23:16, Tom Lane wrote:
Alexander Lakhin <exclusion@gmail.com> writes:
Shouldn't pg_ls_dir_files() retry stat() on ERROR_ACCESS_DENIED just
like the pgwin32_open() does to ignore files in "delete pending" state?That would soon lead us to changing every stat() caller in the system
to have Windows-specific looping logic. No thanks. If we need to do
this, let's put in a Windows wrapper layer comparable to pgwin32_open()
for open().
Maybe pgwin32_safestat() should perform this... For now it checks
GetLastError() for ERROR_DELETE_PENDING, but as we've found out
previously this error code in fact is not returned by the OS.
And if the fix is not going to be quick, probably it should be discussed
in another thread...
Best regards,
Alexander
Alexander Lakhin <exclusion@gmail.com> writes:
13.11.2020 23:16, Tom Lane wrote:
That would soon lead us to changing every stat() caller in the system
to have Windows-specific looping logic. No thanks. If we need to do
this, let's put in a Windows wrapper layer comparable to pgwin32_open()
for open().
Maybe pgwin32_safestat() should perform this...
Uh ... now that you mention it, that's gone since bed90759f.
There is code in win32stat.c that purports to cope with this case, though
I've not tested it personally.
regards, tom lane