make tuplestore helper function
Attached is a patch to create a helper function which creates a
tuplestore to be used by FUNCAPI-callable functions.
It was suggested in [1]/messages/by-id/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de that I start a separate thread for this patch.
A few notes:
There are a few places with very similar code to the new helper
(MakeTuplestore()) but which, for example, don't expect the return type
to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I
wasn't sure if it was worth modifying it for their benefit.
I wasn't sure if the elog() check for call result type should be changed
to an ereport().
All callers of MakeTuplestore() pass work_mem as the third parameter, so
I could just omit it and hard-code it in, but I wasn't sure that felt
right in a utility/helper function.
I inlined deflist_to_tuplestore() in foreign.c since it was billed as a
helper function but wasn't used elsewhere and it made it harder to use
MakeTuplestore() in this location.
-- melanie
[1]: /messages/by-id/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
Attachments:
v1-0001-Add-helper-to-make-tuplestore.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Add-helper-to-make-tuplestore.patchDownload+72-601
On 2021-Nov-02, Melanie Plageman wrote:
Attached is a patch to create a helper function which creates a
tuplestore to be used by FUNCAPI-callable functions.
Looks good, and given the amount of code being removed, it seems a
no-brainer that some helper(s) is/are needed.
I think the name is a bit too generic. How about
MakeFuncResultTuplestore()?
I think the function should not modify rsinfo -- having that as
(undocumented) side effect is weird. I would suggest to just return the
tuplestore, and have the caller stash it in rsinfo->setResult and modify
the other struct members alongside that. It's a couple more lines per
caller, but the code is clearer that way.
There are a few places with very similar code to the new helper
(MakeTuplestore()) but which, for example, don't expect the return type
to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I
wasn't sure if it was worth modifying it for their benefit.
Is this just because of the tupdesc? If that's the case, then maybe the
tupdesc can be optionally passed in by caller, and when it's not, the
helper does get_call_result_type() and the tupdesc is used as output
argument.
I wasn't sure if the elog() check for call result type should be changed
to an ereport().
I don't see a reason for that: it's still an internal (not user-facing) error.
All callers of MakeTuplestore() pass work_mem as the third parameter, so
I could just omit it and hard-code it in, but I wasn't sure that felt
right in a utility/helper function.
No opinion. I would have used the global in the function instead of
passing it in.
I inlined deflist_to_tuplestore() in foreign.c since it was billed as a
helper function but wasn't used elsewhere and it made it harder to use
MakeTuplestore() in this location.
This is a bit strange. Does it work to pass rsinfo->expectedDesc?
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"No renuncies a nada. No te aferres a nada."
Several places have a conditional value for the first argument (randomAccess),
but your patch changes the behavior to a constant "true". I didn't review the
patch beyond that.
@@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);
@@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) - tuple_store = - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, - false, work_mem);
@@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
@@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
@@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);
+++ b/src/backend/utils/fmgr/funcapi.c + tupstore = tuplestore_begin_heap(true, false, maxKBytes);
--
Justin
The attached patch contains a helper now named
MakeFuncResultTuplestore() which aims to eliminate a few lines of
redundant code that all FUNCAPI functions making tuplestores repeated.
While investigating all of these call sites and making the changes
suggested by Álvaro, I noticed that the functions making tuplestores
broke out into seven groups. For some of them, using my helper changes
what memory context the tuple descriptor is made in.
Explanation of how/why memory context tuple descriptor is in has
changed:
An earlier version of this patch changed into the per query memory
context inside of the helper function. This version does not do so but
instead asserts that the caller has changed into the appropriate memory
context.
For callers which previously did not call get_call_result_type() and
instead used a function like CreateTupleDescCopy() to create the tuple
descriptor which was used for the tuples in the tuplestore, most of them
made this copy in the per query memory context. For these callers it
seemed error-prone to switch to the per query memory context to create
the copied tuple descriptor and then switch back before calling the
helper function. Instead, the helper function will assert that the
caller has already switched into the per query memory context so that
the tuplestore is made in the per query context.
This means that for the group of functions which, prior to this patch,
called get_call_result_type() in a shorter-lived memory context but now
rely on the helper function to call get_call_result_type(), the tuple
descriptor will now be in the per query memory context.
I think that this makes sense since most of these are set as the
ResultSetInfo->setDesc (the tuple descriptor for returned tuples).
Explanation of the different groups of functions using the
MakeFuncResultTuplestore() helper:
The first group are functions which called get_call_result_type(),
allocating memory for the tuple descriptor in the short-lived memory
context, then switched into the per query memory context before
creating the tuplestore, then switched back to the short-lived memory
context directly after making the tuplestore.
Now that they use the helper function and switch into the per query
memory context before calling it, the tuple descriptor is allocated in
the per query memory context.
The first group:
pg_stop_backup_v2()
pg_event_trigger_dropped_objects()
pg_event_trigger_ddl_commands()
pg_available_extensions()
pg_available_extension_versions()
pg_extension_update_paths()
pg_hba_file_rules()
pg_stat_get_subscription()
pg_logical_slot_get_changes_guts()
pg_show_replication_origin_status()
pg_get_replication_slots()
pg_stat_get_wal_senders()
pg_get_shmem_allocations()
pg_timezone_names()
pg_ls_dir_files()
pg_get_backend_memory_contexts()
pg_stat_get_progress_info()
pg_stat_get_activity()
pg_stat_get_slru()
The second group are functions which, instead of calling
get_call_result_type(), call CreateTemplateTupleDesc() and
TupleDescInitEntry() to allocate and initialize the tuple descriptor in
the per query memory context.
Their memory usage is unchanged by this patch.
The second group:
pg_prepared_statement()
pg_ls_dir()
pg_tablespace_databases()
show_all_file_settings()
pg_cursor()
The third and fourth groups are functions that use CreateTupleDescCopy()
to make a tuple descriptor copy from ResultSetInfo->expectedDesc.
These functions made a copy of ResultSetInfo->expectedDesc but did not
check if it was present before doing so (and do not seem to call a
function which would call internal_get_result_type() -- which does this
check), so I have added that check.
Their memory usage is unchanged by this patch.
The third group:
text_to_table()
The fourth group differs from the third in that the values for the
tuples are actually created in the per query memory context -- seemingly
unnecessarily since tuplestore_putvalues() will eventually end up
copying everything when it is forming tuples. This behavior is not
altered at all by the MakeFuncResultTuplestore() helper function patch,
however I wanted to mention that I noticed that perhaps more was being
allocated in the per query memory context than is necessary.
The fourth group:
pg_options_to_table()
pg_config()
The fifth group is a function that uses CreateTupleDescCoy() to make a
copy of a tuple descriptor (NOT ResultSetInfo->expectedDesc) in the
function info memory context (fcinfo->flinfo->fn_mcxt) so that it can be
accessed later.
Their memory usage is unchanged by this patch.
The fifth group:
populate_recordset_worker()
The sixth group of functions use CreateTupleDescCopy() to make a copy of
the tuple descriptor that they set up using get_call_result_type().
They called get_call_result_type() in a short-lived memory context, then
switched to the per query memory context and made a copy of it right
away. It is unclear why they couldn't just call get_call_result_type()
in the per query context and not make a copy. Perhaps they needed some
specific behavior of the copy (like that it doesn't copy constraints and
defaults?).
With the attached patch, the initial tuple descriptor is now made in the
per query memory context (as well as the copy). I wonder if the initial
tuple descriptor being made in the per query memory context negated the
need for a copy, however I left the copy intact since I wasn't sure.
These functions also checked for the presence of
ResultSetInfo->expectedDesc which is not applicable for all callers of
MakeFuncResultTuplestore(), so I separated it into a separate check.
get_call_result_type() will call internal_get_result_type() which will
check for expectedDesc if the result type is TYPEFUNC_RECORD, so I'm not
sure if the expectedDesc check is required or not in the body of these
functions.
The sixth group:
each_worker()
each_worker_jsonb()
The seventh group of functions use CreateTupleDescCopy() to make a copy of
ResultSetInfo->expectedDesc in the per query memory context.
This patch does not change any of that. The only difference is that now
the check for expectedDesc is separate.
Their memory usage is unchanged by this patch.
The seventh group:
elements_worker_jsonb()
elements_worker()
Note that for all of the functions in which I added or changed the check for
ResultSetInfo->expectedDesc, it is a little bit different because I've changed
the error code.
I've used error code: ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED where they
previously would have been part of an ereport with error code
ERRCODE_FEATURE_NOT_SUPPORTED.
I'm not sure if the new one is correct. In fact, I'm not sure if an error
related to expectedDesc being set should be user-facing at all (looking at how
expectedDesc is used and set, I wasn't sure). Perhaps it should be an elog() or
an assert?
Feedback is addressed inline below.
On Tue, Nov 2, 2021 at 2:17 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Nov-02, Melanie Plageman wrote:
Attached is a patch to create a helper function which creates a
tuplestore to be used by FUNCAPI-callable functions.Looks good, and given the amount of code being removed, it seems a
no-brainer that some helper(s) is/are needed.I think the name is a bit too generic. How about
MakeFuncResultTuplestore()?
I've done this rename.
I think the function should not modify rsinfo -- having that as
(undocumented) side effect is weird. I would suggest to just return the
tuplestore, and have the caller stash it in rsinfo->setResult and modify
the other struct members alongside that. It's a couple more lines per
caller, but the code is clearer that way.
I've changed the helper not to modify rsinfo.
There are a few places with very similar code to the new helper
(MakeTuplestore()) but which, for example, don't expect the return type
to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I
wasn't sure if it was worth modifying it for their benefit.Is this just because of the tupdesc? If that's the case, then maybe the
tupdesc can be optionally passed in by caller, and when it's not, the
helper does get_call_result_type() and the tupdesc is used as output
argument.
I've parameterized the helper in this way and now am able to use it for
all FUNCAPI-callable functions making tuplestores except fmgr_sql (for
SQL functions) in functions.c because it didn't really seem worth it
given the way the code is written.
All callers of MakeTuplestore() pass work_mem as the third parameter, so
I could just omit it and hard-code it in, but I wasn't sure that felt
right in a utility/helper function.No opinion. I would have used the global in the function instead of
passing it in.
I've changed this to use the global.
I inlined deflist_to_tuplestore() in foreign.c since it was billed as a
helper function but wasn't used elsewhere and it made it harder to use
MakeTuplestore() in this location.This is a bit strange. Does it work to pass rsinfo->expectedDesc?
This is addressed by changing MakeFuncResultTuplestore() call
get_call_result_type() optionally and having pg_options_to_table() call
it with NULL as tuple descriptor. Now the behavior is the same -- it
makes a copy of expectedDesc in the per query context and sets that as
setDesc in the ReturnSetInfo.
- Melanie
Attachments:
v2-0001-Add-helper-to-make-tuplestore.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-helper-to-make-tuplestore.patchDownload+118-459
On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Several places have a conditional value for the first argument (randomAccess),
but your patch changes the behavior to a constant "true". I didn't review the
patch beyond that.@@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);@@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) - tuple_store = - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, - false, work_mem);@@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);@@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);@@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);+++ b/src/backend/utils/fmgr/funcapi.c + tupstore = tuplestore_begin_heap(true, false, maxKBytes);
I believe the patch has preserved the same behavior. All of the callers
for which I replaced tuplestore_begin_heap() which passed a variable for
the randomAccess parameter had set that variable to something which was
effectively the same as passing true -- SFRM_Materialize_Random.
- Melanie
On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote:
On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Several places have a conditional value for the first argument (randomAccess),
but your patch changes the behavior to a constant "true". I didn't review the
patch beyond that.@@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);@@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) - tuple_store = - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, - false, work_mem);@@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);@@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);@@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);+++ b/src/backend/utils/fmgr/funcapi.c + tupstore = tuplestore_begin_heap(true, false, maxKBytes);I believe the patch has preserved the same behavior. All of the callers
for which I replaced tuplestore_begin_heap() which passed a variable for
the randomAccess parameter had set that variable to something which was
effectively the same as passing true -- SFRM_Materialize_Random.
I don't think so ?
They callers aren't passing SFRM_Materialize_Random, but rather
(allowedModes & SFRM_Materialize_Random) != 0
Where allowedModes is determined EXEC_FLAG_BACKWARD.
src/include/executor/executor.h:extern Tuplestorestate *ExecMakeTableFunctionResult(SetExprState *setexpr,
src/include/executor/executor.h- ExprContext *econtext,
src/include/executor/executor.h- MemoryContext argContext,
src/include/executor/executor.h- TupleDesc expectedDesc,
src/include/executor/executor.h- bool randomAccess);
src/backend/executor/nodeFunctionscan.c=FunctionNext(FunctionScanState *node)
src/backend/executor/nodeFunctionscan.c: ExecMakeTableFunctionResult(node->funcstates[0].setexpr,
src/backend/executor/nodeFunctionscan.c- node->ss.ps.ps_ExprContext,
src/backend/executor/nodeFunctionscan.c- node->argcontext,
src/backend/executor/nodeFunctionscan.c- node->funcstates[0].tupdesc,
src/backend/executor/nodeFunctionscan.c- node->eflags & EXEC_FLAG_BACKWARD);
--
Justin
On Mon, Nov 8, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote:
On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Several places have a conditional value for the first argument (randomAccess),
but your patch changes the behavior to a constant "true". I didn't review the
patch beyond that.@@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);@@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) - tuple_store = - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, - false, work_mem);@@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);@@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);@@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) - tupstore = - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, - false, work_mem);+++ b/src/backend/utils/fmgr/funcapi.c + tupstore = tuplestore_begin_heap(true, false, maxKBytes);I believe the patch has preserved the same behavior. All of the callers
for which I replaced tuplestore_begin_heap() which passed a variable for
the randomAccess parameter had set that variable to something which was
effectively the same as passing true -- SFRM_Materialize_Random.I don't think so ?
They callers aren't passing SFRM_Materialize_Random, but rather
(allowedModes & SFRM_Materialize_Random) != 0Where allowedModes is determined EXEC_FLAG_BACKWARD.
src/include/executor/executor.h:extern Tuplestorestate *ExecMakeTableFunctionResult(SetExprState *setexpr,
src/include/executor/executor.h- ExprContext *econtext,
src/include/executor/executor.h- MemoryContext argContext,
src/include/executor/executor.h- TupleDesc expectedDesc,
src/include/executor/executor.h- bool randomAccess);src/backend/executor/nodeFunctionscan.c=FunctionNext(FunctionScanState *node)
src/backend/executor/nodeFunctionscan.c: ExecMakeTableFunctionResult(node->funcstates[0].setexpr,
src/backend/executor/nodeFunctionscan.c- node->ss.ps.ps_ExprContext,
src/backend/executor/nodeFunctionscan.c- node->argcontext,
src/backend/executor/nodeFunctionscan.c- node->funcstates[0].tupdesc,
src/backend/executor/nodeFunctionscan.c- node->eflags & EXEC_FLAG_BACKWARD);
You are right. I misread it.
So, I've attached a patch where randomAccess is now an additional
parameter (and registered for the next fest).
I was thinking about how to add a test that would have broken when I
passed true for randomAccess to tuplestore_begin_heap() when false was
required. But, I don't fully understand the problem. If backward
accesses to a tuplestore are not allowed but randomAccess is mistakenly
passed as true, would the potential result be potentially wrong results
from accessing the tuplestore results backwards?
- Melanie
Attachments:
v3-0001-Add-helper-to-make-tuplestore.patchapplication/octet-stream; name=v3-0001-Add-helper-to-make-tuplestore.patchDownload+127-451
On Thu, Nov 18, 2021 at 12:59:03PM -0500, Melanie Plageman wrote:
On Mon, Nov 8, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote:
On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Several places have a conditional value for the first argument (randomAccess),
but your patch changes the behavior to a constant "true". I didn't review the
patch beyond that.
...
I believe the patch has preserved the same behavior. All of the callers
for which I replaced tuplestore_begin_heap() which passed a variable for
the randomAccess parameter had set that variable to something which was
effectively the same as passing true -- SFRM_Materialize_Random.I don't think so ?
They callers aren't passing SFRM_Materialize_Random, but rather
(allowedModes & SFRM_Materialize_Random) != 0Where allowedModes is determined EXEC_FLAG_BACKWARD.
...
You are right. I misread it.
So, I've attached a patch where randomAccess is now an additional
parameter (and registered for the next fest).I was thinking about how to add a test that would have broken when I
passed true for randomAccess to tuplestore_begin_heap() when false was
required. But, I don't fully understand the problem. If backward
accesses to a tuplestore are not allowed but randomAccess is mistakenly
passed as true, would the potential result be potentially wrong results
from accessing the tuplestore results backwards?
No, see here
src/backend/utils/fmgr/README-The Tuplestore must be created with randomAccess = true if
src/backend/utils/fmgr/README:SFRM_Materialize_Random is set in allowedModes, but it can (and preferably
src/backend/utils/fmgr/README-should) be created with randomAccess = false if not. Callers that can support
src/backend/utils/fmgr/README-both ValuePerCall and Materialize mode will set SFRM_Materialize_Preferred,
src/backend/utils/fmgr/README-or not, depending on which mode they prefer.
If you use "randomAccess=true" when it's not needed, the result might be less
efficient.
Some callers specify "true" when they don't need to. I ran into that here.
/messages/by-id/21724.1583955158@sss.pgh.pa.us
Maybe you'd want to add an 0002 patch which changes those to conditional ?
BTW, there should be a newline before MakeFuncResultTuplestore().
--
Justin
On Thu, Nov 18, 2021 at 1:24 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Nov 18, 2021 at 12:59:03PM -0500, Melanie Plageman wrote:
On Mon, Nov 8, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote:
On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Several places have a conditional value for the first argument (randomAccess),
but your patch changes the behavior to a constant "true". I didn't review the
patch beyond that....
I believe the patch has preserved the same behavior. All of the callers
for which I replaced tuplestore_begin_heap() which passed a variable for
the randomAccess parameter had set that variable to something which was
effectively the same as passing true -- SFRM_Materialize_Random.I don't think so ?
They callers aren't passing SFRM_Materialize_Random, but rather
(allowedModes & SFRM_Materialize_Random) != 0Where allowedModes is determined EXEC_FLAG_BACKWARD.
...
You are right. I misread it.
So, I've attached a patch where randomAccess is now an additional
parameter (and registered for the next fest).I was thinking about how to add a test that would have broken when I
passed true for randomAccess to tuplestore_begin_heap() when false was
required. But, I don't fully understand the problem. If backward
accesses to a tuplestore are not allowed but randomAccess is mistakenly
passed as true, would the potential result be potentially wrong results
from accessing the tuplestore results backwards?No, see here
src/backend/utils/fmgr/README-The Tuplestore must be created with randomAccess = true if
src/backend/utils/fmgr/README:SFRM_Materialize_Random is set in allowedModes, but it can (and preferably
src/backend/utils/fmgr/README-should) be created with randomAccess = false if not. Callers that can support
src/backend/utils/fmgr/README-both ValuePerCall and Materialize mode will set SFRM_Materialize_Preferred,
src/backend/utils/fmgr/README-or not, depending on which mode they prefer.If you use "randomAccess=true" when it's not needed, the result might be less
efficient.Some callers specify "true" when they don't need to. I ran into that here.
/messages/by-id/21724.1583955158@sss.pgh.pa.usMaybe you'd want to add an 0002 patch which changes those to conditional ?
Done in attached v4
BTW, there should be a newline before MakeFuncResultTuplestore().
Fixed
- Melanie
Attachments:
v4-0002-Check-allowedModes-when-creating-tuplestores.patchapplication/octet-stream; name=v4-0002-Check-allowedModes-when-creating-tuplestores.patchDownload+29-15
v4-0001-Add-helper-to-make-tuplestore.patchapplication/octet-stream; name=v4-0001-Add-helper-to-make-tuplestore.patchDownload+128-451
On Mon, Dec 13, 2021 at 05:27:52PM -0500, Melanie Plageman wrote:
Done in attached v4
Thanks.
I think these comments can be removed from the callers:
/* it's a simple type, so don't use get_call_result_type */
You removed one call to tuplestore_donestoring(), but not the others.
I guess you could remove them all (optionally as a separate patch).
The rest of these are questions more than comments, and I'm not necessarily
suggesting to change the patch:
This isn't new in your patch, but for me, it's more clear if the callers have a
separate boolean for randomAccess, rather than having the long expression in
the function call:
+ tupstore = MakeFuncResultTuplestore(fcinfo, NULL,
+ rsinfo->allowedModes & SFRM_Materialize_Random);
vs
randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0;
- tupstore = tuplestore_begin_heap(randomAccess, false, work_mem);
+ tupstore = MakeFuncResultTuplestore(fcinfo, NULL, randomAccess);
One could also argue for passing rsinfo->allowedModes, instead of
(rsinfo->allowedModes & SFRM_Materialize_Random).
There's a couples places that you're checking expectedDesc where it wasn't
being checked before. Is that some kind of live bug ?
pg_config() text_to_table()
It'd be nice if the "expected tuple format" error didn't need to be duplicated
for each SRFs. I suppose the helper function could just take a boolean
determining whether or not to throw an error (passing "expectedDesc==NULL").
You'd have to call the helper before CreateTupleDescCopy() - which you're
already doing in a couple places for similar reasons.
I noticed that tuplestore.h isn't included most places. I suppose it's
included via execnodes.h. Your patch doesn't change that, arguably it
should've always been included.
--
Justin
Thanks for the detailed review!
On Fri, Dec 17, 2021 at 3:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Dec 13, 2021 at 05:27:52PM -0500, Melanie Plageman wrote:
Done in attached v4
Thanks.
I think these comments can be removed from the callers:
/* it's a simple type, so don't use get_call_result_type */
Done in attached v5
You removed one call to tuplestore_donestoring(), but not the others.
I guess you could remove them all (optionally as a separate patch).
I removed it in that one location because I wanted to get rid of the
local variable it used. I am fine with removing the other occurrences,
but I thought there might be some reason why instead of removing it,
they made it into a no-op.
The rest of these are questions more than comments, and I'm not necessarily
suggesting to change the patch:This isn't new in your patch, but for me, it's more clear if the callers have a
separate boolean for randomAccess, rather than having the long expression in
the function call:+ tupstore = MakeFuncResultTuplestore(fcinfo, NULL, + rsinfo->allowedModes & SFRM_Materialize_Random);vs
randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, randomAccess);One could also argue for passing rsinfo->allowedModes, instead of
(rsinfo->allowedModes & SFRM_Materialize_Random).
I've changed the patch to have MakeFuncResultTuplestore() check
rsinfo->allowedModes and pass in the resulting boolean to
tuplestore_begin_heap(). Because I modified the
MakeFuncResultTuplestore() function signature to accommodate this, I had
to collapse the two patches into one, as I couldn't maintain the call
sites which passed in a constant.
There's a couples places that you're checking expectedDesc where it wasn't
being checked before. Is that some kind of live bug ?
pg_config() text_to_table()
Yes, expectedDesc was accessed in these locations without checking that
it is not NULL. Should that be a separate patch?
It'd be nice if the "expected tuple format" error didn't need to be duplicated
for each SRFs. I suppose the helper function could just take a boolean
determining whether or not to throw an error (passing "expectedDesc==NULL").
You'd have to call the helper before CreateTupleDescCopy() - which you're
already doing in a couple places for similar reasons.
So, I don't think it makes sense to move that error message into
MakeFuncResultTuplestore() for the cases in which NULL is passed in for
the tuple descriptor. expectedDesc is not accessed at all in these cases
inside the function (since get_call_result_type()) is not called. For
the cases when a tuple descriptor is passed in, only two callers
actually check for expectedDesc -- each_worker_jsonb and each_worker.
Therefore, I don't think it makes sense to add another parameter just to
move that check inside for two callers.
I noticed that tuplestore.h isn't included most places. I suppose it's
included via execnodes.h. Your patch doesn't change that, arguably it
should've always been included.
I've now included it in funcapi.c.
- Melanie
Attachments:
v5-0001-Add-helper-to-make-tuplestore.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Add-helper-to-make-tuplestore.patchDownload+123-461
On Wed, Jan 05, 2022 at 12:09:16PM -0500, Melanie Plageman wrote:
On Fri, Dec 17, 2021 at 3:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
The rest of these are questions more than comments, and I'm not necessarily
suggesting to change the patch:This isn't new in your patch, but for me, it's more clear if the callers have a
separate boolean for randomAccess, rather than having the long expression in
the function call:+ tupstore = MakeFuncResultTuplestore(fcinfo, NULL, + rsinfo->allowedModes & SFRM_Materialize_Random);vs
randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, randomAccess);One could also argue for passing rsinfo->allowedModes, instead of
(rsinfo->allowedModes & SFRM_Materialize_Random).I've changed the patch to have MakeFuncResultTuplestore() check
rsinfo->allowedModes and pass in the resulting boolean to
tuplestore_begin_heap(). Because I modified the
MakeFuncResultTuplestore() function signature to accommodate this, I had
to collapse the two patches into one, as I couldn't maintain the call
sites which passed in a constant.
There's a couples places that you're checking expectedDesc where it wasn't
being checked before. Is that some kind of live bug ?
pg_config() text_to_table()Yes, expectedDesc was accessed in these locations without checking that
it is not NULL. Should that be a separate patch?
I don't know. The patch is already easy to review, since it's mostly limited
to removing code and fixing inconsistencies (NULL check) and possible
inefficiencies (randomAccess).
If the expectedDesc NULL check were an 0001 patch, then 0002 (the main patch)
would be even easier to review. Only foreign.c is different.
You removed one call to tuplestore_donestoring(), but not the others.
I guess you could remove them all (optionally as a separate patch).I removed it in that one location because I wanted to get rid of the
local variable it used.
What local variable ? I see now that logicalfuncs.c never had a local variable
called tupstore. I'm sure it was intended since 2014 (b89e15105) to say
tupestore_donestoring(p->tupstore). But donestoring(typo) causes no error,
since the define is a NOP.
src/include/utils/tuplestore.h:#define tuplestore_donestoring(state) ((void) 0)
I am fine with removing the other occurrences,
but I thought there might be some reason why instead of removing it,
they made it into a no-op.
I assume Tom left it (actually, added it back in dd04e958c) to avoid breaking
extensions for no reason. And maybe to preserve the possbility of at some
point in the future doing something again during the "done" step.
I'd leave it for input from a committer about those:
- remove tuplestore_donestoring() calls ?
- split expectedDesc NULL check to an 0001 patch ?
- anything other opened questions ?
I'm marking this as RFC, with the caveat that the newline before
MakeFuncResultTuplestore went missing again :)
--
Justin
On Wed, Jan 5, 2022 at 7:57 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Jan 05, 2022 at 12:09:16PM -0500, Melanie Plageman wrote:
On Fri, Dec 17, 2021 at 3:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
There's a couples places that you're checking expectedDesc where it wasn't
being checked before. Is that some kind of live bug ?
pg_config() text_to_table()Yes, expectedDesc was accessed in these locations without checking that
it is not NULL. Should that be a separate patch?I don't know. The patch is already easy to review, since it's mostly limited
to removing code and fixing inconsistencies (NULL check) and possible
inefficiencies (randomAccess).If the expectedDesc NULL check were an 0001 patch, then 0002 (the main patch)
would be even easier to review. Only foreign.c is different.
I'll wait to do that if preferred by committer.
Are you imagining that patch 0001 would only add the check for
expectedDesc that is missing from pg_config() and text_to_table()?
You removed one call to tuplestore_donestoring(), but not the others.
I guess you could remove them all (optionally as a separate patch).I removed it in that one location because I wanted to get rid of the
local variable it used.What local variable ? I see now that logicalfuncs.c never had a local variable
called tupstore. I'm sure it was intended since 2014 (b89e15105) to say
tupestore_donestoring(p->tupstore). But donestoring(typo) causes no error,
since the define is a NOP.src/include/utils/tuplestore.h:#define tuplestore_donestoring(state) ((void) 0)
Yes, I mean the local variable, tupstore.
I am fine with removing the other occurrences,
but I thought there might be some reason why instead of removing it,
they made it into a no-op.I assume Tom left it (actually, added it back in dd04e958c) to avoid breaking
extensions for no reason. And maybe to preserve the possbility of at some
point in the future doing something again during the "done" step.I'd leave it for input from a committer about those:
- remove tuplestore_donestoring() calls ?
- split expectedDesc NULL check to an 0001 patch ?
- anything other opened questions ?I'm marking this as RFC, with the caveat that the newline before
MakeFuncResultTuplestore went missing again :)
oops. I've attached v6 with the newline.
- Melanie
Attachments:
v6-0001-Add-helper-to-make-tuplestore.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Add-helper-to-make-tuplestore.patchDownload+124-461
On Tue, Jan 11, 2022 at 10:19:33AM -0500, Melanie Plageman wrote:
On Wed, Jan 5, 2022 at 7:57 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
I'd leave it for input from a committer about those:
- remove tuplestore_donestoring() calls ?
This part has been raised on a different thread, as of:
/messages/by-id/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
And it looks sensible from here to just remove any traces of this
stuff from the C code, while leaving the compatibility macro around
to avoid any breakages with any external modules.
- split expectedDesc NULL check to an 0001 patch ?
I am not sure that this is worth a split, as the same areas are
touched.
- anything other opened questions ?
I'm marking this as RFC, with the caveat that the newline before
MakeFuncResultTuplestore went missing again :)oops. I've attached v6 with the newline.
Saying that, the wrapper you are proposing here looks like a good
idea. I have wanted a centralized wrapper more than once, and this is
a nice cut overall.
--
Michael
On Tue, Jan 11, 2022 at 10:19:33AM -0500, Melanie Plageman wrote:
If the expectedDesc NULL check were an 0001 patch, then 0002 (the main patch)
would be even easier to review. Only foreign.c is different.I'll wait to do that if preferred by committer.
Are you imagining that patch 0001 would only add the check for
expectedDesc that is missing from pg_config() and text_to_table()?
Yes, that's what I meant, but I haven't been able to determine if a NULL check
should be added at all.
I found this commit message, which indicates that it shouldn't be used in every
case. I'm not clear when it should/not be used, though.
commit 60651e4cddbb77e8f1a0c7fc0be6a7e7bf626fe0
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat Oct 28 14:02:21 2017 -0400
Support domains over composite types in PL/Perl.
In passing, don't insist on rsi->expectedDesc being set unless we
actually need it; this allows succeeding in a couple of cases where
PL/Perl functions returning setof composite would have failed before,
and makes the error message more apropos in other cases.
On Tue, Feb 15, 2022 at 11:57:56PM -0600, Justin Pryzby wrote:
On Tue, Jan 11, 2022 at 10:19:33AM -0500, Melanie Plageman wrote:
I'll wait to do that if preferred by committer.
Are you imagining that patch 0001 would only add the check for
expectedDesc that is missing from pg_config() and text_to_table()?
It took me some time to follow up on this point, but we clearly don't
expect a NULL expectedDesc in the context of these two code paths or
we finish with a NULL pointer dereference in CreateTupleDescCopy()
with a crash, so they had better be added. It also makes sense to me
to do that first and independently of the rest of the patch: it is a
separate issue.
Yes, that's what I meant, but I haven't been able to determine if a NULL check
should be added at all.
+ (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
+ errmsg("expected tuple format not specified as required for "
+ "set-returning function.")))
errmsgs should be in a single line, to ease the grepping of similar
patterns.
There is something else that I think could be cleaned up, while we are
working on this area. Some of the callers of
MakeFuncResultTuplestore() (prepare.c, portalmem.c, one in genfile.c,
guc.c, misc.c) pass down a NULL tupdesc while building their TupleDesc
using CreateTemplateTupleDesc() and a set of TupleDescInitEntry() for
each attribute. This data exists already in pg_proc.dat, duplicating
the effort. Except if I am missing something, wouldn't it be better
to let get_call_result_type() do this work for us, passing a TupleDesc
pointer to the new MakeFuncResultTuplestore() rather than NULL?
Once you do that, there is something else that stands out: all the
callers of MakeFuncResultTuplestore() passing NULL for tupledesc *have
to* set expectedDesc, as far as I can see from the remaining callers.
This brings an interesting effect, couldn't we move the check on
ReturnSetInfo->expectedDesc being not NULL within
MakeFuncResultTuplestore(), checking after it only when tupdesc ==
NULL?
One would say that this changes the error ordering in code paths like
pg_config(), but, actually, in this case, we don't have any need for
the part "Check to make sure we have a reasonable tuple descriptor" as
of pg_proc.dat defining the tuple description this function uses, no?
And we could switch to a tuplestore in this case, as well, reducing by
one the numbers of callers with tupdesc=NULL for the new routine.
This could be a separate cleanup, separate from the rest, done first.
each_worker_jsonb() does not really need its check on expectedDesc
being NULL, does it? HEAD grabs the tuple descriptor with
get_call_result_type(), the patch uses MakeFuncResultTuplestore(), so
we make nowhere use of the expected TupleDesc saved by the SRF
executor.
Asserting that we are in the correct memory context in when calling
MakeFuncResultTuplestore() sounds rather sensible from here as per the
magics done in the various json functions. Still, it really feels
like we could do a more centralized consolidation of the whole.
--
Michael
On Thu, Feb 17, 2022 at 04:10:01PM +0900, Michael Paquier wrote:
Asserting that we are in the correct memory context in when calling
MakeFuncResultTuplestore() sounds rather sensible from here as per the
magics done in the various json functions. Still, it really feels
like we could do a more centralized consolidation of the whole.
So, I got my hands on this area, and found myself applying 07daca5 as
a first piece of the puzzle. Anyway, after more review today, I have
bumped into more pieces that could be consolidated, and finished with
the following patch set:
- 0001 changes a couple of SRFs that I found worth simplifying. These
refer mostly to the second and fourth group mentioned upthread by
Melanie, with two exceptions: pg_tablespace_databases() where it is
not worth changing to keep it backward-compatible and pg_ls_dir() as
per its one-arg version. That's a nice first cut in itself.
- 0002 changes a couple of places to unify some existing SRF checks,
that I bumped into on the way. The value here is in using the same
error messages everywhere, reducing the translation effort and
generating the same errors for all cases based on the same conditions.
This eases the review of the next patch.
- 0003 is the actual refactoring meat, where I have been able to move
the check on expectedDesc into MakeFuncResultTuplestore(), shaving
more code than previously. If you discard the cases of patch 0001, it
should actually be possible to set setResult, setDesc and returnMode
within the new function, which would feel natural as that's the place
where we create the function's tuplestore and we want to materialize
its contents. The cases with the JSON code are also a bit hairy and
need more thoughts, but we could also cut this part of the code from
the initial refactoring effort.
So, for now, 0001 and 0002 look like rather committable pieces. 0003
needs a bit more thoughts about all the corner cases we need to
consider, mostly what I am mentioning above. Perhaps the pieces of
0003 related to pg_options_to_table() should be moved to 0001 as a
matter of clarity.
--
Michael
Attachments:
v7-0003-Introduce-MakeFuncResultTuplestore.patchtext/x-diff; charset=us-asciiDownload+92-477
v7-0002-Simplify-set-of-SRF-related-checks.patchtext/x-diff; charset=us-asciiDownload+63-38
v7-0001-Clean-up-and-simplify-some-code-related-to-SRFs.patchtext/x-diff; charset=us-asciiDownload+35-109
On Mon, Feb 21, 2022 at 04:41:17PM +0900, Michael Paquier wrote:
So, I got my hands on this area, and found myself applying 07daca5 as
a first piece of the puzzle. Anyway, after more review today, I have
bumped into more pieces that could be consolidated, and finished with
the following patch set:
- 0001 changes a couple of SRFs that I found worth simplifying. These
refer mostly to the second and fourth group mentioned upthread by
Melanie, with two exceptions: pg_tablespace_databases() where it is
not worth changing to keep it backward-compatible and pg_ls_dir() as
per its one-arg version. That's a nice first cut in itself.
- 0002 changes a couple of places to unify some existing SRF checks,
that I bumped into on the way. The value here is in using the same
error messages everywhere, reducing the translation effort and
generating the same errors for all cases based on the same conditions.
This eases the review of the next patch.
These two have been now applied, with some comment improvements and
the cleanup of pg_options_to_table() done in 0001, and a slight change
in 0002 for pageinspect where a check was not necessary for a BRIN
code path.
- 0003 is the actual refactoring meat, where I have been able to move
the check on expectedDesc into MakeFuncResultTuplestore(), shaving
more code than previously. If you discard the cases of patch 0001, it
should actually be possible to set setResult, setDesc and returnMode
within the new function, which would feel natural as that's the place
where we create the function's tuplestore and we want to materialize
its contents. The cases with the JSON code are also a bit hairy and
need more thoughts, but we could also cut this part of the code from
the initial refactoring effort.
This is the remaining piece, as attached, that I have not been able to
poke much at yet.
--
Michael
Attachments:
v8-0001-Introduce-MakeFuncResultTuplestore.patchtext/x-diff; charset=us-asciiDownload+82-462
On Thu, Feb 24, 2022 at 08:25:06PM +0900, Michael Paquier wrote:
This is the remaining piece, as attached, that I have not been able to
poke much at yet.
So, I have finally poked at this last part of the patch set, and I
found that we can be more aggressive with the refactoring, by moving
into MakeFuncResultTuplestore() the parts where we save the tuplestore
and the tupledesc in the per-query memory context. There are two
pieces that matter once things are reshaped:
- The tuple descriptor may need some extra validation via
BlessTupleDesc() when it comes from a transient record datatype,
something that happens for most of the subroutines related to the JSON
functions.
- expectedDesc is sometimes required by the caller, though most of the
time just needs to be built with the more expensive
get_call_result_type().
In order to keep things pluggable at will, MakeFuncResultTuplestore()
has been changed to access a set of bits32 flags, able to control the
two options above. With this facility in place, I have been able to
cut much more code than the initial patch, roughly twice as of:
24 files changed, 157 insertions(+), 893 deletions(-)
This seems in rather good shape to me, the changes are
straight-forward and the code cut is really good, so I'd like to move
on with that. 0001 is the initial patch, and 0002 is the extra
refactoring I have been working on. The plan would be to merge both,
but I am sending a split to ease any checks on what I have changed.
Comments or objections?
--
Michael
On Mon, Feb 28, 2022 at 04:49:41PM +0900, Michael Paquier wrote:
In order to keep things pluggable at will, MakeFuncResultTuplestore()
has been changed to access a set of bits32 flags, able to control the
two options above. With this facility in place, I have been able to
cut much more code than the initial patch, roughly twice as of:
24 files changed, 157 insertions(+), 893 deletions(-)
So, I have been thinking about this patch once again, and after
pondering more on it, MakeFuncResultTuplestore() is actually a wrong
name now that it does much more than just creating a tuplestore, by
assigning the TupleDesc and the TupleStore into the function's
ReturnSetInfo.
This is actually setting up a function in the context of a single call
where we fill the tuplestore with all its values, so instead I have
settled down to name that SetSingleFuncCall(), to make a parallel with
the existing MultiFuncCall*(). funcapi.c is the right place for
that, and I have added more documentation in the fmgr's README as well
as funcapi.h.
--
Michael