Thread: make tuplestore helper function
Attached is a patch to create a helper function which creates a tuplestore to be used by FUNCAPI-callable functions. It was suggested in [1] that I start a separate thread for this patch. A few notes: There are a few places with very similar code to the new helper (MakeTuplestore()) but which, for example, don't expect the return type to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I wasn't sure if it was worth modifying it for their benefit. I wasn't sure if the elog() check for call result type should be changed to an ereport(). All callers of MakeTuplestore() pass work_mem as the third parameter, so I could just omit it and hard-code it in, but I wasn't sure that felt right in a utility/helper function. I inlined deflist_to_tuplestore() in foreign.c since it was billed as a helper function but wasn't used elsewhere and it made it harder to use MakeTuplestore() in this location. -- melanie [1] https://www.postgresql.org/message-id/flat/20200124195226.lth52iydq2n2uilq%40alap3.anarazel.de
Attachment
On 2021-Nov-02, Melanie Plageman wrote: > Attached is a patch to create a helper function which creates a > tuplestore to be used by FUNCAPI-callable functions. Looks good, and given the amount of code being removed, it seems a no-brainer that some helper(s) is/are needed. I think the name is a bit too generic. How about MakeFuncResultTuplestore()? I think the function should not modify rsinfo -- having that as (undocumented) side effect is weird. I would suggest to just return the tuplestore, and have the caller stash it in rsinfo->setResult and modify the other struct members alongside that. It's a couple more lines per caller, but the code is clearer that way. > There are a few places with very similar code to the new helper > (MakeTuplestore()) but which, for example, don't expect the return type > to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I > wasn't sure if it was worth modifying it for their benefit. Is this just because of the tupdesc? If that's the case, then maybe the tupdesc can be optionally passed in by caller, and when it's not, the helper does get_call_result_type() and the tupdesc is used as output argument. > I wasn't sure if the elog() check for call result type should be changed > to an ereport(). I don't see a reason for that: it's still an internal (not user-facing) error. > All callers of MakeTuplestore() pass work_mem as the third parameter, so > I could just omit it and hard-code it in, but I wasn't sure that felt > right in a utility/helper function. No opinion. I would have used the global in the function instead of passing it in. > I inlined deflist_to_tuplestore() in foreign.c since it was billed as a > helper function but wasn't used elsewhere and it made it harder to use > MakeTuplestore() in this location. This is a bit strange. Does it work to pass rsinfo->expectedDesc? -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "No renuncies a nada. No te aferres a nada."
Several places have a conditional value for the first argument (randomAccess), but your patch changes the behavior to a constant "true". I didn't review the patch beyond that. > @@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) > - tupstore = > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > - false, work_mem); > @@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) > - tuple_store = > - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, > - false, work_mem); > @@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > @@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > @@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) > - tupstore = > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > - false, work_mem); > +++ b/src/backend/utils/fmgr/funcapi.c > + tupstore = tuplestore_begin_heap(true, false, maxKBytes); -- Justin
The attached patch contains a helper now named MakeFuncResultTuplestore() which aims to eliminate a few lines of redundant code that all FUNCAPI functions making tuplestores repeated. While investigating all of these call sites and making the changes suggested by Álvaro, I noticed that the functions making tuplestores broke out into seven groups. For some of them, using my helper changes what memory context the tuple descriptor is made in. Explanation of how/why memory context tuple descriptor is in has changed: An earlier version of this patch changed into the per query memory context inside of the helper function. This version does not do so but instead asserts that the caller has changed into the appropriate memory context. For callers which previously did not call get_call_result_type() and instead used a function like CreateTupleDescCopy() to create the tuple descriptor which was used for the tuples in the tuplestore, most of them made this copy in the per query memory context. For these callers it seemed error-prone to switch to the per query memory context to create the copied tuple descriptor and then switch back before calling the helper function. Instead, the helper function will assert that the caller has already switched into the per query memory context so that the tuplestore is made in the per query context. This means that for the group of functions which, prior to this patch, called get_call_result_type() in a shorter-lived memory context but now rely on the helper function to call get_call_result_type(), the tuple descriptor will now be in the per query memory context. I think that this makes sense since most of these are set as the ResultSetInfo->setDesc (the tuple descriptor for returned tuples). Explanation of the different groups of functions using the MakeFuncResultTuplestore() helper: The first group are functions which called get_call_result_type(), allocating memory for the tuple descriptor in the short-lived memory context, then switched into the per query memory context before creating the tuplestore, then switched back to the short-lived memory context directly after making the tuplestore. Now that they use the helper function and switch into the per query memory context before calling it, the tuple descriptor is allocated in the per query memory context. The first group: pg_stop_backup_v2() pg_event_trigger_dropped_objects() pg_event_trigger_ddl_commands() pg_available_extensions() pg_available_extension_versions() pg_extension_update_paths() pg_hba_file_rules() pg_stat_get_subscription() pg_logical_slot_get_changes_guts() pg_show_replication_origin_status() pg_get_replication_slots() pg_stat_get_wal_senders() pg_get_shmem_allocations() pg_timezone_names() pg_ls_dir_files() pg_get_backend_memory_contexts() pg_stat_get_progress_info() pg_stat_get_activity() pg_stat_get_slru() The second group are functions which, instead of calling get_call_result_type(), call CreateTemplateTupleDesc() and TupleDescInitEntry() to allocate and initialize the tuple descriptor in the per query memory context. Their memory usage is unchanged by this patch. The second group: pg_prepared_statement() pg_ls_dir() pg_tablespace_databases() show_all_file_settings() pg_cursor() The third and fourth groups are functions that use CreateTupleDescCopy() to make a tuple descriptor copy from ResultSetInfo->expectedDesc. These functions made a copy of ResultSetInfo->expectedDesc but did not check if it was present before doing so (and do not seem to call a function which would call internal_get_result_type() -- which does this check), so I have added that check. Their memory usage is unchanged by this patch. The third group: text_to_table() The fourth group differs from the third in that the values for the tuples are actually created in the per query memory context -- seemingly unnecessarily since tuplestore_putvalues() will eventually end up copying everything when it is forming tuples. This behavior is not altered at all by the MakeFuncResultTuplestore() helper function patch, however I wanted to mention that I noticed that perhaps more was being allocated in the per query memory context than is necessary. The fourth group: pg_options_to_table() pg_config() The fifth group is a function that uses CreateTupleDescCoy() to make a copy of a tuple descriptor (NOT ResultSetInfo->expectedDesc) in the function info memory context (fcinfo->flinfo->fn_mcxt) so that it can be accessed later. Their memory usage is unchanged by this patch. The fifth group: populate_recordset_worker() The sixth group of functions use CreateTupleDescCopy() to make a copy of the tuple descriptor that they set up using get_call_result_type(). They called get_call_result_type() in a short-lived memory context, then switched to the per query memory context and made a copy of it right away. It is unclear why they couldn't just call get_call_result_type() in the per query context and not make a copy. Perhaps they needed some specific behavior of the copy (like that it doesn't copy constraints and defaults?). With the attached patch, the initial tuple descriptor is now made in the per query memory context (as well as the copy). I wonder if the initial tuple descriptor being made in the per query memory context negated the need for a copy, however I left the copy intact since I wasn't sure. These functions also checked for the presence of ResultSetInfo->expectedDesc which is not applicable for all callers of MakeFuncResultTuplestore(), so I separated it into a separate check. get_call_result_type() will call internal_get_result_type() which will check for expectedDesc if the result type is TYPEFUNC_RECORD, so I'm not sure if the expectedDesc check is required or not in the body of these functions. The sixth group: each_worker() each_worker_jsonb() The seventh group of functions use CreateTupleDescCopy() to make a copy of ResultSetInfo->expectedDesc in the per query memory context. This patch does not change any of that. The only difference is that now the check for expectedDesc is separate. Their memory usage is unchanged by this patch. The seventh group: elements_worker_jsonb() elements_worker() Note that for all of the functions in which I added or changed the check for ResultSetInfo->expectedDesc, it is a little bit different because I've changed the error code. I've used error code: ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED where they previously would have been part of an ereport with error code ERRCODE_FEATURE_NOT_SUPPORTED. I'm not sure if the new one is correct. In fact, I'm not sure if an error related to expectedDesc being set should be user-facing at all (looking at how expectedDesc is used and set, I wasn't sure). Perhaps it should be an elog() or an assert? Feedback is addressed inline below. On Tue, Nov 2, 2021 at 2:17 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2021-Nov-02, Melanie Plageman wrote: > > > Attached is a patch to create a helper function which creates a > > tuplestore to be used by FUNCAPI-callable functions. > > Looks good, and given the amount of code being removed, it seems a > no-brainer that some helper(s) is/are needed. > > I think the name is a bit too generic. How about > MakeFuncResultTuplestore()? I've done this rename. > I think the function should not modify rsinfo -- having that as > (undocumented) side effect is weird. I would suggest to just return the > tuplestore, and have the caller stash it in rsinfo->setResult and modify > the other struct members alongside that. It's a couple more lines per > caller, but the code is clearer that way. I've changed the helper not to modify rsinfo. > > There are a few places with very similar code to the new helper > > (MakeTuplestore()) but which, for example, don't expect the return type > > to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I > > wasn't sure if it was worth modifying it for their benefit. > > Is this just because of the tupdesc? If that's the case, then maybe the > tupdesc can be optionally passed in by caller, and when it's not, the > helper does get_call_result_type() and the tupdesc is used as output > argument. I've parameterized the helper in this way and now am able to use it for all FUNCAPI-callable functions making tuplestores except fmgr_sql (for SQL functions) in functions.c because it didn't really seem worth it given the way the code is written. > > All callers of MakeTuplestore() pass work_mem as the third parameter, so > > I could just omit it and hard-code it in, but I wasn't sure that felt > > right in a utility/helper function. > > No opinion. I would have used the global in the function instead of > passing it in. I've changed this to use the global. > > I inlined deflist_to_tuplestore() in foreign.c since it was billed as a > > helper function but wasn't used elsewhere and it made it harder to use > > MakeTuplestore() in this location. > > This is a bit strange. Does it work to pass rsinfo->expectedDesc? This is addressed by changing MakeFuncResultTuplestore() call get_call_result_type() optionally and having pg_options_to_table() call it with NULL as tuple descriptor. Now the behavior is the same -- it makes a copy of expectedDesc in the per query context and sets that as setDesc in the ReturnSetInfo. - Melanie
Attachment
On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > Several places have a conditional value for the first argument (randomAccess), > but your patch changes the behavior to a constant "true". I didn't review the > patch beyond that. > > > @@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) > > - tupstore = > > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > > - false, work_mem); > > > @@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) > > - tuple_store = > > - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, > > - false, work_mem); > > > @@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) > > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > > @@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) > > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > > @@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) > > - tupstore = > > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > > - false, work_mem); > > > +++ b/src/backend/utils/fmgr/funcapi.c > > + tupstore = tuplestore_begin_heap(true, false, maxKBytes); > I believe the patch has preserved the same behavior. All of the callers for which I replaced tuplestore_begin_heap() which passed a variable for the randomAccess parameter had set that variable to something which was effectively the same as passing true -- SFRM_Materialize_Random. - Melanie
On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote: > On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > Several places have a conditional value for the first argument (randomAccess), > > but your patch changes the behavior to a constant "true". I didn't review the > > patch beyond that. > > > > > @@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) > > > - tupstore = > > > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > > > - false, work_mem); > > > > > @@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) > > > - tuple_store = > > > - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, > > > - false, work_mem); > > > > > @@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) > > > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > > > > @@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) > > > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > > > > @@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) > > > - tupstore = > > > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > > > - false, work_mem); > > > > > +++ b/src/backend/utils/fmgr/funcapi.c > > > + tupstore = tuplestore_begin_heap(true, false, maxKBytes); > > I believe the patch has preserved the same behavior. All of the callers > for which I replaced tuplestore_begin_heap() which passed a variable for > the randomAccess parameter had set that variable to something which was > effectively the same as passing true -- SFRM_Materialize_Random. I don't think so ? They callers aren't passing SFRM_Materialize_Random, but rather (allowedModes & SFRM_Materialize_Random) != 0 Where allowedModes is determined EXEC_FLAG_BACKWARD. src/include/executor/executor.h:extern Tuplestorestate *ExecMakeTableFunctionResult(SetExprState *setexpr, src/include/executor/executor.h- ExprContext *econtext, src/include/executor/executor.h- MemoryContext argContext, src/include/executor/executor.h- TupleDesc expectedDesc, src/include/executor/executor.h- bool randomAccess); src/backend/executor/nodeFunctionscan.c=FunctionNext(FunctionScanState *node) src/backend/executor/nodeFunctionscan.c: ExecMakeTableFunctionResult(node->funcstates[0].setexpr, src/backend/executor/nodeFunctionscan.c- node->ss.ps.ps_ExprContext, src/backend/executor/nodeFunctionscan.c- node->argcontext, src/backend/executor/nodeFunctionscan.c- node->funcstates[0].tupdesc, src/backend/executor/nodeFunctionscan.c- node->eflags & EXEC_FLAG_BACKWARD); -- Justin
On Mon, Nov 8, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote: > > On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > > > Several places have a conditional value for the first argument (randomAccess), > > > but your patch changes the behavior to a constant "true". I didn't review the > > > patch beyond that. > > > > > > > @@ -740,18 +724,14 @@ pg_prepared_statement(PG_FUNCTION_ARGS) > > > > - tupstore = > > > > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > > > > - false, work_mem); > > > > > > > @@ -2701,42 +2701,13 @@ pg_hba_file_rules(PG_FUNCTION_ARGS) > > > > - tuple_store = > > > > - tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, > > > > - false, work_mem); > > > > > > > @@ -4799,31 +4797,8 @@ pg_timezone_names(PG_FUNCTION_ARGS) > > > > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > > > > > > @@ -575,38 +575,12 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS) > > > > - randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > > > > > > @@ -1170,17 +1154,12 @@ pg_cursor(PG_FUNCTION_ARGS) > > > > - tupstore = > > > > - tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random, > > > > - false, work_mem); > > > > > > > +++ b/src/backend/utils/fmgr/funcapi.c > > > > + tupstore = tuplestore_begin_heap(true, false, maxKBytes); > > > > I believe the patch has preserved the same behavior. All of the callers > > for which I replaced tuplestore_begin_heap() which passed a variable for > > the randomAccess parameter had set that variable to something which was > > effectively the same as passing true -- SFRM_Materialize_Random. > > I don't think so ? > > They callers aren't passing SFRM_Materialize_Random, but rather > (allowedModes & SFRM_Materialize_Random) != 0 > > Where allowedModes is determined EXEC_FLAG_BACKWARD. > > src/include/executor/executor.h:extern Tuplestorestate *ExecMakeTableFunctionResult(SetExprState *setexpr, > src/include/executor/executor.h- ExprContext *econtext, > src/include/executor/executor.h- MemoryContext argContext, > src/include/executor/executor.h- TupleDesc expectedDesc, > src/include/executor/executor.h- bool randomAccess); > > src/backend/executor/nodeFunctionscan.c=FunctionNext(FunctionScanState *node) > src/backend/executor/nodeFunctionscan.c: ExecMakeTableFunctionResult(node->funcstates[0].setexpr, > src/backend/executor/nodeFunctionscan.c- node->ss.ps.ps_ExprContext, > src/backend/executor/nodeFunctionscan.c- node->argcontext, > src/backend/executor/nodeFunctionscan.c- node->funcstates[0].tupdesc, > src/backend/executor/nodeFunctionscan.c- node->eflags & EXEC_FLAG_BACKWARD); > You are right. I misread it. So, I've attached a patch where randomAccess is now an additional parameter (and registered for the next fest). I was thinking about how to add a test that would have broken when I passed true for randomAccess to tuplestore_begin_heap() when false was required. But, I don't fully understand the problem. If backward accesses to a tuplestore are not allowed but randomAccess is mistakenly passed as true, would the potential result be potentially wrong results from accessing the tuplestore results backwards? - Melanie
Attachment
On Thu, Nov 18, 2021 at 12:59:03PM -0500, Melanie Plageman wrote: > On Mon, Nov 8, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote: > > > On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > > > > > Several places have a conditional value for the first argument (randomAccess), > > > > but your patch changes the behavior to a constant "true". I didn't review the > > > > patch beyond that. > > > > ... > > > I believe the patch has preserved the same behavior. All of the callers > > > for which I replaced tuplestore_begin_heap() which passed a variable for > > > the randomAccess parameter had set that variable to something which was > > > effectively the same as passing true -- SFRM_Materialize_Random. > > > > I don't think so ? > > > > They callers aren't passing SFRM_Materialize_Random, but rather > > (allowedModes & SFRM_Materialize_Random) != 0 > > > > Where allowedModes is determined EXEC_FLAG_BACKWARD. ... > You are right. I misread it. > > So, I've attached a patch where randomAccess is now an additional > parameter (and registered for the next fest). > > I was thinking about how to add a test that would have broken when I > passed true for randomAccess to tuplestore_begin_heap() when false was > required. But, I don't fully understand the problem. If backward > accesses to a tuplestore are not allowed but randomAccess is mistakenly > passed as true, would the potential result be potentially wrong results > from accessing the tuplestore results backwards? No, see here src/backend/utils/fmgr/README-The Tuplestore must be created with randomAccess = true if src/backend/utils/fmgr/README:SFRM_Materialize_Random is set in allowedModes, but it can (and preferably src/backend/utils/fmgr/README-should) be created with randomAccess = false if not. Callers that can support src/backend/utils/fmgr/README-both ValuePerCall and Materialize mode will set SFRM_Materialize_Preferred, src/backend/utils/fmgr/README-or not, depending on which mode they prefer. If you use "randomAccess=true" when it's not needed, the result might be less efficient. Some callers specify "true" when they don't need to. I ran into that here. https://www.postgresql.org/message-id/21724.1583955158@sss.pgh.pa.us Maybe you'd want to add an 0002 patch which changes those to conditional ? BTW, there should be a newline before MakeFuncResultTuplestore(). -- Justin
On Thu, Nov 18, 2021 at 1:24 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Thu, Nov 18, 2021 at 12:59:03PM -0500, Melanie Plageman wrote: > > On Mon, Nov 8, 2021 at 3:13 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > On Mon, Nov 08, 2021 at 02:52:28PM -0500, Melanie Plageman wrote: > > > > On Tue, Nov 2, 2021 at 4:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > > > > > > > Several places have a conditional value for the first argument (randomAccess), > > > > > but your patch changes the behavior to a constant "true". I didn't review the > > > > > patch beyond that. > > > > > > ... > > > > I believe the patch has preserved the same behavior. All of the callers > > > > for which I replaced tuplestore_begin_heap() which passed a variable for > > > > the randomAccess parameter had set that variable to something which was > > > > effectively the same as passing true -- SFRM_Materialize_Random. > > > > > > I don't think so ? > > > > > > They callers aren't passing SFRM_Materialize_Random, but rather > > > (allowedModes & SFRM_Materialize_Random) != 0 > > > > > > Where allowedModes is determined EXEC_FLAG_BACKWARD. > ... > > You are right. I misread it. > > > > So, I've attached a patch where randomAccess is now an additional > > parameter (and registered for the next fest). > > > > I was thinking about how to add a test that would have broken when I > > passed true for randomAccess to tuplestore_begin_heap() when false was > > required. But, I don't fully understand the problem. If backward > > accesses to a tuplestore are not allowed but randomAccess is mistakenly > > passed as true, would the potential result be potentially wrong results > > from accessing the tuplestore results backwards? > > No, see here > > src/backend/utils/fmgr/README-The Tuplestore must be created with randomAccess = true if > src/backend/utils/fmgr/README:SFRM_Materialize_Random is set in allowedModes, but it can (and preferably > src/backend/utils/fmgr/README-should) be created with randomAccess = false if not. Callers that can support > src/backend/utils/fmgr/README-both ValuePerCall and Materialize mode will set SFRM_Materialize_Preferred, > src/backend/utils/fmgr/README-or not, depending on which mode they prefer. > > If you use "randomAccess=true" when it's not needed, the result might be less > efficient. > > Some callers specify "true" when they don't need to. I ran into that here. > https://www.postgresql.org/message-id/21724.1583955158@sss.pgh.pa.us > > Maybe you'd want to add an 0002 patch which changes those to conditional ? Done in attached v4 > BTW, there should be a newline before MakeFuncResultTuplestore(). Fixed - Melanie
Attachment
On Mon, Dec 13, 2021 at 05:27:52PM -0500, Melanie Plageman wrote: > Done in attached v4 Thanks. I think these comments can be removed from the callers: /* it's a simple type, so don't use get_call_result_type */ You removed one call to tuplestore_donestoring(), but not the others. I guess you could remove them all (optionally as a separate patch). The rest of these are questions more than comments, and I'm not necessarily suggesting to change the patch: This isn't new in your patch, but for me, it's more clear if the callers have a separate boolean for randomAccess, rather than having the long expression in the function call: + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, + rsinfo->allowedModes & SFRM_Materialize_Random); vs randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, randomAccess); One could also argue for passing rsinfo->allowedModes, instead of (rsinfo->allowedModes & SFRM_Materialize_Random). There's a couples places that you're checking expectedDesc where it wasn't being checked before. Is that some kind of live bug ? pg_config() text_to_table() It'd be nice if the "expected tuple format" error didn't need to be duplicated for each SRFs. I suppose the helper function could just take a boolean determining whether or not to throw an error (passing "expectedDesc==NULL"). You'd have to call the helper before CreateTupleDescCopy() - which you're already doing in a couple places for similar reasons. I noticed that tuplestore.h isn't included most places. I suppose it's included via execnodes.h. Your patch doesn't change that, arguably it should've always been included. -- Justin
Thanks for the detailed review! On Fri, Dec 17, 2021 at 3:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Mon, Dec 13, 2021 at 05:27:52PM -0500, Melanie Plageman wrote: > > Done in attached v4 > > Thanks. > > I think these comments can be removed from the callers: > /* it's a simple type, so don't use get_call_result_type */ Done in attached v5 > You removed one call to tuplestore_donestoring(), but not the others. > I guess you could remove them all (optionally as a separate patch). I removed it in that one location because I wanted to get rid of the local variable it used. I am fine with removing the other occurrences, but I thought there might be some reason why instead of removing it, they made it into a no-op. > The rest of these are questions more than comments, and I'm not necessarily > suggesting to change the patch: > > This isn't new in your patch, but for me, it's more clear if the callers have a > separate boolean for randomAccess, rather than having the long expression in > the function call: > > + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, > + rsinfo->allowedModes & SFRM_Materialize_Random); > > vs > > randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, randomAccess); > > One could also argue for passing rsinfo->allowedModes, instead of > (rsinfo->allowedModes & SFRM_Materialize_Random). I've changed the patch to have MakeFuncResultTuplestore() check rsinfo->allowedModes and pass in the resulting boolean to tuplestore_begin_heap(). Because I modified the MakeFuncResultTuplestore() function signature to accommodate this, I had to collapse the two patches into one, as I couldn't maintain the call sites which passed in a constant. > There's a couples places that you're checking expectedDesc where it wasn't > being checked before. Is that some kind of live bug ? > pg_config() text_to_table() Yes, expectedDesc was accessed in these locations without checking that it is not NULL. Should that be a separate patch? > It'd be nice if the "expected tuple format" error didn't need to be duplicated > for each SRFs. I suppose the helper function could just take a boolean > determining whether or not to throw an error (passing "expectedDesc==NULL"). > You'd have to call the helper before CreateTupleDescCopy() - which you're > already doing in a couple places for similar reasons. So, I don't think it makes sense to move that error message into MakeFuncResultTuplestore() for the cases in which NULL is passed in for the tuple descriptor. expectedDesc is not accessed at all in these cases inside the function (since get_call_result_type()) is not called. For the cases when a tuple descriptor is passed in, only two callers actually check for expectedDesc -- each_worker_jsonb and each_worker. Therefore, I don't think it makes sense to add another parameter just to move that check inside for two callers. > I noticed that tuplestore.h isn't included most places. I suppose it's > included via execnodes.h. Your patch doesn't change that, arguably it > should've always been included. I've now included it in funcapi.c. - Melanie
Attachment
On Wed, Jan 05, 2022 at 12:09:16PM -0500, Melanie Plageman wrote: > On Fri, Dec 17, 2021 at 3:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > The rest of these are questions more than comments, and I'm not necessarily > > suggesting to change the patch: > > > > This isn't new in your patch, but for me, it's more clear if the callers have a > > separate boolean for randomAccess, rather than having the long expression in > > the function call: > > > > + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, > > + rsinfo->allowedModes & SFRM_Materialize_Random); > > > > vs > > > > randomAccess = (rsinfo->allowedModes & SFRM_Materialize_Random) != 0; > > - tupstore = tuplestore_begin_heap(randomAccess, false, work_mem); > > + tupstore = MakeFuncResultTuplestore(fcinfo, NULL, randomAccess); > > > > One could also argue for passing rsinfo->allowedModes, instead of > > (rsinfo->allowedModes & SFRM_Materialize_Random). > > I've changed the patch to have MakeFuncResultTuplestore() check > rsinfo->allowedModes and pass in the resulting boolean to > tuplestore_begin_heap(). Because I modified the > MakeFuncResultTuplestore() function signature to accommodate this, I had > to collapse the two patches into one, as I couldn't maintain the call > sites which passed in a constant. > > There's a couples places that you're checking expectedDesc where it wasn't > > being checked before. Is that some kind of live bug ? > > pg_config() text_to_table() > > Yes, expectedDesc was accessed in these locations without checking that > it is not NULL. Should that be a separate patch? I don't know. The patch is already easy to review, since it's mostly limited to removing code and fixing inconsistencies (NULL check) and possible inefficiencies (randomAccess). If the expectedDesc NULL check were an 0001 patch, then 0002 (the main patch) would be even easier to review. Only foreign.c is different. > > You removed one call to tuplestore_donestoring(), but not the others. > > I guess you could remove them all (optionally as a separate patch). > > I removed it in that one location because I wanted to get rid of the > local variable it used. What local variable ? I see now that logicalfuncs.c never had a local variable called tupstore. I'm sure it was intended since 2014 (b89e15105) to say tupestore_donestoring(p->tupstore). But donestoring(typo) causes no error, since the define is a NOP. src/include/utils/tuplestore.h:#define tuplestore_donestoring(state) ((void) 0) > I am fine with removing the other occurrences, > but I thought there might be some reason why instead of removing it, > they made it into a no-op. I assume Tom left it (actually, added it back in dd04e958c) to avoid breaking extensions for no reason. And maybe to preserve the possbility of at some point in the future doing something again during the "done" step. I'd leave it for input from a committer about those: - remove tuplestore_donestoring() calls ? - split expectedDesc NULL check to an 0001 patch ? - anything other opened questions ? I'm marking this as RFC, with the caveat that the newline before MakeFuncResultTuplestore went missing again :) -- Justin
On Wed, Jan 5, 2022 at 7:57 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Jan 05, 2022 at 12:09:16PM -0500, Melanie Plageman wrote: > > On Fri, Dec 17, 2021 at 3:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > There's a couples places that you're checking expectedDesc where it wasn't > > > being checked before. Is that some kind of live bug ? > > > pg_config() text_to_table() > > > > Yes, expectedDesc was accessed in these locations without checking that > > it is not NULL. Should that be a separate patch? > > I don't know. The patch is already easy to review, since it's mostly limited > to removing code and fixing inconsistencies (NULL check) and possible > inefficiencies (randomAccess). > > If the expectedDesc NULL check were an 0001 patch, then 0002 (the main patch) > would be even easier to review. Only foreign.c is different. I'll wait to do that if preferred by committer. Are you imagining that patch 0001 would only add the check for expectedDesc that is missing from pg_config() and text_to_table()? > > > You removed one call to tuplestore_donestoring(), but not the others. > > > I guess you could remove them all (optionally as a separate patch). > > > > I removed it in that one location because I wanted to get rid of the > > local variable it used. > > What local variable ? I see now that logicalfuncs.c never had a local variable > called tupstore. I'm sure it was intended since 2014 (b89e15105) to say > tupestore_donestoring(p->tupstore). But donestoring(typo) causes no error, > since the define is a NOP. > > src/include/utils/tuplestore.h:#define tuplestore_donestoring(state) ((void) 0) Yes, I mean the local variable, tupstore. > > I am fine with removing the other occurrences, > > but I thought there might be some reason why instead of removing it, > > they made it into a no-op. > > I assume Tom left it (actually, added it back in dd04e958c) to avoid breaking > extensions for no reason. And maybe to preserve the possbility of at some > point in the future doing something again during the "done" step. > > I'd leave it for input from a committer about those: > > - remove tuplestore_donestoring() calls ? > - split expectedDesc NULL check to an 0001 patch ? > - anything other opened questions ? > > I'm marking this as RFC, with the caveat that the newline before > MakeFuncResultTuplestore went missing again :) oops. I've attached v6 with the newline. - Melanie
Attachment
On Tue, Jan 11, 2022 at 10:19:33AM -0500, Melanie Plageman wrote: > On Wed, Jan 5, 2022 at 7:57 PM Justin Pryzby <pryzby@telsasoft.com> wrote: >> I'd leave it for input from a committer about those: >> >> - remove tuplestore_donestoring() calls ? This part has been raised on a different thread, as of: https://www.postgresql.org/message-id/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com And it looks sensible from here to just remove any traces of this stuff from the C code, while leaving the compatibility macro around to avoid any breakages with any external modules. >> - split expectedDesc NULL check to an 0001 patch ? I am not sure that this is worth a split, as the same areas are touched. >> - anything other opened questions ? >> >> I'm marking this as RFC, with the caveat that the newline before >> MakeFuncResultTuplestore went missing again :) > > oops. I've attached v6 with the newline. Saying that, the wrapper you are proposing here looks like a good idea. I have wanted a centralized wrapper more than once, and this is a nice cut overall. -- Michael
Attachment
On Tue, Jan 11, 2022 at 10:19:33AM -0500, Melanie Plageman wrote: > > If the expectedDesc NULL check were an 0001 patch, then 0002 (the main patch) > > would be even easier to review. Only foreign.c is different. > > I'll wait to do that if preferred by committer. > Are you imagining that patch 0001 would only add the check for > expectedDesc that is missing from pg_config() and text_to_table()? Yes, that's what I meant, but I haven't been able to determine if a NULL check should be added at all. I found this commit message, which indicates that it shouldn't be used in every case. I'm not clear when it should/not be used, though. commit 60651e4cddbb77e8f1a0c7fc0be6a7e7bf626fe0 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat Oct 28 14:02:21 2017 -0400 Support domains over composite types in PL/Perl. In passing, don't insist on rsi->expectedDesc being set unless we actually need it; this allows succeeding in a couple of cases where PL/Perl functions returning setof composite would have failed before, and makes the error message more apropos in other cases.
On Mon, Feb 21, 2022 at 04:41:17PM +0900, Michael Paquier wrote: > So, I got my hands on this area, and found myself applying 07daca5 as > a first piece of the puzzle. Anyway, after more review today, I have > bumped into more pieces that could be consolidated, and finished with > the following patch set: > - 0001 changes a couple of SRFs that I found worth simplifying. These > refer mostly to the second and fourth group mentioned upthread by > Melanie, with two exceptions: pg_tablespace_databases() where it is > not worth changing to keep it backward-compatible and pg_ls_dir() as > per its one-arg version. That's a nice first cut in itself. > - 0002 changes a couple of places to unify some existing SRF checks, > that I bumped into on the way. The value here is in using the same > error messages everywhere, reducing the translation effort and > generating the same errors for all cases based on the same conditions. > This eases the review of the next patch. These two have been now applied, with some comment improvements and the cleanup of pg_options_to_table() done in 0001, and a slight change in 0002 for pageinspect where a check was not necessary for a BRIN code path. > - 0003 is the actual refactoring meat, where I have been able to move > the check on expectedDesc into MakeFuncResultTuplestore(), shaving > more code than previously. If you discard the cases of patch 0001, it > should actually be possible to set setResult, setDesc and returnMode > within the new function, which would feel natural as that's the place > where we create the function's tuplestore and we want to materialize > its contents. The cases with the JSON code are also a bit hairy and > need more thoughts, but we could also cut this part of the code from > the initial refactoring effort. This is the remaining piece, as attached, that I have not been able to poke much at yet. -- Michael
Attachment
On Thu, Feb 24, 2022 at 08:25:06PM +0900, Michael Paquier wrote: > This is the remaining piece, as attached, that I have not been able to > poke much at yet. So, I have finally poked at this last part of the patch set, and I found that we can be more aggressive with the refactoring, by moving into MakeFuncResultTuplestore() the parts where we save the tuplestore and the tupledesc in the per-query memory context. There are two pieces that matter once things are reshaped: - The tuple descriptor may need some extra validation via BlessTupleDesc() when it comes from a transient record datatype, something that happens for most of the subroutines related to the JSON functions. - expectedDesc is sometimes required by the caller, though most of the time just needs to be built with the more expensive get_call_result_type(). In order to keep things pluggable at will, MakeFuncResultTuplestore() has been changed to access a set of bits32 flags, able to control the two options above. With this facility in place, I have been able to cut much more code than the initial patch, roughly twice as of: 24 files changed, 157 insertions(+), 893 deletions(-) This seems in rather good shape to me, the changes are straight-forward and the code cut is really good, so I'd like to move on with that. 0001 is the initial patch, and 0002 is the extra refactoring I have been working on. The plan would be to merge both, but I am sending a split to ease any checks on what I have changed. Comments or objections? -- Michael
Attachment
On Mon, Feb 28, 2022 at 04:49:41PM +0900, Michael Paquier wrote: > In order to keep things pluggable at will, MakeFuncResultTuplestore() > has been changed to access a set of bits32 flags, able to control the > two options above. With this facility in place, I have been able to > cut much more code than the initial patch, roughly twice as of: > 24 files changed, 157 insertions(+), 893 deletions(-) So, I have been thinking about this patch once again, and after pondering more on it, MakeFuncResultTuplestore() is actually a wrong name now that it does much more than just creating a tuplestore, by assigning the TupleDesc and the TupleStore into the function's ReturnSetInfo. This is actually setting up a function in the context of a single call where we fill the tuplestore with all its values, so instead I have settled down to name that SetSingleFuncCall(), to make a parallel with the existing MultiFuncCall*(). funcapi.c is the right place for that, and I have added more documentation in the fmgr's README as well as funcapi.h. -- Michael
Attachment
On Wed, Mar 02, 2022 at 03:43:17PM +0900, Michael Paquier wrote: > This is actually setting up a function in the context of a single call > where we fill the tuplestore with all its values, so instead I have > settled down to name that SetSingleFuncCall(), to make a parallel with > the existing MultiFuncCall*(). funcapi.c is the right place for > that, and I have added more documentation in the fmgr's README as well > as funcapi.h. I have tortured all those code paths for the last couple of days, and the new function name, as well as its options, still seemed fine to me, so I have applied the patch. The buildfarm is cool with it. It is worth noting that there are more code paths in contrib/ that could be simplified with this new routine. -- Michael
Attachment
On Sun, Mar 6, 2022 at 9:29 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Mar 02, 2022 at 03:43:17PM +0900, Michael Paquier wrote: > > This is actually setting up a function in the context of a single call > > where we fill the tuplestore with all its values, so instead I have > > settled down to name that SetSingleFuncCall(), to make a parallel with > > the existing MultiFuncCall*(). funcapi.c is the right place for > > that, and I have added more documentation in the fmgr's README as well > > as funcapi.h. > > I have tortured all those code paths for the last couple of days, and > the new function name, as well as its options, still seemed fine to > me, so I have applied the patch. The buildfarm is cool with it. It > is worth noting that there are more code paths in contrib/ that could > be simplified with this new routine. Wow! Thanks so much for taking the time to review, refine, and commit this work. I've attached a patch using the helper in most locations in contrib modules that seemed useful. The following I don't think we can use the helper or it is not worth it: - pg_logdir_ls() in contrib/adminpack has return type TYPEFUNC_RECORD and expectedDesc is not already created, so the helper can't be used. But basically, since it doesn't specify OUT argument names, it has to do TupleDescInitEntry() itself anyway, I think. - contrib/tablefunc.c was also not simple to refactor. the various parts of SetSingleFuncCall are spread throughout different functions in the file. - contrib/dblink has one function which returns a tuplestore that was simple to change (dblink_get_notify()) and I've done so. However, most of the other creation of tuplestore and tupledescriptors is in helper functions (like materializeResult()) which separate the tuplestore creation from the tuple descriptor initialization in a way that made it hard to do a drop-in replacement with the helper function. - Melanie
Attachment
On Mon, Mar 07, 2022 at 05:09:19PM -0500, Melanie Plageman wrote: > I've attached a patch using the helper in most locations in contrib > modules that seemed useful. Thanks for the patch. I was also looking at that yesterday, and this pretty much maps to what I have finished with, except for dblink and xml2 where it was not looking that obvious to me at quick glance. In xml2, my eyes hurt a bit on the meaning of doing the "Switch out of SPI context" in xpath_table() after doing the two SPI calls, but I agree that this extra switch back to the old context should not be necessary. For dblink, I did not notice that the change for dblink_get_notify() would be this straight-forward, good catch. It would be nice to see prepTuplestoreResult() gone at the end, though I am not sure how invasive/worth it would be for dblink/. > - pg_logdir_ls() in contrib/adminpack has return type TYPEFUNC_RECORD > and expectedDesc is not already created, so the helper can't be used. > But basically, since it doesn't specify OUT argument names, it has to > do TupleDescInitEntry() itself anyway, I think. Here, actually, I thought first that a simplification should be possible, noticing that the regression tests passed with the function called until I saw what it was testing :) I am fine to not poke at the beast, and it looks like we may run into trouble if we wish to maintain compatibility down to adminpack 1.0 at runtime. This function has a very limited usage, anyway. > - contrib/tablefunc.c was also not simple to refactor. the various parts > of SetSingleFuncCall are spread throughout different functions in the > file. > > - contrib/dblink has one function which returns a tuplestore that was > simple to change (dblink_get_notify()) and I've done so. This matches my impression, so applied. -- Michael