Thread: Re: Assert failure on running a completed portal again
Yugo Nagata <nagata@sraoss.co.jp> writes: > I notice that the following Assert in PortalRun fails when a same portal is > executed more than once by an Execute message whose "max number of rows" > is specified to zero, that is, "no limit". > /* Set run_once flag. Shouldn't be clear if previously set. */ > Assert(!portal->run_once || run_once); > portal->run_once = run_once; Hmm. It's fairly hard to tell what was meant there, because (a) there is exactly zero documentation of what run_once means (b) that comment is opaque and doesn't appear to describe the code behavior anyway. > I believe the server should return CommanComplete normally in this case. > his can be fixed by not setting execute_is_fetch flag (run_once as the result) > when the portal is already completed since no rows will be fetched in this case. Don't like this much. I'm not sure I believe any part of this approach to deciding whether the portal is "run once". In any case, a proper fix for this needs to include actually documenting what that field means and does. regards, tom lane
I wrote: > Don't like this much. I'm not sure I believe any part of this > approach to deciding whether the portal is "run once". In any > case, a proper fix for this needs to include actually documenting > what that field means and does. After looking at this further, I think this whole "run_once" business is badly designed, worse documented, and redundant (as well as buggy). It can be replaced with three self-contained lines in ExecutePlan, as per the attached. (Obviously, the API changes in this wouldn't do for the back branches. But we could leave those arguments in place as unused dummies.) regards, tom lane diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index 623a674f99..f2eaa8e494 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -79,7 +79,7 @@ static ExecutorEnd_hook_type prev_ExecutorEnd = NULL; static void explain_ExecutorStart(QueryDesc *queryDesc, int eflags); static void explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, - uint64 count, bool execute_once); + uint64 count); static void explain_ExecutorFinish(QueryDesc *queryDesc); static void explain_ExecutorEnd(QueryDesc *queryDesc); @@ -321,15 +321,15 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags) */ static void explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, - uint64 count, bool execute_once) + uint64 count) { nesting_level++; PG_TRY(); { if (prev_ExecutorRun) - prev_ExecutorRun(queryDesc, direction, count, execute_once); + prev_ExecutorRun(queryDesc, direction, count); else - standard_ExecutorRun(queryDesc, direction, count, execute_once); + standard_ExecutorRun(queryDesc, direction, count); } PG_FINALLY(); { diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 49c657b3e0..602cae54ff 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -335,7 +335,7 @@ static PlannedStmt *pgss_planner(Query *parse, static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags); static void pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, - uint64 count, bool execute_once); + uint64 count); static void pgss_ExecutorFinish(QueryDesc *queryDesc); static void pgss_ExecutorEnd(QueryDesc *queryDesc); static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, @@ -1021,16 +1021,15 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags) * ExecutorRun hook: all we need do is track nesting depth */ static void -pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count, - bool execute_once) +pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count) { nesting_level++; PG_TRY(); { if (prev_ExecutorRun) - prev_ExecutorRun(queryDesc, direction, count, execute_once); + prev_ExecutorRun(queryDesc, direction, count); else - standard_ExecutorRun(queryDesc, direction, count, execute_once); + standard_ExecutorRun(queryDesc, direction, count); } PG_FINALLY(); { diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c index f55e6d9675..161a0f8b0a 100644 --- a/src/backend/commands/copyto.c +++ b/src/backend/commands/copyto.c @@ -880,7 +880,7 @@ DoCopyTo(CopyToState cstate) else { /* run the plan --- the dest receiver will send tuples */ - ExecutorRun(cstate->queryDesc, ForwardScanDirection, 0, true); + ExecutorRun(cstate->queryDesc, ForwardScanDirection, 0); processed = ((DR_copy *) cstate->queryDesc->dest)->processed; } diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c index 5c92e48a56..466376af9b 100644 --- a/src/backend/commands/createas.c +++ b/src/backend/commands/createas.c @@ -340,7 +340,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt, ExecutorStart(queryDesc, GetIntoRelEFlags(into)); /* run the plan to completion */ - ExecutorRun(queryDesc, ForwardScanDirection, 0, true); + ExecutorRun(queryDesc, ForwardScanDirection, 0); /* save the rowcount if we're given a qc to fill */ if (qc) diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index a3f1d53d7a..3078f5c1a3 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -719,7 +719,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, dir = ForwardScanDirection; /* run the plan */ - ExecutorRun(queryDesc, dir, 0, true); + ExecutorRun(queryDesc, dir, 0); /* run cleanup too */ ExecutorFinish(queryDesc); diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index af6bd8ff42..e683c520a8 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -912,7 +912,7 @@ execute_sql_string(const char *sql, const char *filename) dest, NULL, NULL, 0); ExecutorStart(qdesc, 0); - ExecutorRun(qdesc, ForwardScanDirection, 0, true); + ExecutorRun(qdesc, ForwardScanDirection, 0); ExecutorFinish(qdesc); ExecutorEnd(qdesc); diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c index 010097873d..694da8291e 100644 --- a/src/backend/commands/matview.c +++ b/src/backend/commands/matview.c @@ -446,7 +446,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query, ExecutorStart(queryDesc, 0); /* run the plan */ - ExecutorRun(queryDesc, ForwardScanDirection, 0, true); + ExecutorRun(queryDesc, ForwardScanDirection, 0); processed = queryDesc->estate->es_processed; diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c index ac52ca25e9..dc68264242 100644 --- a/src/backend/commands/portalcmds.c +++ b/src/backend/commands/portalcmds.c @@ -427,7 +427,7 @@ PersistHoldablePortal(Portal portal) NULL); /* Fetch the result set into the tuplestore */ - ExecutorRun(queryDesc, direction, 0, false); + ExecutorRun(queryDesc, direction, 0); queryDesc->dest->rDestroy(queryDesc->dest); queryDesc->dest = NULL; diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c index a93f970a29..39a71c1de2 100644 --- a/src/backend/commands/prepare.c +++ b/src/backend/commands/prepare.c @@ -252,7 +252,7 @@ ExecuteQuery(ParseState *pstate, */ PortalStart(portal, paramLI, eflags, GetActiveSnapshot()); - (void) PortalRun(portal, count, false, true, dest, dest, qc); + (void) PortalRun(portal, count, false, dest, dest, qc); PortalDrop(portal, false); diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 5ca856fd27..312f622f32 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -77,14 +77,13 @@ static void InitPlan(QueryDesc *queryDesc, int eflags); static void CheckValidRowMarkRel(Relation rel, RowMarkType markType); static void ExecPostprocessPlan(EState *estate); static void ExecEndPlan(PlanState *planstate, EState *estate); -static void ExecutePlan(EState *estate, PlanState *planstate, +static void ExecutePlan(QueryDesc *queryDesc, bool use_parallel_mode, CmdType operation, bool sendTuples, uint64 numberTuples, ScanDirection direction, - DestReceiver *dest, - bool execute_once); + DestReceiver *dest); static bool ExecCheckOneRelPerms(RTEPermissionInfo *perminfo); static bool ExecCheckPermissionsModified(Oid relOid, Oid userid, Bitmapset *modifiedCols, @@ -294,18 +293,17 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags) */ void ExecutorRun(QueryDesc *queryDesc, - ScanDirection direction, uint64 count, - bool execute_once) + ScanDirection direction, uint64 count) { if (ExecutorRun_hook) - (*ExecutorRun_hook) (queryDesc, direction, count, execute_once); + (*ExecutorRun_hook) (queryDesc, direction, count); else - standard_ExecutorRun(queryDesc, direction, count, execute_once); + standard_ExecutorRun(queryDesc, direction, count); } void standard_ExecutorRun(QueryDesc *queryDesc, - ScanDirection direction, uint64 count, bool execute_once) + ScanDirection direction, uint64 count) { EState *estate; CmdType operation; @@ -354,21 +352,13 @@ standard_ExecutorRun(QueryDesc *queryDesc, * run plan */ if (!ScanDirectionIsNoMovement(direction)) - { - if (execute_once && queryDesc->already_executed) - elog(ERROR, "can't re-execute query flagged for single execution"); - queryDesc->already_executed = true; - - ExecutePlan(estate, - queryDesc->planstate, + ExecutePlan(queryDesc, queryDesc->plannedstmt->parallelModeNeeded, operation, sendTuples, count, direction, - dest, - execute_once); - } + dest); /* * Update es_total_processed to keep track of the number of tuples @@ -1601,22 +1591,19 @@ ExecCloseRangeTableRelations(EState *estate) * moving in the specified direction. * * Runs to completion if numberTuples is 0 - * - * Note: the ctid attribute is a 'junk' attribute that is removed before the - * user can see it * ---------------------------------------------------------------- */ static void -ExecutePlan(EState *estate, - PlanState *planstate, +ExecutePlan(QueryDesc *queryDesc, bool use_parallel_mode, CmdType operation, bool sendTuples, uint64 numberTuples, ScanDirection direction, - DestReceiver *dest, - bool execute_once) + DestReceiver *dest) { + EState *estate = queryDesc->estate; + PlanState *planstate = queryDesc->planstate; TupleTableSlot *slot; uint64 current_tuple_count; @@ -1631,11 +1618,13 @@ ExecutePlan(EState *estate, estate->es_direction = direction; /* - * If the plan might potentially be executed multiple times, we must force - * it to run without parallelism, because we might exit early. + * Parallel mode only supports complete execution of a plan. If we've + * already partially executed it, or if the caller asks us to exit early, + * we must force the plan to run without parallelism. */ - if (!execute_once) + if (queryDesc->already_executed || numberTuples != 0) use_parallel_mode = false; + queryDesc->already_executed = true; estate->es_use_parallel_mode = use_parallel_mode; if (use_parallel_mode) diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c index bfb3419efb..846ec727de 100644 --- a/src/backend/executor/execParallel.c +++ b/src/backend/executor/execParallel.c @@ -1471,8 +1471,7 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc) */ ExecutorRun(queryDesc, ForwardScanDirection, - fpes->tuples_needed < 0 ? (int64) 0 : fpes->tuples_needed, - true); + fpes->tuples_needed < 0 ? (int64) 0 : fpes->tuples_needed); /* Shut down the executor */ ExecutorFinish(queryDesc); diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 8d1fda2ddc..3b2f78b219 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -894,7 +894,7 @@ postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache) /* Run regular commands to completion unless lazyEval */ uint64 count = (es->lazyEval) ? 1 : 0; - ExecutorRun(es->qd, ForwardScanDirection, count, !fcache->returnsSet || !es->lazyEval); + ExecutorRun(es->qd, ForwardScanDirection, count); /* * If we requested run to completion OR there was no tuple returned, diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 2fb2e73604..c1d8fd08c6 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -2929,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount) ExecutorStart(queryDesc, eflags); - ExecutorRun(queryDesc, ForwardScanDirection, tcount, true); + ExecutorRun(queryDesc, ForwardScanDirection, tcount); _SPI_current->processed = queryDesc->estate->es_processed; diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 42af768045..e0a603f42b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -1283,7 +1283,6 @@ exec_simple_query(const char *query_string) (void) PortalRun(portal, FETCH_ALL, true, /* always top level */ - true, receiver, receiver, &qc); @@ -2260,7 +2259,6 @@ exec_execute_message(const char *portal_name, long max_rows) completed = PortalRun(portal, max_rows, true, /* always top level */ - !execute_is_fetch && max_rows == FETCH_ALL, receiver, receiver, &qc); diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c index 0c45fcf318..89d704df8d 100644 --- a/src/backend/tcop/pquery.c +++ b/src/backend/tcop/pquery.c @@ -157,7 +157,7 @@ ProcessQuery(PlannedStmt *plan, /* * Run the plan to completion. */ - ExecutorRun(queryDesc, ForwardScanDirection, 0, true); + ExecutorRun(queryDesc, ForwardScanDirection, 0); /* * Build command completion status data, if caller wants one. @@ -681,7 +681,7 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats) * suspended due to exhaustion of the count parameter. */ bool -PortalRun(Portal portal, long count, bool isTopLevel, bool run_once, +PortalRun(Portal portal, long count, bool isTopLevel, DestReceiver *dest, DestReceiver *altdest, QueryCompletion *qc) { @@ -714,10 +714,6 @@ PortalRun(Portal portal, long count, bool isTopLevel, bool run_once, */ MarkPortalActive(portal); - /* Set run_once flag. Shouldn't be clear if previously set. */ - Assert(!portal->run_once || run_once); - portal->run_once = run_once; - /* * Set up global portal context pointers. * @@ -921,8 +917,7 @@ PortalRunSelect(Portal portal, else { PushActiveSnapshot(queryDesc->snapshot); - ExecutorRun(queryDesc, direction, (uint64) count, - portal->run_once); + ExecutorRun(queryDesc, direction, (uint64) count); nprocessed = queryDesc->estate->es_processed; PopActiveSnapshot(); } @@ -961,8 +956,7 @@ PortalRunSelect(Portal portal, else { PushActiveSnapshot(queryDesc->snapshot); - ExecutorRun(queryDesc, direction, (uint64) count, - portal->run_once); + ExecutorRun(queryDesc, direction, (uint64) count); nprocessed = queryDesc->estate->es_processed; PopActiveSnapshot(); } @@ -1406,9 +1400,6 @@ PortalRunFetch(Portal portal, */ MarkPortalActive(portal); - /* If supporting FETCH, portal can't be run-once. */ - Assert(!portal->run_once); - /* * Set up global portal context pointers. */ diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h index 0a7274e26c..79e8b63111 100644 --- a/src/include/executor/execdesc.h +++ b/src/include/executor/execdesc.h @@ -48,7 +48,7 @@ typedef struct QueryDesc EState *estate; /* executor's query-wide state */ PlanState *planstate; /* tree of per-plan-node state */ - /* This field is set by ExecutorRun */ + /* This field is set by ExecutePlan */ bool already_executed; /* true if previously executed */ /* This is always set NULL by the core system, but plugins can change it */ diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 69c3ebff00..892cf055cd 100644 --- a/src/include/executor/executor.h +++ b/src/include/executor/executor.h @@ -78,8 +78,7 @@ extern PGDLLIMPORT ExecutorStart_hook_type ExecutorStart_hook; /* Hook for plugins to get control in ExecutorRun() */ typedef void (*ExecutorRun_hook_type) (QueryDesc *queryDesc, ScanDirection direction, - uint64 count, - bool execute_once); + uint64 count); extern PGDLLIMPORT ExecutorRun_hook_type ExecutorRun_hook; /* Hook for plugins to get control in ExecutorFinish() */ @@ -200,9 +199,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull) extern void ExecutorStart(QueryDesc *queryDesc, int eflags); extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags); extern void ExecutorRun(QueryDesc *queryDesc, - ScanDirection direction, uint64 count, bool execute_once); + ScanDirection direction, uint64 count); extern void standard_ExecutorRun(QueryDesc *queryDesc, - ScanDirection direction, uint64 count, bool execute_once); + ScanDirection direction, uint64 count); extern void ExecutorFinish(QueryDesc *queryDesc); extern void standard_ExecutorFinish(QueryDesc *queryDesc); extern void ExecutorEnd(QueryDesc *queryDesc); diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h index 073fb323bc..37ba3160e5 100644 --- a/src/include/tcop/pquery.h +++ b/src/include/tcop/pquery.h @@ -36,7 +36,7 @@ extern void PortalSetResultFormat(Portal portal, int nFormats, int16 *formats); extern bool PortalRun(Portal portal, long count, bool isTopLevel, - bool run_once, DestReceiver *dest, DestReceiver *altdest, + DestReceiver *dest, DestReceiver *altdest, QueryCompletion *qc); extern uint64 PortalRunFetch(Portal portal, diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h index 29f49829f2..afe605a8ed 100644 --- a/src/include/utils/portal.h +++ b/src/include/utils/portal.h @@ -145,7 +145,6 @@ typedef struct PortalData /* Features/options */ PortalStrategy strategy; /* see above */ int cursorOptions; /* DECLARE CURSOR option bits */ - bool run_once; /* portal will only be run once */ /* Status data */ PortalStatus status; /* see above */
I wrote: > After looking at this further, I think this whole "run_once" > business is badly designed, worse documented, and redundant > (as well as buggy). It can be replaced with three self-contained > lines in ExecutePlan, as per the attached. > (Obviously, the API changes in this wouldn't do for the back > branches. But we could leave those arguments in place as > unused dummies.) Here's a cut-down version that I think would be OK to use in the back branches. regards, tom lane diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index f8113d116d..3ccdbde112 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -75,14 +75,13 @@ static void InitPlan(QueryDesc *queryDesc, int eflags); static void CheckValidRowMarkRel(Relation rel, RowMarkType markType); static void ExecPostprocessPlan(EState *estate); static void ExecEndPlan(PlanState *planstate, EState *estate); -static void ExecutePlan(EState *estate, PlanState *planstate, +static void ExecutePlan(QueryDesc *queryDesc, bool use_parallel_mode, CmdType operation, bool sendTuples, uint64 numberTuples, ScanDirection direction, - DestReceiver *dest, - bool execute_once); + DestReceiver *dest); static bool ExecCheckOneRelPerms(RTEPermissionInfo *perminfo); static bool ExecCheckPermissionsModified(Oid relOid, Oid userid, Bitmapset *modifiedCols, @@ -283,6 +282,9 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags) * retrieved tuples, not for instance to those inserted/updated/deleted * by a ModifyTable plan node. * + * execute_once is ignored, and is present only to avoid an API break + * in stable branches. + * * There is no return value, but output tuples (if any) are sent to * the destination receiver specified in the QueryDesc; and the number * of tuples processed at the top level can be found in @@ -357,21 +359,13 @@ standard_ExecutorRun(QueryDesc *queryDesc, * run plan */ if (!ScanDirectionIsNoMovement(direction)) - { - if (execute_once && queryDesc->already_executed) - elog(ERROR, "can't re-execute query flagged for single execution"); - queryDesc->already_executed = true; - - ExecutePlan(estate, - queryDesc->planstate, + ExecutePlan(queryDesc, queryDesc->plannedstmt->parallelModeNeeded, operation, sendTuples, count, direction, - dest, - execute_once); - } + dest); /* * Update es_total_processed to keep track of the number of tuples @@ -1600,22 +1594,19 @@ ExecCloseRangeTableRelations(EState *estate) * moving in the specified direction. * * Runs to completion if numberTuples is 0 - * - * Note: the ctid attribute is a 'junk' attribute that is removed before the - * user can see it * ---------------------------------------------------------------- */ static void -ExecutePlan(EState *estate, - PlanState *planstate, +ExecutePlan(QueryDesc *queryDesc, bool use_parallel_mode, CmdType operation, bool sendTuples, uint64 numberTuples, ScanDirection direction, - DestReceiver *dest, - bool execute_once) + DestReceiver *dest) { + EState *estate = queryDesc->estate; + PlanState *planstate = queryDesc->planstate; TupleTableSlot *slot; uint64 current_tuple_count; @@ -1630,11 +1621,13 @@ ExecutePlan(EState *estate, estate->es_direction = direction; /* - * If the plan might potentially be executed multiple times, we must force - * it to run without parallelism, because we might exit early. + * Parallel mode only supports complete execution of a plan. If we've + * already partially executed it, or if the caller asks us to exit early, + * we must force the plan to run without parallelism. */ - if (!execute_once) + if (queryDesc->already_executed || numberTuples != 0) use_parallel_mode = false; + queryDesc->already_executed = true; estate->es_use_parallel_mode = use_parallel_mode; if (use_parallel_mode) diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index a750dc800b..1887b11982 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -1278,7 +1278,7 @@ exec_simple_query(const char *query_string) (void) PortalRun(portal, FETCH_ALL, true, /* always top level */ - true, + true, /* ignored */ receiver, receiver, &qc); @@ -2255,7 +2255,7 @@ exec_execute_message(const char *portal_name, long max_rows) completed = PortalRun(portal, max_rows, true, /* always top level */ - !execute_is_fetch && max_rows == FETCH_ALL, + true, /* ignored */ receiver, receiver, &qc); diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c index 0c45fcf318..0a72006823 100644 --- a/src/backend/tcop/pquery.c +++ b/src/backend/tcop/pquery.c @@ -670,6 +670,8 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats) * isTopLevel: true if query is being executed at backend "top level" * (that is, directly from a client command message) * + * run_once: ignored, present only to avoid an API break in stable branches. + * * dest: where to send output of primary (canSetTag) query * * altdest: where to send output of non-primary queries @@ -714,10 +716,6 @@ PortalRun(Portal portal, long count, bool isTopLevel, bool run_once, */ MarkPortalActive(portal); - /* Set run_once flag. Shouldn't be clear if previously set. */ - Assert(!portal->run_once || run_once); - portal->run_once = run_once; - /* * Set up global portal context pointers. * @@ -922,7 +920,7 @@ PortalRunSelect(Portal portal, { PushActiveSnapshot(queryDesc->snapshot); ExecutorRun(queryDesc, direction, (uint64) count, - portal->run_once); + false); nprocessed = queryDesc->estate->es_processed; PopActiveSnapshot(); } @@ -962,7 +960,7 @@ PortalRunSelect(Portal portal, { PushActiveSnapshot(queryDesc->snapshot); ExecutorRun(queryDesc, direction, (uint64) count, - portal->run_once); + false); nprocessed = queryDesc->estate->es_processed; PopActiveSnapshot(); } @@ -1406,9 +1404,6 @@ PortalRunFetch(Portal portal, */ MarkPortalActive(portal); - /* If supporting FETCH, portal can't be run-once. */ - Assert(!portal->run_once); - /* * Set up global portal context pointers. */ diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h index 0a7274e26c..79e8b63111 100644 --- a/src/include/executor/execdesc.h +++ b/src/include/executor/execdesc.h @@ -48,7 +48,7 @@ typedef struct QueryDesc EState *estate; /* executor's query-wide state */ PlanState *planstate; /* tree of per-plan-node state */ - /* This field is set by ExecutorRun */ + /* This field is set by ExecutePlan */ bool already_executed; /* true if previously executed */ /* This is always set NULL by the core system, but plugins can change it */ diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h index 29f49829f2..6c63515074 100644 --- a/src/include/utils/portal.h +++ b/src/include/utils/portal.h @@ -145,7 +145,7 @@ typedef struct PortalData /* Features/options */ PortalStrategy strategy; /* see above */ int cursorOptions; /* DECLARE CURSOR option bits */ - bool run_once; /* portal will only be run once */ + bool run_once; /* unused */ /* Status data */ PortalStatus status; /* see above */
On Sun, 08 Dec 2024 14:25:53 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > After looking at this further, I think this whole "run_once" > > business is badly designed, worse documented, and redundant > > (as well as buggy). It can be replaced with three self-contained > > lines in ExecutePlan, as per the attached. > > > (Obviously, the API changes in this wouldn't do for the back > > branches. But we could leave those arguments in place as > > unused dummies.) > > Here's a cut-down version that I think would be OK to use in the > back branches. I confirmed both versions of the pach fixed the issue using pgproto. Also, I feel your fix makes the codes easier to understand at least for me. I have a few minor comment: * moving in the specified direction. * * Runs to completion if numberTuples is 0 - * - * Note: the ctid attribute is a 'junk' attribute that is removed before the - * user can see it * ---------------------------------------------------------------- */ This comment remove seems not related to the fix of ExecutePlan. Should this actually have been done by 8a5849b7ff24c? static void -ExecutePlan(EState *estate, - PlanState *planstate, +ExecutePlan(QueryDesc *queryDesc, bool use_parallel_mode, CmdType operation, bool sendTuples, uint64 numberTuples, ScanDirection direction, - DestReceiver *dest, - bool execute_once) + DestReceiver *dest) { + EState *estate = queryDesc->estate; + PlanState *planstate = queryDesc->planstate; TupleTableSlot *slot; uint64 current_tuple_count I guess planstate is removed due to the redundancy that this is included in queryDesc. If so, I think we can also remove use_parallel_mode for the same reason. Regards, Yugo Nagata -- Yugo NAGATA <nagata@sraoss.co.jp>
Yugo NAGATA <nagata@sraoss.co.jp> writes: > On Sun, 08 Dec 2024 14:25:53 -0500 > Tom Lane <tgl@sss.pgh.pa.us> wrote: > - * > - * Note: the ctid attribute is a 'junk' attribute that is removed before the > - * user can see it > * ---------------------------------------------------------------- > */ > This comment remove seems not related to the fix of ExecutePlan. Should this > actually have been done by 8a5849b7ff24c? Yeah, I'm sure it should have been removed a long time ago. Presumably it's there because ctid was once an explicit part of ExecutePlan's API; but that's surely not been so for a long time, and I don't see that this sentence has any remaining relevance where it is. You could say that getting rid of that dead comment should be a separate commit, but I thought that was a bit too pedantic. > static void > -ExecutePlan(EState *estate, > - PlanState *planstate, > +ExecutePlan(QueryDesc *queryDesc, > bool use_parallel_mode, > CmdType operation, > bool sendTuples, > uint64 numberTuples, > ScanDirection direction, > - DestReceiver *dest, > - bool execute_once) > + DestReceiver *dest) > { > + EState *estate = queryDesc->estate; > + PlanState *planstate = queryDesc->planstate; > TupleTableSlot *slot; > uint64 current_tuple_count > I guess planstate is removed due to the redundancy that this is included > in queryDesc. If so, I think we can also remove use_parallel_mode for the > same reason. Certainly there's a bit of aesthetic judgment involved here. In principle, now that we're passing down the QueryDesc, we could remove the operation and dest arguments too. I chose not to do that on the grounds that ExecutorRun is deciding what ExecutePlan should do, and at least in principle it could decide to pass something else for these arguments than what it passes today. But it's hard to see how it would pass a different EState or PlanState tree than what is in the QueryDesc, so I thought it was okay to remove those arguments. Somebody else writing this patch might have done that differently, but I think that's fine. A different line of thought is that the reason we need this change is that the already_executed flag is in the wrong place. If we moved it into EState, which feels like a more natural place, we could avoid refactoring ExecutePlan's API. But that seemed more invasive in the big picture, and it'd involve more risk in the back branches, so I didn't go that way. Thanks for reviewing and testing the patch! regards, tom lane
I wrote: > Yugo NAGATA <nagata@sraoss.co.jp> writes: >> I guess planstate is removed due to the redundancy that this is included >> in queryDesc. If so, I think we can also remove use_parallel_mode for the >> same reason. > Certainly there's a bit of aesthetic judgment involved here. After thinking awhile longer, I'm coming around to the idea that you're right and it'd be better to get rid of ExecutePlan's use_parallel_mode argument. We clearly have to pass down count and direction, and it's sensible to pass down operation, sendTuples, and dest because ExecutorRun is involved with those to perform dest-receiver startup and shutdown. But ExecutorRun has no real concern with the parallelism decision, so it seems better to unify that logic in one place. I'll make that change before pushing. regards, tom lane
On Thu, Dec 5, 2024 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > After looking at this further, I think this whole "run_once" > business is badly designed, worse documented, and redundant > (as well as buggy). It can be replaced with three self-contained > lines in ExecutePlan, as per the attached. Did you look at the commit message for 691b8d59281b5177f16fe80858df921f77a8e955? IMHO, that's pretty clear about what the goals of this were. I didn't consider the scenario mentioned by Nagata-san in his original post, namely, executing a portal with a row count of 0 and then doing the same thing again. So in that sense the assertion fulfilled its intended purpose: alerting somebody that there is a case which is reachable but which the original programmer believed to be unreachable. I haven't tested, but I'm guessing that what happens in Nagata-san's case with the committed fix is that we execute the plan once to completion in parallel mode; and then the second execute message executes the same plan tree again in non-parallel mode but this time it ends up returning zero tuples because it's already been executed to completion. If that is correct, then I think this is subtly changing the rules for parallel Plan trees: before, the rule as I understood it, was "only complete execution must supported" but now it's "complete executed must be supported and, in addition, a no-op re-execution after complete execution must be supported". Off the top of my head that seems like it should be fine. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Dec 5, 2024 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> After looking at this further, I think this whole "run_once" >> business is badly designed, worse documented, and redundant >> (as well as buggy). It can be replaced with three self-contained >> lines in ExecutePlan, as per the attached. > Did you look at the commit message for > 691b8d59281b5177f16fe80858df921f77a8e955? IMHO, that's pretty clear > about what the goals of this were. Well, it's not very clear about what implementation limitations we are trying to protect. > I haven't tested, but I'm guessing that what happens in Nagata-san's > case with the committed fix is that we execute the plan once to > completion in parallel mode; and then the second execute message > executes the same plan tree again in non-parallel mode but this time > it ends up returning zero tuples because it's already been executed to > completion. If that is correct, then I think this is subtly changing > the rules for parallel Plan trees: before, the rule as I understood > it, was "only complete execution must supported" but now it's > "complete executed must be supported and, in addition, a no-op > re-execution after complete execution must be supported". Off the top > of my head that seems like it should be fine. Hmm. As committed, the re-execution will happen with use_parallel_mode disabled, which might make that safer, or maybe it's less safe? Do you think we need something closer to the upthread proposal to not run the executor at all during the "re-execution"? (I'd still be inclined to put that short-circuit into ExecutePlan not anything higher-level.) My recollection is that even before parallelism we had some plan node types that didn't cope well with being called again after they'd once returned EOF (ie a null tuple). So maybe a defense against trying to do that would be wise. It could probably be as simple as checking estate->es_finished. regards, tom lane
On Tue, Dec 10, 2024 at 12:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Did you look at the commit message for > > 691b8d59281b5177f16fe80858df921f77a8e955? IMHO, that's pretty clear > > about what the goals of this were. > > Well, it's not very clear about what implementation limitations we > are trying to protect. OK, I see that I was vague about that. The core problem here is that when we start parallel workers, we do a one-time state transfer from the leader to the workers. If the leader makes any changes to any of that state, or the workers do, then they're out of sync, and very bad things will happen. To prevent that, we need a mechanism to keep any of that state from changing while there are active workers, and that mechanism is parallel mode. The leader enters parallel mode (via EnterParallelMode()) before launching any workers and must only exit parallel mode after they're all gone; the workers run in parallel mode the whole time. Hence, we are (hopefully) guaranteed that the relevant state can't change under us. Hence, the most fundamental restriction here is that ExecutePlan() had better not reach the call to ExitParallelMode() at the end of the function while there are still workers active. Since workers don't exit until they've run their portion of the plan to completion (except in case of error), we need to run the plan to completion in the leader too before reaching ExecutePlan() escapes from the loop; otherwise, some worker could have filled up its output queue with tuples and then blocked waiting for us to read them and we never did. In the case at issue, if the plan was already run to completion, then there shouldn't be any workers active any more and trying to re-execute the completed tree should not launch any new ones, so in theory there's no problem, but... > Hmm. As committed, the re-execution will happen with > use_parallel_mode disabled, which might make that safer, or maybe it's > less safe? Do you think we need something closer to the upthread > proposal to not run the executor at all during the "re-execution"? > (I'd still be inclined to put that short-circuit into ExecutePlan > not anything higher-level.) ...hypothetically, there could be code which doesn't enjoy being called on the same plan tree first in parallel mode and then later in non-parallel mode. I can't really see any good reason for such code to actually exist. For example, ExecGather() bases its decision about whether to launch workers on gather->num_workers > 0 && estate->es_use_parallel_mode, but it looks to me like all the decisions about whether to read from workers or close down workers are based on what workers actually got launched without further reference to the criteria that ExecGather() used to decide whether to launch them in the first place. That seems like the right way for things to be coded, and I think that probably all the things are actually coded that way. I just wanted to point out that it wasn't theoretically impossible for this change to run into some lower-level problem. > My recollection is that even before parallelism we had some plan node > types that didn't cope well with being called again after they'd once > returned EOF (ie a null tuple). So maybe a defense against trying to > do that would be wise. It could probably be as simple as checking > estate->es_finished. It's not a crazy idea, but it might be better to fix those node types. It might just be me, but I find the comments in the executor to be pretty lacking and there's a good amount of boilerplate that seems to have been copy-and-pasted around without the author totally understanding what was happening. That never gets better if we just paper over the bugs. -- Robert Haas EDB: http://www.enterprisedb.com
I wrote: > My recollection is that even before parallelism we had some plan node > types that didn't cope well with being called again after they'd once > returned EOF (ie a null tuple). So maybe a defense against trying to > do that would be wise. I spent a little time trying to run that recollection to ground, and after awhile arrived at this comment in heapam.c: * Note: when we fall off the end of the scan in either direction, we * reset rs_inited. This means that a further request with the same * scan direction will restart the scan, which is a bit odd, but a * request with the opposite scan direction will start a fresh scan * in the proper direction. The latter is required behavior for cursors, * while the former case is generally undefined behavior in Postgres * so we don't care too much. So indeed nodes types as basic as SeqScan will misbehave if that happens. However, there is also logic in pquery.c that intends to provide defenses against this, for instance in PortalRunSelect: * Determine which direction to go in, and check to see if we're already * at the end of the available tuples in that direction. If so, set the * direction to NoMovement to avoid trying to fetch any tuples. (This * check exists because not all plan node types are robust about being * called again if they've already returned NULL once.) Then call the * executor (we must not skip this, because the destination needs to see a * setup and shutdown even if no tuples are available). Finally, update * the portal position state depending on the number of tuples that were * retrieved. */ if (forward) { if (portal->atEnd || count <= 0) { direction = NoMovementScanDirection; count = 0; /* don't pass negative count to executor */ } And ExecutorRun skips ExecutePlan if direction is NoMovementScanDirection. So unless there are gaps in pquery.c's EOF checks, I think we're OK: the "re-execution" call will not reach ExecutePlan. This could probably do with more commentary though. regards, tom lane
On Tue, Dec 10, 2024 at 1:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > And ExecutorRun skips ExecutePlan if direction is > NoMovementScanDirection. So unless there are gaps in pquery.c's EOF > checks, I think we're OK: the "re-execution" call will not reach > ExecutePlan. This could probably do with more commentary though. Thanks for the research, and +1 if you feel like adding more commentary. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > Thanks for the research, and +1 if you feel like adding more commentary. I'm thinking about something like this: diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 13f3fcdaee..7ebb17fc2e 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -348,7 +348,15 @@ standard_ExecutorRun(QueryDesc *queryDesc, dest->rStartup(dest, operation, queryDesc->tupDesc); /* - * run plan + * Run plan, unless direction is NoMovement. + * + * Note: pquery.c selects NoMovement if a prior call already reached + * end-of-data in the user-specified fetch direction. This is important + * because various parts of the executor can misbehave if called again + * after reporting EOF. For example, heapam.c would actually restart a + * heapscan and return all its data afresh. There is also reason to be + * concerned about whether parallel query mode would operate properly if a + * fresh execution request arrives after completing the query. */ if (!ScanDirectionIsNoMovement(direction)) ExecutePlan(queryDesc, It's slightly annoying that the executor is depending on the Portal machinery to guarantee that it won't do something wrong. However, that was true in the previous formulation that relied on Portal.run_once, too. I don't see another way that wouldn't amount to duplicating the Portal-level state in the executor, and I don't think it's worth doing that. regards, tom lane
On Tue, Dec 10, 2024 at 2:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > Thanks for the research, and +1 if you feel like adding more commentary. > > I'm thinking about something like this: > > diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c > index 13f3fcdaee..7ebb17fc2e 100644 > --- a/src/backend/executor/execMain.c > +++ b/src/backend/executor/execMain.c > @@ -348,7 +348,15 @@ standard_ExecutorRun(QueryDesc *queryDesc, > dest->rStartup(dest, operation, queryDesc->tupDesc); > > /* > - * run plan > + * Run plan, unless direction is NoMovement. > + * > + * Note: pquery.c selects NoMovement if a prior call already reached > + * end-of-data in the user-specified fetch direction. This is important > + * because various parts of the executor can misbehave if called again > + * after reporting EOF. For example, heapam.c would actually restart a > + * heapscan and return all its data afresh. There is also reason to be > + * concerned about whether parallel query mode would operate properly if a > + * fresh execution request arrives after completing the query. > */ > if (!ScanDirectionIsNoMovement(direction)) > ExecutePlan(queryDesc, That seems pretty good, although the last sentence seems like it might be a little too alarming. Maybe add "although we know of no specific problem" or something like that. > It's slightly annoying that the executor is depending on the Portal > machinery to guarantee that it won't do something wrong. However, > that was true in the previous formulation that relied on > Portal.run_once, too. I don't see another way that wouldn't > amount to duplicating the Portal-level state in the executor, > and I don't think it's worth doing that. The portal mechanism is a hot mess, IMHO, and needs some serious redesign, or at least cleanup. For example, the fact that portal->cleanup is always either PortalCleanup or NULL is an "interesting" design choice. MarkPortalDone() and MarkPortalFailed() looked like they do the same amount of cleanup but, ha ha, no they don't, because the one and only cleanup hook peeks behind the curtain to figure out who is calling it. This code looks like it was written by somebody who didn't initially understand the requirements and then kept hitting it with a sledgehammer until it stopped complaining without ever stopping to reconsider the basic design. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Dec 10, 2024 at 2:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm thinking about something like this: > That seems pretty good, although the last sentence seems like it might > be a little too alarming. Maybe add "although we know of no specific > problem" or something like that. OK, I'll tone it down a bit. > The portal mechanism is a hot mess, IMHO, and needs some serious > redesign, or at least cleanup. For example, the fact that > portal->cleanup is always either PortalCleanup or NULL is an > "interesting" design choice. While I'm not here to defend that code particularly, that part doesn't seem totally insane to me. The intent clearly was to support more than one form of cleanup. Maybe after 25 years we can conclude that we'll never need more than one form, but I'm not going to fault whoever wrote it for not having an operational crystal ball. > MarkPortalDone() and MarkPortalFailed() > looked like they do the same amount of cleanup but, ha ha, no they > don't, because the one and only cleanup hook peeks behind the curtain > to figure out who is calling it. If you're talking about * Shut down executor, if still running. We skip this during error abort, * since other mechanisms will take care of releasing executor resources, * and we can't be sure that ExecutorEnd itself wouldn't fail. it's hardly the fault of the Portal logic that ExecutorEnd is unsafe to call during abort. (ExecutorEnd shares that property with a boatload of other code, too.) Anyway, if you feel like rewriting that stuff, step right up. My feeling about it is that the law of conservation of cruft will prevent a replacement from being all that much cleaner, but maybe I'm wrong. regards, tom lane
On Tue, Dec 10, 2024 at 4:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Anyway, if you feel like rewriting that stuff, step right up. > My feeling about it is that the law of conservation of cruft > will prevent a replacement from being all that much cleaner, > but maybe I'm wrong. Thanks for the thoughts. It's always good to get your perspective on stuff like this, or at least I find it so. -- Robert Haas EDB: http://www.enterprisedb.com