Re: POC: converting Lists into arrays - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: POC: converting Lists into arrays |
Date | |
Msg-id | 21272.1563318411@sss.pgh.pa.us Whole thread Raw |
In response to | Re: POC: converting Lists into arrays (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: POC: converting Lists into arrays
Re: POC: converting Lists into arrays Re: POC: converting Lists into arrays |
List | pgsql-hackers |
I wrote: > * Look at places using lcons/list_delete_first to maintain FIFO lists. > The patch makes these O(N^2) for long lists. If we can reverse the list > order and use lappend/list_truncate instead, it'd be better. Possibly in > some places the list ordering is critical enough to make this impractical, > but I suspect it's an easy win in most. Attached are two patches that touch all the places where it seemed like an easy win to stop using lcons and/or list_delete_first. 0001 adds list_delete_last() as a mirror image to list_delete_first(), and changes all the places where it seemed 100% safe to do so (ie, there's no behavioral change because the list order is demonstrably immaterial). 0002 changes some additional places where it's maybe a bit less safe, ie there's a potential for user-visible behavioral change because processing will occur in a different order. In particular, the proposed change in execExpr.c causes aggregates and window functions that are in the same plan node to be executed in a different order than before --- but it seems to me that this order is saner. (Note the change in the expected regression results, in a test that's intentionally sensitive to the execution order.) And anyway when did we guarantee anything about that? I refrained from changing lcons to lappend in get_relation_info, because that demonstrably causes the planner to change its choices when two indexes look equally attractive, and probably people would complain about that. I think that the other changes proposed in 0002 are pretty harmless --- for example, in get_tables_to_cluster the order depends initially on the results of a seqscan of pg_index, so anybody who's expecting stability is in for rude surprises anyhow. Also, the proposed changes in plancat.c, parse_agg.c, selfuncs.c almost certainly have no user-visible effect, but maybe there could be changes at the roundoff-error level due to processing estimates in a different order? There are a bunch of places that are using list_delete_first to remove the next-to-process entry from a List used as a queue. In principle, we could invert the order of those queues and then use list_delete_last, but I thought this would probably be too confusing: it's natural to think of the front of the list as being the head of the queue. I doubt that any of those queues get long enough for it to be a serious performance problem to leave them as-is. (Actually, I doubt that any of these changes will really move the performance needle in the real world. It's more a case of wanting the code to present good examples not bad ones.) Thoughts? Anybody want to object to any of the changes in 0002? regards, tom lane diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index dfb51f6..169bf6f 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -1323,8 +1323,6 @@ static void gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, GISTSTATE *giststate, List *splitinfo, bool unlockbuf) { - ListCell *lc; - List *reversed; GISTPageSplitInfo *right; GISTPageSplitInfo *left; IndexTuple tuples[2]; @@ -1339,14 +1337,6 @@ gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, * left. Finally insert the downlink for the last new page and update the * downlink for the original page as one operation. */ - - /* for convenience, create a copy of the list in reverse order */ - reversed = NIL; - foreach(lc, splitinfo) - { - reversed = lcons(lfirst(lc), reversed); - } - LockBuffer(stack->parent->buffer, GIST_EXCLUSIVE); gistFindCorrectParent(state->r, stack); @@ -1354,10 +1344,10 @@ gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, * insert downlinks for the siblings from right to left, until there are * only two siblings left. */ - while (list_length(reversed) > 2) + for (int pos = list_length(splitinfo) - 1; pos > 1; pos--) { - right = (GISTPageSplitInfo *) linitial(reversed); - left = (GISTPageSplitInfo *) lsecond(reversed); + right = (GISTPageSplitInfo *) list_nth(splitinfo, pos); + left = (GISTPageSplitInfo *) list_nth(splitinfo, pos - 1); if (gistinserttuples(state, stack->parent, giststate, &right->downlink, 1, @@ -1371,11 +1361,10 @@ gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, gistFindCorrectParent(state->r, stack); } /* gistinserttuples() released the lock on right->buf. */ - reversed = list_delete_first(reversed); } - right = (GISTPageSplitInfo *) linitial(reversed); - left = (GISTPageSplitInfo *) lsecond(reversed); + right = (GISTPageSplitInfo *) lsecond(splitinfo); + left = (GISTPageSplitInfo *) linitial(splitinfo); /* * Finally insert downlink for the remaining right page and update the diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 6c3ff76..032fab9 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -633,7 +633,7 @@ CheckAttributeType(const char *attname, errmsg("composite type %s cannot be made a member of itself", format_type_be(atttypid)))); - containing_rowtypes = lcons_oid(atttypid, containing_rowtypes); + containing_rowtypes = lappend_oid(containing_rowtypes, atttypid); relation = relation_open(get_typ_typrelid(atttypid), AccessShareLock); @@ -653,7 +653,7 @@ CheckAttributeType(const char *attname, relation_close(relation, AccessShareLock); - containing_rowtypes = list_delete_first(containing_rowtypes); + containing_rowtypes = list_delete_last(containing_rowtypes); } else if (OidIsValid((att_typelem = get_element_type(atttypid)))) { diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c index 417d595..bae3b38 100644 --- a/src/backend/commands/lockcmds.c +++ b/src/backend/commands/lockcmds.c @@ -281,11 +281,11 @@ LockViewRecurse(Oid reloid, LOCKMODE lockmode, bool nowait, List *ancestor_views context.nowait = nowait; context.viewowner = view->rd_rel->relowner; context.viewoid = reloid; - context.ancestor_views = lcons_oid(reloid, ancestor_views); + context.ancestor_views = lappend_oid(ancestor_views, reloid); LockViewRecurse_walker((Node *) viewquery, &context); - ancestor_views = list_delete_oid(ancestor_views, reloid); + (void) list_delete_last(context.ancestor_views); table_close(view, NoLock); } diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 0c0ddd5..fc1c4df 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -14480,6 +14480,11 @@ register_on_commit_action(Oid relid, OnCommitAction action) oc->creating_subid = GetCurrentSubTransactionId(); oc->deleting_subid = InvalidSubTransactionId; + /* + * We use lcons() here so that ON COMMIT actions are processed in reverse + * order of registration. That might not be essential but it seems + * reasonable. + */ on_commits = lcons(oc, on_commits); MemoryContextSwitchTo(oldcxt); diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c index 5584fa8..9163464 100644 --- a/src/backend/nodes/list.c +++ b/src/backend/nodes/list.c @@ -827,6 +827,30 @@ list_delete_first(List *list) } /* + * Delete the last element of the list. + * + * This is the opposite of list_delete_first(), but is noticeably cheaper + * with a long list, since no data need be moved. + */ +List * +list_delete_last(List *list) +{ + check_list_invariants(list); + + if (list == NIL) + return NIL; /* would an error be better? */ + + /* list_truncate won't free list if it goes to empty, but this should */ + if (list_length(list) <= 1) + { + list_free(list); + return NIL; + } + + return list_truncate(list, list_length(list) - 1); +} + +/* * Generate the union of two lists. This is calculated by copying * list1 via list_copy(), then adding to it all the members of list2 * that aren't already in list1. diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index f0e789f..99dbf8d 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -881,11 +881,9 @@ is_parallel_safe(PlannerInfo *root, Node *node) foreach(l, proot->init_plans) { SubPlan *initsubplan = (SubPlan *) lfirst(l); - ListCell *l2; - foreach(l2, initsubplan->setParam) - context.safe_param_ids = lcons_int(lfirst_int(l2), - context.safe_param_ids); + context.safe_param_ids = list_concat(context.safe_param_ids, + initsubplan->setParam); } } @@ -1015,6 +1013,7 @@ max_parallel_hazard_walker(Node *node, max_parallel_hazard_context *context) context->safe_param_ids); if (max_parallel_hazard_walker(subplan->testexpr, context)) return true; /* no need to restore safe_param_ids */ + list_free(context->safe_param_ids); context->safe_param_ids = save_safe_param_ids; /* we must also check args, but no special Param treatment there */ if (max_parallel_hazard_walker((Node *) subplan->args, context)) @@ -4185,8 +4184,8 @@ add_function_defaults(List *args, HeapTuple func_tuple) ndelete = nargsprovided + list_length(defaults) - funcform->pronargs; if (ndelete < 0) elog(ERROR, "not enough default arguments"); - while (ndelete-- > 0) - defaults = list_delete_first(defaults); + if (ndelete > 0) + defaults = list_copy_tail(defaults, ndelete); /* And form the combined argument list, not modifying the input list */ return list_concat(list_copy(args), defaults); @@ -4701,9 +4700,9 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, * Recursively try to simplify the modified expression. Here we must add * the current function to the context list of active functions. */ - context->active_fns = lcons_oid(funcid, context->active_fns); + context->active_fns = lappend_oid(context->active_fns, funcid); newexpr = eval_const_expressions_mutator(newexpr, context); - context->active_fns = list_delete_first(context->active_fns); + context->active_fns = list_delete_last(context->active_fns); error_context_stack = sqlerrcontext.previous; diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c index 5b047d1..93b6784 100644 --- a/src/backend/rewrite/rewriteHandler.c +++ b/src/backend/rewrite/rewriteHandler.c @@ -1973,7 +1973,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs) (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), errmsg("infinite recursion detected in rules for relation \"%s\"", RelationGetRelationName(rel)))); - activeRIRs = lcons_oid(RelationGetRelid(rel), activeRIRs); + activeRIRs = lappend_oid(activeRIRs, RelationGetRelid(rel)); foreach(l, locks) { @@ -1986,7 +1986,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs) activeRIRs); } - activeRIRs = list_delete_first(activeRIRs); + activeRIRs = list_delete_last(activeRIRs); } } @@ -2059,7 +2059,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs) errmsg("infinite recursion detected in policy for relation \"%s\"", RelationGetRelationName(rel)))); - activeRIRs = lcons_oid(RelationGetRelid(rel), activeRIRs); + activeRIRs = lappend_oid(activeRIRs, RelationGetRelid(rel)); /* * get_row_security_policies just passed back securityQuals @@ -2084,7 +2084,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs) expression_tree_walker((Node *) withCheckOptions, fireRIRonSubLink, (void *) activeRIRs); - activeRIRs = list_delete_first(activeRIRs); + activeRIRs = list_delete_last(activeRIRs); } /* @@ -3711,7 +3711,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events) rev = (rewrite_event *) palloc(sizeof(rewrite_event)); rev->relation = RelationGetRelid(rt_entry_relation); rev->event = event; - rewrite_events = lcons(rev, rewrite_events); + rewrite_events = lappend(rewrite_events, rev); foreach(n, product_queries) { @@ -3722,7 +3722,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events) rewritten = list_concat(rewritten, newstuff); } - rewrite_events = list_delete_first(rewrite_events); + rewrite_events = list_delete_last(rewrite_events); } /* diff --git a/src/include/nodes/pg_list.h b/src/include/nodes/pg_list.h index 71dc4dc..1463408 100644 --- a/src/include/nodes/pg_list.h +++ b/src/include/nodes/pg_list.h @@ -531,6 +531,7 @@ extern List *list_delete_ptr(List *list, void *datum); extern List *list_delete_int(List *list, int datum); extern List *list_delete_oid(List *list, Oid datum); extern List *list_delete_first(List *list); +extern List *list_delete_last(List *list); extern List *list_delete_nth_cell(List *list, int n); extern List *list_delete_cell(List *list, ListCell *cell); diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index ebaec4f..cedb4ee 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -1566,7 +1566,7 @@ get_tables_to_cluster(MemoryContext cluster_context) rvtc = (RelToCluster *) palloc(sizeof(RelToCluster)); rvtc->tableOid = index->indrelid; rvtc->indexOid = index->indexrelid; - rvs = lcons(rvtc, rvs); + rvs = lappend(rvs, rvtc); MemoryContextSwitchTo(old_context); } diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c index e9c8873..89887b8 100644 --- a/src/backend/commands/typecmds.c +++ b/src/backend/commands/typecmds.c @@ -2999,7 +2999,7 @@ get_rels_with_domain(Oid domainOid, LOCKMODE lockmode) rtc->rel = rel; rtc->natts = 0; rtc->atts = (int *) palloc(sizeof(int) * RelationGetNumberOfAttributes(rel)); - result = lcons(rtc, result); + result = lappend(result, rtc); } /* diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index e4e0575..6d09f2a 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -786,7 +786,7 @@ ExecInitExprRec(Expr *node, ExprState *state, { AggState *aggstate = (AggState *) state->parent; - aggstate->aggs = lcons(astate, aggstate->aggs); + aggstate->aggs = lappend(aggstate->aggs, astate); aggstate->numaggs++; } else @@ -834,7 +834,7 @@ ExecInitExprRec(Expr *node, ExprState *state, WindowAggState *winstate = (WindowAggState *) state->parent; int nfuncs; - winstate->funcs = lcons(wfstate, winstate->funcs); + winstate->funcs = lappend(winstate->funcs, wfstate); nfuncs = ++winstate->numfuncs; if (wfunc->winagg) winstate->numaggs++; diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 6ea625a..98e9948 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -419,6 +419,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, index_close(indexRelation, NoLock); + /* + * We've historically used lcons() here. It'd make more sense to + * use lappend(), but that causes the planner to change behavior + * in cases where two indexes seem equally attractive. For now, + * stick with lcons() --- few tables should have so many indexes + * that the O(N^2) behavior of lcons() is really a problem. + */ indexinfos = lcons(info, indexinfos); } @@ -1339,7 +1346,7 @@ get_relation_statistics(RelOptInfo *rel, Relation relation) info->kind = STATS_EXT_NDISTINCT; info->keys = bms_copy(keys); - stainfos = lcons(info, stainfos); + stainfos = lappend(stainfos, info); } if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES)) @@ -1351,7 +1358,7 @@ get_relation_statistics(RelOptInfo *rel, Relation relation) info->kind = STATS_EXT_DEPENDENCIES; info->keys = bms_copy(keys); - stainfos = lcons(info, stainfos); + stainfos = lappend(stainfos, info); } if (statext_is_kind_built(dtup, STATS_EXT_MCV)) @@ -1363,7 +1370,7 @@ get_relation_statistics(RelOptInfo *rel, Relation relation) info->kind = STATS_EXT_MCV; info->keys = bms_copy(keys); - stainfos = lcons(info, stainfos); + stainfos = lappend(stainfos, info); } ReleaseSysCache(htup); diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c index 19e3164..354030e 100644 --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -1132,7 +1132,7 @@ parseCheckAggregates(ParseState *pstate, Query *qry) if (expr == NULL) continue; /* probably cannot happen */ - groupClauses = lcons(expr, groupClauses); + groupClauses = lappend(groupClauses, expr); } /* diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 66449b8..7eba59e 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3201,7 +3201,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, * Split the list of varinfos in two - one for the current rel, one * for remaining Vars on other rels. */ - relvarinfos = lcons(varinfo1, relvarinfos); + relvarinfos = lappend(relvarinfos, varinfo1); for_each_cell(l, varinfos, list_second_cell(varinfos)) { GroupVarInfo *varinfo2 = (GroupVarInfo *) lfirst(l); @@ -3209,12 +3209,12 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, if (varinfo2->rel == varinfo1->rel) { /* varinfos on current rel */ - relvarinfos = lcons(varinfo2, relvarinfos); + relvarinfos = lappend(relvarinfos, varinfo2); } else { /* not time to process varinfo2 yet */ - newvarinfos = lcons(varinfo2, newvarinfos); + newvarinfos = lappend(newvarinfos, varinfo2); } } diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out index ef8eec3..be4ddf8 100644 --- a/src/test/regress/expected/aggregates.out +++ b/src/test/regress/expected/aggregates.out @@ -2030,10 +2030,10 @@ NOTICE: avg_transfn called with 3 -- this should not share the state due to different input columns. select my_avg(one),my_sum(two) from (values(1,2),(3,4)) t(one,two); -NOTICE: avg_transfn called with 2 NOTICE: avg_transfn called with 1 -NOTICE: avg_transfn called with 4 +NOTICE: avg_transfn called with 2 NOTICE: avg_transfn called with 3 +NOTICE: avg_transfn called with 4 my_avg | my_sum --------+-------- 2 | 6
pgsql-hackers by date: