Thread: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
David Rowley
Date:
Over in [1] I complained to Tom that I thought he should use list_nth() instead of linitial() and lsecond(). My reasoning was that we already knew that the list contained 2 elements, and linitial() and lsecond() do some additional checks and return NULL if there are fewer than that amount of elements in the list, or so I thought. As it turns out, due to the lfirst() dereferencing the returned pointer, we'd just segfault if the list being too short and we got NULL. Since there appears to be no actual safety reason to use those macros in case your list is too small, it seems better just to use lfirst(list_nth_cell(..., ...)). This saves doing the checks to see if the list is NIL or too short. It seems nice to get rid of that additional branching. We can't just return list_nth() as some code still do things like: linitial(list) = something; and we obviously can't assign "something" to the return value of a function call. I've attached a patch which improves the macros. I see on gcc9.3 this reduces the size of the postgres binary by about 16KB: 9027296 to 9011320. I'm a bit unsure about llast()'s new double evaluation of the list. Perhaps I can add a new inline function named list_last_cell() to get around that... Or maybe it doesn't matter. I'm not quite sure what's best there. David [1] https://www.postgresql.org/message-id/CAApHDvqeh8JEqMjpCFTgHD_zu2S03nOVh2srejd+sNLza8M+mg@mail.gmail.com
Attachment
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > I'm a bit unsure about llast()'s new double evaluation of the list. > Perhaps I can add a new inline function named list_last_cell() to get > around that... Or maybe it doesn't matter. I'm not quite sure what's > best there. Double evaluation bad, especially in a macro that has not had such a hazard for the last twenty-plus years. It might not be worth mucking with llast, as it's not used very heavily I believe. But if it is, then let's add another inline function. regards, tom lane
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
David Rowley
Date:
On Mon, 28 Sep 2020 at 12:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > I'm a bit unsure about llast()'s new double evaluation of the list. > > Perhaps I can add a new inline function named list_last_cell() to get > > around that... Or maybe it doesn't matter. I'm not quite sure what's > > best there. > > Double evaluation bad, especially in a macro that has not had such a > hazard for the last twenty-plus years. > > It might not be worth mucking with llast, as it's not used very heavily > I believe. But if it is, then let's add another inline function. Thanks for having a look at this. I changed things around to make llast() and the int and oid variant use a new inline function to get the last cell. I also pushed the resulting code to master. David
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > I changed things around to make llast() and the int and oid variant > use a new inline function to get the last cell. > I also pushed the resulting code to master. LGTM. regards, tom lane
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
Poking around to count remaining uses of those inline functions, I found a few places that should be using the macros instead, and fixed them. After that, I notice that list_tail(), list_third_cell(), and list_fourth_cell() are entirely unreferenced. I'm hesitant to get rid of list_tail(), because it seems like it could well be used by extensions. But I'd bet quite a bit that list_third_cell() and list_fourth_cell() are not used anywhere anymore. Should we get rid of them to shave a few microseconds from compile times? regards, tom lane
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
David Rowley
Date:
Thanks for 9d299a492. On Mon, 28 Sep 2020 at 15:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Poking around to count remaining uses of those inline functions, > I found a few places that should be using the macros instead, > and fixed them. After that, I notice that list_tail(), > list_third_cell(), and list_fourth_cell() are entirely unreferenced. > I'm hesitant to get rid of list_tail(), because it seems like it > could well be used by extensions. But I'd bet quite a bit that > list_third_cell() and list_fourth_cell() are not used anywhere > anymore. Should we get rid of them to shave a few microseconds > from compile times? I wouldn't object to the removal of list_third_cell() and list_fourth_cell(). I agree to your reasoning with last_tail(). It does seem more likely that someone would use it. Although, if you'd proposed to remove it too, I wouldn't have objected. It's not like it's hard to reimplement within an extension for any extensions that use it. Though, perhaps it would maybe be a shame if that was the sole thing we broke for them when they try compiling their extension in a year's time on the newly release PG14. David
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > On Mon, 28 Sep 2020 at 15:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Poking around to count remaining uses of those inline functions, >> I found a few places that should be using the macros instead, >> and fixed them. After that, I notice that list_tail(), >> list_third_cell(), and list_fourth_cell() are entirely unreferenced. >> I'm hesitant to get rid of list_tail(), because it seems like it >> could well be used by extensions. But I'd bet quite a bit that >> list_third_cell() and list_fourth_cell() are not used anywhere >> anymore. Should we get rid of them to shave a few microseconds >> from compile times? > I wouldn't object to the removal of list_third_cell() and list_fourth_cell(). > I agree to your reasoning with last_tail(). It does seem more likely > that someone would use it. Although, if you'd proposed to remove it > too, I wouldn't have objected. It's not like it's hard to reimplement > within an extension for any extensions that use it. Though, perhaps it > would maybe be a shame if that was the sole thing we broke for them > when they try compiling their extension in a year's time on the newly > release PG14. Looking a bit harder, I notice that list_third_cell() and list_fourth_cell() were in fact introduced in v13, purely to support lthird() and lfourth(). So it hardly seems likely that any extensions would have grown direct dependencies on them already. list_tail() has been there a long time though. Just to be sure, I checked codesearch.debian.net, and it failed to find any outside uses either. So I'll go ahead and remove those two. list_second_cell() does have uses, although I observe that they are almost exclusively in locutions such as for_each_cell(lc, rollups, list_second_cell(rollups)) to iterate over all-but-the-first elements of a list. I wonder if we ought to come up with a better notation for that. I'm imagining something like for_each_from(lc, rollups, 1) to start from list index 1. It looks like this would be marginally more efficient, and perhaps more readable. regards, tom lane
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
I wrote: > list_second_cell() does have uses, although I observe that they > are almost exclusively in locutions such as > for_each_cell(lc, rollups, list_second_cell(rollups)) > to iterate over all-but-the-first elements of a list. I wonder if > we ought to come up with a better notation for that. I'm imagining > something like > for_each_from(lc, rollups, 1) > to start from list index 1. It looks like this would be marginally > more efficient, and perhaps more readable. Concretely, I'm thinking about the attached. This does seem simpler. It reduces the size of the executable by about 560 bytes on my machine, or 46 bytes per usage, which isn't bad. (Note: this is in an assert-enabled build, might be different otherwise.) I didn't try to measure performance changes, but it should be for the better. Looking at the remaining instances of for_each_cell, I see several where it seems like it'd be simpler and clearer to use for_each_from. But for the moment I confined myself to changing just the instances following the pattern above. I noticed while messing with this that I'd neglected to const-ify the support functions for for_each_cell() and for_both_cell(), so this fixes that too. I'm somewhat inclined to back-patch this into v13. The missing const decoration seems arguably a bug, which we've missed noticing only because of our generally lamentable under-usage of const. And I think it'll be helpful for future back-patching if for_each_from is available in all versions with the new List API. regards, tom lane diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 16285ad09f..e0ac4e05e5 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -5732,7 +5732,7 @@ ATCheckPartitionsNotInUse(Relation rel, LOCKMODE lockmode) inh = find_all_inheritors(RelationGetRelid(rel), lockmode, NULL); /* first element is the parent rel; must ignore it */ - for_each_cell(cell, inh, list_second_cell(inh)) + for_each_from(cell, inh, 1) { Relation childrel; diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 9ce8f43385..1dc873ed25 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -441,7 +441,7 @@ exprTypmod(const Node *expr) typmod = exprTypmod((Node *) linitial(cexpr->args)); if (typmod < 0) return -1; /* no point in trying harder */ - for_each_cell(arg, cexpr->args, list_second_cell(cexpr->args)) + for_each_from(arg, cexpr->args, 1) { Node *e = (Node *) lfirst(arg); @@ -469,7 +469,7 @@ exprTypmod(const Node *expr) typmod = exprTypmod((Node *) linitial(mexpr->args)); if (typmod < 0) return -1; /* no point in trying harder */ - for_each_cell(arg, mexpr->args, list_second_cell(mexpr->args)) + for_each_from(arg, mexpr->args, 1) { Node *e = (Node *) lfirst(arg); diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 99278eed93..3d7a4e373f 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -2261,7 +2261,7 @@ create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path) { bool is_first_sort = ((RollupData *) linitial(rollups))->is_hashed; - for_each_cell(lc, rollups, list_second_cell(rollups)) + for_each_from(lc, rollups, 1) { RollupData *rollup = lfirst(lc); AttrNumber *new_grpColIdx; diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 3e2b4965c4..f331f82a6c 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -4430,7 +4430,7 @@ consider_groupingsets_paths(PlannerInfo *root, * below, must use the same condition. */ i = 0; - for_each_cell(lc, gd->rollups, list_second_cell(gd->rollups)) + for_each_from(lc, gd->rollups, 1) { RollupData *rollup = lfirst_node(RollupData, lc); @@ -4464,7 +4464,7 @@ consider_groupingsets_paths(PlannerInfo *root, rollups = list_make1(linitial(gd->rollups)); i = 0; - for_each_cell(lc, gd->rollups, list_second_cell(gd->rollups)) + for_each_from(lc, gd->rollups, 1) { RollupData *rollup = lfirst_node(RollupData, lc); diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c index f813b587f1..783f3fe8f2 100644 --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -1083,7 +1083,7 @@ parseCheckAggregates(ParseState *pstate, Query *qry) if (gset_common) { - for_each_cell(l, gsets, list_second_cell(gsets)) + for_each_from(l, gsets, 1) { gset_common = list_intersection_int(gset_common, lfirst(l)); if (!gset_common) @@ -1774,7 +1774,7 @@ expand_grouping_sets(List *groupingSets, int limit) result = lappend(result, list_union_int(NIL, (List *) lfirst(lc))); } - for_each_cell(lc, expanded_groups, list_second_cell(expanded_groups)) + for_each_from(lc, expanded_groups, 1) { List *p = lfirst(lc); List *new_result = NIL; diff --git a/src/backend/utils/adt/jsonpath_gram.y b/src/backend/utils/adt/jsonpath_gram.y index 88ef9550e9..53f422260c 100644 --- a/src/backend/utils/adt/jsonpath_gram.y +++ b/src/backend/utils/adt/jsonpath_gram.y @@ -441,7 +441,7 @@ makeItemList(List *list) while (end->next) end = end->next; - for_each_cell(cell, list, list_second_cell(list)) + for_each_from(cell, list, 1) { JsonPathParseItem *c = (JsonPathParseItem *) lfirst(cell); diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 03cf241996..62023c20b2 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -8113,7 +8113,7 @@ get_rule_expr(Node *node, deparse_context *context, { BoolExpr *expr = (BoolExpr *) node; Node *first_arg = linitial(expr->args); - ListCell *arg = list_second_cell(expr->args); + ListCell *arg; switch (expr->boolop) { @@ -8122,12 +8122,11 @@ get_rule_expr(Node *node, deparse_context *context, appendStringInfoChar(buf, '('); get_rule_expr_paren(first_arg, context, false, node); - while (arg) + for_each_from(arg, expr->args, 1) { appendStringInfoString(buf, " AND "); get_rule_expr_paren((Node *) lfirst(arg), context, false, node); - arg = lnext(expr->args, arg); } if (!PRETTY_PAREN(context)) appendStringInfoChar(buf, ')'); @@ -8138,12 +8137,11 @@ get_rule_expr(Node *node, deparse_context *context, appendStringInfoChar(buf, '('); get_rule_expr_paren(first_arg, context, false, node); - while (arg) + for_each_from(arg, expr->args, 1) { appendStringInfoString(buf, " OR "); get_rule_expr_paren((Node *) lfirst(arg), context, false, node); - arg = lnext(expr->args, arg); } if (!PRETTY_PAREN(context)) appendStringInfoChar(buf, ')'); diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 00c7afc66f..bec357fcef 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3519,7 +3519,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, * for remaining Vars on other rels. */ relvarinfos = lappend(relvarinfos, varinfo1); - for_each_cell(l, varinfos, list_second_cell(varinfos)) + for_each_from(l, varinfos, 1) { GroupVarInfo *varinfo2 = (GroupVarInfo *) lfirst(l); diff --git a/src/include/nodes/pg_list.h b/src/include/nodes/pg_list.h index 104df4174a..eefe705c39 100644 --- a/src/include/nodes/pg_list.h +++ b/src/include/nodes/pg_list.h @@ -389,6 +389,32 @@ lnext(const List *l, const ListCell *c) */ #define foreach_current_index(cell) (cell##__state.i) +/* + * for_each_from - + * Like foreach(), but start from the N'th (zero-based) list element, + * not necessarily the first one. + * + * It's okay for N to exceed the list length, but not for it to be negative. + * + * The caveats for foreach() apply equally here. + */ +#define for_each_from(cell, lst, N) \ + for (ForEachState cell##__state = for_each_from_setup(lst, N); \ + (cell##__state.l != NIL && \ + cell##__state.i < cell##__state.l->length) ? \ + (cell = &cell##__state.l->elements[cell##__state.i], true) : \ + (cell = NULL, false); \ + cell##__state.i++) + +static inline ForEachState +for_each_from_setup(const List *lst, int N) +{ + ForEachState r = {lst, N}; + + Assert(N >= 0); + return r; +} + /* * for_each_cell - * a convenience macro which loops through a list starting from a @@ -405,7 +431,7 @@ lnext(const List *l, const ListCell *c) cell##__state.i++) static inline ForEachState -for_each_cell_setup(List *lst, ListCell *initcell) +for_each_cell_setup(const List *lst, const ListCell *initcell) { ForEachState r = {lst, initcell ? list_cell_number(lst, initcell) : list_length(lst)}; @@ -456,8 +482,8 @@ for_each_cell_setup(List *lst, ListCell *initcell) cell1##__state.i1++, cell1##__state.i2++) static inline ForBothCellState -for_both_cell_setup(List *list1, ListCell *initcell1, - List *list2, ListCell *initcell2) +for_both_cell_setup(const List *list1, const ListCell *initcell1, + const List *list2, const ListCell *initcell2) { ForBothCellState r = {list1, list2, initcell1 ? list_cell_number(list1, initcell1) : list_length(list1),
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
David Rowley
Date:
On Tue, 29 Sep 2020 at 11:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: > > list_second_cell() does have uses, although I observe that they > > are almost exclusively in locutions such as > > for_each_cell(lc, rollups, list_second_cell(rollups)) > > to iterate over all-but-the-first elements of a list. I wonder if > > we ought to come up with a better notation for that. I'm imagining > > something like > > for_each_from(lc, rollups, 1) > > to start from list index 1. It looks like this would be marginally > > more efficient, and perhaps more readable. > > Concretely, I'm thinking about the attached. I had a look over this and I like it. It seems good as it allows consumers to choose N programmatically rather than be fixed into using list_second_cell() or list_fortysecond_cell(). > I'm somewhat inclined to back-patch this into v13. The missing > const decoration seems arguably a bug, which we've missed noticing > only because of our generally lamentable under-usage of const. > And I think it'll be helpful for future back-patching if > for_each_from is available in all versions with the new List API. It does seem fairly low risk and having personally experienced backpatching pain, I understand your motivation to backpatch. I certainly wouldn't object to backpacking but will defer to your better judgement on whether you choose to or not. David
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > On Tue, 29 Sep 2020 at 11:37, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Concretely, I'm thinking about the attached. > I had a look over this and I like it. It seems good as it allows > consumers to choose N programmatically rather than be fixed into using > list_second_cell() or list_fortysecond_cell(). Thanks for looking! Right now, if you want to start from a non-first point in the list, you have to use for_each_cell (or else write out a loop manually, but let's not). I'm not proposing to remove that alternative, but there are surely cases where it's simpler or clearer to use a list index instead of a ListCell pointer -- especially so if the list index is a constant. So I think this is a pretty clear win that simply failed to occur to me earlier. >> I'm somewhat inclined to back-patch this into v13. The missing >> const decoration seems arguably a bug, which we've missed noticing >> only because of our generally lamentable under-usage of const. >> And I think it'll be helpful for future back-patching if >> for_each_from is available in all versions with the new List API. > It does seem fairly low risk and having personally experienced > backpatching pain, I understand your motivation to backpatch. I > certainly wouldn't object to backpacking but will defer to your better > judgement on whether you choose to or not. A key point here is that everyplace I'm proposing to touch was already changed in v13 (a fortiori, because list_second_cell wasn't there in v12). So we can either have two different coding patterns in these areas, or three. Two's better from a backpatching standpoint. The fact that v13 is barely out the door also factors into this ... a year from now, my judgment would probably be different. regards, tom lane
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
David Rowley
Date:
On Tue, 29 Sep 2020 at 12:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > It does seem fairly low risk and having personally experienced > > backpatching pain, I understand your motivation to backpatch. I > > certainly wouldn't object to backpacking but will defer to your better > > judgement on whether you choose to or not. > > A key point here is that everyplace I'm proposing to touch was already > changed in v13 (a fortiori, because list_second_cell wasn't there in v12). > So we can either have two different coding patterns in these areas, or > three. Two's better from a backpatching standpoint. The fact that > v13 is barely out the door also factors into this ... a year from now, > my judgment would probably be different. Yeah, I understand that part. The pain I was talking about was writing a patch for master, trying to apply it to the previous version only to find it does not apply. After fixing it up for that version trying to apply that patch to the version before that and getting more conflicts. Repeating that down to our earliest supported version is something I'd really love to never have to do again. It's an exhausting process. It's also risky having to custom write a version of the patch for each release. So I understand and agree with your reasoning to backpatch as it could reduce the number of versions of a patch that must be written to fix a bug. That could reduce the chances of someone messing up a backpatch at some later date so might be a safe option. aka. I'm not going to object to you backpatching this. :) David
Re: Small improvements to pg_list.h's linitial(), lsecond(), lthird() etc macros
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > Yeah, I understand that part. The pain I was talking about was > writing a patch for master, trying to apply it to the previous version > only to find it does not apply. After fixing it up for that version > trying to apply that patch to the version before that and getting more > conflicts. Repeating that down to our earliest supported version is > something I'd really love to never have to do again. Tell me about it :-(. I've spent countless hours on that sort of activity. So I really dislike changing an API and then changing it again in the next release. regards, tom lane