Re: List of Bitmapset (was Re: ExecRTCheckPerms() and many prunable partitions) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: List of Bitmapset (was Re: ExecRTCheckPerms() and many prunable partitions) |
Date | |
Msg-id | 1175042.1668536607@sss.pgh.pa.us Whole thread Raw |
In response to | Re: List of Bitmapset (was Re: ExecRTCheckPerms() and many prunable partitions) (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
List | pgsql-hackers |
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2022-Nov-14, Tom Lane wrote: >> For discussion's sake, here's my current version of that 0004 patch, >> rewritten to use list-of-bitmapset as the data structure. > I feel that there should be more commentary that explains what a > multi-bms is. Not sure where, maybe just put it near the function that > first appears in the file. Right. I split the new functions out to new files multibitmapset.h/.c, so that the file headers can carry the overall explanation. (I duplicated the text between .h and .c, which is also true of bitmapset.h/.c, but maybe that's overkill.) >> * The multi_bms prefix is a bit wordy, so I was thinking of shortening >> the function names to mbms_xxx. Maybe that's too brief. > I don't think the "ulti_" bytes add a lot, and short names are better. Yeah, after sleeping on it I like mbms. >> * This is a pretty short list of functions so far. I'm not eager >> to build out a bunch of dead code though. Is it OK to leave it >> with just this much functionality until someone needs more? > I agree with not adding dead code. I concluded that the only thing that makes this an odd set of functions to start out with is the lack of mbms_is_member; it seems asymmetric to have mbms_add_member but not mbms_is_member. So I added that. I'm content to let the rest grow out as needed. >> * I'm a little hesitant about whether the API actually should be >> List-of-Bitmapset, or some dedicated struct as I had in the previous >> version of 0004. This version is way less invasive in prepjointree.c >> than that was, but the reason is there's ambiguity about what the >> forced_null_vars Lists actually contain, which feels error-prone. > Hmm ... if somebody makes a mistake, does the functionality break in > obvious ways, or is it very hard to pinpoint what happened? Now that Bitmapset is a full-fledged Node type, we can make use of castNode checks to verify that the input Lists contain what we expect. That seems probably sufficient to catch coding errors. >> +multi_bms_add_member(List *mbms, int index1, int index2) > Maybe s/index1/listidx/ or bitmapidx and s/index2/bitnr/ ? Right. I used listidx and bitidx. regards, tom lane diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile index d7b4261b47..4368c30fdb 100644 --- a/src/backend/nodes/Makefile +++ b/src/backend/nodes/Makefile @@ -21,6 +21,7 @@ OBJS = \ extensible.o \ list.o \ makefuncs.o \ + multibitmapset.o \ nodeFuncs.o \ nodes.o \ outfuncs.o \ diff --git a/src/backend/nodes/README b/src/backend/nodes/README index 64937fe254..489a67eb89 100644 --- a/src/backend/nodes/README +++ b/src/backend/nodes/README @@ -58,6 +58,7 @@ FILES IN THIS DIRECTORY (src/backend/nodes/) Specialized manipulation functions: bitmapset.c - Bitmapset support list.c - generic list support + multibitmapset.c - List-of-Bitmapset support params.c - Param support tidbitmap.c - TIDBitmap support value.c - support for value nodes diff --git a/src/backend/nodes/meson.build b/src/backend/nodes/meson.build index 8e0d4039f2..c4f3897ef2 100644 --- a/src/backend/nodes/meson.build +++ b/src/backend/nodes/meson.build @@ -3,6 +3,7 @@ backend_sources += files( 'extensible.c', 'list.c', 'makefuncs.c', + 'multibitmapset.c', 'nodeFuncs.c', 'nodes.c', 'params.c', diff --git a/src/backend/nodes/multibitmapset.c b/src/backend/nodes/multibitmapset.c new file mode 100644 index 0000000000..9a2ac8c81b --- /dev/null +++ b/src/backend/nodes/multibitmapset.c @@ -0,0 +1,158 @@ +/*------------------------------------------------------------------------- + * + * multibitmapset.c + * Lists of Bitmapsets + * + * A multibitmapset is useful in situations where members of a set can + * be identified by two small integers; for example, varno and varattno + * of a group of Vars within a query. The implementation is a List of + * Bitmapsets, so that the empty set can be represented by NIL. (But, + * as with Bitmapsets, that's not the only allowed representation.) + * The zero-based index of a List element is the first identifying value, + * and the (also zero-based) index of a bit within that Bitmapset is + * the second identifying value. There is no expectation that the + * Bitmapsets should all be the same size. + * + * The available operations on multibitmapsets are intended to parallel + * those on bitmapsets, for example union and intersection. So far only + * a small fraction of that has been built out; we'll add more as needed. + * + * + * Copyright (c) 2022, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/backend/nodes/multibitmapset.c + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "nodes/multibitmapset.h" + + +/* + * mbms_add_member + * Add a new member to a multibitmapset. + * + * The new member is identified by "listidx", the zero-based index of the + * List element it should go into, and "bitidx", which specifies the bit + * number to be set therein. + * + * This is like bms_add_member, but for multibitmapsets. + */ +List * +mbms_add_member(List *a, int listidx, int bitidx) +{ + Bitmapset *bms; + ListCell *lc; + + if (listidx < 0 || bitidx < 0) + elog(ERROR, "negative multibitmapset member index not allowed"); + /* Add empty elements as needed */ + while (list_length(a) <= listidx) + a = lappend(a, NULL); + /* Update the target element */ + lc = list_nth_cell(a, listidx); + bms = lfirst_node(Bitmapset, lc); + bms = bms_add_member(bms, bitidx); + lfirst(lc) = bms; + return a; +} + +/* + * mbms_add_members + * Add all members of set b to set a. + * + * This is like bms_add_members, but for multibitmapsets. + */ +List * +mbms_add_members(List *a, const List *b) +{ + ListCell *lca, + *lcb; + + /* Add empty elements to a, as needed */ + while (list_length(a) < list_length(b)) + a = lappend(a, NULL); + /* forboth will stop at the end of the shorter list, which is fine */ + forboth(lca, a, lcb, b) + { + Bitmapset *bmsa = lfirst_node(Bitmapset, lca); + const Bitmapset *bmsb = lfirst_node(Bitmapset, lcb); + + bmsa = bms_add_members(bmsa, bmsb); + lfirst(lca) = bmsa; + } + return a; +} + +/* + * mbms_int_members + * Reduce set a to its intersection with set b. + * + * This is like bms_int_members, but for multibitmapsets. + */ +List * +mbms_int_members(List *a, const List *b) +{ + ListCell *lca, + *lcb; + + /* Remove any elements of a that are no longer of use */ + a = list_truncate(a, list_length(b)); + /* forboth will stop at the end of the shorter list, which is fine */ + forboth(lca, a, lcb, b) + { + Bitmapset *bmsa = lfirst_node(Bitmapset, lca); + const Bitmapset *bmsb = lfirst_node(Bitmapset, lcb); + + bmsa = bms_int_members(bmsa, bmsb); + lfirst(lca) = bmsa; + } + return a; +} + +/* + * mbms_is_member + * Is listidx/bitidx a member of A? + * + * This is like bms_is_member, but for multibitmapsets. + */ +bool +mbms_is_member(int listidx, int bitidx, const List *a) +{ + const Bitmapset *bms; + + /* XXX better to just return false for negative indexes? */ + if (listidx < 0 || bitidx < 0) + elog(ERROR, "negative multibitmapset member index not allowed"); + if (listidx >= list_length(a)) + return false; + bms = list_nth_node(Bitmapset, a, listidx); + return bms_is_member(bitidx, bms); +} + +/* + * mbms_overlap_sets + * Identify the bitmapsets having common members in a and b. + * + * The result is a bitmapset of the list indexes of bitmapsets that overlap. + */ +Bitmapset * +mbms_overlap_sets(const List *a, const List *b) +{ + Bitmapset *result = NULL; + ListCell *lca, + *lcb; + + /* forboth will stop at the end of the shorter list, which is fine */ + forboth(lca, a, lcb, b) + { + const Bitmapset *bmsa = lfirst_node(Bitmapset, lca); + const Bitmapset *bmsb = lfirst_node(Bitmapset, lcb); + + if (bms_overlap(bmsa, bmsb)) + result = bms_add_member(result, foreach_current_index(lca)); + } + return result; +} diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index b4b9099eb6..f4cdb879c2 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -28,6 +28,7 @@ #include "catalog/pg_type.h" #include "funcapi.h" #include "nodes/makefuncs.h" +#include "nodes/multibitmapset.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" #include "optimizer/optimizer.h" @@ -2769,7 +2770,7 @@ reduce_outer_joins_pass1(Node *jtnode) * state: state data collected by phase 1 for this node * root: toplevel planner state * nonnullable_rels: set of base relids forced non-null by upper quals - * forced_null_vars: list of Vars forced null by upper quals + * forced_null_vars: multibitmapset of Vars forced null by upper quals */ static void reduce_outer_joins_pass2(Node *jtnode, @@ -2799,8 +2800,8 @@ reduce_outer_joins_pass2(Node *jtnode, pass_nonnullable_rels = bms_add_members(pass_nonnullable_rels, nonnullable_rels); pass_forced_null_vars = find_forced_null_vars(f->quals); - pass_forced_null_vars = list_concat(pass_forced_null_vars, - forced_null_vars); + pass_forced_null_vars = mbms_add_members(pass_forced_null_vars, + forced_null_vars); /* And recurse --- but only into interesting subtrees */ Assert(list_length(f->fromlist) == list_length(state->sub_states)); forboth(l, f->fromlist, s, state->sub_states) @@ -2897,7 +2898,7 @@ reduce_outer_joins_pass2(Node *jtnode, if (jointype == JOIN_LEFT) { List *nonnullable_vars; - List *overlap; + Bitmapset *overlap; /* Find Vars in j->quals that must be non-null in joined rows */ nonnullable_vars = find_nonnullable_vars(j->quals); @@ -2907,11 +2908,8 @@ reduce_outer_joins_pass2(Node *jtnode, * forced_null_vars overlap: we need to know if the overlap * includes any RHS variables. */ - overlap = list_intersection(nonnullable_vars, - forced_null_vars); - if (overlap != NIL && - bms_overlap(pull_varnos(root, (Node *) overlap), - right_state->relids)) + overlap = mbms_overlap_sets(nonnullable_vars, forced_null_vars); + if (bms_overlap(overlap, right_state->relids)) jointype = JOIN_ANTI; } @@ -2964,8 +2962,8 @@ reduce_outer_joins_pass2(Node *jtnode, /* OK to merge upper and local constraints */ local_nonnullable_rels = bms_add_members(local_nonnullable_rels, nonnullable_rels); - local_forced_null_vars = list_concat(local_forced_null_vars, - forced_null_vars); + local_forced_null_vars = mbms_add_members(local_forced_null_vars, + forced_null_vars); } } else diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 317c10c2b9..33790a4f46 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -31,6 +31,7 @@ #include "funcapi.h" #include "miscadmin.h" #include "nodes/makefuncs.h" +#include "nodes/multibitmapset.h" #include "nodes/nodeFuncs.h" #include "nodes/subscripting.h" #include "nodes/supportnodes.h" @@ -1566,7 +1567,7 @@ find_nonnullable_rels_walker(Node *node, bool top_level) * find_nonnullable_vars * Determine which Vars are forced nonnullable by given clause. * - * Returns a list of all level-zero Vars that are referenced in the clause in + * Returns the set of all level-zero Vars that are referenced in the clause in * such a way that the clause cannot possibly return TRUE if any of these Vars * is NULL. (It is OK to err on the side of conservatism; hence the analysis * here is simplistic.) @@ -1578,8 +1579,9 @@ find_nonnullable_rels_walker(Node *node, bool top_level) * the expression to have been AND/OR flattened and converted to implicit-AND * format. * - * The result is a palloc'd List, but we have not copied the member Var nodes. - * Also, we don't bother trying to eliminate duplicate entries. + * Attnos of the identified Vars are returned in a multibitmapset (a List of + * Bitmapsets). List indexes correspond to relids (varnos), while the per-rel + * Bitmapsets hold varattnos offset by FirstLowInvalidHeapAttributeNumber. * * top_level is true while scanning top-level AND/OR structure; here, showing * the result is either FALSE or NULL is good enough. top_level is false when @@ -1608,7 +1610,9 @@ find_nonnullable_vars_walker(Node *node, bool top_level) Var *var = (Var *) node; if (var->varlevelsup == 0) - result = list_make1(var); + result = mbms_add_member(result, + var->varno, + var->varattno - FirstLowInvalidHeapAttributeNumber); } else if (IsA(node, List)) { @@ -1623,9 +1627,9 @@ find_nonnullable_vars_walker(Node *node, bool top_level) */ foreach(l, (List *) node) { - result = list_concat(result, - find_nonnullable_vars_walker(lfirst(l), - top_level)); + result = mbms_add_members(result, + find_nonnullable_vars_walker(lfirst(l), + top_level)); } } else if (IsA(node, FuncExpr)) @@ -1657,7 +1661,12 @@ find_nonnullable_vars_walker(Node *node, bool top_level) switch (expr->boolop) { case AND_EXPR: - /* At top level we can just recurse (to the List case) */ + + /* + * At top level we can just recurse (to the List case), since + * the result should be the union of what we can prove in each + * arm. + */ if (top_level) { result = find_nonnullable_vars_walker((Node *) expr->args, @@ -1689,7 +1698,7 @@ find_nonnullable_vars_walker(Node *node, bool top_level) if (result == NIL) /* first subresult? */ result = subresult; else - result = list_intersection(result, subresult); + result = mbms_int_members(result, subresult); /* * If the intersection is empty, we can stop looking. This @@ -1788,8 +1797,8 @@ find_nonnullable_vars_walker(Node *node, bool top_level) * side of conservatism; hence the analysis here is simplistic. In fact, * we only detect simple "var IS NULL" tests at the top level.) * - * The result is a palloc'd List, but we have not copied the member Var nodes. - * Also, we don't bother trying to eliminate duplicate entries. + * As with find_nonnullable_vars, we return the varattnos of the identified + * Vars in a multibitmapset. */ List * find_forced_null_vars(Node *node) @@ -1804,7 +1813,9 @@ find_forced_null_vars(Node *node) var = find_forced_null_var(node); if (var) { - result = list_make1(var); + result = mbms_add_member(result, + var->varno, + var->varattno - FirstLowInvalidHeapAttributeNumber); } /* Otherwise, handle AND-conditions */ else if (IsA(node, List)) @@ -1815,8 +1826,8 @@ find_forced_null_vars(Node *node) */ foreach(l, (List *) node) { - result = list_concat(result, - find_forced_null_vars(lfirst(l))); + result = mbms_add_members(result, + find_forced_null_vars((Node *) lfirst(l))); } } else if (IsA(node, BoolExpr)) diff --git a/src/include/nodes/multibitmapset.h b/src/include/nodes/multibitmapset.h new file mode 100644 index 0000000000..8c6695aeda --- /dev/null +++ b/src/include/nodes/multibitmapset.h @@ -0,0 +1,39 @@ +/*------------------------------------------------------------------------- + * + * multibitmapset.h + * Lists of Bitmapsets + * + * A multibitmapset is useful in situations where members of a set can + * be identified by two small integers; for example, varno and varattno + * of a group of Vars within a query. The implementation is a List of + * Bitmapsets, so that the empty set can be represented by NIL. (But, + * as with Bitmapsets, that's not the only allowed representation.) + * The zero-based index of a List element is the first identifying value, + * and the (also zero-based) index of a bit within that Bitmapset is + * the second identifying value. There is no expectation that the + * Bitmapsets should all be the same size. + * + * The available operations on multibitmapsets are intended to parallel + * those on bitmapsets, for example union and intersection. So far only + * a small fraction of that has been built out; we'll add more as needed. + * + * + * Copyright (c) 2022, PostgreSQL Global Development Group + * + * src/include/nodes/multibitmapset.h + * + *------------------------------------------------------------------------- + */ +#ifndef MULTIBITMAPSET_H +#define MULTIBITMAPSET_H + +#include "nodes/bitmapset.h" +#include "nodes/pg_list.h" + +extern List *mbms_add_member(List *a, int listidx, int bitidx); +extern List *mbms_add_members(List *a, const List *b); +extern List *mbms_int_members(List *a, const List *b); +extern bool mbms_is_member(int listidx, int bitidx, const List *a); +extern Bitmapset *mbms_overlap_sets(const List *a, const List *b); + +#endif /* MULTIBITMAPSET_H */
pgsql-hackers by date: