Thread: Marginal performance improvement: replace bms_first_member loops
Another thing that came out of the discussion at http://www.postgresql.org/message-id/flat/CAOR=d=3j1U_q-zf8+jUx1hkx8ps+N8pm=EUTqyFdJ5ov=+fawg@mail.gmail.com was that there was a significant amount of palloc/pfree traffic blamable on the bms_first_member() loop in plpgsql's setup_param_list(). I've been experimenting with a variant of bms_first_member() called bms_next_member(), which doesn't modify the input bitmapset and therefore removes the need to make a working copy when iterating over the members of a set. In isolation, bms_next_member() is fractionally slower than bms_first_member() because it has to do a bit more shifting-and-masking, but of course we more than win that back from eliminating a palloc/pfree cycle. It's also worth noting that in principle, a bms_first_member() loop is O(N^2) for large sets because it scans from the start of the set each time; but I doubt this is much of an issue in practice, because the bitmapsets we work with just aren't very large. (I did some microbenchmarking and found that if one ignores the palloc overhead question, a bms_next_member loop is a tad slower up to about four words in the bitmapset, and faster beyond that because the rescans start to make a difference. But four words would be 128 bits and very very few bitmapsets in PG would have more members than that.) The attached proposed patch adds bms_next_member() and replaces bms_first_member() calls where it seemed to make sense. I've had a hard time measuring much speed difference for this patch in isolation, but in principle it should be less code and less cycles. It also seems safer and more natural to not use destructive looping techniques. regards, tom lane diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 76bda73..7dcedd1 100644 *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *************** postgresPlanForeignModify(PlannerInfo *r *** 1198,1212 **** } else if (operation == CMD_UPDATE) { - Bitmapset *tmpset = bms_copy(rte->modifiedCols); AttrNumber col; ! while ((col = bms_first_member(tmpset)) >= 0) { col += FirstLowInvalidHeapAttributeNumber; if (col <= InvalidAttrNumber) /* shouldn't happen */ elog(ERROR, "system-column update is not supported"); targetAttrs = lappend_int(targetAttrs, col); } } --- 1198,1213 ---- } else if (operation == CMD_UPDATE) { AttrNumber col; ! col = -1; ! while ((col = bms_next_member(rte->modifiedCols, col)) >= 0) { col += FirstLowInvalidHeapAttributeNumber; if (col <= InvalidAttrNumber) /* shouldn't happen */ elog(ERROR, "system-column update is not supported"); targetAttrs = lappend_int(targetAttrs, col); + col -= FirstLowInvalidHeapAttributeNumber; } } diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c index bb82c0d..44c067d 100644 *** a/contrib/sepgsql/dml.c --- b/contrib/sepgsql/dml.c *************** static Bitmapset * *** 94,100 **** fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns) { AttrNumber attno; - Bitmapset *tmpset; Bitmapset *result = NULL; char *attname; int index; --- 94,99 ---- *************** fixup_inherited_columns(Oid parentId, Oi *** 105,112 **** if (parentId == childId) return columns; ! tmpset = bms_copy(columns); ! while ((index = bms_first_member(tmpset)) > 0) { attno = index + FirstLowInvalidHeapAttributeNumber; --- 104,111 ---- if (parentId == childId) return columns; ! index = -1; ! while ((index = bms_next_member(columns, index)) > 0) { attno = index + FirstLowInvalidHeapAttributeNumber; *************** fixup_inherited_columns(Oid parentId, Oi *** 128,139 **** elog(ERROR, "cache lookup failed for attribute %s of relation %u", attname, childId); ! index = attno - FirstLowInvalidHeapAttributeNumber; ! result = bms_add_member(result, index); pfree(attname); } - bms_free(tmpset); return result; } --- 127,137 ---- elog(ERROR, "cache lookup failed for attribute %s of relation %u", attname, childId); ! result = bms_add_member(result, ! attno - FirstLowInvalidHeapAttributeNumber); pfree(attname); } return result; } diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index c499486..436b0fa 100644 *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *************** ExecCheckRTEPerms(RangeTblEntry *rte) *** 547,553 **** AclMode remainingPerms; Oid relOid; Oid userid; - Bitmapset *tmpset; int col; /* --- 547,552 ---- *************** ExecCheckRTEPerms(RangeTblEntry *rte) *** 614,621 **** return false; } ! tmpset = bms_copy(rte->selectedCols); ! while ((col = bms_first_member(tmpset)) >= 0) { /* remove the column number offset */ col += FirstLowInvalidHeapAttributeNumber; --- 613,620 ---- return false; } ! col = -1; ! while ((col = bms_next_member(rte->selectedCols, col)) >= 0) { /* remove the column number offset */ col += FirstLowInvalidHeapAttributeNumber; *************** ExecCheckRTEPerms(RangeTblEntry *rte) *** 632,639 **** ACL_SELECT) != ACLCHECK_OK) return false; } } - bms_free(tmpset); } /* --- 631,638 ---- ACL_SELECT) != ACLCHECK_OK) return false; } + col -= FirstLowInvalidHeapAttributeNumber; } } /* *************** ExecCheckRTEPerms(RangeTblEntry *rte) *** 656,663 **** return false; } ! tmpset = bms_copy(rte->modifiedCols); ! while ((col = bms_first_member(tmpset)) >= 0) { /* remove the column number offset */ col += FirstLowInvalidHeapAttributeNumber; --- 655,662 ---- return false; } ! col = -1; ! while ((col = bms_next_member(rte->modifiedCols, col)) >= 0) { /* remove the column number offset */ col += FirstLowInvalidHeapAttributeNumber; *************** ExecCheckRTEPerms(RangeTblEntry *rte) *** 672,679 **** remainingPerms) != ACLCHECK_OK) return false; } } - bms_free(tmpset); } } return true; --- 671,678 ---- remainingPerms) != ACLCHECK_OK) return false; } + col -= FirstLowInvalidHeapAttributeNumber; } } } return true; diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c index c927b78..23eb1eb 100644 *** a/src/backend/nodes/bitmapset.c --- b/src/backend/nodes/bitmapset.c *************** bms_join(Bitmapset *a, Bitmapset *b) *** 793,799 **** return result; } ! /*---------- * bms_first_member - find and remove first member of a set * * Returns -1 if set is empty. NB: set is destructively modified! --- 793,799 ---- return result; } ! /* * bms_first_member - find and remove first member of a set * * Returns -1 if set is empty. NB: set is destructively modified! *************** bms_join(Bitmapset *a, Bitmapset *b) *** 801,811 **** * This is intended as support for iterating through the members of a set. * The typical pattern is * ! * tmpset = bms_copy(inputset); ! * while ((x = bms_first_member(tmpset)) >= 0) * process member x; ! * bms_free(tmpset); ! *---------- */ int bms_first_member(Bitmapset *a) --- 801,811 ---- * This is intended as support for iterating through the members of a set. * The typical pattern is * ! * while ((x = bms_first_member(inputset)) >= 0) * process member x; ! * ! * CAUTION: this destroys the content of "inputset". If the set must ! * not be modified, use bms_next_member instead. */ int bms_first_member(Bitmapset *a) *************** bms_first_member(Bitmapset *a) *** 841,846 **** --- 841,899 ---- } /* + * bms_next_member - find next member of a set + * + * Returns smallest member greater than "prevbit", or -1 if there is none. + * + * This is intended as support for iterating through the members of a set. + * The typical pattern is + * + * x = -1; + * while ((x = bms_next_member(inputset, x)) >= 0) + * process member x; + */ + int + bms_next_member(const Bitmapset *a, int prevbit) + { + int nwords; + int wordnum; + bitmapword mask; + + if (a == NULL) + return -1; + nwords = a->nwords; + prevbit++; + mask = (~(bitmapword) 0) << BITNUM(prevbit); + for (wordnum = WORDNUM(prevbit); wordnum < nwords; wordnum++) + { + bitmapword w = a->words[wordnum]; + + /* ignore bits before prevbit */ + w &= mask; + + if (w != 0) + { + int result; + + w = RIGHTMOST_ONE(w); + + result = wordnum * BITS_PER_BITMAPWORD; + while ((w & 255) == 0) + { + w >>= 8; + result += 8; + } + result += rightmost_one_pos[w & 255]; + return result; + } + + /* in subsequent words, consider all bits */ + mask = (~(bitmapword) 0); + } + return -1; + } + + /* * bms_hash_value - compute a hash key for a Bitmapset * * Note: we must ensure that any two bitmapsets that are bms_equal() will diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index ca9bd4f..edbd09f 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *************** _outList(StringInfo str, const List *nod *** 184,198 **** static void _outBitmapset(StringInfo str, const Bitmapset *bms) { - Bitmapset *tmpset; int x; appendStringInfoChar(str, '('); appendStringInfoChar(str, 'b'); ! tmpset = bms_copy(bms); ! while ((x = bms_first_member(tmpset)) >= 0) appendStringInfo(str, " %d", x); - bms_free(tmpset); appendStringInfoChar(str, ')'); } --- 184,196 ---- static void _outBitmapset(StringInfo str, const Bitmapset *bms) { int x; appendStringInfoChar(str, '('); appendStringInfoChar(str, 'b'); ! x = -1; ! while ((x = bms_next_member(bms, x)) >= 0) appendStringInfo(str, " %d", x); appendStringInfoChar(str, ')'); } diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 25f3067..449fdc3 100644 *** a/src/backend/optimizer/path/allpaths.c --- b/src/backend/optimizer/path/allpaths.c *************** remove_unused_subquery_outputs(Query *su *** 2272,2290 **** static void print_relids(Relids relids) { - Relids tmprelids; int x; bool first = true; ! tmprelids = bms_copy(relids); ! while ((x = bms_first_member(tmprelids)) >= 0) { if (!first) printf(" "); printf("%d", x); first = false; } - bms_free(tmprelids); } static void --- 2272,2288 ---- static void print_relids(Relids relids) { int x; bool first = true; ! x = -1; ! while ((x = bms_next_member(relids, x)) >= 0) { if (!first) printf(" "); printf("%d", x); first = false; } } static void diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 1ee3b93..7aab644 100644 *** a/src/backend/optimizer/path/indxpath.c --- b/src/backend/optimizer/path/indxpath.c *************** get_loop_count(PlannerInfo *root, Relids *** 1875,1883 **** { int relid; ! /* Need a working copy since bms_first_member is destructive */ ! outer_relids = bms_copy(outer_relids); ! while ((relid = bms_first_member(outer_relids)) >= 0) { RelOptInfo *outer_rel; --- 1875,1882 ---- { int relid; ! relid = -1; ! while ((relid = bms_next_member(outer_relids, relid)) >= 0) { RelOptInfo *outer_rel; *************** get_loop_count(PlannerInfo *root, Relids *** 1900,1906 **** if (result == 1.0 || result > outer_rel->rows) result = outer_rel->rows; } - bms_free(outer_relids); } return result; } --- 1899,1904 ---- diff --git a/src/backend/optimizer/util/joininfo.c b/src/backend/optimizer/util/joininfo.c index 0418946..40af38d 100644 *** a/src/backend/optimizer/util/joininfo.c --- b/src/backend/optimizer/util/joininfo.c *************** add_join_clause_to_rels(PlannerInfo *roo *** 96,112 **** RestrictInfo *restrictinfo, Relids join_relids) { - Relids tmprelids; int cur_relid; ! tmprelids = bms_copy(join_relids); ! while ((cur_relid = bms_first_member(tmprelids)) >= 0) { RelOptInfo *rel = find_base_rel(root, cur_relid); rel->joininfo = lappend(rel->joininfo, restrictinfo); } - bms_free(tmprelids); } /* --- 96,110 ---- RestrictInfo *restrictinfo, Relids join_relids) { int cur_relid; ! cur_relid = -1; ! while ((cur_relid = bms_next_member(join_relids, cur_relid)) >= 0) { RelOptInfo *rel = find_base_rel(root, cur_relid); rel->joininfo = lappend(rel->joininfo, restrictinfo); } } /* *************** remove_join_clause_from_rels(PlannerInfo *** 125,135 **** RestrictInfo *restrictinfo, Relids join_relids) { - Relids tmprelids; int cur_relid; ! tmprelids = bms_copy(join_relids); ! while ((cur_relid = bms_first_member(tmprelids)) >= 0) { RelOptInfo *rel = find_base_rel(root, cur_relid); --- 123,132 ---- RestrictInfo *restrictinfo, Relids join_relids) { int cur_relid; ! cur_relid = -1; ! while ((cur_relid = bms_next_member(join_relids, cur_relid)) >= 0) { RelOptInfo *rel = find_base_rel(root, cur_relid); *************** remove_join_clause_from_rels(PlannerInfo *** 140,144 **** Assert(list_member_ptr(rel->joininfo, restrictinfo)); rel->joininfo = list_delete_ptr(rel->joininfo, restrictinfo); } - bms_free(tmprelids); } --- 137,140 ---- diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c index d4f46b8..a64a8d7 100644 *** a/src/backend/optimizer/util/var.c --- b/src/backend/optimizer/util/var.c *************** static Relids *** 773,783 **** alias_relid_set(PlannerInfo *root, Relids relids) { Relids result = NULL; - Relids tmprelids; int rtindex; ! tmprelids = bms_copy(relids); ! while ((rtindex = bms_first_member(tmprelids)) >= 0) { RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable); --- 773,782 ---- alias_relid_set(PlannerInfo *root, Relids relids) { Relids result = NULL; int rtindex; ! rtindex = -1; ! while ((rtindex = bms_next_member(relids, rtindex)) >= 0) { RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable); *************** alias_relid_set(PlannerInfo *root, Relid *** 786,791 **** else result = bms_add_member(result, rtindex); } - bms_free(tmprelids); return result; } --- 785,789 ---- diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c index ad983c7..c7afea1 100644 *** a/src/backend/rewrite/rewriteHandler.c --- b/src/backend/rewrite/rewriteHandler.c *************** static Bitmapset * *** 2456,2466 **** adjust_view_column_set(Bitmapset *cols, List *targetlist) { Bitmapset *result = NULL; - Bitmapset *tmpcols; AttrNumber col; ! tmpcols = bms_copy(cols); ! while ((col = bms_first_member(tmpcols)) >= 0) { /* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber; --- 2456,2465 ---- adjust_view_column_set(Bitmapset *cols, List *targetlist) { Bitmapset *result = NULL; AttrNumber col; ! col = -1; ! while ((col = bms_next_member(cols, col)) >= 0) { /* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber; *************** adjust_view_column_set(Bitmapset *cols, *** 2510,2516 **** attno); } } - bms_free(tmpcols); return result; } --- 2509,2514 ---- diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c index fb20314..c9e4b68 100644 *** a/src/backend/rewrite/rewriteManip.c --- b/src/backend/rewrite/rewriteManip.c *************** static Relids *** 454,466 **** offset_relid_set(Relids relids, int offset) { Relids result = NULL; - Relids tmprelids; int rtindex; ! tmprelids = bms_copy(relids); ! while ((rtindex = bms_first_member(tmprelids)) >= 0) result = bms_add_member(result, rtindex + offset); - bms_free(tmprelids); return result; } --- 454,464 ---- offset_relid_set(Relids relids, int offset) { Relids result = NULL; int rtindex; ! rtindex = -1; ! while ((rtindex = bms_next_member(relids, rtindex)) >= 0) result = bms_add_member(result, rtindex + offset); return result; } diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h index f770608..a78ff48 100644 *** a/src/include/nodes/bitmapset.h --- b/src/include/nodes/bitmapset.h *************** extern Bitmapset *bms_join(Bitmapset *a, *** 89,94 **** --- 89,95 ---- /* support for iterating through the integer elements of a set: */ extern int bms_first_member(Bitmapset *a); + extern int bms_next_member(const Bitmapset *a, int prevbit); /* support for hashtables using Bitmapsets as keys: */ extern uint32 bms_hash_value(const Bitmapset *a); diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 11cb47b..d1feef7 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** setup_param_list(PLpgSQL_execstate *esta *** 5263,5269 **** */ if (!bms_is_empty(expr->paramnos)) { - Bitmapset *tmpset; int dno; paramLI = (ParamListInfo) --- 5263,5268 ---- *************** setup_param_list(PLpgSQL_execstate *esta *** 5276,5283 **** paramLI->numParams = estate->ndatums; /* Instantiate values for "safe" parameters of the expression */ ! tmpset = bms_copy(expr->paramnos); ! while ((dno = bms_first_member(tmpset)) >= 0) { PLpgSQL_datum *datum = estate->datums[dno]; --- 5275,5282 ---- paramLI->numParams = estate->ndatums; /* Instantiate values for "safe" parameters of the expression */ ! dno = -1; ! while ((dno = bms_next_member(expr->paramnos, dno)) >= 0) { PLpgSQL_datum *datum = estate->datums[dno]; *************** setup_param_list(PLpgSQL_execstate *esta *** 5292,5298 **** prm->ptype = var->datatype->typoid; } } - bms_free(tmpset); /* * Set up link to active expr where the hook functions can find it. --- 5291,5296 ---- *************** format_expr_params(PLpgSQL_execstate *es *** 6528,6542 **** int paramno; int dno; StringInfoData paramstr; - Bitmapset *tmpset; if (!expr->paramnos) return NULL; initStringInfo(¶mstr); - tmpset = bms_copy(expr->paramnos); paramno = 0; ! while ((dno = bms_first_member(tmpset)) >= 0) { Datum paramdatum; Oid paramtypeid; --- 6526,6539 ---- int paramno; int dno; StringInfoData paramstr; if (!expr->paramnos) return NULL; initStringInfo(¶mstr); paramno = 0; ! dno = -1; ! while ((dno = bms_next_member(expr->paramnos, dno)) >= 0) { Datum paramdatum; Oid paramtypeid; *************** format_expr_params(PLpgSQL_execstate *es *** 6572,6578 **** paramno++; } - bms_free(tmpset); return paramstr.data; } --- 6569,6574 ----
On 27 November 2014 at 19:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The attached proposed patch adds bms_next_member() and replaces > bms_first_member() calls where it seemed to make sense. I've had a > hard time measuring much speed difference for this patch in isolation, > but in principle it should be less code and less cycles. It also seems > safer and more natural to not use destructive looping techniques. > +1. I had a similar idea a while back but didn't have time to produce a complete patch. There is another micro-optimisation that you could make in bms_next_member() -- it isn't necessary to do w = RIGHTMOST_ONE(w) because unlike bms_next_member, w isn't being used to mask out the bit retrieved, so any higher bits don't matter and the later use of rightmost_one_pos[...] will pick out the required rightmost bit. Should this function protect against large negative inputs? As it stands, passing in a value of prevbit less than -1 would be problematic. Maybe it's sufficient to say "don't do that" in the docs, rather than waste more cycles checking. Regards, Dean
On Fri, Nov 28, 2014 at 8:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The attached proposed patch adds bms_next_member() and replaces
bms_first_member() calls where it seemed to make sense. I've had a
hard time measuring much speed difference for this patch in isolation,
but in principle it should be less code and less cycles. It also seems
safer and more natural to not use destructive looping techniques.
I've had a quick read of the patch and it seems like a good idea.
I have to say I don't really like the modifying of the loop iterator that's going on here:
col = -1;
while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
{
col += FirstLowInvalidHeapAttributeNumber;
/* do stuff */
col -= FirstLowInvalidHeapAttributeNumber;
}
Some other code is doing this:
col = -1;
while ((col = bms_next_member(cols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
Which seems less prone to future breakage and possibly slightly less cycles.
A while back when I was benchmarking the planner time during my trials with
anti/semi join removals, I wrote a patch to change the usage pattern for cases
such as:
if (bms_membership(a) != BMS_SINGLETON)
return; /* nothing to do */
singleton = bms_singleton_member(a);
...
Into:
if (!bms_get_singleton(a, &singleton))
return; /* nothing to do */
...
Which means 1 function call and loop over the bitmapset, rather than 2 function
calls and 2 loops over the set when the set is a singleton.
This knocked between 4 and 22% off of the time the planner spent in the join
removals path.
The patch to implement this and change all suitable calls sites is attached.
Regards
David Rowley
Attachment
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On 27 November 2014 at 19:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The attached proposed patch adds bms_next_member() and replaces >> bms_first_member() calls where it seemed to make sense. > There is another micro-optimisation that you could make in > bms_next_member() -- it isn't necessary to do > w = RIGHTMOST_ONE(w) Excellent point! Thanks for noticing that. > Should this function protect against large negative inputs? As it > stands, passing in a value of prevbit less than -1 would be > problematic. Maybe it's sufficient to say "don't do that" in the docs, > rather than waste more cycles checking. Yeah, I had considered whether to do that; instead of just prevbit++ it would need to be something likeprevbit = (prevbit < 0) ? 0 : prevbit + 1; This would add one test-and-branch, and moreover one that would be hard to predict correctly (given that most of our bitmapsets don't have very many members). So it seems pretty expensive. Probably a more explicit warning in the header comment is good enough; or we could put in an Assert(). regards, tom lane
David Rowley <dgrowleyml@gmail.com> writes: > I have to say I don't really like the modifying of the loop iterator that's > going on here: > col = -1; > while ((col = bms_next_member(rte->modifiedCols, col)) >= 0) > { > col += FirstLowInvalidHeapAttributeNumber; > /* do stuff */ > col -= FirstLowInvalidHeapAttributeNumber; > } > Some other code is doing this: > col = -1; > while ((col = bms_next_member(cols, col)) >= 0) > { > /* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */ > AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber; > Which seems less prone to future breakage and possibly slightly less cycles. Yeah, I'd come to the same conclusion while thinking about it in the shower this morning ... > A while back when I was benchmarking the planner time during my trials with > anti/semi join removals, I wrote a patch to change the usage pattern for > cases such as: > ... > This knocked between 4 and 22% off of the time the planner spent in the join > removals path. Really!? I've never seen either of those functions show up all that high in profiles. Can you share the test case you were measuring? regards, tom lane
David Rowley <dgrowleyml@gmail.com> writes: > A while back when I was benchmarking the planner time during my trials with > anti/semi join removals, I wrote a patch to change the usage pattern for > cases such as: > if (bms_membership(a) != BMS_SINGLETON) > return; /* nothing to do */ > singleton = bms_singleton_member(a); > ... > Into: > if (!bms_get_singleton(a, &singleton)) > return; /* nothing to do */ > ... > Which means 1 function call and loop over the bitmapset, rather than 2 > function > calls and 2 loops over the set when the set is a singleton. I went ahead and committed this with some cosmetic adjustments. I'm not sure about there being any performance win in existing use-cases, but it seems worth doing on notational grounds anyway. regards, tom lane
On Sat, Nov 29, 2014 at 8:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> A while back when I was benchmarking the planner time during my trials with
> anti/semi join removals, I wrote a patch to change the usage pattern for
> cases such as:
> if (bms_membership(a) != BMS_SINGLETON)
> return; /* nothing to do */
> singleton = bms_singleton_member(a);
> ...
> Into:
> if (!bms_get_singleton(a, &singleton))
> return; /* nothing to do */
> ...
> Which means 1 function call and loop over the bitmapset, rather than 2
> function
> calls and 2 loops over the set when the set is a singleton.
I went ahead and committed this with some cosmetic adjustments.
Thank you!
I'm not sure about there being any performance win in existing use-cases,
but it seems worth doing on notational grounds anyway.
My original benchmarks for this were based on the semi/anti join patch I was working on at the time
Benchmarks here:
Though the existing left join removal code should see similar speed ups, albeit the time spent in the join removal code path did only happen to be between 0.02 and 0.2% of total planning time with my test cases.
Regards
David Rowley