Marginal performance improvement: replace bms_first_member loops
Another thing that came out of the discussion at
/messages/by-id/CAOR=d=3j1U_q-zf8+jUx1hkx8ps+N8pm=EUTqyFdJ5ov=+fawg@mail.gmail.com
was that there was a significant amount of palloc/pfree traffic blamable
on the bms_first_member() loop in plpgsql's setup_param_list(). I've
been experimenting with a variant of bms_first_member() called
bms_next_member(), which doesn't modify the input bitmapset and therefore
removes the need to make a working copy when iterating over the members
of a set.
In isolation, bms_next_member() is fractionally slower than
bms_first_member() because it has to do a bit more shifting-and-masking,
but of course we more than win that back from eliminating a palloc/pfree
cycle. It's also worth noting that in principle, a bms_first_member()
loop is O(N^2) for large sets because it scans from the start of the
set each time; but I doubt this is much of an issue in practice, because
the bitmapsets we work with just aren't very large. (I did some
microbenchmarking and found that if one ignores the palloc overhead
question, a bms_next_member loop is a tad slower up to about four words
in the bitmapset, and faster beyond that because the rescans start to
make a difference. But four words would be 128 bits and very very few
bitmapsets in PG would have more members than that.)
The attached proposed patch adds bms_next_member() and replaces
bms_first_member() calls where it seemed to make sense. I've had a
hard time measuring much speed difference for this patch in isolation,
but in principle it should be less code and less cycles. It also seems
safer and more natural to not use destructive looping techniques.
regards, tom lane
Attachments:
bms_next_member.patchtext/x-diff; charset=us-ascii; name=bms_next_member.patchDownload
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 76bda73..7dcedd1 100644
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
*************** postgresPlanForeignModify(PlannerInfo *r
*** 1198,1212 ****
}
else if (operation == CMD_UPDATE)
{
- Bitmapset *tmpset = bms_copy(rte->modifiedCols);
AttrNumber col;
! while ((col = bms_first_member(tmpset)) >= 0)
{
col += FirstLowInvalidHeapAttributeNumber;
if (col <= InvalidAttrNumber) /* shouldn't happen */
elog(ERROR, "system-column update is not supported");
targetAttrs = lappend_int(targetAttrs, col);
}
}
--- 1198,1213 ----
}
else if (operation == CMD_UPDATE)
{
AttrNumber col;
! col = -1;
! while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
{
col += FirstLowInvalidHeapAttributeNumber;
if (col <= InvalidAttrNumber) /* shouldn't happen */
elog(ERROR, "system-column update is not supported");
targetAttrs = lappend_int(targetAttrs, col);
+ col -= FirstLowInvalidHeapAttributeNumber;
}
}
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index bb82c0d..44c067d 100644
*** a/contrib/sepgsql/dml.c
--- b/contrib/sepgsql/dml.c
*************** static Bitmapset *
*** 94,100 ****
fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
{
AttrNumber attno;
- Bitmapset *tmpset;
Bitmapset *result = NULL;
char *attname;
int index;
--- 94,99 ----
*************** fixup_inherited_columns(Oid parentId, Oi
*** 105,112 ****
if (parentId == childId)
return columns;
! tmpset = bms_copy(columns);
! while ((index = bms_first_member(tmpset)) > 0)
{
attno = index + FirstLowInvalidHeapAttributeNumber;
--- 104,111 ----
if (parentId == childId)
return columns;
! index = -1;
! while ((index = bms_next_member(columns, index)) > 0)
{
attno = index + FirstLowInvalidHeapAttributeNumber;
*************** fixup_inherited_columns(Oid parentId, Oi
*** 128,139 ****
elog(ERROR, "cache lookup failed for attribute %s of relation %u",
attname, childId);
! index = attno - FirstLowInvalidHeapAttributeNumber;
! result = bms_add_member(result, index);
pfree(attname);
}
- bms_free(tmpset);
return result;
}
--- 127,137 ----
elog(ERROR, "cache lookup failed for attribute %s of relation %u",
attname, childId);
! result = bms_add_member(result,
! attno - FirstLowInvalidHeapAttributeNumber);
pfree(attname);
}
return result;
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index c499486..436b0fa 100644
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
*************** ExecCheckRTEPerms(RangeTblEntry *rte)
*** 547,553 ****
AclMode remainingPerms;
Oid relOid;
Oid userid;
- Bitmapset *tmpset;
int col;
/*
--- 547,552 ----
*************** ExecCheckRTEPerms(RangeTblEntry *rte)
*** 614,621 ****
return false;
}
! tmpset = bms_copy(rte->selectedCols);
! while ((col = bms_first_member(tmpset)) >= 0)
{
/* remove the column number offset */
col += FirstLowInvalidHeapAttributeNumber;
--- 613,620 ----
return false;
}
! col = -1;
! while ((col = bms_next_member(rte->selectedCols, col)) >= 0)
{
/* remove the column number offset */
col += FirstLowInvalidHeapAttributeNumber;
*************** ExecCheckRTEPerms(RangeTblEntry *rte)
*** 632,639 ****
ACL_SELECT) != ACLCHECK_OK)
return false;
}
}
- bms_free(tmpset);
}
/*
--- 631,638 ----
ACL_SELECT) != ACLCHECK_OK)
return false;
}
+ col -= FirstLowInvalidHeapAttributeNumber;
}
}
/*
*************** ExecCheckRTEPerms(RangeTblEntry *rte)
*** 656,663 ****
return false;
}
! tmpset = bms_copy(rte->modifiedCols);
! while ((col = bms_first_member(tmpset)) >= 0)
{
/* remove the column number offset */
col += FirstLowInvalidHeapAttributeNumber;
--- 655,662 ----
return false;
}
! col = -1;
! while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
{
/* remove the column number offset */
col += FirstLowInvalidHeapAttributeNumber;
*************** ExecCheckRTEPerms(RangeTblEntry *rte)
*** 672,679 ****
remainingPerms) != ACLCHECK_OK)
return false;
}
}
- bms_free(tmpset);
}
}
return true;
--- 671,678 ----
remainingPerms) != ACLCHECK_OK)
return false;
}
+ col -= FirstLowInvalidHeapAttributeNumber;
}
}
}
return true;
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index c927b78..23eb1eb 100644
*** a/src/backend/nodes/bitmapset.c
--- b/src/backend/nodes/bitmapset.c
*************** bms_join(Bitmapset *a, Bitmapset *b)
*** 793,799 ****
return result;
}
! /*----------
* bms_first_member - find and remove first member of a set
*
* Returns -1 if set is empty. NB: set is destructively modified!
--- 793,799 ----
return result;
}
! /*
* bms_first_member - find and remove first member of a set
*
* Returns -1 if set is empty. NB: set is destructively modified!
*************** bms_join(Bitmapset *a, Bitmapset *b)
*** 801,811 ****
* This is intended as support for iterating through the members of a set.
* The typical pattern is
*
! * tmpset = bms_copy(inputset);
! * while ((x = bms_first_member(tmpset)) >= 0)
* process member x;
! * bms_free(tmpset);
! *----------
*/
int
bms_first_member(Bitmapset *a)
--- 801,811 ----
* This is intended as support for iterating through the members of a set.
* The typical pattern is
*
! * while ((x = bms_first_member(inputset)) >= 0)
* process member x;
! *
! * CAUTION: this destroys the content of "inputset". If the set must
! * not be modified, use bms_next_member instead.
*/
int
bms_first_member(Bitmapset *a)
*************** bms_first_member(Bitmapset *a)
*** 841,846 ****
--- 841,899 ----
}
/*
+ * bms_next_member - find next member of a set
+ *
+ * Returns smallest member greater than "prevbit", or -1 if there is none.
+ *
+ * This is intended as support for iterating through the members of a set.
+ * The typical pattern is
+ *
+ * x = -1;
+ * while ((x = bms_next_member(inputset, x)) >= 0)
+ * process member x;
+ */
+ int
+ bms_next_member(const Bitmapset *a, int prevbit)
+ {
+ int nwords;
+ int wordnum;
+ bitmapword mask;
+
+ if (a == NULL)
+ return -1;
+ nwords = a->nwords;
+ prevbit++;
+ mask = (~(bitmapword) 0) << BITNUM(prevbit);
+ for (wordnum = WORDNUM(prevbit); wordnum < nwords; wordnum++)
+ {
+ bitmapword w = a->words[wordnum];
+
+ /* ignore bits before prevbit */
+ w &= mask;
+
+ if (w != 0)
+ {
+ int result;
+
+ w = RIGHTMOST_ONE(w);
+
+ result = wordnum * BITS_PER_BITMAPWORD;
+ while ((w & 255) == 0)
+ {
+ w >>= 8;
+ result += 8;
+ }
+ result += rightmost_one_pos[w & 255];
+ return result;
+ }
+
+ /* in subsequent words, consider all bits */
+ mask = (~(bitmapword) 0);
+ }
+ return -1;
+ }
+
+ /*
* bms_hash_value - compute a hash key for a Bitmapset
*
* Note: we must ensure that any two bitmapsets that are bms_equal() will
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index ca9bd4f..edbd09f 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outList(StringInfo str, const List *nod
*** 184,198 ****
static void
_outBitmapset(StringInfo str, const Bitmapset *bms)
{
- Bitmapset *tmpset;
int x;
appendStringInfoChar(str, '(');
appendStringInfoChar(str, 'b');
! tmpset = bms_copy(bms);
! while ((x = bms_first_member(tmpset)) >= 0)
appendStringInfo(str, " %d", x);
- bms_free(tmpset);
appendStringInfoChar(str, ')');
}
--- 184,196 ----
static void
_outBitmapset(StringInfo str, const Bitmapset *bms)
{
int x;
appendStringInfoChar(str, '(');
appendStringInfoChar(str, 'b');
! x = -1;
! while ((x = bms_next_member(bms, x)) >= 0)
appendStringInfo(str, " %d", x);
appendStringInfoChar(str, ')');
}
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 25f3067..449fdc3 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*************** remove_unused_subquery_outputs(Query *su
*** 2272,2290 ****
static void
print_relids(Relids relids)
{
- Relids tmprelids;
int x;
bool first = true;
! tmprelids = bms_copy(relids);
! while ((x = bms_first_member(tmprelids)) >= 0)
{
if (!first)
printf(" ");
printf("%d", x);
first = false;
}
- bms_free(tmprelids);
}
static void
--- 2272,2288 ----
static void
print_relids(Relids relids)
{
int x;
bool first = true;
! x = -1;
! while ((x = bms_next_member(relids, x)) >= 0)
{
if (!first)
printf(" ");
printf("%d", x);
first = false;
}
}
static void
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 1ee3b93..7aab644 100644
*** a/src/backend/optimizer/path/indxpath.c
--- b/src/backend/optimizer/path/indxpath.c
*************** get_loop_count(PlannerInfo *root, Relids
*** 1875,1883 ****
{
int relid;
! /* Need a working copy since bms_first_member is destructive */
! outer_relids = bms_copy(outer_relids);
! while ((relid = bms_first_member(outer_relids)) >= 0)
{
RelOptInfo *outer_rel;
--- 1875,1882 ----
{
int relid;
! relid = -1;
! while ((relid = bms_next_member(outer_relids, relid)) >= 0)
{
RelOptInfo *outer_rel;
*************** get_loop_count(PlannerInfo *root, Relids
*** 1900,1906 ****
if (result == 1.0 || result > outer_rel->rows)
result = outer_rel->rows;
}
- bms_free(outer_relids);
}
return result;
}
--- 1899,1904 ----
diff --git a/src/backend/optimizer/util/joininfo.c b/src/backend/optimizer/util/joininfo.c
index 0418946..40af38d 100644
*** a/src/backend/optimizer/util/joininfo.c
--- b/src/backend/optimizer/util/joininfo.c
*************** add_join_clause_to_rels(PlannerInfo *roo
*** 96,112 ****
RestrictInfo *restrictinfo,
Relids join_relids)
{
- Relids tmprelids;
int cur_relid;
! tmprelids = bms_copy(join_relids);
! while ((cur_relid = bms_first_member(tmprelids)) >= 0)
{
RelOptInfo *rel = find_base_rel(root, cur_relid);
rel->joininfo = lappend(rel->joininfo, restrictinfo);
}
- bms_free(tmprelids);
}
/*
--- 96,110 ----
RestrictInfo *restrictinfo,
Relids join_relids)
{
int cur_relid;
! cur_relid = -1;
! while ((cur_relid = bms_next_member(join_relids, cur_relid)) >= 0)
{
RelOptInfo *rel = find_base_rel(root, cur_relid);
rel->joininfo = lappend(rel->joininfo, restrictinfo);
}
}
/*
*************** remove_join_clause_from_rels(PlannerInfo
*** 125,135 ****
RestrictInfo *restrictinfo,
Relids join_relids)
{
- Relids tmprelids;
int cur_relid;
! tmprelids = bms_copy(join_relids);
! while ((cur_relid = bms_first_member(tmprelids)) >= 0)
{
RelOptInfo *rel = find_base_rel(root, cur_relid);
--- 123,132 ----
RestrictInfo *restrictinfo,
Relids join_relids)
{
int cur_relid;
! cur_relid = -1;
! while ((cur_relid = bms_next_member(join_relids, cur_relid)) >= 0)
{
RelOptInfo *rel = find_base_rel(root, cur_relid);
*************** remove_join_clause_from_rels(PlannerInfo
*** 140,144 ****
Assert(list_member_ptr(rel->joininfo, restrictinfo));
rel->joininfo = list_delete_ptr(rel->joininfo, restrictinfo);
}
- bms_free(tmprelids);
}
--- 137,140 ----
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index d4f46b8..a64a8d7 100644
*** a/src/backend/optimizer/util/var.c
--- b/src/backend/optimizer/util/var.c
*************** static Relids
*** 773,783 ****
alias_relid_set(PlannerInfo *root, Relids relids)
{
Relids result = NULL;
- Relids tmprelids;
int rtindex;
! tmprelids = bms_copy(relids);
! while ((rtindex = bms_first_member(tmprelids)) >= 0)
{
RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
--- 773,782 ----
alias_relid_set(PlannerInfo *root, Relids relids)
{
Relids result = NULL;
int rtindex;
! rtindex = -1;
! while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
{
RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
*************** alias_relid_set(PlannerInfo *root, Relid
*** 786,791 ****
else
result = bms_add_member(result, rtindex);
}
- bms_free(tmprelids);
return result;
}
--- 785,789 ----
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index ad983c7..c7afea1 100644
*** a/src/backend/rewrite/rewriteHandler.c
--- b/src/backend/rewrite/rewriteHandler.c
*************** static Bitmapset *
*** 2456,2466 ****
adjust_view_column_set(Bitmapset *cols, List *targetlist)
{
Bitmapset *result = NULL;
- Bitmapset *tmpcols;
AttrNumber col;
! tmpcols = bms_copy(cols);
! while ((col = bms_first_member(tmpcols)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
--- 2456,2465 ----
adjust_view_column_set(Bitmapset *cols, List *targetlist)
{
Bitmapset *result = NULL;
AttrNumber col;
! col = -1;
! while ((col = bms_next_member(cols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
*************** adjust_view_column_set(Bitmapset *cols,
*** 2510,2516 ****
attno);
}
}
- bms_free(tmpcols);
return result;
}
--- 2509,2514 ----
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index fb20314..c9e4b68 100644
*** a/src/backend/rewrite/rewriteManip.c
--- b/src/backend/rewrite/rewriteManip.c
*************** static Relids
*** 454,466 ****
offset_relid_set(Relids relids, int offset)
{
Relids result = NULL;
- Relids tmprelids;
int rtindex;
! tmprelids = bms_copy(relids);
! while ((rtindex = bms_first_member(tmprelids)) >= 0)
result = bms_add_member(result, rtindex + offset);
- bms_free(tmprelids);
return result;
}
--- 454,464 ----
offset_relid_set(Relids relids, int offset)
{
Relids result = NULL;
int rtindex;
! rtindex = -1;
! while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
result = bms_add_member(result, rtindex + offset);
return result;
}
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index f770608..a78ff48 100644
*** a/src/include/nodes/bitmapset.h
--- b/src/include/nodes/bitmapset.h
*************** extern Bitmapset *bms_join(Bitmapset *a,
*** 89,94 ****
--- 89,95 ----
/* support for iterating through the integer elements of a set: */
extern int bms_first_member(Bitmapset *a);
+ extern int bms_next_member(const Bitmapset *a, int prevbit);
/* support for hashtables using Bitmapsets as keys: */
extern uint32 bms_hash_value(const Bitmapset *a);
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 11cb47b..d1feef7 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*************** setup_param_list(PLpgSQL_execstate *esta
*** 5263,5269 ****
*/
if (!bms_is_empty(expr->paramnos))
{
- Bitmapset *tmpset;
int dno;
paramLI = (ParamListInfo)
--- 5263,5268 ----
*************** setup_param_list(PLpgSQL_execstate *esta
*** 5276,5283 ****
paramLI->numParams = estate->ndatums;
/* Instantiate values for "safe" parameters of the expression */
! tmpset = bms_copy(expr->paramnos);
! while ((dno = bms_first_member(tmpset)) >= 0)
{
PLpgSQL_datum *datum = estate->datums[dno];
--- 5275,5282 ----
paramLI->numParams = estate->ndatums;
/* Instantiate values for "safe" parameters of the expression */
! dno = -1;
! while ((dno = bms_next_member(expr->paramnos, dno)) >= 0)
{
PLpgSQL_datum *datum = estate->datums[dno];
*************** setup_param_list(PLpgSQL_execstate *esta
*** 5292,5298 ****
prm->ptype = var->datatype->typoid;
}
}
- bms_free(tmpset);
/*
* Set up link to active expr where the hook functions can find it.
--- 5291,5296 ----
*************** format_expr_params(PLpgSQL_execstate *es
*** 6528,6542 ****
int paramno;
int dno;
StringInfoData paramstr;
- Bitmapset *tmpset;
if (!expr->paramnos)
return NULL;
initStringInfo(¶mstr);
- tmpset = bms_copy(expr->paramnos);
paramno = 0;
! while ((dno = bms_first_member(tmpset)) >= 0)
{
Datum paramdatum;
Oid paramtypeid;
--- 6526,6539 ----
int paramno;
int dno;
StringInfoData paramstr;
if (!expr->paramnos)
return NULL;
initStringInfo(¶mstr);
paramno = 0;
! dno = -1;
! while ((dno = bms_next_member(expr->paramnos, dno)) >= 0)
{
Datum paramdatum;
Oid paramtypeid;
*************** format_expr_params(PLpgSQL_execstate *es
*** 6572,6578 ****
paramno++;
}
- bms_free(tmpset);
return paramstr.data;
}
--- 6569,6574 ----
On 27 November 2014 at 19:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The attached proposed patch adds bms_next_member() and replaces
bms_first_member() calls where it seemed to make sense. I've had a
hard time measuring much speed difference for this patch in isolation,
but in principle it should be less code and less cycles. It also seems
safer and more natural to not use destructive looping techniques.
+1. I had a similar idea a while back but didn't have time to produce
a complete patch.
There is another micro-optimisation that you could make in
bms_next_member() -- it isn't necessary to do
w = RIGHTMOST_ONE(w)
because unlike bms_next_member, w isn't being used to mask out the bit
retrieved, so any higher bits don't matter and the later use of
rightmost_one_pos[...] will pick out the required rightmost bit.
Should this function protect against large negative inputs? As it
stands, passing in a value of prevbit less than -1 would be
problematic. Maybe it's sufficient to say "don't do that" in the docs,
rather than waste more cycles checking.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 28, 2014 at 8:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The attached proposed patch adds bms_next_member() and replaces
bms_first_member() calls where it seemed to make sense. I've had a
hard time measuring much speed difference for this patch in isolation,
but in principle it should be less code and less cycles. It also seems
safer and more natural to not use destructive looping techniques.
I've had a quick read of the patch and it seems like a good idea.
I have to say I don't really like the modifying of the loop iterator that's
going on here:
col = -1;
while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
{
col += FirstLowInvalidHeapAttributeNumber;
/* do stuff */
col -= FirstLowInvalidHeapAttributeNumber;
}
Some other code is doing this:
col = -1;
while ((col = bms_next_member(cols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
Which seems less prone to future breakage and possibly slightly less cycles.
A while back when I was benchmarking the planner time during my trials with
anti/semi join removals, I wrote a patch to change the usage pattern for
cases
such as:
if (bms_membership(a) != BMS_SINGLETON)
return; /* nothing to do */
singleton = bms_singleton_member(a);
...
Into:
if (!bms_get_singleton(a, &singleton))
return; /* nothing to do */
...
Which means 1 function call and loop over the bitmapset, rather than 2
function
calls and 2 loops over the set when the set is a singleton.
This knocked between 4 and 22% off of the time the planner spent in the join
removals path.
The patch to implement this and change all suitable calls sites is attached.
Regards
David Rowley
Attachments:
bms_get_singleton_v1.patchapplication/octet-stream; name=bms_get_singleton_v1.patchDownload
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index c927b78..daec9e6 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -525,6 +525,61 @@ bms_singleton_member(const Bitmapset *a)
}
/*
+ * bms_get_singleton
+ *
+ * Returns True and sets singleton to the value of the singleton, if the
+ * bitmapset is a singleton, otherwise, if the bitmapset is NULL, empty or has
+ * multiple values, False is returned.
+ *
+ * This function can be useful if some processing only needs to take place
+ * when a Bitmapset is a singleton and that singleton value is required for
+ * that processing, for example, you could do:
+ *
+ * if (bms_membership(a) != BMS_SINGLETON)
+ * return; // nothing to do
+ * singleton = bms_singleton_member(a);
+ *
+ * But it would be more efficiently processed by doing:
+ *
+ * if (!bms_get_singleton(a, &singleton))
+ * return; // nothing to do
+ */
+bool
+bms_get_singleton(const Bitmapset *a, int *singleton)
+{
+ int result = -1;
+ int nwords;
+ int wordnum;
+
+ if (a == NULL)
+ return false;
+ nwords = a->nwords;
+ for (wordnum = 0; wordnum < nwords; wordnum++)
+ {
+ bitmapword w = a->words[wordnum];
+
+ if (w != 0)
+ {
+ if (result >= 0 || HAS_MULTIPLE_ONES(w))
+ return false;
+ result = wordnum * BITS_PER_BITMAPWORD;
+ while ((w & 255) == 0)
+ {
+ w >>= 8;
+ result += 8;
+ }
+ result += rightmost_one_pos[w & 255];
+ }
+ }
+ if (result < 0)
+ return false;
+
+ *singleton = result;
+ return true;
+}
+
+
+/*
* bms_num_members - count members of set
*/
int
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index e5dd58e..479fdba 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -867,9 +867,9 @@ generate_base_implied_equalities_no_const(PlannerInfo *root,
int relid;
Assert(!cur_em->em_is_child); /* no children yet */
- if (bms_membership(cur_em->em_relids) != BMS_SINGLETON)
+ if (!bms_get_singleton(cur_em->em_relids, &relid))
continue;
- relid = bms_singleton_member(cur_em->em_relids);
+
Assert(relid < root->simple_rel_array_size);
if (prev_ems[relid] != NULL)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 773f8a4..6cd0758 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -163,10 +163,9 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
*/
if (sjinfo->jointype != JOIN_LEFT ||
sjinfo->delay_upper_joins ||
- bms_membership(sjinfo->min_righthand) != BMS_SINGLETON)
+ !bms_get_singleton(sjinfo->min_righthand, &innerrelid))
return false;
- innerrelid = bms_singleton_member(sjinfo->min_righthand);
innerrel = find_base_rel(root, innerrelid);
if (innerrel->reloptkind != RELOPT_BASEREL)
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index 8d7c4fe..d9d1c6a 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -383,10 +383,10 @@ add_placeholders_to_base_rels(PlannerInfo *root)
{
PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc);
Relids eval_at = phinfo->ph_eval_at;
+ int varno;
- if (bms_membership(eval_at) == BMS_SINGLETON)
+ if (bms_get_singleton(eval_at, &varno))
{
- int varno = bms_singleton_member(eval_at);
RelOptInfo *rel = find_base_rel(root, varno);
/* add it to reltargetlist if needed above the rel scan level */
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index f770608..6db7270 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -72,6 +72,8 @@ extern bool bms_is_member(int x, const Bitmapset *a);
extern bool bms_overlap(const Bitmapset *a, const Bitmapset *b);
extern bool bms_nonempty_difference(const Bitmapset *a, const Bitmapset *b);
extern int bms_singleton_member(const Bitmapset *a);
+extern bool bms_get_singleton(const Bitmapset *a, int *singleton);
+
extern int bms_num_members(const Bitmapset *a);
/* optimized tests when we don't need to know exact membership count: */
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
On 27 November 2014 at 19:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The attached proposed patch adds bms_next_member() and replaces
bms_first_member() calls where it seemed to make sense.
There is another micro-optimisation that you could make in
bms_next_member() -- it isn't necessary to do
w = RIGHTMOST_ONE(w)
Excellent point! Thanks for noticing that.
Should this function protect against large negative inputs? As it
stands, passing in a value of prevbit less than -1 would be
problematic. Maybe it's sufficient to say "don't do that" in the docs,
rather than waste more cycles checking.
Yeah, I had considered whether to do that; instead of just prevbit++
it would need to be something like
prevbit = (prevbit < 0) ? 0 : prevbit + 1;
This would add one test-and-branch, and moreover one that would be
hard to predict correctly (given that most of our bitmapsets don't
have very many members). So it seems pretty expensive. Probably
a more explicit warning in the header comment is good enough; or
we could put in an Assert().
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
I have to say I don't really like the modifying of the loop iterator that's
going on here:
col = -1;
while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
{
col += FirstLowInvalidHeapAttributeNumber;
/* do stuff */
col -= FirstLowInvalidHeapAttributeNumber;
}
Some other code is doing this:
col = -1;
while ((col = bms_next_member(cols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
Which seems less prone to future breakage and possibly slightly less cycles.
Yeah, I'd come to the same conclusion while thinking about it in the
shower this morning ...
A while back when I was benchmarking the planner time during my trials with
anti/semi join removals, I wrote a patch to change the usage pattern for
cases such as:
...
This knocked between 4 and 22% off of the time the planner spent in the join
removals path.
Really!? I've never seen either of those functions show up all that high
in profiles. Can you share the test case you were measuring?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
A while back when I was benchmarking the planner time during my trials with
anti/semi join removals, I wrote a patch to change the usage pattern for
cases such as:
if (bms_membership(a) != BMS_SINGLETON)
return; /* nothing to do */
singleton = bms_singleton_member(a);
...
Into:
if (!bms_get_singleton(a, &singleton))
return; /* nothing to do */
...
Which means 1 function call and loop over the bitmapset, rather than 2
function
calls and 2 loops over the set when the set is a singleton.
I went ahead and committed this with some cosmetic adjustments.
I'm not sure about there being any performance win in existing use-cases,
but it seems worth doing on notational grounds anyway.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 29, 2014 at 8:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
A while back when I was benchmarking the planner time during my trials
with
anti/semi join removals, I wrote a patch to change the usage pattern for
cases such as:if (bms_membership(a) != BMS_SINGLETON)
return; /* nothing to do */
singleton = bms_singleton_member(a);
...Into:
if (!bms_get_singleton(a, &singleton))
return; /* nothing to do */
...Which means 1 function call and loop over the bitmapset, rather than 2
function
calls and 2 loops over the set when the set is a singleton.I went ahead and committed this with some cosmetic adjustments.
Thank you!
I'm not sure about there being any performance win in existing use-cases,
but it seems worth doing on notational grounds anyway.
My original benchmarks for this were based on the semi/anti join patch I
was working on at the time
Benchmarks here:
/messages/by-id/CAApHDvo21-b+PU=sC9B1QiEG3YL4T919i4WjZfnfP6UPXS9nZg@mail.gmail.com
Though the existing left join removal code should see similar speed ups,
albeit the time spent in the join removal code path did only happen to be
between 0.02 and 0.2% of total planning time with my test cases.
Regards
David Rowley