[Patch] RBTree iteration interface improvement
Hello
While working on some new feature for PostgreSQL (which is still in
development and is irrelevant in this context) I noticed that current
RBTree implementation interface is following:
```
extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
extern RBNode *rb_iterate(RBTree *rb);
```
As you see it doesn't allow to do multiple iterations over a single
tree. I believe it's a major flaw for a general-purpose container.
Proposed patch introduces a new iteration interface that doesn't has
such a flaw. Naturally I wrote some unit tests, but I was not sure where
exactly to place them in PostgreSQL source code. For now I've just
uploaded them to GitHub:
https://github.com/afiskon/c-algorithms-examples/blob/master/rbtree_example.c
According to these tests new implementation works as fast as current
implementation and produces exactly same results.
I didn't dare to remove current interface since in theory some
extensions can use it. Still I believe such a move is worth considering.
--
Best regards,
Aleksander Alekseev
Attachments:
rbtree-iteration-improvement.patchtext/x-patchDownload
diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c
index d6422ea..eac76c4 100644
--- a/src/backend/access/gin/ginbulk.c
+++ b/src/backend/access/gin/ginbulk.c
@@ -255,7 +255,7 @@ qsortCompareItemPointers(const void *a, const void *b)
void
ginBeginBAScan(BuildAccumulator *accum)
{
- rb_begin_iterate(accum->tree, LeftRightWalk);
+ rb_begin_left_right_walk(accum->tree, &accum->tree_walk);
}
/*
@@ -271,7 +271,7 @@ ginGetBAEntry(BuildAccumulator *accum,
GinEntryAccumulator *entry;
ItemPointerData *list;
- entry = (GinEntryAccumulator *) rb_iterate(accum->tree);
+ entry = (GinEntryAccumulator *) rb_left_right_walk(&accum->tree_walk);
if (entry == NULL)
return NULL; /* no more entries */
diff --git a/src/backend/lib/rbtree.c b/src/backend/lib/rbtree.c
index 4fa8a1d..cbe256f 100644
--- a/src/backend/lib/rbtree.c
+++ b/src/backend/lib/rbtree.c
@@ -871,3 +871,278 @@ rb_iterate(RBTree *rb)
return rb->iterate(rb);
}
+
+/*
+ * Begin left right walk.
+ */
+void
+rb_begin_left_right_walk(RBTree *rb, RBTreeLeftRightWalk * lrw)
+{
+ lrw->rb = rb;
+ lrw->last_visited = NULL;
+ lrw->is_over = (lrw->rb->root == RBNIL);
+}
+
+/*
+ * Left right walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_left_right_walk(RBTreeLeftRightWalk * lrw)
+{
+ if (lrw->is_over)
+ return NULL;
+
+ if (lrw->last_visited == NULL)
+ {
+ lrw->last_visited = lrw->rb->root;
+ while (lrw->last_visited->left != RBNIL)
+ lrw->last_visited = lrw->last_visited->left;
+
+ return lrw->last_visited;
+ }
+
+ if (lrw->last_visited->right != RBNIL)
+ {
+ lrw->last_visited = lrw->last_visited->right;
+ while (lrw->last_visited->left != RBNIL)
+ lrw->last_visited = lrw->last_visited->left;
+
+ return lrw->last_visited;
+ }
+
+ for (;;)
+ {
+ RBNode *came_from = lrw->last_visited;
+
+ lrw->last_visited = lrw->last_visited->parent;
+ if (lrw->last_visited == NULL)
+ {
+ lrw->is_over = true;
+ break;
+ }
+
+ if (lrw->last_visited->left == came_from)
+ break; /* came from left sub-tree, return current
+ * node */
+
+ /* else - came from right sub-tree, continue to move up */
+ }
+
+ return lrw->last_visited;
+}
+
+/*
+ * Begin right left walk.
+ */
+void
+rb_begin_right_left_walk(RBTree *rb, RBTreeRightLeftWalk * rlw)
+{
+ rlw->rb = rb;
+ rlw->last_visited = NULL;
+ rlw->is_over = (rlw->rb->root == RBNIL);
+}
+
+/*
+ * Right left walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_right_left_walk(RBTreeRightLeftWalk * rlw)
+{
+ if (rlw->is_over)
+ return NULL;
+
+ if (rlw->last_visited == NULL)
+ {
+ rlw->last_visited = rlw->rb->root;
+ while (rlw->last_visited->right != RBNIL)
+ rlw->last_visited = rlw->last_visited->right;
+
+ return rlw->last_visited;
+ }
+
+ if (rlw->last_visited->left != RBNIL)
+ {
+ rlw->last_visited = rlw->last_visited->left;
+ while (rlw->last_visited->right != RBNIL)
+ rlw->last_visited = rlw->last_visited->right;
+
+ return rlw->last_visited;
+ }
+
+ for (;;)
+ {
+ RBNode *came_from = rlw->last_visited;
+
+ rlw->last_visited = rlw->last_visited->parent;
+ if (rlw->last_visited == NULL)
+ {
+ rlw->is_over = true;
+ break;
+ }
+
+ if (rlw->last_visited->right == came_from)
+ break; /* came from right sub-tree, return current
+ * node */
+
+ /* else - came from left sub-tree, continue to move up */
+ }
+
+ return rlw->last_visited;
+}
+
+/*
+ * Begin direct walk (a.k.a pre-order traversal).
+ *
+ * Pre-order traversal while duplicating nodes and edges can make a complete
+ * duplicate of a binary tree. It can also be used to make a prefix expression
+ * (Polish notation) from expression trees: traverse the expression tree
+ * pre-orderly.
+ *
+ * https://en.wikipedia.org/wiki/Tree_traversal#Applications
+ */
+void
+rb_begin_direct_walk(RBTree *rb, RBTreeDirectWalk * dw)
+{
+ dw->rb = rb;
+ dw->last_visited = NULL;
+ dw->is_over = (dw->rb->root == RBNIL);
+}
+
+/*
+ * Direct walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_direct_walk(RBTreeDirectWalk * dw)
+{
+ if (dw->is_over)
+ return NULL;
+
+ if (dw->last_visited == NULL)
+ {
+ dw->last_visited = dw->rb->root;
+ return dw->last_visited;
+ }
+
+ if (dw->last_visited->left != RBNIL)
+ {
+ dw->last_visited = dw->last_visited->left;
+ return dw->last_visited;
+ }
+
+ do
+ {
+ if (dw->last_visited->right != RBNIL)
+ {
+ dw->last_visited = dw->last_visited->right;
+ break;
+ }
+
+ /* go up and one step right */
+ for (;;)
+ {
+ RBNode *came_from = dw->last_visited;
+
+ dw->last_visited = dw->last_visited->parent;
+ if (dw->last_visited == NULL)
+ {
+ dw->is_over = true;
+ break;
+ }
+
+ if ((dw->last_visited->right != came_from) && (dw->last_visited->right != RBNIL))
+ {
+ dw->last_visited = dw->last_visited->right;
+ return dw->last_visited;
+ }
+ }
+ }
+ while (dw->last_visited != NULL);
+
+ return dw->last_visited;
+}
+
+/*
+ * Begin inverted walk (a.k.a post-order traversal).
+ */
+void
+rb_begin_inverted_walk(RBTree *rb, RBTreeInvertedWalk * iw)
+{
+ iw->rb = rb;
+ iw->last_visited = NULL;
+ iw->next_step = NextStepNone;
+ iw->is_over = (iw->rb->root == RBNIL);
+}
+
+/*
+ * Inverted walk: get next node. Returns NULL if there is none.
+ *
+ * Post-order traversal while deleting or freeing nodes and values can delete
+ * or free an entire binary tree. It can also generate a postfix
+ * representation of a binary tree.
+ *
+ * https://en.wikipedia.org/wiki/Tree_traversal#Applications
+ */
+RBNode *
+rb_inverted_walk(RBTreeInvertedWalk * iw)
+{
+ RBNode *came_from;
+
+ if (iw->is_over)
+ return NULL;
+
+ if (iw->last_visited == NULL)
+ {
+ iw->last_visited = iw->rb->root;
+ iw->next_step = NextStepLeft;
+ }
+
+loop:
+ switch (iw->next_step)
+ {
+ case NextStepLeft:
+ came_from = iw->last_visited;
+ while (iw->last_visited->left != RBNIL)
+ iw->last_visited = iw->last_visited->left;
+
+ iw->next_step = NextStepRight;
+ goto loop;
+
+ case NextStepRight:
+ if (iw->last_visited->right != RBNIL)
+ {
+ iw->last_visited = iw->last_visited->right;
+ iw->next_step = NextStepLeft;
+ goto loop;
+ }
+ else /* not moved - return current, then go up */
+ iw->next_step = NextStepUp;
+ break;
+ case NextStepUp:
+ for (;;)
+ {
+ came_from = iw->last_visited;
+ iw->last_visited = iw->last_visited->parent;
+ if (iw->last_visited == NULL)
+ {
+ iw->is_over = true;
+ break; /* end of iteration */
+ }
+
+ if (came_from == iw->last_visited->right)
+ {
+ /* return current, then continue to go up */
+ break;
+ }
+
+ /* otherwise we came from the left */
+ iw->next_step = NextStepRight;
+ goto loop;
+ }
+ break;
+ default:
+ fprintf(stderr, "Unexpected next step value during inverted walk: %d\n", iw->next_step);
+ exit(1);
+ }
+
+ return iw->last_visited;
+}
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index 68cfe0c..3438fb8 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -919,6 +919,7 @@ typedef struct
GinEntryAccumulator *entryallocator;
uint32 eas_used;
RBTree *tree;
+ RBTreeLeftRightWalk tree_walk;
} BuildAccumulator;
extern void ginInitBA(BuildAccumulator *accum);
diff --git a/src/include/lib/rbtree.h b/src/include/lib/rbtree.h
index 73e83c3..541f978 100644
--- a/src/include/lib/rbtree.h
+++ b/src/include/lib/rbtree.h
@@ -63,4 +63,53 @@ extern void rb_delete(RBTree *rb, RBNode *node);
extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
extern RBNode *rb_iterate(RBTree *rb);
+typedef enum RBTreeNextStep
+{
+ NextStepNone,
+ NextStepUp,
+ NextStepLeft,
+ NextStepRight
+} RBTreeNextStep;
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeLeftRightWalk;
+
+extern void rb_begin_left_right_walk(RBTree *rb, RBTreeLeftRightWalk * lrw);
+extern RBNode *rb_left_right_walk(RBTreeLeftRightWalk * lrw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeRightLeftWalk;
+
+extern void rb_begin_right_left_walk(RBTree *rb, RBTreeRightLeftWalk * rlw);
+extern RBNode *rb_right_left_walk(RBTreeRightLeftWalk * rlw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeDirectWalk;
+
+extern void rb_begin_direct_walk(RBTree *rb, RBTreeDirectWalk * dw);
+extern RBNode *rb_direct_walk(RBTreeDirectWalk * dw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ RBTreeNextStep next_step;
+ bool is_over;
+} RBTreeInvertedWalk;
+
+extern void rb_begin_inverted_walk(RBTree *rb, RBTreeInvertedWalk * dw);
+extern RBNode *rb_inverted_walk(RBTreeInvertedWalk * dw);
+
#endif /* RBTREE_H */
And as always, I didn't re-check a patch before sending it :) Here is a
corrected version without any fprintf's.
--
Best regards,
Aleksander Alekseev
Attachments:
rbtree-iteration-improvement-v2.patchtext/x-patchDownload
diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c
index d6422ea..eac76c4 100644
--- a/src/backend/access/gin/ginbulk.c
+++ b/src/backend/access/gin/ginbulk.c
@@ -255,7 +255,7 @@ qsortCompareItemPointers(const void *a, const void *b)
void
ginBeginBAScan(BuildAccumulator *accum)
{
- rb_begin_iterate(accum->tree, LeftRightWalk);
+ rb_begin_left_right_walk(accum->tree, &accum->tree_walk);
}
/*
@@ -271,7 +271,7 @@ ginGetBAEntry(BuildAccumulator *accum,
GinEntryAccumulator *entry;
ItemPointerData *list;
- entry = (GinEntryAccumulator *) rb_iterate(accum->tree);
+ entry = (GinEntryAccumulator *) rb_left_right_walk(&accum->tree_walk);
if (entry == NULL)
return NULL; /* no more entries */
diff --git a/src/backend/lib/rbtree.c b/src/backend/lib/rbtree.c
index 4fa8a1d..eb645a2 100644
--- a/src/backend/lib/rbtree.c
+++ b/src/backend/lib/rbtree.c
@@ -871,3 +871,278 @@ rb_iterate(RBTree *rb)
return rb->iterate(rb);
}
+
+/*
+ * Begin left right walk.
+ */
+void
+rb_begin_left_right_walk(RBTree *rb, RBTreeLeftRightWalk * lrw)
+{
+ lrw->rb = rb;
+ lrw->last_visited = NULL;
+ lrw->is_over = (lrw->rb->root == RBNIL);
+}
+
+/*
+ * Left right walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_left_right_walk(RBTreeLeftRightWalk * lrw)
+{
+ if (lrw->is_over)
+ return NULL;
+
+ if (lrw->last_visited == NULL)
+ {
+ lrw->last_visited = lrw->rb->root;
+ while (lrw->last_visited->left != RBNIL)
+ lrw->last_visited = lrw->last_visited->left;
+
+ return lrw->last_visited;
+ }
+
+ if (lrw->last_visited->right != RBNIL)
+ {
+ lrw->last_visited = lrw->last_visited->right;
+ while (lrw->last_visited->left != RBNIL)
+ lrw->last_visited = lrw->last_visited->left;
+
+ return lrw->last_visited;
+ }
+
+ for (;;)
+ {
+ RBNode *came_from = lrw->last_visited;
+
+ lrw->last_visited = lrw->last_visited->parent;
+ if (lrw->last_visited == NULL)
+ {
+ lrw->is_over = true;
+ break;
+ }
+
+ if (lrw->last_visited->left == came_from)
+ break; /* came from left sub-tree, return current
+ * node */
+
+ /* else - came from right sub-tree, continue to move up */
+ }
+
+ return lrw->last_visited;
+}
+
+/*
+ * Begin right left walk.
+ */
+void
+rb_begin_right_left_walk(RBTree *rb, RBTreeRightLeftWalk * rlw)
+{
+ rlw->rb = rb;
+ rlw->last_visited = NULL;
+ rlw->is_over = (rlw->rb->root == RBNIL);
+}
+
+/*
+ * Right left walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_right_left_walk(RBTreeRightLeftWalk * rlw)
+{
+ if (rlw->is_over)
+ return NULL;
+
+ if (rlw->last_visited == NULL)
+ {
+ rlw->last_visited = rlw->rb->root;
+ while (rlw->last_visited->right != RBNIL)
+ rlw->last_visited = rlw->last_visited->right;
+
+ return rlw->last_visited;
+ }
+
+ if (rlw->last_visited->left != RBNIL)
+ {
+ rlw->last_visited = rlw->last_visited->left;
+ while (rlw->last_visited->right != RBNIL)
+ rlw->last_visited = rlw->last_visited->right;
+
+ return rlw->last_visited;
+ }
+
+ for (;;)
+ {
+ RBNode *came_from = rlw->last_visited;
+
+ rlw->last_visited = rlw->last_visited->parent;
+ if (rlw->last_visited == NULL)
+ {
+ rlw->is_over = true;
+ break;
+ }
+
+ if (rlw->last_visited->right == came_from)
+ break; /* came from right sub-tree, return current
+ * node */
+
+ /* else - came from left sub-tree, continue to move up */
+ }
+
+ return rlw->last_visited;
+}
+
+/*
+ * Begin direct walk (a.k.a pre-order traversal).
+ *
+ * Pre-order traversal while duplicating nodes and edges can make a complete
+ * duplicate of a binary tree. It can also be used to make a prefix expression
+ * (Polish notation) from expression trees: traverse the expression tree
+ * pre-orderly.
+ *
+ * https://en.wikipedia.org/wiki/Tree_traversal#Applications
+ */
+void
+rb_begin_direct_walk(RBTree *rb, RBTreeDirectWalk * dw)
+{
+ dw->rb = rb;
+ dw->last_visited = NULL;
+ dw->is_over = (dw->rb->root == RBNIL);
+}
+
+/*
+ * Direct walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_direct_walk(RBTreeDirectWalk * dw)
+{
+ if (dw->is_over)
+ return NULL;
+
+ if (dw->last_visited == NULL)
+ {
+ dw->last_visited = dw->rb->root;
+ return dw->last_visited;
+ }
+
+ if (dw->last_visited->left != RBNIL)
+ {
+ dw->last_visited = dw->last_visited->left;
+ return dw->last_visited;
+ }
+
+ do
+ {
+ if (dw->last_visited->right != RBNIL)
+ {
+ dw->last_visited = dw->last_visited->right;
+ break;
+ }
+
+ /* go up and one step right */
+ for (;;)
+ {
+ RBNode *came_from = dw->last_visited;
+
+ dw->last_visited = dw->last_visited->parent;
+ if (dw->last_visited == NULL)
+ {
+ dw->is_over = true;
+ break;
+ }
+
+ if ((dw->last_visited->right != came_from) && (dw->last_visited->right != RBNIL))
+ {
+ dw->last_visited = dw->last_visited->right;
+ return dw->last_visited;
+ }
+ }
+ }
+ while (dw->last_visited != NULL);
+
+ return dw->last_visited;
+}
+
+/*
+ * Begin inverted walk (a.k.a post-order traversal).
+ */
+void
+rb_begin_inverted_walk(RBTree *rb, RBTreeInvertedWalk * iw)
+{
+ iw->rb = rb;
+ iw->last_visited = NULL;
+ iw->next_step = NextStepNone;
+ iw->is_over = (iw->rb->root == RBNIL);
+}
+
+/*
+ * Inverted walk: get next node. Returns NULL if there is none.
+ *
+ * Post-order traversal while deleting or freeing nodes and values can delete
+ * or free an entire binary tree. It can also generate a postfix
+ * representation of a binary tree.
+ *
+ * https://en.wikipedia.org/wiki/Tree_traversal#Applications
+ */
+RBNode *
+rb_inverted_walk(RBTreeInvertedWalk * iw)
+{
+ RBNode *came_from;
+
+ if (iw->is_over)
+ return NULL;
+
+ if (iw->last_visited == NULL)
+ {
+ iw->last_visited = iw->rb->root;
+ iw->next_step = NextStepLeft;
+ }
+
+loop:
+ switch (iw->next_step)
+ {
+ case NextStepLeft:
+ came_from = iw->last_visited;
+ while (iw->last_visited->left != RBNIL)
+ iw->last_visited = iw->last_visited->left;
+
+ iw->next_step = NextStepRight;
+ goto loop;
+
+ case NextStepRight:
+ if (iw->last_visited->right != RBNIL)
+ {
+ iw->last_visited = iw->last_visited->right;
+ iw->next_step = NextStepLeft;
+ goto loop;
+ }
+ else /* not moved - return current, then go up */
+ iw->next_step = NextStepUp;
+ break;
+ case NextStepUp:
+ for (;;)
+ {
+ came_from = iw->last_visited;
+ iw->last_visited = iw->last_visited->parent;
+ if (iw->last_visited == NULL)
+ {
+ iw->is_over = true;
+ break; /* end of iteration */
+ }
+
+ if (came_from == iw->last_visited->right)
+ {
+ /* return current, then continue to go up */
+ break;
+ }
+
+ /* otherwise we came from the left */
+ iw->next_step = NextStepRight;
+ goto loop;
+ }
+ break;
+ default: /* should never happen but is usefull during
+ * debugging */
+ elog(ERROR, "Unexpected next step value during inverted walk: %d\n", iw->next_step);
+ }
+
+ return iw->last_visited;
+}
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index 68cfe0c..3438fb8 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -919,6 +919,7 @@ typedef struct
GinEntryAccumulator *entryallocator;
uint32 eas_used;
RBTree *tree;
+ RBTreeLeftRightWalk tree_walk;
} BuildAccumulator;
extern void ginInitBA(BuildAccumulator *accum);
diff --git a/src/include/lib/rbtree.h b/src/include/lib/rbtree.h
index 73e83c3..541f978 100644
--- a/src/include/lib/rbtree.h
+++ b/src/include/lib/rbtree.h
@@ -63,4 +63,53 @@ extern void rb_delete(RBTree *rb, RBNode *node);
extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
extern RBNode *rb_iterate(RBTree *rb);
+typedef enum RBTreeNextStep
+{
+ NextStepNone,
+ NextStepUp,
+ NextStepLeft,
+ NextStepRight
+} RBTreeNextStep;
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeLeftRightWalk;
+
+extern void rb_begin_left_right_walk(RBTree *rb, RBTreeLeftRightWalk * lrw);
+extern RBNode *rb_left_right_walk(RBTreeLeftRightWalk * lrw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeRightLeftWalk;
+
+extern void rb_begin_right_left_walk(RBTree *rb, RBTreeRightLeftWalk * rlw);
+extern RBNode *rb_right_left_walk(RBTreeRightLeftWalk * rlw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeDirectWalk;
+
+extern void rb_begin_direct_walk(RBTree *rb, RBTreeDirectWalk * dw);
+extern RBNode *rb_direct_walk(RBTreeDirectWalk * dw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ RBTreeNextStep next_step;
+ bool is_over;
+} RBTreeInvertedWalk;
+
+extern void rb_begin_inverted_walk(RBTree *rb, RBTreeInvertedWalk * dw);
+extern RBNode *rb_inverted_walk(RBTreeInvertedWalk * dw);
+
#endif /* RBTREE_H */
On Wed, Jul 27, 2016 at 7:56 PM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:
Hello
While working on some new feature for PostgreSQL (which is still in
development and is irrelevant in this context) I noticed that current
RBTree implementation interface is following:```
extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
extern RBNode *rb_iterate(RBTree *rb);
```As you see it doesn't allow to do multiple iterations over a single
tree. I believe it's a major flaw for a general-purpose container.
Can you explain use case where you need it?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Can you explain use case where you need it?
Sure. You can consider RBTree as a container that always keeps its
elements in sorted order. Now imagine you would like to write a code
like this:
```
/* iterate over items in sorted order */
while(item1 = left_right_walk(tree))
{
/* another iteration, probably even in different procedure */
while(item2 = left_right_walk(tree))
{
/* ... some logic ... */
}
}
```
Currently you can't do it.
Or maybe you have different objects, e.g. IndexScanDesc's, that should
iterate over some tree's independently somewhere in indexam.c
procedures. Exact order may depend on user's query so you don't even
control it.
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes:
Can you explain use case where you need it?
... Or maybe you have different objects, e.g. IndexScanDesc's, that should
iterate over some tree's independently somewhere in indexam.c
procedures. Exact order may depend on user's query so you don't even
control it.
It seems clear to me that the existing arrangement is hazardous for any
RBTree that hasn't got exactly one consumer. I think Aleksander's plan to
decouple the iteration state is probably a good one (NB: I've not read the
patch, so this is not an endorsement of details). I'd go so far as to say
that we should remove the old API as being dangerous, rather than preserve
it on backwards-compatibility grounds. We make bigger changes than that
in internal APIs all the time.
Having said that, though: if the iteration state is not part of the
object, it's not very clear whether we can behave sanely if someone
changes the tree while an iteration is open. It will need careful
thought as to what sort of guarantees we can make about that. If it's
too weak, then a separated-state version would have enough hazards of
its own that it's not necessarily any safer.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jul 28, 2016 at 1:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes:
Can you explain use case where you need it?
... Or maybe you have different objects, e.g. IndexScanDesc's, that should
iterate over some tree's independently somewhere in indexam.c
procedures. Exact order may depend on user's query so you don't even
control it.It seems clear to me that the existing arrangement is hazardous for any
RBTree that hasn't got exactly one consumer. I think Aleksander's plan to
decouple the iteration state is probably a good one (NB: I've not read the
patch, so this is not an endorsement of details). I'd go so far as to say
that we should remove the old API as being dangerous, rather than preserve
it on backwards-compatibility grounds. We make bigger changes than that
in internal APIs all the time.Having said that, though: if the iteration state is not part of the
object, it's not very clear whether we can behave sanely if someone
changes the tree while an iteration is open. It will need careful
thought as to what sort of guarantees we can make about that. If it's
too weak, then a separated-state version would have enough hazards of
its own that it's not necessarily any safer.
+1 to all of that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Having said that, though: if the iteration state is not part of the
object, it's not very clear whether we can behave sanely if someone
changes the tree while an iteration is open. It will need careful
thought as to what sort of guarantees we can make about that. If
it's too weak, then a separated-state version would have enough
hazards of its own that it's not necessarily any safer.+1 to all of that.
The old API assumes that tree doesn't change during iteration. And it
doesn't do any checks about this. The new API behaves the same way. In
this sense patch at least doesn't make anything worse.
It seems clear to me that the existing arrangement is hazardous for
any RBTree that hasn't got exactly one consumer. I think
Aleksander's plan to decouple the iteration state is probably a good
one (NB: I've not read the patch, so this is not an endorsement of
details). I'd go so far as to say that we should remove the old API
as being dangerous, rather than preserve it on
backwards-compatibility grounds. We make bigger changes than that in
internal APIs all the time.
Glad to hear it! I would like to propose a patch that removes the old
API. I have a few questions though.
Would you like it to be a separate patch, or all-in-one patch is fine
in this case? I also would like to include unit/property-based tests in
this (these) patch (patches). However as I understand there are
currently no unit tests in PostgreSQL at all, only integration/system
tests. Any advice how to do it better?
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
It seems clear to me that the existing arrangement is hazardous for
any RBTree that hasn't got exactly one consumer. I think
Aleksander's plan to decouple the iteration state is probably a good
one (NB: I've not read the patch, so this is not an endorsement of
details). I'd go so far as to say that we should remove the old API
as being dangerous, rather than preserve it on
backwards-compatibility grounds. We make bigger changes than that in
internal APIs all the time.Glad to hear it! I would like to propose a patch that removes the old
API. I have a few questions though.Would you like it to be a separate patch, or all-in-one patch is fine
in this case? I also would like to include unit/property-based tests in
this (these) patch (patches). However as I understand there are
currently no unit tests in PostgreSQL at all, only integration/system
tests. Any advice how to do it better?
OK, here is a patch. I think we could call it a _replacement_ of an old API, so
there is probably no need in two separate patches. I still don't know how to
include unit test and whether they should be included at all. Thus for now
unit/property-based tests could be found only on GitHub [1]https://github.com/afiskon/c-algorithms/tree/master/test.
As usual, if you have any questions, suggestions or notes regarding this patch,
please let me know.
[1]: https://github.com/afiskon/c-algorithms/tree/master/test
--
Best regards,
Aleksander Alekseev
Attachments:
rbtree-iteration-improvement-v3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c
index d6422ea..eac76c4 100644
--- a/src/backend/access/gin/ginbulk.c
+++ b/src/backend/access/gin/ginbulk.c
@@ -255,7 +255,7 @@ qsortCompareItemPointers(const void *a, const void *b)
void
ginBeginBAScan(BuildAccumulator *accum)
{
- rb_begin_iterate(accum->tree, LeftRightWalk);
+ rb_begin_left_right_walk(accum->tree, &accum->tree_walk);
}
/*
@@ -271,7 +271,7 @@ ginGetBAEntry(BuildAccumulator *accum,
GinEntryAccumulator *entry;
ItemPointerData *list;
- entry = (GinEntryAccumulator *) rb_iterate(accum->tree);
+ entry = (GinEntryAccumulator *) rb_left_right_walk(&accum->tree_walk);
if (entry == NULL)
return NULL; /* no more entries */
diff --git a/src/backend/lib/rbtree.c b/src/backend/lib/rbtree.c
index 4fa8a1d..4368492 100644
--- a/src/backend/lib/rbtree.c
+++ b/src/backend/lib/rbtree.c
@@ -28,18 +28,6 @@
#include "lib/rbtree.h"
-
-/*
- * Values of RBNode.iteratorState
- *
- * Note that iteratorState has an undefined value except in nodes that are
- * currently being visited by an active iteration.
- */
-#define InitialState (0)
-#define FirstStepDone (1)
-#define SecondStepDone (2)
-#define ThirdStepDone (3)
-
/*
* Colors of nodes (values of RBNode.color)
*/
@@ -53,10 +41,6 @@ struct RBTree
{
RBNode *root; /* root node, or RBNIL if tree is empty */
- /* Iteration state */
- RBNode *cur; /* current iteration node */
- RBNode *(*iterate) (RBTree *rb);
-
/* Remaining fields are constant after rb_create */
Size node_size; /* actual size of tree nodes */
@@ -75,7 +59,7 @@ struct RBTree
*/
#define RBNIL (&sentinel)
-static RBNode sentinel = {InitialState, RBBLACK, RBNIL, RBNIL, NULL};
+static RBNode sentinel = {RBBLACK, RBNIL, RBNIL, NULL};
/*
@@ -123,8 +107,6 @@ rb_create(Size node_size,
Assert(node_size > sizeof(RBNode));
tree->root = RBNIL;
- tree->cur = RBNIL;
- tree->iterate = NULL;
tree->node_size = node_size;
tree->comparator = comparator;
tree->combiner = combiner;
@@ -201,6 +183,28 @@ rb_leftmost(RBTree *rb)
return NULL;
}
+/*
+ * rb_rightmost: fetch the rightmost (highest-valued) tree node.
+ * Returns NULL if tree is empty.
+ */
+RBNode *
+rb_rightmost(RBTree *rb)
+{
+ RBNode *node = rb->root;
+ RBNode *rightmost = rb->root;
+
+ while (node != RBNIL)
+ {
+ rightmost = node;
+ node = node->right;
+ }
+
+ if (rightmost != RBNIL)
+ return rightmost;
+
+ return NULL;
+}
+
/**********************************************************************
* Insertion *
**********************************************************************/
@@ -437,7 +441,6 @@ rb_insert(RBTree *rb, const RBNode *data, bool *isNew)
x = rb->allocfunc (rb->arg);
- x->iteratorState = InitialState;
x->color = RBRED;
x->left = RBNIL;
@@ -654,220 +657,276 @@ rb_delete(RBTree *rb, RBNode *node)
**********************************************************************/
/*
- * The iterator routines were originally coded in tail-recursion style,
- * which is nice to look at, but is trouble if your compiler isn't smart
- * enough to optimize it. Now we just use looping.
+ * Begin left right walk.
+ */
+void
+rb_begin_left_right_walk(RBTree *rb, RBTreeLeftRightWalk * lrw)
+{
+ lrw->rb = rb;
+ lrw->last_visited = NULL;
+ lrw->is_over = (lrw->rb->root == RBNIL);
+}
+
+/*
+ * Left right walk: get next node. Returns NULL if there is none.
*/
-#define descend(next_node) \
- do { \
- (next_node)->iteratorState = InitialState; \
- node = rb->cur = (next_node); \
- goto restart; \
- } while (0)
-
-#define ascend(next_node) \
- do { \
- node = rb->cur = (next_node); \
- goto restart; \
- } while (0)
-
-
-static RBNode *
-rb_left_right_iterator(RBTree *rb)
+RBNode *
+rb_left_right_walk(RBTreeLeftRightWalk * lrw)
{
- RBNode *node = rb->cur;
+ if (lrw->is_over)
+ return NULL;
-restart:
- switch (node->iteratorState)
+ if (lrw->last_visited == NULL)
{
- case InitialState:
- if (node->left != RBNIL)
- {
- node->iteratorState = FirstStepDone;
- descend(node->left);
- }
- /* FALL THROUGH */
- case FirstStepDone:
- node->iteratorState = SecondStepDone;
- return node;
- case SecondStepDone:
- if (node->right != RBNIL)
- {
- node->iteratorState = ThirdStepDone;
- descend(node->right);
- }
- /* FALL THROUGH */
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
- break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ lrw->last_visited = lrw->rb->root;
+ while (lrw->last_visited->left != RBNIL)
+ lrw->last_visited = lrw->last_visited->left;
+
+ return lrw->last_visited;
}
- return NULL;
-}
+ if (lrw->last_visited->right != RBNIL)
+ {
+ lrw->last_visited = lrw->last_visited->right;
+ while (lrw->last_visited->left != RBNIL)
+ lrw->last_visited = lrw->last_visited->left;
-static RBNode *
-rb_right_left_iterator(RBTree *rb)
-{
- RBNode *node = rb->cur;
+ return lrw->last_visited;
+ }
-restart:
- switch (node->iteratorState)
+ for (;;)
{
- case InitialState:
- if (node->right != RBNIL)
- {
- node->iteratorState = FirstStepDone;
- descend(node->right);
- }
- /* FALL THROUGH */
- case FirstStepDone:
- node->iteratorState = SecondStepDone;
- return node;
- case SecondStepDone:
- if (node->left != RBNIL)
- {
- node->iteratorState = ThirdStepDone;
- descend(node->left);
- }
- /* FALL THROUGH */
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
+ RBNode *came_from = lrw->last_visited;
+
+ lrw->last_visited = lrw->last_visited->parent;
+ if (lrw->last_visited == NULL)
+ {
+ lrw->is_over = true;
break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ }
+
+ if (lrw->last_visited->left == came_from)
+ break; /* came from left sub-tree, return current
+ * node */
+
+ /* else - came from right sub-tree, continue to move up */
}
- return NULL;
+ return lrw->last_visited;
+}
+
+/*
+ * Begin right left walk.
+ */
+void
+rb_begin_right_left_walk(RBTree *rb, RBTreeRightLeftWalk * rlw)
+{
+ rlw->rb = rb;
+ rlw->last_visited = NULL;
+ rlw->is_over = (rlw->rb->root == RBNIL);
}
-static RBNode *
-rb_direct_iterator(RBTree *rb)
+/*
+ * Right left walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_right_left_walk(RBTreeRightLeftWalk * rlw)
{
- RBNode *node = rb->cur;
+ if (rlw->is_over)
+ return NULL;
-restart:
- switch (node->iteratorState)
+ if (rlw->last_visited == NULL)
{
- case InitialState:
- node->iteratorState = FirstStepDone;
- return node;
- case FirstStepDone:
- if (node->left != RBNIL)
- {
- node->iteratorState = SecondStepDone;
- descend(node->left);
- }
- /* FALL THROUGH */
- case SecondStepDone:
- if (node->right != RBNIL)
- {
- node->iteratorState = ThirdStepDone;
- descend(node->right);
- }
- /* FALL THROUGH */
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
- break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ rlw->last_visited = rlw->rb->root;
+ while (rlw->last_visited->right != RBNIL)
+ rlw->last_visited = rlw->last_visited->right;
+
+ return rlw->last_visited;
}
- return NULL;
-}
+ if (rlw->last_visited->left != RBNIL)
+ {
+ rlw->last_visited = rlw->last_visited->left;
+ while (rlw->last_visited->right != RBNIL)
+ rlw->last_visited = rlw->last_visited->right;
-static RBNode *
-rb_inverted_iterator(RBTree *rb)
-{
- RBNode *node = rb->cur;
+ return rlw->last_visited;
+ }
-restart:
- switch (node->iteratorState)
+ for (;;)
{
- case InitialState:
- if (node->left != RBNIL)
- {
- node->iteratorState = FirstStepDone;
- descend(node->left);
- }
- /* FALL THROUGH */
- case FirstStepDone:
- if (node->right != RBNIL)
- {
- node->iteratorState = SecondStepDone;
- descend(node->right);
- }
- /* FALL THROUGH */
- case SecondStepDone:
- node->iteratorState = ThirdStepDone;
- return node;
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
+ RBNode *came_from = rlw->last_visited;
+
+ rlw->last_visited = rlw->last_visited->parent;
+ if (rlw->last_visited == NULL)
+ {
+ rlw->is_over = true;
break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ }
+
+ if (rlw->last_visited->right == came_from)
+ break; /* came from right sub-tree, return current
+ * node */
+
+ /* else - came from left sub-tree, continue to move up */
}
- return NULL;
+ return rlw->last_visited;
}
/*
- * rb_begin_iterate: prepare to traverse the tree in any of several orders
- *
- * After calling rb_begin_iterate, call rb_iterate repeatedly until it
- * returns NULL or the traversal stops being of interest.
+ * Begin direct walk (a.k.a pre-order traversal).
*
- * If the tree is changed during traversal, results of further calls to
- * rb_iterate are unspecified.
+ * Pre-order traversal while duplicating nodes and edges can make a complete
+ * duplicate of a binary tree. It can also be used to make a prefix expression
+ * (Polish notation) from expression trees: traverse the expression tree
+ * pre-orderly.
*
- * Note: this used to return a separately palloc'd iterator control struct,
- * but that's a bit pointless since the data structure is incapable of
- * supporting multiple concurrent traversals. Now we just keep the state
- * in RBTree.
+ * https://en.wikipedia.org/wiki/Tree_traversal#Applications
*/
void
-rb_begin_iterate(RBTree *rb, RBOrderControl ctrl)
+rb_begin_direct_walk(RBTree *rb, RBTreeDirectWalk * dw)
+{
+ dw->rb = rb;
+ dw->last_visited = NULL;
+ dw->is_over = (dw->rb->root == RBNIL);
+}
+
+/*
+ * Direct walk: get next node. Returns NULL if there is none.
+ */
+RBNode *
+rb_direct_walk(RBTreeDirectWalk * dw)
{
- rb->cur = rb->root;
- if (rb->cur != RBNIL)
- rb->cur->iteratorState = InitialState;
+ if (dw->is_over)
+ return NULL;
- switch (ctrl)
+ if (dw->last_visited == NULL)
{
- case LeftRightWalk: /* visit left, then self, then right */
- rb->iterate = rb_left_right_iterator;
- break;
- case RightLeftWalk: /* visit right, then self, then left */
- rb->iterate = rb_right_left_iterator;
- break;
- case DirectWalk: /* visit self, then left, then right */
- rb->iterate = rb_direct_iterator;
- break;
- case InvertedWalk: /* visit left, then right, then self */
- rb->iterate = rb_inverted_iterator;
+ dw->last_visited = dw->rb->root;
+ return dw->last_visited;
+ }
+
+ if (dw->last_visited->left != RBNIL)
+ {
+ dw->last_visited = dw->last_visited->left;
+ return dw->last_visited;
+ }
+
+ do
+ {
+ if (dw->last_visited->right != RBNIL)
+ {
+ dw->last_visited = dw->last_visited->right;
break;
- default:
- elog(ERROR, "unrecognized rbtree iteration order: %d", ctrl);
+ }
+
+ /* go up and one step right */
+ for (;;)
+ {
+ RBNode *came_from = dw->last_visited;
+
+ dw->last_visited = dw->last_visited->parent;
+ if (dw->last_visited == NULL)
+ {
+ dw->is_over = true;
+ break;
+ }
+
+ if ((dw->last_visited->right != came_from) && (dw->last_visited->right != RBNIL))
+ {
+ dw->last_visited = dw->last_visited->right;
+ return dw->last_visited;
+ }
+ }
}
+ while (dw->last_visited != NULL);
+
+ return dw->last_visited;
+}
+
+/*
+ * Begin inverted walk (a.k.a post-order traversal).
+ */
+void
+rb_begin_inverted_walk(RBTree *rb, RBTreeInvertedWalk * iw)
+{
+ iw->rb = rb;
+ iw->last_visited = NULL;
+ iw->next_step = NextStepNone;
+ iw->is_over = (iw->rb->root == RBNIL);
}
/*
- * rb_iterate: return the next node in traversal order, or NULL if no more
+ * Inverted walk: get next node. Returns NULL if there is none.
+ *
+ * Post-order traversal while deleting or freeing nodes and values can delete
+ * or free an entire binary tree. It can also generate a postfix
+ * representation of a binary tree.
+ *
+ * https://en.wikipedia.org/wiki/Tree_traversal#Applications
*/
RBNode *
-rb_iterate(RBTree *rb)
+rb_inverted_walk(RBTreeInvertedWalk * iw)
{
- if (rb->cur == RBNIL)
+ RBNode *came_from;
+
+ if (iw->is_over)
return NULL;
- return rb->iterate(rb);
+ if (iw->last_visited == NULL)
+ {
+ iw->last_visited = iw->rb->root;
+ iw->next_step = NextStepLeft;
+ }
+
+loop:
+ switch (iw->next_step)
+ {
+ case NextStepLeft:
+ came_from = iw->last_visited;
+ while (iw->last_visited->left != RBNIL)
+ iw->last_visited = iw->last_visited->left;
+
+ iw->next_step = NextStepRight;
+ goto loop;
+
+ case NextStepRight:
+ if (iw->last_visited->right != RBNIL)
+ {
+ iw->last_visited = iw->last_visited->right;
+ iw->next_step = NextStepLeft;
+ goto loop;
+ }
+ else /* not moved - return current, then go up */
+ iw->next_step = NextStepUp;
+ break;
+ case NextStepUp:
+ for (;;)
+ {
+ came_from = iw->last_visited;
+ iw->last_visited = iw->last_visited->parent;
+ if (iw->last_visited == NULL)
+ {
+ iw->is_over = true;
+ break; /* end of iteration */
+ }
+
+ if (came_from == iw->last_visited->right)
+ {
+ /* return current, then continue to go up */
+ break;
+ }
+
+ /* otherwise we came from the left */
+ iw->next_step = NextStepRight;
+ goto loop;
+ }
+ break;
+ default: /* should never happen but is usefull during
+ * debugging */
+ elog(ERROR, "Unexpected next step value during inverted walk: %d\n", iw->next_step);
+ }
+
+ return iw->last_visited;
}
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index 68cfe0c..3438fb8 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -919,6 +919,7 @@ typedef struct
GinEntryAccumulator *entryallocator;
uint32 eas_used;
RBTree *tree;
+ RBTreeLeftRightWalk tree_walk;
} BuildAccumulator;
extern void ginInitBA(BuildAccumulator *accum);
diff --git a/src/include/lib/rbtree.h b/src/include/lib/rbtree.h
index 73e83c3..efec1d2 100644
--- a/src/include/lib/rbtree.h
+++ b/src/include/lib/rbtree.h
@@ -22,7 +22,6 @@
*/
typedef struct RBNode
{
- char iteratorState; /* workspace for iterating through tree */
char color; /* node's current color, red or black */
struct RBNode *left; /* left child, or RBNIL if none */
struct RBNode *right; /* right child, or RBNIL if none */
@@ -32,15 +31,6 @@ typedef struct RBNode
/* Opaque struct representing a whole tree */
typedef struct RBTree RBTree;
-/* Available tree iteration orderings */
-typedef enum RBOrderControl
-{
- LeftRightWalk, /* inorder: left child, node, right child */
- RightLeftWalk, /* reverse inorder: right, node, left */
- DirectWalk, /* preorder: node, left child, right child */
- InvertedWalk /* postorder: left child, right child, node */
-} RBOrderControl;
-
/* Support functions to be provided by caller */
typedef int (*rb_comparator) (const RBNode *a, const RBNode *b, void *arg);
typedef void (*rb_combiner) (RBNode *existing, const RBNode *newdata, void *arg);
@@ -56,11 +46,58 @@ extern RBTree *rb_create(Size node_size,
extern RBNode *rb_find(RBTree *rb, const RBNode *data);
extern RBNode *rb_leftmost(RBTree *rb);
+extern RBNode *rb_rightmost(RBTree *rb);
extern RBNode *rb_insert(RBTree *rb, const RBNode *data, bool *isNew);
extern void rb_delete(RBTree *rb, RBNode *node);
-extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
-extern RBNode *rb_iterate(RBTree *rb);
+typedef enum RBTreeNextStep
+{
+ NextStepNone,
+ NextStepUp,
+ NextStepLeft,
+ NextStepRight
+} RBTreeNextStep;
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeLeftRightWalk;
+
+extern void rb_begin_left_right_walk(RBTree *rb, RBTreeLeftRightWalk * lrw);
+extern RBNode *rb_left_right_walk(RBTreeLeftRightWalk * lrw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeRightLeftWalk;
+
+extern void rb_begin_right_left_walk(RBTree *rb, RBTreeRightLeftWalk * rlw);
+extern RBNode *rb_right_left_walk(RBTreeRightLeftWalk * rlw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ bool is_over;
+} RBTreeDirectWalk;
+
+extern void rb_begin_direct_walk(RBTree *rb, RBTreeDirectWalk * dw);
+extern RBNode *rb_direct_walk(RBTreeDirectWalk * dw);
+
+typedef struct
+{
+ RBTree *rb;
+ RBNode *last_visited;
+ RBTreeNextStep next_step;
+ bool is_over;
+} RBTreeInvertedWalk;
+
+extern void rb_begin_inverted_walk(RBTree *rb, RBTreeInvertedWalk * dw);
+extern RBNode *rb_inverted_walk(RBTreeInvertedWalk * dw);
#endif /* RBTREE_H */
On 08/22/2016 01:00 PM, Aleksander Alekseev wrote:
It seems clear to me that the existing arrangement is hazardous for
any RBTree that hasn't got exactly one consumer. I think
Aleksander's plan to decouple the iteration state is probably a good
one (NB: I've not read the patch, so this is not an endorsement of
details). I'd go so far as to say that we should remove the old API
as being dangerous, rather than preserve it on
backwards-compatibility grounds. We make bigger changes than that in
internal APIs all the time.Glad to hear it! I would like to propose a patch that removes the old
API. I have a few questions though.Would you like it to be a separate patch, or all-in-one patch is fine
in this case? I also would like to include unit/property-based tests in
this (these) patch (patches). However as I understand there are
currently no unit tests in PostgreSQL at all, only integration/system
tests. Any advice how to do it better?OK, here is a patch. I think we could call it a _replacement_ of an old API, so
there is probably no need in two separate patches. I still don't know how to
include unit test and whether they should be included at all. Thus for now
unit/property-based tests could be found only on GitHub [1].As usual, if you have any questions, suggestions or notes regarding this patch,
please let me know.
This also seems to change the API so that instead of a single
rb_begin_iterate()+rb_iterate() pair, there is a separate begin and
iterate function for each traversal order. That seems like an unrelated
change. Was there a particular reason for that? I think the old
rb_begin_iterate()+rb_iterate() functions were fine, we just need to
have a RBTreeIterator struct that's different from the tree itself.
Note that the API actually used to be like that. rb_begin_iterate() used
to return a palloc'd RBTreeIterator struct. It was changed in commit
0454f131, because the implementation didn't support more than one
iterator at a time, which made the separate struct pointless. But we're
fixing that problem now.
Another unrelated change in this patch is the addition of
rb_rightmost(). It's not used for anything, so I'm not sure what the
point is. Then again, there don't seem to be any callers of
rb_leftmost() either.
I think we should something like in the attached patch. It seems to pass
your test suite, but I haven't done any other testing on this. Does it
look OK to you?
- Heikki
Attachments:
rbtree-iteration-improvement-heikki-v1.patchapplication/x-patch; name=rbtree-iteration-improvement-heikki-v1.patchDownload
diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c
index d6422ea..71c64e4 100644
--- a/src/backend/access/gin/ginbulk.c
+++ b/src/backend/access/gin/ginbulk.c
@@ -255,7 +255,7 @@ qsortCompareItemPointers(const void *a, const void *b)
void
ginBeginBAScan(BuildAccumulator *accum)
{
- rb_begin_iterate(accum->tree, LeftRightWalk);
+ rb_begin_iterate(accum->tree, LeftRightWalk, &accum->tree_walk);
}
/*
@@ -271,7 +271,7 @@ ginGetBAEntry(BuildAccumulator *accum,
GinEntryAccumulator *entry;
ItemPointerData *list;
- entry = (GinEntryAccumulator *) rb_iterate(accum->tree);
+ entry = (GinEntryAccumulator *) rb_iterate(&accum->tree_walk);
if (entry == NULL)
return NULL; /* no more entries */
diff --git a/src/backend/lib/rbtree.c b/src/backend/lib/rbtree.c
index 4fa8a1d..048366d 100644
--- a/src/backend/lib/rbtree.c
+++ b/src/backend/lib/rbtree.c
@@ -28,18 +28,6 @@
#include "lib/rbtree.h"
-
-/*
- * Values of RBNode.iteratorState
- *
- * Note that iteratorState has an undefined value except in nodes that are
- * currently being visited by an active iteration.
- */
-#define InitialState (0)
-#define FirstStepDone (1)
-#define SecondStepDone (2)
-#define ThirdStepDone (3)
-
/*
* Colors of nodes (values of RBNode.color)
*/
@@ -53,10 +41,6 @@ struct RBTree
{
RBNode *root; /* root node, or RBNIL if tree is empty */
- /* Iteration state */
- RBNode *cur; /* current iteration node */
- RBNode *(*iterate) (RBTree *rb);
-
/* Remaining fields are constant after rb_create */
Size node_size; /* actual size of tree nodes */
@@ -75,8 +59,19 @@ struct RBTree
*/
#define RBNIL (&sentinel)
-static RBNode sentinel = {InitialState, RBBLACK, RBNIL, RBNIL, NULL};
+static RBNode sentinel = {RBBLACK, RBNIL, RBNIL, NULL};
+/*
+ * Values used in the RBTreeIterator.next_state field, with an
+ * InvertedWalk iterator.
+ */
+typedef enum InvertedWalkNextStep
+{
+ NextStepBegin,
+ NextStepUp,
+ NextStepLeft,
+ NextStepRight
+} InvertedWalkNextStep;
/*
* rb_create: create an empty RBTree
@@ -123,8 +118,6 @@ rb_create(Size node_size,
Assert(node_size > sizeof(RBNode));
tree->root = RBNIL;
- tree->cur = RBNIL;
- tree->iterate = NULL;
tree->node_size = node_size;
tree->comparator = comparator;
tree->combiner = combiner;
@@ -437,7 +430,6 @@ rb_insert(RBTree *rb, const RBNode *data, bool *isNew)
x = rb->allocfunc (rb->arg);
- x->iteratorState = InitialState;
x->color = RBRED;
x->left = RBNIL;
@@ -653,171 +645,194 @@ rb_delete(RBTree *rb, RBNode *node)
* Traverse *
**********************************************************************/
-/*
- * The iterator routines were originally coded in tail-recursion style,
- * which is nice to look at, but is trouble if your compiler isn't smart
- * enough to optimize it. Now we just use looping.
- */
-#define descend(next_node) \
- do { \
- (next_node)->iteratorState = InitialState; \
- node = rb->cur = (next_node); \
- goto restart; \
- } while (0)
+static RBNode *
+rb_left_right_iterator(RBTreeIterator *iter)
+{
+ if (iter->last_visited == NULL)
+ {
+ iter->last_visited = iter->rb->root;
+ while (iter->last_visited->left != RBNIL)
+ iter->last_visited = iter->last_visited->left;
-#define ascend(next_node) \
- do { \
- node = rb->cur = (next_node); \
- goto restart; \
- } while (0)
+ return iter->last_visited;
+ }
+ if (iter->last_visited->right != RBNIL)
+ {
+ iter->last_visited = iter->last_visited->right;
+ while (iter->last_visited->left != RBNIL)
+ iter->last_visited = iter->last_visited->left;
-static RBNode *
-rb_left_right_iterator(RBTree *rb)
-{
- RBNode *node = rb->cur;
+ return iter->last_visited;
+ }
-restart:
- switch (node->iteratorState)
+ for (;;)
{
- case InitialState:
- if (node->left != RBNIL)
- {
- node->iteratorState = FirstStepDone;
- descend(node->left);
- }
- /* FALL THROUGH */
- case FirstStepDone:
- node->iteratorState = SecondStepDone;
- return node;
- case SecondStepDone:
- if (node->right != RBNIL)
- {
- node->iteratorState = ThirdStepDone;
- descend(node->right);
- }
- /* FALL THROUGH */
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
+ RBNode *came_from = iter->last_visited;
+
+ iter->last_visited = iter->last_visited->parent;
+ if (iter->last_visited == NULL)
+ {
+ iter->is_over = true;
break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ }
+
+ if (iter->last_visited->left == came_from)
+ break; /* came from left sub-tree, return current
+ * node */
+
+ /* else - came from right sub-tree, continue to move up */
}
- return NULL;
+ return iter->last_visited;
}
static RBNode *
-rb_right_left_iterator(RBTree *rb)
+rb_right_left_iterator(RBTreeIterator *iter)
{
- RBNode *node = rb->cur;
+ if (iter->last_visited == NULL)
+ {
+ iter->last_visited = iter->rb->root;
+ while (iter->last_visited->right != RBNIL)
+ iter->last_visited = iter->last_visited->right;
+
+ return iter->last_visited;
+ }
-restart:
- switch (node->iteratorState)
+ if (iter->last_visited->left != RBNIL)
{
- case InitialState:
- if (node->right != RBNIL)
- {
- node->iteratorState = FirstStepDone;
- descend(node->right);
- }
- /* FALL THROUGH */
- case FirstStepDone:
- node->iteratorState = SecondStepDone;
- return node;
- case SecondStepDone:
- if (node->left != RBNIL)
- {
- node->iteratorState = ThirdStepDone;
- descend(node->left);
- }
- /* FALL THROUGH */
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
+ iter->last_visited = iter->last_visited->left;
+ while (iter->last_visited->right != RBNIL)
+ iter->last_visited = iter->last_visited->right;
+
+ return iter->last_visited;
+ }
+
+ for (;;)
+ {
+ RBNode *came_from = iter->last_visited;
+
+ iter->last_visited = iter->last_visited->parent;
+ if (iter->last_visited == NULL)
+ {
+ iter->is_over = true;
break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ }
+
+ if (iter->last_visited->right == came_from)
+ break; /* came from right sub-tree, return current
+ * node */
+
+ /* else - came from left sub-tree, continue to move up */
}
- return NULL;
+ return iter->last_visited;
}
static RBNode *
-rb_direct_iterator(RBTree *rb)
+rb_direct_iterator(RBTreeIterator *iter)
{
- RBNode *node = rb->cur;
+ if (iter->last_visited == NULL)
+ {
+ iter->last_visited = iter->rb->root;
+ return iter->last_visited;
+ }
-restart:
- switch (node->iteratorState)
+ if (iter->last_visited->left != RBNIL)
{
- case InitialState:
- node->iteratorState = FirstStepDone;
- return node;
- case FirstStepDone:
- if (node->left != RBNIL)
+ iter->last_visited = iter->last_visited->left;
+ return iter->last_visited;
+ }
+
+ do
+ {
+ if (iter->last_visited->right != RBNIL)
+ {
+ iter->last_visited = iter->last_visited->right;
+ break;
+ }
+
+ /* go up and one step right */
+ for (;;)
+ {
+ RBNode *came_from = iter->last_visited;
+
+ iter->last_visited = iter->last_visited->parent;
+ if (iter->last_visited == NULL)
{
- node->iteratorState = SecondStepDone;
- descend(node->left);
+ iter->is_over = true;
+ break;
}
- /* FALL THROUGH */
- case SecondStepDone:
- if (node->right != RBNIL)
+
+ if ((iter->last_visited->right != came_from) && (iter->last_visited->right != RBNIL))
{
- node->iteratorState = ThirdStepDone;
- descend(node->right);
+ iter->last_visited = iter->last_visited->right;
+ return iter->last_visited;
}
- /* FALL THROUGH */
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
- break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
+ }
}
+ while (iter->last_visited != NULL);
- return NULL;
+ return iter->last_visited;
}
static RBNode *
-rb_inverted_iterator(RBTree *rb)
+rb_inverted_iterator(RBTreeIterator *iter)
{
- RBNode *node = rb->cur;
+ RBNode *came_from;
-restart:
- switch (node->iteratorState)
+loop:
+ switch ((InvertedWalkNextStep) iter->next_step)
{
- case InitialState:
- if (node->left != RBNIL)
+ /* First call, begin from root */
+ case NextStepBegin:
+ iter->last_visited = iter->rb->root;
+ iter->next_step = NextStepLeft;
+ goto loop;
+
+ case NextStepLeft:
+ while (iter->last_visited->left != RBNIL)
+ iter->last_visited = iter->last_visited->left;
+
+ iter->next_step = NextStepRight;
+ goto loop;
+
+ case NextStepRight:
+ if (iter->last_visited->right != RBNIL)
{
- node->iteratorState = FirstStepDone;
- descend(node->left);
+ iter->last_visited = iter->last_visited->right;
+ iter->next_step = NextStepLeft;
+ goto loop;
}
- /* FALL THROUGH */
- case FirstStepDone:
- if (node->right != RBNIL)
+ else /* not moved - return current, then go up */
+ iter->next_step = NextStepUp;
+ break;
+
+ case NextStepUp:
+ for (;;)
{
- node->iteratorState = SecondStepDone;
- descend(node->right);
+ came_from = iter->last_visited;
+ iter->last_visited = iter->last_visited->parent;
+ if (iter->last_visited == NULL)
+ {
+ iter->is_over = true;
+ break; /* end of iteration */
+ }
+
+ if (came_from == iter->last_visited->right)
+ {
+ /* return current, then continue to go up */
+ break;
+ }
+
+ /* otherwise we came from the left */
+ iter->next_step = NextStepRight;
+ goto loop;
}
- /* FALL THROUGH */
- case SecondStepDone:
- node->iteratorState = ThirdStepDone;
- return node;
- case ThirdStepDone:
- if (node->parent)
- ascend(node->parent);
break;
- default:
- elog(ERROR, "unrecognized rbtree node state: %d",
- node->iteratorState);
}
- return NULL;
+ return iter->last_visited;
}
/*
@@ -827,33 +842,34 @@ restart:
* returns NULL or the traversal stops being of interest.
*
* If the tree is changed during traversal, results of further calls to
- * rb_iterate are unspecified.
+ * rb_iterate are unspecified. Multiple concurrent iterators on the same
+ * tree are allowed.
*
- * Note: this used to return a separately palloc'd iterator control struct,
- * but that's a bit pointless since the data structure is incapable of
- * supporting multiple concurrent traversals. Now we just keep the state
- * in RBTree.
+ * The iterator state is stored in the 'iter' struct. The caller should
+ * treat it as opaque struct.
*/
void
-rb_begin_iterate(RBTree *rb, RBOrderControl ctrl)
+rb_begin_iterate(RBTree *rb, RBOrderControl ctrl, RBTreeIterator *iter)
{
- rb->cur = rb->root;
- if (rb->cur != RBNIL)
- rb->cur->iteratorState = InitialState;
+ /* Common initialization for all traversal orders */
+ iter->rb = rb;
+ iter->last_visited = NULL;
+ iter->is_over = (rb->root == RBNIL);
switch (ctrl)
{
case LeftRightWalk: /* visit left, then self, then right */
- rb->iterate = rb_left_right_iterator;
+ iter->iterate = rb_left_right_iterator;
break;
case RightLeftWalk: /* visit right, then self, then left */
- rb->iterate = rb_right_left_iterator;
+ iter->iterate = rb_right_left_iterator;
break;
case DirectWalk: /* visit self, then left, then right */
- rb->iterate = rb_direct_iterator;
+ iter->iterate = rb_direct_iterator;
break;
case InvertedWalk: /* visit left, then right, then self */
- rb->iterate = rb_inverted_iterator;
+ iter->iterate = rb_inverted_iterator;
+ iter->next_step = NextStepBegin;
break;
default:
elog(ERROR, "unrecognized rbtree iteration order: %d", ctrl);
@@ -864,10 +880,10 @@ rb_begin_iterate(RBTree *rb, RBOrderControl ctrl)
* rb_iterate: return the next node in traversal order, or NULL if no more
*/
RBNode *
-rb_iterate(RBTree *rb)
+rb_iterate(RBTreeIterator *iter)
{
- if (rb->cur == RBNIL)
+ if (iter->is_over)
return NULL;
- return rb->iterate(rb);
+ return iter->iterate(iter);
}
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index 68cfe0c..bf589ab 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -919,6 +919,7 @@ typedef struct
GinEntryAccumulator *entryallocator;
uint32 eas_used;
RBTree *tree;
+ RBTreeIterator tree_walk;
} BuildAccumulator;
extern void ginInitBA(BuildAccumulator *accum);
diff --git a/src/include/lib/rbtree.h b/src/include/lib/rbtree.h
index 73e83c3..4f2870e 100644
--- a/src/include/lib/rbtree.h
+++ b/src/include/lib/rbtree.h
@@ -22,7 +22,6 @@
*/
typedef struct RBNode
{
- char iteratorState; /* workspace for iterating through tree */
char color; /* node's current color, red or black */
struct RBNode *left; /* left child, or RBNIL if none */
struct RBNode *right; /* right child, or RBNIL if none */
@@ -41,6 +40,22 @@ typedef enum RBOrderControl
InvertedWalk /* postorder: left child, right child, node */
} RBOrderControl;
+/*
+ * RBTreeIterator holds state while traversing a tree. This is declared here
+ * so that callers can stack-allocate this, but must otherwise be treated
+ * as an opaque struct.
+ */
+typedef struct RBTreeIterator RBTreeIterator;
+
+struct RBTreeIterator
+{
+ RBTree *rb;
+ RBNode *(*iterate) (RBTreeIterator *iter);
+ RBNode *last_visited;
+ char next_step;
+ bool is_over;
+};
+
/* Support functions to be provided by caller */
typedef int (*rb_comparator) (const RBNode *a, const RBNode *b, void *arg);
typedef void (*rb_combiner) (RBNode *existing, const RBNode *newdata, void *arg);
@@ -60,7 +75,7 @@ extern RBNode *rb_leftmost(RBTree *rb);
extern RBNode *rb_insert(RBTree *rb, const RBNode *data, bool *isNew);
extern void rb_delete(RBTree *rb, RBNode *node);
-extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
-extern RBNode *rb_iterate(RBTree *rb);
+extern void rb_begin_iterate(RBTree *rb, RBOrderControl ctrl, RBTreeIterator *iter);
+extern RBNode *rb_iterate(RBTreeIterator *iter);
#endif /* RBTREE_H */
Hello, Heikki.
Thank you for your attention to this patch!
This also seems to change the API so that instead of a single
rb_begin_iterate()+rb_iterate() pair, there is a separate begin and
iterate function for each traversal order. That seems like an unrelated
change. Was there a particular reason for that? I think the old
rb_begin_iterate()+rb_iterate() functions were fine, we just need to
have a RBTreeIterator struct that's different from the tree itself.
I'm trying to avoid calling procedures by a pointer, an old habit. When
I started to work on this patch I just needed a RB-tree implementation
for a pet project. I took one from PostgreSQL code. Then I found this
flaw of iteration interface and decided to fix it. The idea to merge
this fix back to PostgreSQL code appeared later so I just wrote code the
way I like.
These days code performance depends on many factors like whether code
fits into cache, i.e not only on whether you call a procedure directly
or using a pointer. Until someone finds a real bottleneck here I think
current rb_begin_iterate()+rb_iterate() interface should do just fine.
Another unrelated change in this patch is the addition of
rb_rightmost(). It's not used for anything, so I'm not sure what the
point is. Then again, there don't seem to be any callers of
rb_leftmost() either.
It's just something I needed in tests to reach a good percent of code
coverage. Implementation of rb_rightmost is trivial so we probably can do
without it.
I think we should something like in the attached patch. It seems to pass
your test suite, but I haven't done any other testing on this. Does it
look OK to you?
Looks good to me.
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 08/26/2016 04:07 PM, Aleksander Alekseev wrote:
Another unrelated change in this patch is the addition of
rb_rightmost(). It's not used for anything, so I'm not sure what the
point is. Then again, there don't seem to be any callers of
rb_leftmost() either.It's just something I needed in tests to reach a good percent of code
coverage. Implementation of rb_rightmost is trivial so we probably can do
without it.
Looking closer, we don't currently use any of the iterators besides the
left-right iterator either. Nor rb_delete().
I think we should something like in the attached patch. It seems to pass
your test suite, but I haven't done any other testing on this. Does it
look OK to you?Looks good to me.
Ok, committed.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Ok, committed.
- Heikki
Thanks a lot for committing this patch and also for your contribution to
it!
--
Best regards,
Aleksander Alekseev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers