Red-black tree for GIN

Started by Teodor Sigaevabout 16 years ago14 messages
#1Teodor Sigaev
teodor@sigaev.ru
1 attachment(s)

Hi there,

attached is a patch, which contains implementation of a red-black
tree, a self-balanced implementation of binary tree. The main goal of
this patch is to improve creation time of GIN index in the corner cases.
While creation, GIN collects data in memory in binary tree until it
reach some limit and then it flush tree to disk. Some data could
produces unbalanced binary tree, for example, sorted data, so the tree
degenerates to the list with O(N^2) processing time (see thread
http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php)
), which cause very slow index creation. Tom has fixed that by limiting
depth of tree (committed to 8.3 and 8.4), but we found it's not enough
and propose to use red-black tree, which is very good for skewed data and has
almost the same performance for unsorted data, see
http://www.sai.msu.su/~megera/wiki/2009-07-27 and
http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information.

Implementation of red-black tree has several currently unused methods,
but they will be used in next patches.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

rbtree-0.5.gzapplication/x-tar; name=rbtree-0.5.gzDownload
#2Robert Haas
robertmhaas@gmail.com
In reply to: Teodor Sigaev (#1)
Re: Red-black tree for GIN

2009/11/23 Teodor Sigaev <teodor@sigaev.ru>:

Hi there,

attached is a patch, which contains implementation of a  red-black
tree,  a self-balanced implementation of binary tree.  The main goal of
this patch is to improve creation time of GIN index in the corner cases.
While creation, GIN collects data in memory in binary tree until it
reach some limit and then it flush tree to disk. Some data could
produces unbalanced binary tree,  for example, sorted data, so the tree
degenerates to the list with O(N^2) processing time (see thread
http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php)
), which cause very slow index creation.  Tom has fixed that by limiting
depth of tree  (committed to 8.3 and 8.4),  but we found it's not enough
and propose to use red-black tree, which is very good for skewed data and
has almost the same performance for unsorted data, see
http://www.sai.msu.su/~megera/wiki/2009-07-27 and
http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information.

Implementation of red-black tree has several currently unused  methods,
but they will be used in next patches.

I did a quick read-through of this, and one question that immediately
occurred to me is that rbtree.c says that it is adopted from
http://algolist.manual.ru/ds/rbtree.php. But I'm not sure what
license that code is under, so I'm not sure whether it's OK for us to
use it.

My other question is as related to performance. Can you provide a
test case that shows the performance improvement with this patch?

...Robert

#3Alvaro Herrera
alvherre@commandprompt.com
In reply to: Robert Haas (#2)
Re: Red-black tree for GIN

Robert Haas escribi�:

I did a quick read-through of this, and one question that immediately
occurred to me is that rbtree.c says that it is adopted from
http://algolist.manual.ru/ds/rbtree.php. But I'm not sure what
license that code is under, so I'm not sure whether it's OK for us to
use it.

This code comes from Thomas Niemann's "Sorting and Searching Algorithms:
A Cookbook",
http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.htm

specifically
http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_rbt.htm

The code is in the public domain; that web page says
"Source code, when part of a software project, may be used
freely without reference to the author."

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#4Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#3)
Re: Red-black tree for GIN

On Mon, Jan 4, 2010 at 8:12 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Robert Haas escribió:

I did a quick read-through of this, and one question that immediately
occurred to me is that rbtree.c says that it is adopted from
http://algolist.manual.ru/ds/rbtree.php.  But I'm not sure what
license that code is under, so I'm not sure whether it's OK for us to
use it.

This code comes from Thomas Niemann's "Sorting and Searching Algorithms:
A Cookbook",
http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.htm

specifically
http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_rbt.htm

The code is in the public domain; that web page says
       "Source code, when part of a software project, may be used
       freely without reference to the author."

That is excellent. Perhaps we should document that information in the
code comments where the URL is currently mentioned.

...Robert

#5Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#2)
Re: Red-black tree for GIN

On Thu, Dec 31, 2009 at 4:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:

My other question is as related to performance.  Can you provide a
test case that shows the performance improvement with this patch?

So, we still don't have a test case for this patch. During the
November CommitFest, Greg Smith griped a bit about the lack of a
reproducible performance benchmark for the XLogInsert patch:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg00816.php

...and I would say the same logic applies to this patch, maybe even
moreso. Tom has already applied a partial workaround for this
problem, and I'm feeling like it won't be trivial to figure out what
to measure to see the remaining issue and measure how much this new
implementation helps.

The coding pattern that this patch uses also merits some discussion.
Basically, rbtree.c is a generic implementation of red-black trees -
from a textbook - which ginbulk.c then uses for GIN. One possible
advantage of this implementation is that it might make it possible for
us to use the rbtree.c logic in other places, if we have other data
structures that need similar treatment. But I'm not sure if that's
the way we want to go. The other alternative is to drop the
generalized implementation and incorporate the logic directly into
ginbulk.c. I really don't know which is better, but I'd like to hear
some other opinions...

...Robert

#6Greg Stark
gsstark@mit.edu
In reply to: Robert Haas (#5)
Re: Red-black tree for GIN

On Mon, Jan 11, 2010 at 2:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:

The coding pattern that this patch uses also merits some discussion.
Basically, rbtree.c is a generic implementation of red-black trees -
from a textbook - which ginbulk.c then uses for GIN.  One possible
advantage of this implementation is that it might make it possible for
us to use the rbtree.c logic in other places, if we have other data
structures that need similar treatment.  But I'm not sure if that's
the way we want to go.  The other alternative is to drop the
generalized implementation and incorporate the logic directly into
ginbulk.c.  I really don't know which is better, but I'd like to hear
some other opinions...

So, code reuse is not the only advantage of abstraction. It's also
just plain easier to understand and test code written with clear
abstract interfaces. The way you describe it someone with no knowledge
could look at rbtree.c and see if it's done correctly and maybe
improve it. And someone reading ginbulk only has to understand the
operations red-black trees support and no understand how they're
implemented to follow the code.

Is there any advantage of integrating the code with ginbulk.c? Does it
let us get away with finer grained locks or do any tricks doing
gin-specific changes while we're in the middle of rebalancing or
anything like that?

--
greg

#7Oleg Bartunov
oleg@sai.msu.su
In reply to: Robert Haas (#5)
Re: Red-black tree for GIN

Robert,

we have benchmark for rbtree
http://www.sai.msu.su/~megera/wiki/2009-07-27
rbtree, actually, fix corner cases with ordered input, with little
overhead.

As you may see from knngist patch, rbtree used in gist code, so, please,
leave rbtree code as is.

Oleg
On Sun, 10 Jan 2010, Robert Haas wrote:

On Thu, Dec 31, 2009 at 4:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:

My other question is as related to performance. =A0Can you provide a
test case that shows the performance improvement with this patch?

So, we still don't have a test case for this patch. During the
November CommitFest, Greg Smith griped a bit about the lack of a
reproducible performance benchmark for the XLogInsert patch:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg00816.php

...and I would say the same logic applies to this patch, maybe even
moreso. Tom has already applied a partial workaround for this
problem, and I'm feeling like it won't be trivial to figure out what
to measure to see the remaining issue and measure how much this new
implementation helps.

The coding pattern that this patch uses also merits some discussion.
Basically, rbtree.c is a generic implementation of red-black trees -
from a textbook - which ginbulk.c then uses for GIN. One possible
advantage of this implementation is that it might make it possible for
us to use the rbtree.c logic in other places, if we have other data
structures that need similar treatment. But I'm not sure if that's
the way we want to go. The other alternative is to drop the
generalized implementation and incorporate the logic directly into
ginbulk.c. I really don't know which is better, but I'd like to hear
some other opinions...

...Robert

--=20
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#8Teodor Sigaev
teodor@sigaev.ru
In reply to: Robert Haas (#5)
Re: Red-black tree for GIN

...and I would say the same logic applies to this patch, maybe even
moreso. Tom has already applied a partial workaround for this
problem, and I'm feeling like it won't be trivial to figure out what

That partial workaround is not work for some corner cases:
http://www.sai.msu.su/~megera/wiki/2009-07-27 (8.4 already has that hask!)

The coding pattern that this patch uses also merits some discussion.
Basically, rbtree.c is a generic implementation of red-black trees -
from a textbook - which ginbulk.c then uses for GIN. One possible
advantage of this implementation is that it might make it possible for
us to use the rbtree.c logic in other places, if we have other data
structures that need similar treatment. But I'm not sure if that's
the way we want to go. The other alternative is to drop the
generalized implementation and incorporate the logic directly into
ginbulk.c. I really don't know which is better, but I'd like to hear
some other opinions...

knngist uses that implementation of rb-tree. One more candidate is a ts_stat
which now uses unbalanced binary tree.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#9Robert Haas
robertmhaas@gmail.com
In reply to: Teodor Sigaev (#8)
Re: Red-black tree for GIN

2010/1/11 Teodor Sigaev <teodor@sigaev.ru>:

knngist uses that implementation of rb-tree. One more candidate is a ts_stat
which now uses unbalanced binary tree.

Ah, OK. That's great if we can reuse that code in 2 or 3 places.

...Robert

#10Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#9)
Re: Red-black tree for GIN

On Mon, Jan 11, 2010 at 1:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:

2010/1/11 Teodor Sigaev <teodor@sigaev.ru>:

knngist uses that implementation of rb-tree. One more candidate is a ts_stat
which now uses unbalanced binary tree.

Ah, OK.  That's great if we can reuse that code in 2 or 3 places.

Some preliminary thoughts on this patch:

1. I think rb_free_recursive is missing a pfree().

2. We already have a definition of NIL in the PG source base. I think
this one needs to be named something else. RBNIL, maybe.

3. This code could really use some more comments, and maybe some of
the variable names could be better chosen, too. It's far from obvious
what is going on here. I studied rbtrees in college and I still
remember more or less how they work, but, boy, this is hard to follow.
The names of the iterator states are truly horrible, at least IMO.

4. It would be nice if you could do a better job conforming to project
indentation style.

...Robert

#11Teodor Sigaev
teodor@sigaev.ru
In reply to: Robert Haas (#10)
1 attachment(s)
Re: Red-black tree for GIN

1. I think rb_free_recursive is missing a pfree().

Fixed, thank you

2. We already have a definition of NIL in the PG source base. I think
this one needs to be named something else. RBNIL, maybe.

fixed

3. This code could really use some more comments, and maybe some of
the variable names could be better chosen, too. It's far from obvious
what is going on here. I studied rbtrees in college and I still
remember more or less how they work, but, boy, this is hard to follow.
The names of the iterator states are truly horrible, at least IMO.

I changed names and slightly improve comments.

4. It would be nice if you could do a better job conforming to project
indentation style.

Did pgindent.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

rbtree-0.6.gzapplication/x-tar; name=rbtree-0.6.gzDownload
#12Robert Haas
robertmhaas@gmail.com
In reply to: Teodor Sigaev (#11)
Re: Red-black tree for GIN

2010/1/24 Teodor Sigaev <teodor@sigaev.ru>:

3. This code could really use some more comments, and maybe some of
the variable names could be better chosen, too.  It's far from obvious
what is going on here.  I studied rbtrees in college and I still
remember more or less how they work, but, boy, this is hard to follow.
 The names of the iterator states are truly horrible, at least IMO.

I changed names and slightly improve comments.

Would it be OK if I went through here and hacked on the comments and
sent this back to you?

...Robert

#13Teodor Sigaev
teodor@sigaev.ru
In reply to: Robert Haas (#12)
Re: Red-black tree for GIN

Would it be OK if I went through here and hacked on the comments and
sent this back to you?

Feel free to edit the patch.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#14Robert Haas
robertmhaas@gmail.com
In reply to: Teodor Sigaev (#13)
1 attachment(s)
Re: Red-black tree for GIN

2010/1/24 Teodor Sigaev <teodor@sigaev.ru>:

Would it be OK if I went through here and hacked on the comments and
sent this back to you?

Feel free to edit the patch.

Here's an edited version, which I've now reviewed more fully. Some
more substantive review comments:

1. I'm pretty satisfied that the rbtree code is generally sane,
although I wonder if we should think about putting it in access/common
rather than utils/misc. I'm not sure that I have a sufficiently
clearly defined notion of what each subdirectory is for to draw a
definitive conclusion on this point; hopefully someone else will weigh
in.

2. I'm a little concerned about the performance implications of this
patch. Looking at http://www.sai.msu.su/~megera/wiki/2009-04-03, it's
clear that the patch is a huge win in some cases. But I'm also
surprised by how much performance is lost in some of the other cases
that you tested. I suspect, on balance, that it's probably still a
good idea to put this in, but I wonder if you've profiled this at all
to see where the extra time is going and/or explored possible ways of
squashing that overhead down a little more.

3. In ginInsertEntry(), the "else" branch of the if statement really
looks like magic when you first read it. I wonder if it would make
sense to pull the contents of EAAllocate() directly into this
function, since there's only one call site anyhow.

There are a lot of whitespace changes in the version I'm attaching
compared to your last one - your pg_indent run didn't use a typedef
list that included the new typedefs, so some of the spacing came out
kind of wacky. I also fleshed out the comments a bit, deleted one
comment that was no longer true, and changed some of the function
names in rbtree.c to make them more consistent, but the changes are
all pretty trivial.

I still have not done any performance testing of my own on this code,
and it probably needs that. If anyone else has time to step up and
help with that, it would be much appreciated. It would be useful to
have some plain old benchmarks as well as some profiling data as
mentioned above.

...Robert

Attachments:

rbtree-0.6-rmhapplication/octet-stream; name=rbtree-0.6-rmhDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5de44d6..e343635 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2620,7 +2620,7 @@
        <entry>Return Type</entry>
        <entry>Description</entry>
        <entry>Example</entry>
-       <entry>Result</entry>  
+       <entry>Result</entry>
       </row>
      </thead>
 
@@ -2641,32 +2641,6 @@
       </row>
 
       <row>
-       <entry><function>get_bit</function>(<parameter>string</parameter>, <parameter>offset</parameter>)</entry>
-       <entry><type>int</type></entry>
-       <entry>
-        Extract bit from string
-        <indexterm>
-         <primary>get_bit</primary>
-        </indexterm>
-       </entry>
-       <entry><literal>get_bit(E'Th\\000omas'::bytea, 45)</literal></entry>
-       <entry><literal>1</literal></entry>
-      </row>
-
-      <row>
-       <entry><function>get_byte</function>(<parameter>string</parameter>, <parameter>offset</parameter>)</entry>
-       <entry><type>int</type></entry>
-       <entry>
-        Extract byte from string
-        <indexterm>
-         <primary>get_byte</primary>
-        </indexterm>
-       </entry>
-       <entry><literal>get_byte(E'Th\\000omas'::bytea, 4)</literal></entry>
-       <entry><literal>109</literal></entry>
-      </row>
-
-      <row>
        <entry><literal><function>octet_length</function>(<parameter>string</parameter>)</literal></entry>
        <entry><type>int</type></entry>
        <entry>Number of bytes in binary string</entry>
@@ -2675,39 +2649,21 @@
       </row>
 
       <row>
-       <entry><literal><function>position</function>(<parameter>substring</parameter> in <parameter>string</parameter>)</literal></entry>
-       <entry><type>int</type></entry>
-       <entry>Location of specified substring</entry>
-      <entry><literal>position(E'\\000om'::bytea in E'Th\\000omas'::bytea)</literal></entry>
-       <entry><literal>3</literal></entry>
-      </row>
-
-      <row>
-       <entry><function>set_bit</function>(<parameter>string</parameter>,
-       <parameter>offset</parameter>, <parameter>newvalue</>)</entry>
+       <entry><literal><function>overlay</function>(<parameter>string</parameter> placing <parameter>string</parameter> from <type>int</type> <optional>for <type>int</type></optional>)</literal></entry>
        <entry><type>bytea</type></entry>
        <entry>
-        Set bit in string
-        <indexterm>
-         <primary>set_bit</primary>
-        </indexterm>
+        Replace substring
        </entry>
-       <entry><literal>set_bit(E'Th\\000omas'::bytea, 45, 0)</literal></entry>
-       <entry><literal>Th\000omAs</literal></entry>
+       <entry><literal>overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 2 for 3)</literal></entry>
+       <entry><literal>T\\002\\003mas</literal></entry>
       </row>
 
       <row>
-       <entry><function>set_byte</function>(<parameter>string</parameter>,
-       <parameter>offset</parameter>, <parameter>newvalue</>)</entry>
-       <entry><type>bytea</type></entry>
-       <entry>
-        Set byte in string
-        <indexterm>
-         <primary>set_byte</primary>
-        </indexterm>
-       </entry>
-       <entry><literal>set_byte(E'Th\\000omas'::bytea, 4, 64)</literal></entry>
-       <entry><literal>Th\000o@as</literal></entry>
+       <entry><literal><function>position</function>(<parameter>substring</parameter> in <parameter>string</parameter>)</literal></entry>
+       <entry><type>int</type></entry>
+       <entry>Location of specified substring</entry>
+      <entry><literal>position(E'\\000om'::bytea in E'Th\\000omas'::bytea)</literal></entry>
+       <entry><literal>3</literal></entry>
       </row>
 
       <row>
@@ -2784,7 +2740,7 @@
       </entry>
       <entry><type>bytea</type></entry>
       <entry>
-       Decode binary string from <parameter>string</parameter> previously 
+       Decode binary string from <parameter>string</parameter> previously
        encoded with <function>encode</>.  Parameter type is same as in <function>encode</>.
       </entry>
       <entry><literal>decode(E'123\\000456', 'escape')</literal></entry>
@@ -2805,6 +2761,36 @@
       <entry><literal>123\000456</literal></entry>
      </row>
 
+      <row>
+       <entry>
+        <literal><function>get_bit</function>(<parameter>string</parameter>, <parameter>offset</parameter>)</literal>
+       </entry>
+       <entry><type>int</type></entry>
+       <entry>
+        Extract bit from string
+        <indexterm>
+         <primary>get_bit</primary>
+        </indexterm>
+       </entry>
+       <entry><literal>get_bit(E'Th\\000omas'::bytea, 45)</literal></entry>
+       <entry><literal>1</literal></entry>
+      </row>
+
+      <row>
+       <entry>
+        <literal><function>get_byte</function>(<parameter>string</parameter>, <parameter>offset</parameter>)</literal>
+       </entry>
+       <entry><type>int</type></entry>
+       <entry>
+        Extract byte from string
+        <indexterm>
+         <primary>get_byte</primary>
+        </indexterm>
+       </entry>
+       <entry><literal>get_byte(E'Th\\000omas'::bytea, 4)</literal></entry>
+       <entry><literal>109</literal></entry>
+      </row>
+
      <row>
       <entry><literal><function>length</function>(<parameter>string</parameter>)</literal></entry>
       <entry><type>int</type></entry>
@@ -2834,6 +2820,38 @@
       <entry><literal>md5(E'Th\\000omas'::bytea)</literal></entry>
       <entry><literal>8ab2d3c9689aaf18 b4958c334c82d8b1</literal></entry>
      </row>
+
+      <row>
+       <entry>
+        <literal><function>set_bit</function>(<parameter>string</parameter>,
+        <parameter>offset</parameter>, <parameter>newvalue</>)</literal>
+       </entry>
+       <entry><type>bytea</type></entry>
+       <entry>
+        Set bit in string
+        <indexterm>
+         <primary>set_bit</primary>
+        </indexterm>
+       </entry>
+       <entry><literal>set_bit(E'Th\\000omas'::bytea, 45, 0)</literal></entry>
+       <entry><literal>Th\000omAs</literal></entry>
+      </row>
+
+      <row>
+       <entry>
+        <literal><function>set_byte</function>(<parameter>string</parameter>,
+        <parameter>offset</parameter>, <parameter>newvalue</>)</literal>
+       </entry>
+       <entry><type>bytea</type></entry>
+       <entry>
+        Set byte in string
+        <indexterm>
+         <primary>set_byte</primary>
+        </indexterm>
+       </entry>
+       <entry><literal>set_byte(E'Th\\000omas'::bytea, 4, 64)</literal></entry>
+       <entry><literal>Th\000o@as</literal></entry>
+      </row>
     </tbody>
    </tgroup>
   </table>
@@ -2934,7 +2952,15 @@
     <literal><function>bit_length</function></literal>,
     <literal><function>octet_length</function></literal>,
     <literal><function>position</function></literal>,
-    <literal><function>substring</function></literal>.
+    <literal><function>substring</function></literal>,
+    <literal><function>overlay</function></literal>.
+   </para>
+
+   <para>
+    The following functions work on bit strings as well as binary
+    strings:
+    <literal><function>get_bit</function></literal>,
+    <literal><function>set_bit</function></literal>.
    </para>
 
    <para>
diff --git a/doc/src/sgml/ref/prepare_transaction.sgml b/doc/src/sgml/ref/prepare_transaction.sgml
index 0a28092..8f95e29 100644
--- a/doc/src/sgml/ref/prepare_transaction.sgml
+++ b/doc/src/sgml/ref/prepare_transaction.sgml
@@ -83,6 +83,15 @@ PREPARE TRANSACTION <replaceable class="PARAMETER">transaction_id</replaceable>
   <title>Notes</title>
 
   <para>
+   <command>PREPARE TRANSACTION</> is not intended for use in applications
+   or interactive sessions. It's purpose is to allow an external
+   transaction manager to perform atomic global transactions across multiple
+   databases or other transactional resources. Unless you're writing a
+   transaction manager, you probably shouldn't be using <command>PREPARE
+   TRANSACTION</>.
+  </para>
+
+  <para>
    This command must be used inside a transaction block. Use <xref
    linkend="sql-begin" endterm="sql-begin-title"> to start one.
   </para>
diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c
index 954884d..e043713 100644
--- a/src/backend/access/gin/ginbulk.c
+++ b/src/backend/access/gin/ginbulk.c
@@ -22,15 +22,60 @@
 #define DEF_NENTRY	2048
 #define DEF_NPTR	4
 
+static void*
+ginAppendData(void *old, void *new, void *arg)
+{
+	EntryAccumulator	*eo = (EntryAccumulator*)old,
+						*en = (EntryAccumulator*)new;
+
+	BuildAccumulator	*accum = (BuildAccumulator*)arg;
+
+	if (eo->number >= eo->length)
+	{
+		accum->allocatedMemory -= GetMemoryChunkSpace(eo->list);
+		eo->length *= 2;
+		eo->list = (ItemPointerData *) repalloc(eo->list,
+									sizeof(ItemPointerData) * eo->length);
+		accum->allocatedMemory += GetMemoryChunkSpace(eo->list);
+	}
+
+	/* If item pointers are not ordered, they will need to be sorted. */
+	if (eo->shouldSort == FALSE)
+	{
+		int			res;
+
+		res = compareItemPointers(eo->list + eo->number - 1, en->list);
+		Assert(res != 0);
+
+		if (res > 0)
+			eo->shouldSort = TRUE;
+	}
+
+	eo->list[eo->number] = en->list[0];
+	eo->number++;
+
+	return old;
+}
+
+static int
+cmpEntryAccumulator(const void *a, const void *b, void *arg)
+{
+	EntryAccumulator	*ea = (EntryAccumulator*)a;
+	EntryAccumulator	*eb = (EntryAccumulator*)b;
+	BuildAccumulator	*accum = (BuildAccumulator*)arg;
+
+	return compareAttEntries(accum->ginstate, ea->attnum, ea->value,
+							 eb->attnum, eb->value);
+}
+
 void
 ginInitBA(BuildAccumulator *accum)
 {
-	accum->maxdepth = 1;
-	accum->stackpos = 0;
-	accum->entries = NULL;
-	accum->stack = NULL;
 	accum->allocatedMemory = 0;
 	accum->entryallocator = NULL;
+	accum->tree = rb_create(cmpEntryAccumulator, ginAppendData, NULL, accum);
+	accum->iterator = NULL;
+	accum->tmpList = NULL;
 }
 
 static EntryAccumulator *
@@ -48,36 +93,6 @@ EAAllocate(BuildAccumulator *accum)
 }
 
 /*
- * Stores heap item pointer. For robust, it checks that
- * item pointer are ordered
- */
-static void
-ginInsertData(BuildAccumulator *accum, EntryAccumulator *entry, ItemPointer heapptr)
-{
-	if (entry->number >= entry->length)
-	{
-		accum->allocatedMemory -= GetMemoryChunkSpace(entry->list);
-		entry->length *= 2;
-		entry->list = (ItemPointerData *) repalloc(entry->list,
-									sizeof(ItemPointerData) * entry->length);
-		accum->allocatedMemory += GetMemoryChunkSpace(entry->list);
-	}
-
-	if (entry->shouldSort == FALSE)
-	{
-		int			res = compareItemPointers(entry->list + entry->number - 1, heapptr);
-
-		Assert(res != 0);
-
-		if (res > 0)
-			entry->shouldSort = TRUE;
-	}
-
-	entry->list[entry->number] = *heapptr;
-	entry->number++;
-}
-
-/*
  * This is basically the same as datumCopy(), but modified to count
  * palloc'd space in accum.
  */
@@ -103,57 +118,38 @@ getDatumCopy(BuildAccumulator *accum, OffsetNumber attnum, Datum value)
 static void
 ginInsertEntry(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber attnum, Datum entry)
 {
-	EntryAccumulator *ea = accum->entries,
-			   *pea = NULL;
-	int			res = 0;
-	uint32		depth = 1;
+	EntryAccumulator 	*key = EAAllocate(accum),
+						*ea;
 
-	while (ea)
-	{
-		res = compareAttEntries(accum->ginstate, attnum, entry, ea->attnum, ea->value);
-		if (res == 0)
-			break;				/* found */
-		else
-		{
-			pea = ea;
-			if (res < 0)
-				ea = ea->left;
-			else
-				ea = ea->right;
-		}
-		depth++;
-	}
+	key->attnum = attnum;
+	key->value = entry;
+	if (accum->tmpList == NULL)
+		accum->tmpList =
+			(ItemPointerData *) palloc(sizeof(ItemPointerData) * DEF_NPTR);
+	key->list = accum->tmpList;
+	key->list[0] = *heapptr;
 
-	if (depth > accum->maxdepth)
-		accum->maxdepth = depth;
+	ea = rb_insert(accum->tree, key);
 
 	if (ea == NULL)
 	{
-		ea = EAAllocate(accum);
-
-		ea->left = ea->right = NULL;
-		ea->attnum = attnum;
-		ea->value = getDatumCopy(accum, attnum, entry);
-		ea->length = DEF_NPTR;
-		ea->number = 1;
-		ea->shouldSort = FALSE;
-		ea->list = (ItemPointerData *) palloc(sizeof(ItemPointerData) * DEF_NPTR);
-		accum->allocatedMemory += GetMemoryChunkSpace(ea->list);
-		ea->list[0] = *heapptr;
-
-		if (pea == NULL)
-			accum->entries = ea;
-		else
-		{
-			Assert(res != 0);
-			if (res < 0)
-				pea->left = ea;
-			else
-				pea->right = ea;
-		}
+		/*
+		 * The key has been inserted, so continue initialization.
+		 */
+		key->value = getDatumCopy(accum, attnum, entry);
+		key->length = DEF_NPTR;
+		key->number = 1;
+		key->shouldSort = FALSE;
+		accum->allocatedMemory += GetMemoryChunkSpace(key->list);
+		accum->tmpList = NULL;
 	}
 	else
-		ginInsertData(accum, ea, heapptr);
+	{
+		/*
+		 * The key has been appended, so reset
+		 */
+		accum->length--;
+	}
 }
 
 /*
@@ -219,86 +215,16 @@ qsortCompareItemPointers(const void *a, const void *b)
 	return res;
 }
 
-/*
- * walk on binary tree and returns ordered nodes
- */
-static EntryAccumulator *
-walkTree(BuildAccumulator *accum)
-{
-	EntryAccumulator *entry = accum->stack[accum->stackpos];
-
-	if (entry->list != NULL)
-	{
-		/* return entry itself: we already was at left sublink */
-		return entry;
-	}
-	else if (entry->right && entry->right != accum->stack[accum->stackpos + 1])
-	{
-		/* go on right sublink */
-		accum->stackpos++;
-		entry = entry->right;
-
-		/* find most-left value */
-		for (;;)
-		{
-			accum->stack[accum->stackpos] = entry;
-			if (entry->left)
-			{
-				accum->stackpos++;
-				entry = entry->left;
-			}
-			else
-				break;
-		}
-	}
-	else
-	{
-		/* we already return all left subtree, itself and  right subtree */
-		if (accum->stackpos == 0)
-			return 0;
-		accum->stackpos--;
-		return walkTree(accum);
-	}
-
-	return entry;
-}
-
 ItemPointerData *
 ginGetEntry(BuildAccumulator *accum, OffsetNumber *attnum, Datum *value, uint32 *n)
 {
 	EntryAccumulator *entry;
 	ItemPointerData *list;
 
-	if (accum->stack == NULL)
-	{
-		/* first call */
-		accum->stack = palloc0(sizeof(EntryAccumulator *) * (accum->maxdepth + 1));
-		accum->allocatedMemory += GetMemoryChunkSpace(accum->stack);
-		entry = accum->entries;
-
-		if (entry == NULL)
-			return NULL;
-
-		/* find most-left value */
-		for (;;)
-		{
-			accum->stack[accum->stackpos] = entry;
-			if (entry->left)
-			{
-				accum->stackpos++;
-				entry = entry->left;
-			}
-			else
-				break;
-		}
-	}
-	else
-	{
-		accum->allocatedMemory -= GetMemoryChunkSpace(accum->stack[accum->stackpos]->list);
-		pfree(accum->stack[accum->stackpos]->list);
-		accum->stack[accum->stackpos]->list = NULL;
-		entry = walkTree(accum);
-	}
+	if (accum->iterator == NULL)
+		accum->iterator = rb_begin_iterate(accum->tree, LeftRightWalk);
+
+	entry = rb_iterate(accum->iterator);
 
 	if (entry == NULL)
 		return NULL;
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 8d48cdf..3fb4441 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -765,8 +765,7 @@ ginInsertCleanup(Relation index, GinState *ginstate,
 		 */
 		if (GinPageGetOpaque(page)->rightlink == InvalidBlockNumber ||
 			(GinPageHasFullRow(page) &&
-			 (accum.allocatedMemory >= maintenance_work_mem * 1024L ||
-			  accum.maxdepth > GIN_MAX_TREE_DEPTH)))
+			 (accum.allocatedMemory >= maintenance_work_mem * 1024L)))
 		{
 			ItemPointerData *list;
 			uint32		nlist;
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 902e361..97fc417 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -247,9 +247,7 @@ ginBuildCallback(Relation index, HeapTuple htup, Datum *values,
 															&htup->t_self);
 
 	/* If we've maxed out our available memory, dump everything to the index */
-	/* Also dump if the tree seems to be getting too unbalanced */
-	if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L ||
-		buildstate->accum.maxdepth > GIN_MAX_TREE_DEPTH)
+	if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L)
 	{
 		ItemPointerData *list;
 		Datum		entry;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0b61a59..f911c86 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4142,6 +4142,10 @@ readTimeLineHistory(TimeLineID targetTLI)
 	char		fline[MAXPGPATH];
 	FILE	   *fd;
 
+	/* Timeline 1 does not have a history file, so no need to check */
+	if (targetTLI == 1)
+		return list_make1_int((int) targetTLI);
+
 	if (InArchiveRecovery)
 	{
 		TLHistoryFileName(histfname, targetTLI);
@@ -4227,6 +4231,10 @@ existsTimeLineHistory(TimeLineID probeTLI)
 	char		histfname[MAXFNAMELEN];
 	FILE	   *fd;
 
+	/* Timeline 1 does not have a history file, so no need to check */
+	if (probeTLI == 1)
+		return false;
+
 	if (InArchiveRecovery)
 	{
 		TLHistoryFileName(histfname, probeTLI);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3d3ae79..146fa56 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9586,9 +9586,9 @@ func_expr:	func_name '(' ')' over_clause
 			| OVERLAY '(' overlay_list ')'
 				{
 					/* overlay(A PLACING B FROM C FOR D) is converted to
-					 * substring(A, 1, C-1) || B || substring(A, C+1, C+D)
+					 * overlay(A, B, C, D)
 					 * overlay(A PLACING B FROM C) is converted to
-					 * substring(A, 1, C-1) || B || substring(A, C+1, C+char_length(B))
+					 * overlay(A, B, C)
 					 */
 					FuncCall *n = makeNode(FuncCall);
 					n->funcname = SystemFuncName("overlay");
@@ -10150,6 +10150,7 @@ extract_arg:
  * SQL99 defines the OVERLAY() function:
  * o overlay(text placing text from int for int)
  * o overlay(text placing text from int)
+ * and similarly for binary strings
  */
 overlay_list:
 			a_expr overlay_placing substr_from substr_for
diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c
index 576b872..03aff75 100644
--- a/src/backend/utils/adt/varbit.c
+++ b/src/backend/utils/adt/varbit.c
@@ -23,8 +23,10 @@
 
 #define HEXDIG(z)	 ((z)<10 ? ((z)+'0') : ((z)-10+'A'))
 
+static VarBit *bit_catenate(VarBit *arg1, VarBit *arg2);
 static VarBit *bitsubstring(VarBit *arg, int32 s, int32 l,
 							bool length_not_specified);
+static VarBit *bit_overlay(VarBit *t1, VarBit *t2, int sp, int sl);
 
 
 /* common code for bittypmodin and varbittypmodin */
@@ -877,6 +879,13 @@ bitcat(PG_FUNCTION_ARGS)
 {
 	VarBit	   *arg1 = PG_GETARG_VARBIT_P(0);
 	VarBit	   *arg2 = PG_GETARG_VARBIT_P(1);
+
+	PG_RETURN_VARBIT_P(bit_catenate(arg1, arg2));
+}
+
+static VarBit *
+bit_catenate(VarBit *arg1, VarBit *arg2)
+{
 	VarBit	   *result;
 	int			bitlen1,
 				bitlen2,
@@ -919,7 +928,7 @@ bitcat(PG_FUNCTION_ARGS)
 		}
 	}
 
-	PG_RETURN_VARBIT_P(result);
+	return result;
 }
 
 /* bitsubstr
@@ -1034,6 +1043,67 @@ bitsubstring(VarBit *arg, int32 s, int32 l, bool length_not_specified)
 	return result;
 }
 
+/*
+ * bitoverlay
+ *	Replace specified substring of first string with second
+ *
+ * The SQL standard defines OVERLAY() in terms of substring and concatenation.
+ * This code is a direct implementation of what the standard says.
+ */
+Datum
+bitoverlay(PG_FUNCTION_ARGS)
+{
+	VarBit	   *t1 = PG_GETARG_VARBIT_P(0);
+	VarBit	   *t2 = PG_GETARG_VARBIT_P(1);
+	int			sp = PG_GETARG_INT32(2); /* substring start position */
+	int			sl = PG_GETARG_INT32(3); /* substring length */
+
+	PG_RETURN_VARBIT_P(bit_overlay(t1, t2, sp, sl));
+}
+
+Datum
+bitoverlay_no_len(PG_FUNCTION_ARGS)
+{
+	VarBit	   *t1 = PG_GETARG_VARBIT_P(0);
+	VarBit	   *t2 = PG_GETARG_VARBIT_P(1);
+	int			sp = PG_GETARG_INT32(2); /* substring start position */
+	int			sl;
+
+	sl = VARBITLEN(t2);				/* defaults to length(t2) */
+	PG_RETURN_VARBIT_P(bit_overlay(t1, t2, sp, sl));
+}
+
+static VarBit *
+bit_overlay(VarBit *t1, VarBit *t2, int sp, int sl)
+{
+	VarBit	   *result;
+	VarBit	   *s1;
+	VarBit	   *s2;
+	int			sp_pl_sl;
+
+	/*
+	 * Check for possible integer-overflow cases.  For negative sp,
+	 * throw a "substring length" error because that's what should be
+	 * expected according to the spec's definition of OVERLAY().
+	 */
+	if (sp <= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_SUBSTRING_ERROR),
+				 errmsg("negative substring length not allowed")));
+	sp_pl_sl = sp + sl;
+	if (sp_pl_sl <= sl)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("integer out of range")));
+
+	s1 = bitsubstring(t1, 1, sp-1, false);
+	s2 = bitsubstring(t1, sp_pl_sl, -1, true);
+	result = bit_catenate(s1, t2);
+	result = bit_catenate(result, s2);
+
+	return result;
+}
+
 /* bitlength, bitoctetlength
  * Return the length of a bit string
  */
@@ -1606,3 +1676,103 @@ bitposition(PG_FUNCTION_ARGS)
 	}
 	PG_RETURN_INT32(0);
 }
+
+
+/*
+ * bitsetbit
+ *
+ * Given an instance of type 'bit' creates a new one with
+ * the Nth bit set to the given value.
+ *
+ * The bit location is specified left-to-right in a zero-based fashion
+ * consistent with the other get_bit and set_bit functions, but
+ * inconsistent with the standard substring, position, overlay functions
+ */
+Datum
+bitsetbit(PG_FUNCTION_ARGS)
+{
+	VarBit	   *arg1 = PG_GETARG_VARBIT_P(0);
+	int32		n = PG_GETARG_INT32(1);
+	int32		newBit = PG_GETARG_INT32(2);
+	VarBit	   *result;
+	int			len,
+				bitlen;
+	bits8	   *r,
+			   *p;
+	int			byteNo,
+				bitNo;
+
+	bitlen = VARBITLEN(arg1);
+	if (n < 0 || n >= bitlen)
+		ereport(ERROR,
+				(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+				 errmsg("bit index %d out of valid range (0..%d)",
+						n, bitlen - 1)));
+	/*
+	 * sanity check!
+	 */
+	if (newBit != 0 && newBit != 1)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("new bit must be 0 or 1")));
+
+	len = VARSIZE(arg1);
+	result = (VarBit *) palloc(len);
+	SET_VARSIZE(result, len);
+	VARBITLEN(result) = bitlen;
+
+	p = VARBITS(arg1);
+	r = VARBITS(result);
+
+	memcpy(r, p, VARBITBYTES(arg1));
+
+	byteNo = n / BITS_PER_BYTE;
+	bitNo = BITS_PER_BYTE - 1 - (n % BITS_PER_BYTE);
+
+	/*
+	 * Update the byte.
+	 */
+	if (newBit == 0)
+		r[byteNo] &= (~(1 << bitNo));
+	else
+		r[byteNo] |= (1 << bitNo);
+
+	PG_RETURN_VARBIT_P(result);
+}
+
+/*
+ * bitgetbit
+ *
+ * returns the value of the Nth bit of a bit array (0 or 1).
+ *
+ * The bit location is specified left-to-right in a zero-based fashion
+ * consistent with the other get_bit and set_bit functions, but
+ * inconsistent with the standard substring, position, overlay functions
+ */
+Datum
+bitgetbit(PG_FUNCTION_ARGS)
+{
+	VarBit	   *arg1 = PG_GETARG_VARBIT_P(0);
+	int32		n = PG_GETARG_INT32(1);
+	int			bitlen;
+	bits8	   *p;
+	int			byteNo,
+				bitNo;
+
+	bitlen = VARBITLEN(arg1);
+	if (n < 0 || n >= bitlen)
+		ereport(ERROR,
+				(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+				 errmsg("bit index %d out of valid range (0..%d)",
+						n, bitlen - 1)));
+
+	p = VARBITS(arg1);
+
+	byteNo = n / BITS_PER_BYTE;
+	bitNo = BITS_PER_BYTE - 1 - (n % BITS_PER_BYTE);
+
+	if (p[byteNo] & (1 << bitNo))
+		PG_RETURN_INT32(1);
+	else
+		PG_RETURN_INT32(0);
+}
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 6ec0fe5..d4174c9 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -60,11 +60,19 @@ static int	text_position(text *t1, text *t2);
 static void text_position_setup(text *t1, text *t2, TextPositionState *state);
 static int	text_position_next(int start_pos, TextPositionState *state);
 static void text_position_cleanup(TextPositionState *state);
+static text *text_catenate(text *t1, text *t2);
 static text *text_substring(Datum str,
 			   int32 start,
 			   int32 length,
 			   bool length_not_specified);
+static text *text_overlay(text *t1, text *t2, int sp, int sl);
 static void appendStringInfoText(StringInfo str, const text *t);
+static bytea *bytea_catenate(bytea *t1, bytea *t2);
+static bytea *bytea_substring(Datum str,
+				int S,
+				int L,
+				bool length_not_specified);
+static bytea *bytea_overlay(bytea *t1, bytea *t2, int sp, int sl);
 
 
 /*****************************************************************************
@@ -559,17 +567,31 @@ textcat(PG_FUNCTION_ARGS)
 {
 	text	   *t1 = PG_GETARG_TEXT_PP(0);
 	text	   *t2 = PG_GETARG_TEXT_PP(1);
+
+	PG_RETURN_TEXT_P(text_catenate(t1, t2));
+}
+
+/*
+ * text_catenate
+ *	Guts of textcat(), broken out so it can be used by other functions
+ *
+ * Arguments can be in short-header form, but not compressed or out-of-line
+ */
+static text *
+text_catenate(text *t1, text *t2)
+{
+	text	   *result;
 	int			len1,
 				len2,
 				len;
-	text	   *result;
 	char	   *ptr;
 
 	len1 = VARSIZE_ANY_EXHDR(t1);
+	len2 = VARSIZE_ANY_EXHDR(t2);
+
+	/* paranoia ... probably should throw error instead? */
 	if (len1 < 0)
 		len1 = 0;
-
-	len2 = VARSIZE_ANY_EXHDR(t2);
 	if (len2 < 0)
 		len2 = 0;
 
@@ -586,7 +608,7 @@ textcat(PG_FUNCTION_ARGS)
 	if (len2 > 0)
 		memcpy(ptr + len1, VARDATA_ANY(t2), len2);
 
-	PG_RETURN_TEXT_P(result);
+	return result;
 }
 
 /*
@@ -866,6 +888,67 @@ text_substring(Datum str, int32 start, int32 length, bool length_not_specified)
 }
 
 /*
+ * textoverlay
+ *	Replace specified substring of first string with second
+ *
+ * The SQL standard defines OVERLAY() in terms of substring and concatenation.
+ * This code is a direct implementation of what the standard says.
+ */
+Datum
+textoverlay(PG_FUNCTION_ARGS)
+{
+	text	   *t1 = PG_GETARG_TEXT_PP(0);
+	text	   *t2 = PG_GETARG_TEXT_PP(1);
+	int			sp = PG_GETARG_INT32(2); /* substring start position */
+	int			sl = PG_GETARG_INT32(3); /* substring length */
+
+	PG_RETURN_TEXT_P(text_overlay(t1, t2, sp, sl));
+}
+
+Datum
+textoverlay_no_len(PG_FUNCTION_ARGS)
+{
+	text	   *t1 = PG_GETARG_TEXT_PP(0);
+	text	   *t2 = PG_GETARG_TEXT_PP(1);
+	int			sp = PG_GETARG_INT32(2); /* substring start position */
+	int			sl;
+
+	sl = text_length(PointerGetDatum(t2)); /* defaults to length(t2) */
+	PG_RETURN_TEXT_P(text_overlay(t1, t2, sp, sl));
+}
+
+static text *
+text_overlay(text *t1, text *t2, int sp, int sl)
+{
+	text	   *result;
+	text	   *s1;
+	text	   *s2;
+	int			sp_pl_sl;
+
+	/*
+	 * Check for possible integer-overflow cases.  For negative sp,
+	 * throw a "substring length" error because that's what should be
+	 * expected according to the spec's definition of OVERLAY().
+	 */
+	if (sp <= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_SUBSTRING_ERROR),
+				 errmsg("negative substring length not allowed")));
+	sp_pl_sl = sp + sl;
+	if (sp_pl_sl <= sl)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("integer out of range")));
+
+	s1 = text_substring(PointerGetDatum(t1), 1, sp-1, false);
+	s2 = text_substring(PointerGetDatum(t1), sp_pl_sl, -1, true);
+	result = text_catenate(s1, t2);
+	result = text_catenate(result, s2);
+
+	return result;
+}
+
+/*
  * textpos -
  *	  Return the position of the specified substring.
  *	  Implements the SQL92 POSITION() function.
@@ -1640,17 +1723,31 @@ byteacat(PG_FUNCTION_ARGS)
 {
 	bytea	   *t1 = PG_GETARG_BYTEA_PP(0);
 	bytea	   *t2 = PG_GETARG_BYTEA_PP(1);
+
+	PG_RETURN_BYTEA_P(bytea_catenate(t1, t2));
+}
+
+/*
+ * bytea_catenate
+ *	Guts of byteacat(), broken out so it can be used by other functions
+ *
+ * Arguments can be in short-header form, but not compressed or out-of-line
+ */
+static bytea *
+bytea_catenate(bytea *t1, bytea *t2)
+{
+	bytea	   *result;
 	int			len1,
 				len2,
 				len;
-	bytea	   *result;
 	char	   *ptr;
 
 	len1 = VARSIZE_ANY_EXHDR(t1);
+	len2 = VARSIZE_ANY_EXHDR(t2);
+
+	/* paranoia ... probably should throw error instead? */
 	if (len1 < 0)
 		len1 = 0;
-
-	len2 = VARSIZE_ANY_EXHDR(t2);
 	if (len2 < 0)
 		len2 = 0;
 
@@ -1667,7 +1764,7 @@ byteacat(PG_FUNCTION_ARGS)
 	if (len2 > 0)
 		memcpy(ptr + len1, VARDATA_ANY(t2), len2);
 
-	PG_RETURN_BYTEA_P(result);
+	return result;
 }
 
 #define PG_STR_GET_BYTEA(str_) \
@@ -1691,16 +1788,41 @@ byteacat(PG_FUNCTION_ARGS)
 Datum
 bytea_substr(PG_FUNCTION_ARGS)
 {
-	int			S = PG_GETARG_INT32(1); /* start position */
+	PG_RETURN_BYTEA_P(bytea_substring(PG_GETARG_DATUM(0),
+									  PG_GETARG_INT32(1),
+									  PG_GETARG_INT32(2),
+									  false));
+}
+
+/*
+ * bytea_substr_no_len -
+ *	  Wrapper to avoid opr_sanity failure due to
+ *	  one function accepting a different number of args.
+ */
+Datum
+bytea_substr_no_len(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_BYTEA_P(bytea_substring(PG_GETARG_DATUM(0),
+									  PG_GETARG_INT32(1),
+									  -1,
+									  true));
+}
+
+static bytea *
+bytea_substring(Datum str,
+				int S,
+				int L,
+				bool length_not_specified)
+{
 	int			S1;				/* adjusted start position */
 	int			L1;				/* adjusted substring length */
 
 	S1 = Max(S, 1);
 
-	if (fcinfo->nargs == 2)
+	if (length_not_specified)
 	{
 		/*
-		 * Not passed a length - PG_GETARG_BYTEA_P_SLICE() grabs everything to
+		 * Not passed a length - DatumGetByteaPSlice() grabs everything to
 		 * the end of the string if we pass it a negative value for length.
 		 */
 		L1 = -1;
@@ -1708,7 +1830,7 @@ bytea_substr(PG_FUNCTION_ARGS)
 	else
 	{
 		/* end position */
-		int			E = S + PG_GETARG_INT32(2);
+		int			E = S + L;
 
 		/*
 		 * A negative value for L is the only way for the end position to be
@@ -1725,28 +1847,78 @@ bytea_substr(PG_FUNCTION_ARGS)
 		 * string.
 		 */
 		if (E < 1)
-			PG_RETURN_BYTEA_P(PG_STR_GET_BYTEA(""));
+			return PG_STR_GET_BYTEA("");
 
 		L1 = E - S1;
 	}
 
 	/*
 	 * If the start position is past the end of the string, SQL99 says to
-	 * return a zero-length string -- PG_GETARG_TEXT_P_SLICE() will do that
+	 * return a zero-length string -- DatumGetByteaPSlice() will do that
 	 * for us. Convert to zero-based starting position
 	 */
-	PG_RETURN_BYTEA_P(PG_GETARG_BYTEA_P_SLICE(0, S1 - 1, L1));
+	return DatumGetByteaPSlice(str, S1 - 1, L1);
 }
 
 /*
- * bytea_substr_no_len -
- *	  Wrapper to avoid opr_sanity failure due to
- *	  one function accepting a different number of args.
+ * byteaoverlay
+ *	Replace specified substring of first string with second
+ *
+ * The SQL standard defines OVERLAY() in terms of substring and concatenation.
+ * This code is a direct implementation of what the standard says.
  */
 Datum
-bytea_substr_no_len(PG_FUNCTION_ARGS)
+byteaoverlay(PG_FUNCTION_ARGS)
 {
-	return bytea_substr(fcinfo);
+	bytea	   *t1 = PG_GETARG_BYTEA_PP(0);
+	bytea	   *t2 = PG_GETARG_BYTEA_PP(1);
+	int			sp = PG_GETARG_INT32(2); /* substring start position */
+	int			sl = PG_GETARG_INT32(3); /* substring length */
+
+	PG_RETURN_BYTEA_P(bytea_overlay(t1, t2, sp, sl));
+}
+
+Datum
+byteaoverlay_no_len(PG_FUNCTION_ARGS)
+{
+	bytea	   *t1 = PG_GETARG_BYTEA_PP(0);
+	bytea	   *t2 = PG_GETARG_BYTEA_PP(1);
+	int			sp = PG_GETARG_INT32(2); /* substring start position */
+	int			sl;
+
+	sl = VARSIZE_ANY_EXHDR(t2);			/* defaults to length(t2) */
+	PG_RETURN_BYTEA_P(bytea_overlay(t1, t2, sp, sl));
+}
+
+static bytea *
+bytea_overlay(bytea *t1, bytea *t2, int sp, int sl)
+{
+	bytea	   *result;
+	bytea	   *s1;
+	bytea	   *s2;
+	int			sp_pl_sl;
+
+	/*
+	 * Check for possible integer-overflow cases.  For negative sp,
+	 * throw a "substring length" error because that's what should be
+	 * expected according to the spec's definition of OVERLAY().
+	 */
+	if (sp <= 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_SUBSTRING_ERROR),
+				 errmsg("negative substring length not allowed")));
+	sp_pl_sl = sp + sl;
+	if (sp_pl_sl <= sl)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("integer out of range")));
+
+	s1 = bytea_substring(PointerGetDatum(t1), 1, sp-1, false);
+	s2 = bytea_substring(PointerGetDatum(t1), sp_pl_sl, -1, true);
+	result = bytea_catenate(s1, t2);
+	result = bytea_catenate(result, s2);
+
+	return result;
 }
 
 /*
diff --git a/src/backend/utils/misc/Makefile b/src/backend/utils/misc/Makefile
index 03a155c..e8866df 100644
--- a/src/backend/utils/misc/Makefile
+++ b/src/backend/utils/misc/Makefile
@@ -14,7 +14,8 @@ include $(top_builddir)/src/Makefile.global
 
 override CPPFLAGS := -I. -I$(srcdir) $(CPPFLAGS)
 
-OBJS = guc.o help_config.o pg_rusage.o ps_status.o superuser.o tzparser.o
+OBJS = guc.o help_config.o pg_rusage.o ps_status.o superuser.o tzparser.o \
+       rbtree.o
 
 # This location might depend on the installation directories. Therefore
 # we can't subsitute it into pg_config.h.
diff --git a/src/backend/utils/misc/rbtree.c b/src/backend/utils/misc/rbtree.c
new file mode 100644
index 0000000..b10c669
--- /dev/null
+++ b/src/backend/utils/misc/rbtree.c
@@ -0,0 +1,814 @@
+/*-------------------------------------------------------------------------
+ *
+ * rbtree.c
+ *	  implementation for PostgreSQL generic Red-Black binary tree package
+ *	  Adopted from http://algolist.manual.ru/ds/rbtree.php
+ *
+ * This code comes from Thomas Niemann's "Sorting and Searching Algorithms:
+ * a Cookbook".
+ *
+ * See http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.htm for
+ * license terms: "Source code, when part of a software project, may be used
+ * freely without reference to the author."
+ *
+ * Red-black trees are a type of balanced binary tree wherein (1) any child of
+ * a red node is always black, and (2) every path from root to leaf traverses
+ * an equal number of black nodes.  From these properties, it follows that the
+ * longest path from root to leaf is only about twice as long as the shortest,
+ * so lookups are guaranteed to run in O(lg n) time.
+ *
+ * Copyright (c) 1996-2009, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  $PostgreSQL: pgsql/src/backend/nodes/rbtree.c,v 1.69 2008/01/01 19:45:50 momjian Exp $
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/rbtree.h"
+
+/**********************************************************************
+ *						 Declarations								  *
+ **********************************************************************/
+
+/*
+ * Values for RBNode->iteratorState
+ */
+#define InitialState 	(0)
+#define FirstStepDone	(1)
+#define SecondStepDone	(2)
+#define ThirdStepDone	(3)
+
+/*
+ * Colors of node
+ */
+#define BLACK		(0)
+#define RED			(1)
+
+typedef struct RBNode
+{
+	uint32		iteratorState:2,
+				color:	1 ,
+				unused: 29;
+	struct RBNode *left;
+	struct RBNode *right;
+	struct RBNode *parent;
+	void	   *data;
+}	RBNode;
+
+struct RBTree
+{
+	RBNode	   *root;
+	rb_comparator comparator;
+	rb_appendator appendator;
+	rb_freefunc freefunc;
+	void	   *arg;
+};
+
+struct RBTreeIterator
+{
+	RBNode	   *node;
+	void	   *(*iterate) (RBTreeIterator *iterator);
+};
+
+/*
+ * all leafs are sentinels, use castimized NIL name to prevent
+ * collision with sytem-wide NIL which is actually NULL
+ */
+#define RBNIL &sentinel
+
+RBNode		sentinel = {InitialState, BLACK, 0, RBNIL, RBNIL, NULL, NULL};
+
+/**********************************************************************
+ *						  Create/free								  *
+ **********************************************************************/
+
+RBTree *
+rb_create(rb_comparator comparator, rb_appendator appendator,
+				  rb_freefunc freefunc, void *arg)
+{
+	RBTree	   *tree = palloc(sizeof(RBTree));
+
+	tree->root = RBNIL;
+	tree->comparator = comparator;
+	tree->appendator = appendator;
+	tree->freefunc = freefunc;
+	tree->arg = arg;
+
+	return tree;
+}
+
+static void
+rb_free_recursive(RBNode *node, rb_freefunc freefunc)
+{
+	if (node->left != RBNIL)
+		rb_free_recursive(node->left, freefunc);
+	if (node->right != RBNIL)
+		rb_free_recursive(node->right, freefunc);
+	if (freefunc && node->data)
+		freefunc(node->data);
+	pfree(node);
+}
+
+void
+rb_free(RBTree *rb)
+{
+	if (!rb)
+		return;
+
+	if (rb->root != RBNIL)
+		rb_free_recursive(rb->root, rb->freefunc);
+
+	pfree(rb);
+}
+
+/**********************************************************************
+ *						  Search									  *
+ **********************************************************************/
+
+void *
+rb_find(RBTree *rb, void *data)
+{
+	RBNode	   *node = rb->root;
+	int			cmp;
+
+	while (node != RBNIL)
+	{
+		cmp = rb->comparator(data, node->data, rb->arg);
+
+		if (cmp == 0)
+			return node->data;
+		else if (cmp < 0)
+			node = node->left;
+		else
+			node = node->right;
+	}
+
+	return NULL;
+}
+
+/**********************************************************************
+ *							  Insertion								  *
+ **********************************************************************/
+
+/*
+ * Rotate node x to left.
+ *
+ * x's right child takes its place in the tree, and x becomes the left
+ * child of that node.
+ */
+static void
+rb_rotate_left(RBTree *rb, RBNode *x)
+{
+	RBNode	   *y = x->right;
+
+	/* establish x->right link */
+	x->right = y->left;
+	if (y->left != RBNIL)
+		y->left->parent = x;
+
+	/* establish y->parent link */
+	if (y != RBNIL)
+		y->parent = x->parent;
+	if (x->parent)
+	{
+		if (x == x->parent->left)
+			x->parent->left = y;
+		else
+			x->parent->right = y;
+	}
+	else
+	{
+		rb->root = y;
+	}
+
+	/* link x and y */
+	y->left = x;
+	if (x != RBNIL)
+		x->parent = y;
+}
+
+/*
+ * Rotate node x to right.
+ *
+ * x's left right child takes its place in the tree, and x becomes the right
+ * child of that node.
+ */
+static void
+rb_rotate_right(RBTree *rb, RBNode *x)
+{
+	RBNode	   *y = x->left;
+
+	/* establish x->left link */
+	x->left = y->right;
+	if (y->right != RBNIL)
+		y->right->parent = x;
+
+	/* establish y->parent link */
+	if (y != RBNIL)
+		y->parent = x->parent;
+	if (x->parent)
+	{
+		if (x == x->parent->right)
+			x->parent->right = y;
+		else
+			x->parent->left = y;
+	}
+	else
+	{
+		rb->root = y;
+	}
+
+	/* link x and y */
+	y->right = x;
+	if (x != RBNIL)
+		x->parent = y;
+}
+
+/*
+ * Maintain Red-Black tree balance after inserting node x.
+ *
+ * The newly inserted node is always initially marked red.  That may lead to
+ * a situation where a red node has a red child, which is prohibited.  We can
+ * always fix the problem by a series of color changes and/or "rotations",
+ * which move the problem progressively higher up in the tree.  If one of the
+ * two red nodes is the root, we can always fix the problem by changing the
+ * root from red to black.
+ *
+ * (This does not work lower down in the tree because we must also maintain
+ * the invariant that every leaf has equal black-height.)
+ */
+static void
+rb_insert_fixup(RBTree *rb, RBNode *x)
+{
+	/*
+	 * x is always a red node.  Initially, it is the newly inserted node.
+	 * Each iteration of this loop moves it higher up in the tree.
+	 */
+	while (x != rb->root && x->parent->color == RED)
+	{
+		/*
+		 * x and x->parent are both red.  Fix depends on whether x->parent is
+		 * a left or right child.  In either case, we define y to be the
+		 * "uncle" of x, that is, the other child of x's grandparent.
+		 *
+		 * If the uncle is red, we flip the grandparent to red and its two
+		 * children to black.  Then we loop around again to check whether the
+		 * grandparent still has a problem.
+		 *
+		 * If the uncle is black, we will perform one or two "rotations" to
+		 * balance the tree.  Either x or x->parent will take the grandparent's
+		 * position in the tree and recolored black, and the original
+		 * grandparent will be recolored red and become a child of that node.
+		 * This always leaves us with a valid red-black tree, so the loop
+		 * will terminate.
+		 */
+		if (x->parent == x->parent->parent->left)
+		{
+			RBNode	   *y = x->parent->parent->right;
+
+			if (y->color == RED)
+			{
+				/* uncle is RED */
+				x->parent->color = BLACK;
+				y->color = BLACK;
+				x->parent->parent->color = RED;
+				x = x->parent->parent;
+			}
+			else
+			{
+				/* uncle is BLACK */
+				if (x == x->parent->right)
+				{
+					/* make x a left child */
+					x = x->parent;
+					rb_rotate_left(rb, x);
+				}
+
+				/* recolor and rotate */
+				x->parent->color = BLACK;
+				x->parent->parent->color = RED;
+				rb_rotate_right(rb, x->parent->parent);
+			}
+		}
+		else
+		{
+			/* mirror image of above code */
+			RBNode	   *y = x->parent->parent->left;
+
+			if (y->color == RED)
+			{
+				/* uncle is RED */
+				x->parent->color = BLACK;
+				y->color = BLACK;
+				x->parent->parent->color = RED;
+				x = x->parent->parent;
+			}
+			else
+			{
+				/* uncle is BLACK */
+				if (x == x->parent->left)
+				{
+					x = x->parent;
+					rb_rotate_right(rb, x);
+				}
+				x->parent->color = BLACK;
+				x->parent->parent->color = RED;
+				rb_rotate_left(rb, x->parent->parent);
+			}
+		}
+	}
+
+	/*
+	 * The root may already have been black; if not, the black-height of every
+	 * node in the tree increases by one.
+	 */
+	rb->root->color = BLACK;
+}
+
+/*
+ * Allocate node for data and insert in tree.
+ *
+ * Return old data (or result of appendator method) if it exists and NULL
+ * otherwise.
+ */
+void *
+rb_insert(RBTree *rb, void *data)
+{
+	RBNode	   *current,
+			   *parent,
+			   *x;
+	int			cmp;
+
+	/* find where node belongs */
+	current = rb->root;
+	parent = NULL;
+	while (current != RBNIL)
+	{
+		cmp = rb->comparator(data, current->data, rb->arg);
+		if (cmp == 0)
+		{
+			/*
+			 * Found node with given key.  If appendator method is provided,
+			 * call it to join old and new data; else, new data replaces old
+			 * data.
+			 */
+			if (rb->appendator)
+			{
+				current->data = rb->appendator(current->data, data, rb->arg);
+				return current->data;
+			}
+			else
+			{
+				void	   *old = current->data;
+
+				current->data = data;
+				return old;
+			}
+		}
+		parent = current;
+		current = (cmp < 0) ? current->left : current->right;
+	}
+
+	/* setup new node in tree */
+	x = palloc(sizeof(RBNode));
+	x->data = data;
+	x->parent = parent;
+	x->left = RBNIL;
+	x->right = RBNIL;
+	x->color = RED;
+	x->iteratorState = InitialState;
+
+	/* insert node in tree */
+	if (parent)
+	{
+		if (cmp < 0)
+			parent->left = x;
+		else
+			parent->right = x;
+	}
+	else
+	{
+		rb->root = x;
+	}
+
+	rb_insert_fixup(rb, x);
+	return NULL;
+}
+
+/**********************************************************************
+ *							Deletion								  *
+ **********************************************************************/
+
+/*
+ * Maintain Red-Black tree balance after deleting a black node.
+ */
+static void
+rb_delete_fixup(RBTree *rb, RBNode *x)
+{
+	/*
+	 * x is always a black node.  Initially, it is the former child of the
+	 * deleted node.  Each iteration of this loop moves it higher up in the
+	 * tree.
+	 */
+	while (x != rb->root && x->color == BLACK)
+	{
+		/*
+		 * Left and right cases are symmetric.  Any nodes that are children
+		 * of x have a black-height one less than the remainder of the nodes
+		 * in the tree.  We rotate and recolor nodes to move the problem up
+		 * the tree: at some stage we'll either fix the problem, or reach the
+		 * root (where the black-height is allowed to decrease).
+		 */
+		if (x == x->parent->left)
+		{
+			RBNode	   *w = x->parent->right;
+
+			if (w->color == RED)
+			{
+				w->color = BLACK;
+				x->parent->color = RED;
+				rb_rotate_left(rb, x->parent);
+				w = x->parent->right;
+			}
+
+			if (w->left->color == BLACK && w->right->color == BLACK)
+			{
+				w->color = RED;
+				x = x->parent;
+			}
+			else
+			{
+				if (w->right->color == BLACK)
+				{
+					w->left->color = BLACK;
+					w->color = RED;
+					rb_rotate_right(rb, w);
+					w = x->parent->right;
+				}
+				w->color = x->parent->color;
+				x->parent->color = BLACK;
+				w->right->color = BLACK;
+				rb_rotate_left(rb, x->parent);
+				x = rb->root;		/* Arrange for loop to terminate. */
+			}
+		}
+		else
+		{
+			RBNode	   *w = x->parent->left;
+
+			if (w->color == RED)
+			{
+				w->color = BLACK;
+				x->parent->color = RED;
+				rb_rotate_right(rb, x->parent);
+				w = x->parent->left;
+			}
+
+			if (w->right->color == BLACK && w->left->color == BLACK)
+			{
+				w->color = RED;
+				x = x->parent;
+			}
+			else
+			{
+				if (w->left->color == BLACK)
+				{
+					w->right->color = BLACK;
+					w->color = RED;
+					rb_rotate_left(rb, w);
+					w = x->parent->left;
+				}
+				w->color = x->parent->color;
+				x->parent->color = BLACK;
+				w->left->color = BLACK;
+				rb_rotate_right(rb, x->parent);
+				x = rb->root;		/* Arrange for loop to terminate. */
+			}
+		}
+	}
+	x->color = BLACK;
+}
+
+/*
+ * Delete node z from tree.
+ */
+static void
+rb_delete_node(RBTree *rb, RBNode *z)
+{
+	RBNode	   *x,
+			   *y;
+
+	if (!z || z == RBNIL)
+		return;
+
+	/*
+	 * y is the node that will actually be removed from the tree.  This will
+	 * be z if z has fewer than two children, or the tree successor of z
+	 * otherwise.
+	 */
+	if (z->left == RBNIL || z->right == RBNIL)
+	{
+		/* y has a RBNIL node as a child */
+		y = z;
+	}
+	else
+	{
+		/* find tree successor */
+		y = z->right;
+		while (y->left != RBNIL)
+			y = y->left;
+	}
+
+	/* x is y's only child */
+	if (y->left != RBNIL)
+		x = y->left;
+	else
+		x = y->right;
+
+	/* Remove y from the tree. */
+	x->parent = y->parent;
+	if (y->parent)
+	{
+		if (y == y->parent->left)
+			y->parent->left = x;
+		else
+			y->parent->right = x;
+	}
+	else
+	{
+		rb->root = x;
+	}
+
+	/*
+	 * If we removed the tree successor of z rather than z itself, then
+	 * attach the data for the removed node to the one we were supposed to
+	 * remove.
+	 */
+	if (y != z)
+		z->data = y->data;
+
+	/*
+	 * Removing a black node might make some paths from root to leaf contain
+	 * fewer black nodes than others, or it might make two red nodes adjacent.
+	 */
+	if (y->color == BLACK)
+		rb_delete_fixup(rb, x);
+
+	pfree(y);
+}
+
+extern void
+rb_delete(RBTree *rb, void *data)
+{
+	RBNode	   *node = rb->root;
+	int			cmp;
+
+	while (node != RBNIL)
+	{
+		cmp = rb->comparator(data, node->data, rb->arg);
+
+		if (cmp == 0)
+		{
+			/* found node to delete */
+			if (rb->freefunc)
+				rb->freefunc(node->data);
+			node->data = NULL;
+			rb_delete_node(rb, node);
+			return;
+		}
+		else if (cmp < 0)
+			node = node->left;
+		else
+			node = node->right;
+	}
+}
+
+/*
+ * Return data on left most node and delete
+ * that node
+ */
+extern void *
+rb_leftmost(RBTree *rb)
+{
+	RBNode	   *node = rb->root;
+	RBNode	   *leftmost = rb->root;
+	void	   *res = NULL;
+
+	while (node != RBNIL)
+	{
+		leftmost = node;
+		node = node->left;
+	}
+
+	if (leftmost != RBNIL)
+	{
+		res = leftmost->data;
+		leftmost->data = NULL;
+		rb_delete_node(rb, leftmost);
+	}
+
+	return res;
+}
+
+/**********************************************************************
+ *						  Traverse									  *
+ **********************************************************************/
+
+static void *
+rb_next_node(RBTreeIterator *iterator, RBNode *node)
+{
+	node->iteratorState = InitialState;
+	iterator->node = node;
+	return iterator->iterate(iterator);
+}
+
+static void *
+rb_left_right_iterator(RBTreeIterator *iterator)
+{
+	RBNode	   *node = iterator->node;
+
+	switch (node->iteratorState)
+	{
+		case InitialState:
+			if (node->left != RBNIL)
+			{
+				node->iteratorState = FirstStepDone;
+				return rb_next_node(iterator, node->left);
+			}
+		case FirstStepDone:
+			node->iteratorState = SecondStepDone;
+			return node->data;
+		case SecondStepDone:
+			if (node->right != RBNIL)
+			{
+				node->iteratorState = ThirdStepDone;
+				return rb_next_node(iterator, node->right);
+			}
+		case ThirdStepDone:
+			if (node->parent)
+			{
+				iterator->node = node->parent;
+				return iterator->iterate(iterator);
+			}
+			break;
+		default:
+			elog(ERROR, "Unknow node state: %d", node->iteratorState);
+	}
+
+	return NULL;
+}
+
+static void *
+rb_right_left_iterator(RBTreeIterator *iterator)
+{
+	RBNode	   *node = iterator->node;
+
+	switch (node->iteratorState)
+	{
+		case InitialState:
+			if (node->right != RBNIL)
+			{
+				node->iteratorState = FirstStepDone;
+				return rb_next_node(iterator, node->right);
+			}
+		case FirstStepDone:
+			node->iteratorState = SecondStepDone;
+			return node->data;
+		case SecondStepDone:
+			if (node->left != RBNIL)
+			{
+				node->iteratorState = ThirdStepDone;
+				return rb_next_node(iterator, node->left);
+			}
+		case ThirdStepDone:
+			if (node->parent)
+			{
+				iterator->node = node->parent;
+				return iterator->iterate(iterator);
+			}
+			break;
+		default:
+			elog(ERROR, "Unknow node state: %d", node->iteratorState);
+	}
+
+	return NULL;
+}
+
+static void *
+rb_direct_iterator(RBTreeIterator *iterator)
+{
+	RBNode	   *node = iterator->node;
+
+	switch (node->iteratorState)
+	{
+		case InitialState:
+			node->iteratorState = FirstStepDone;
+			return node->data;
+		case FirstStepDone:
+			if (node->left != RBNIL)
+			{
+				node->iteratorState = SecondStepDone;
+				return rb_next_node(iterator, node->left);
+			}
+		case SecondStepDone:
+			if (node->right != RBNIL)
+			{
+				node->iteratorState = ThirdStepDone;
+				return rb_next_node(iterator, node->right);
+			}
+		case ThirdStepDone:
+			if (node->parent)
+			{
+				iterator->node = node->parent;
+				return iterator->iterate(iterator);
+			}
+			break;
+		default:
+			elog(ERROR, "Unknow node state: %d", node->iteratorState);
+	}
+
+	return NULL;
+}
+
+static void *
+rb_inverted_iterator(RBTreeIterator *iterator)
+{
+	RBNode	   *node = iterator->node;
+
+	switch (node->iteratorState)
+	{
+		case InitialState:
+			if (node->left != RBNIL)
+			{
+				node->iteratorState = FirstStepDone;
+				return rb_next_node(iterator, node->left);
+			}
+		case FirstStepDone:
+			if (node->right != RBNIL)
+			{
+				node->iteratorState = SecondStepDone;
+				return rb_next_node(iterator, node->right);
+			}
+		case SecondStepDone:
+			node->iteratorState = ThirdStepDone;
+			return node->data;
+		case ThirdStepDone:
+			if (node->parent)
+			{
+				iterator->node = node->parent;
+				return iterator->iterate(iterator);
+			}
+			break;
+		default:
+			elog(ERROR, "Unknow node state: %d", node->iteratorState);
+	}
+
+	return NULL;
+}
+
+RBTreeIterator *
+rb_begin_iterate(RBTree *rb, RBOrderControl ctrl)
+{
+	RBTreeIterator *iterator = palloc(sizeof(RBTreeIterator));
+
+	iterator->node = rb->root;
+	if (iterator->node != RBNIL)
+		iterator->node->iteratorState = InitialState;
+
+	switch (ctrl)
+	{
+		case LeftRightWalk:			/* visit left, then self, then right */
+			iterator->iterate = rb_left_right_iterator;
+			break;
+		case RightLeftWalk:			/* visit right, then self, then left */
+			iterator->iterate = rb_right_left_iterator;
+			break;
+		case DirectWalk:			/* visit self, then left, then right */
+			iterator->iterate = rb_direct_iterator;
+			break;
+		case InvertedWalk:			/* visit left, then right, then self */
+			iterator->iterate = rb_inverted_iterator;
+			break;
+		default:
+			elog(ERROR, "Unknown iterator order: %d", ctrl);
+	}
+
+	return iterator;
+}
+
+void *
+rb_iterate(RBTreeIterator *iterator)
+{
+	if (iterator->node == RBNIL)
+		return NULL;
+
+	return iterator->iterate(iterator);
+}
+
+void
+rb_free_iterator(RBTreeIterator *iterator)
+{
+	pfree(iterator);
+}
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index cb2ae9a..2a84895 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1882,6 +1882,11 @@ psql_completion(char *text, int start, int end)
 		COMPLETE_WITH_LIST(list_PREPARE);
 	}
 
+/*
+ * PREPARE TRANSACTION is missing on purpose. It's intended for transaction
+ * managers, not for manual use in interactive sessions.
+ */
+
 /* REASSIGN OWNED BY xxx TO yyy */
 	else if (pg_strcasecmp(prev_wd, "REASSIGN") == 0)
 		COMPLETE_WITH_CONST("OWNED");
diff --git a/src/include/access/gin.h b/src/include/access/gin.h
index b96ff95..fcc5371 100644
--- a/src/include/access/gin.h
+++ b/src/include/access/gin.h
@@ -13,6 +13,7 @@
 #include "access/genam.h"
 #include "access/itup.h"
 #include "access/xlog.h"
+#include "utils/rbtree.h"
 #include "fmgr.h"
 
 
@@ -27,14 +28,6 @@
 #define GINNProcs					   5
 
 /*
- * Max depth allowed in search tree during bulk inserts.  This is to keep from
- * degenerating to O(N^2) behavior when the tree is unbalanced due to sorted
- * or nearly-sorted input.	(Perhaps it would be better to use a balanced-tree
- * algorithm, but in common cases that would only add useless overhead.)
- */
-#define GIN_MAX_TREE_DEPTH 100
-
-/*
  * Page opaque data in a inverted index page.
  *
  * Note: GIN does not include a page ID word as do the other index types.
@@ -571,26 +564,22 @@ extern Datum ginarrayconsistent(PG_FUNCTION_ARGS);
 typedef struct EntryAccumulator
 {
 	OffsetNumber attnum;
+	bool		shouldSort;
 	Datum		value;
 	uint32		length;
 	uint32		number;
 	ItemPointerData *list;
-	bool		shouldSort;
-	struct EntryAccumulator *left;
-	struct EntryAccumulator *right;
 } EntryAccumulator;
 
 typedef struct
 {
 	GinState   *ginstate;
-	EntryAccumulator *entries;
-	uint32		maxdepth;
-	EntryAccumulator **stack;
-	uint32		stackpos;
 	long		allocatedMemory;
-
 	uint32		length;
-	EntryAccumulator *entryallocator;
+	EntryAccumulator   *entryallocator;
+	ItemPointerData	   *tmpList;
+	RBTree	   *tree;
+	RBTreeIterator *iterator;
 } BuildAccumulator;
 
 extern void ginInitBA(BuildAccumulator *accum);
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index 30a5033..70965b8 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
  */
 
 /*							yyyymmddN */
-#define CATALOG_VERSION_NO	201001222
+#define CATALOG_VERSION_NO	201001251
 
 #endif
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index c4320e5..5ee7443 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -957,6 +957,10 @@ DATA(insert OID = 723 (  get_bit		   PGNSP PGUID 12 1 0 0 f f f t f i 2 0 23 "17
 DESCR("get bit");
 DATA(insert OID = 724 (  set_bit		   PGNSP PGUID 12 1 0 0 f f f t f i 3 0 17 "17 23 23" _null_ _null_ _null_ _null_	byteaSetBit _null_ _null_ _null_ ));
 DESCR("set bit");
+DATA(insert OID = 749 (  overlay		   PGNSP PGUID 12 1 0 0 f f f t f i 4 0 17 "17 17 23 23" _null_ _null_ _null_ _null_	byteaoverlay _null_ _null_ _null_ ));
+DESCR("substitute portion of string");
+DATA(insert OID = 752 (  overlay		   PGNSP PGUID 12 1 0 0 f f f t f i 3 0 17 "17 17 23" _null_ _null_ _null_ _null_	byteaoverlay_no_len _null_ _null_ _null_ ));
+DESCR("substitute portion of string");
 
 DATA(insert OID = 725 (  dist_pl		   PGNSP PGUID 12 1 0 0 f f f t f i 2 0 701 "600 628" _null_ _null_ _null_ _null_	dist_pl _null_ _null_ _null_ ));
 DESCR("distance between point and line");
@@ -1832,9 +1836,9 @@ DESCR("current schema name");
 DATA(insert OID = 1403 (  current_schemas	PGNSP PGUID 12 1 0 0 f f f t f s 1 0 1003 "16" _null_ _null_ _null_ _null_	current_schemas _null_ _null_ _null_ ));
 DESCR("current schema search list");
 
-DATA(insert OID = 1404 (  overlay			PGNSP PGUID 14 1 0 0 f f f t f i 4 0 25 "25 25 23 23" _null_ _null_ _null_ _null_	"select pg_catalog.substring($1, 1, ($3 - 1)) || $2 || pg_catalog.substring($1, ($3 + $4))" _null_ _null_ _null_ ));
+DATA(insert OID = 1404 (  overlay			PGNSP PGUID 12 1 0 0 f f f t f i 4 0 25 "25 25 23 23" _null_ _null_ _null_ _null_	textoverlay _null_ _null_ _null_ ));
 DESCR("substitute portion of string");
-DATA(insert OID = 1405 (  overlay			PGNSP PGUID 14 1 0 0 f f f t f i 3 0 25 "25 25 23" _null_ _null_ _null_ _null_	"select pg_catalog.substring($1, 1, ($3 - 1)) || $2 || pg_catalog.substring($1, ($3 + pg_catalog.char_length($2)))" _null_ _null_ _null_ ));
+DATA(insert OID = 1405 (  overlay			PGNSP PGUID 12 1 0 0 f f f t f i 3 0 25 "25 25 23" _null_ _null_ _null_ _null_	textoverlay_no_len _null_ _null_ _null_ ));
 DESCR("substitute portion of string");
 
 DATA(insert OID = 1406 (  isvertical		PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 "600 600" _null_ _null_ _null_ _null_  point_vert _null_ _null_ _null_ ));
@@ -2402,9 +2406,17 @@ DESCR("adjust varbit() to typmod length");
 
 DATA(insert OID = 1698 (  position		   PGNSP PGUID 12 1 0 0 f f f t f i 2 0 23 "1560 1560" _null_ _null_ _null_ _null_ bitposition _null_ _null_ _null_ ));
 DESCR("return position of sub-bitstring");
-DATA(insert OID = 1699 (  substring			PGNSP PGUID 12 1 0 0 f f f t f i 2 0 1560 "1560 23" _null_ _null_ _null_ _null_ bitsubstr_no_len _null_ _null_ _null_ ));
+DATA(insert OID = 1699 (  substring		   PGNSP PGUID 12 1 0 0 f f f t f i 2 0 1560 "1560 23" _null_ _null_ _null_ _null_ bitsubstr_no_len _null_ _null_ _null_ ));
 DESCR("return portion of bitstring");
 
+DATA(insert OID = 3030 (  overlay		   PGNSP PGUID 12 1 0 0 f f f t f i 4 0 1560 "1560 1560 23 23" _null_ _null_ _null_ _null_	bitoverlay _null_ _null_ _null_ ));
+DESCR("substitute portion of bitstring");
+DATA(insert OID = 3031 (  overlay		   PGNSP PGUID 12 1 0 0 f f f t f i 3 0 1560 "1560 1560 23" _null_ _null_ _null_ _null_	bitoverlay_no_len _null_ _null_ _null_ ));
+DESCR("substitute portion of bitstring");
+DATA(insert OID = 3032 (  get_bit		   PGNSP PGUID 12 1 0 0 f f f t f i 2 0 23 "1560 23" _null_ _null_ _null_ _null_ bitgetbit _null_ _null_ _null_ ));
+DESCR("get bit");
+DATA(insert OID = 3033 (  set_bit		   PGNSP PGUID 12 1 0 0 f f f t f i 3 0 1560 "1560 23 23" _null_ _null_ _null_ _null_ bitsetbit _null_ _null_ _null_ ));
+DESCR("set bit");
 
 /* for mac type support */
 DATA(insert OID = 436 (  macaddr_in			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 829 "2275" _null_ _null_ _null_ _null_ macaddr_in _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 72e9ed0..9ad9573 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -698,6 +698,8 @@ extern Datum textoctetlen(PG_FUNCTION_ARGS);
 extern Datum textpos(PG_FUNCTION_ARGS);
 extern Datum text_substr(PG_FUNCTION_ARGS);
 extern Datum text_substr_no_len(PG_FUNCTION_ARGS);
+extern Datum textoverlay(PG_FUNCTION_ARGS);
+extern Datum textoverlay_no_len(PG_FUNCTION_ARGS);
 extern Datum name_text(PG_FUNCTION_ARGS);
 extern Datum text_name(PG_FUNCTION_ARGS);
 extern int	varstr_cmp(char *arg1, int len1, char *arg2, int len2);
diff --git a/src/include/utils/bytea.h b/src/include/utils/bytea.h
index 6ad3558..b51ea4c 100644
--- a/src/include/utils/bytea.h
+++ b/src/include/utils/bytea.h
@@ -46,5 +46,7 @@ extern Datum byteacat(PG_FUNCTION_ARGS);
 extern Datum byteapos(PG_FUNCTION_ARGS);
 extern Datum bytea_substr(PG_FUNCTION_ARGS);
 extern Datum bytea_substr_no_len(PG_FUNCTION_ARGS);
+extern Datum byteaoverlay(PG_FUNCTION_ARGS);
+extern Datum byteaoverlay_no_len(PG_FUNCTION_ARGS);
 
 #endif   /* BYTEA_H */
diff --git a/src/include/utils/rbtree.h b/src/include/utils/rbtree.h
new file mode 100644
index 0000000..6ce74b0
--- /dev/null
+++ b/src/include/utils/rbtree.h
@@ -0,0 +1,47 @@
+/*-------------------------------------------------------------------------
+ *
+ * rbtree.h
+ *    interface for PostgreSQL generic Red-Black binary tree package
+ *
+ * Copyright (c) 1996-2009, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * 		$PostgreSQL: pgsql/src/backend/nodes/list.c,v 1.69 2008/01/01 19:45:50 momjian Exp $
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef RBTREE_H
+#define RBTREE_H
+
+typedef struct RBTree RBTree;
+typedef struct RBTreeIterator RBTreeIterator;
+
+typedef int (*rb_comparator) (const void *a, const void *b, void *arg);
+typedef void* (*rb_appendator) (void *current, void *new, void *arg);
+typedef void (*rb_freefunc) (void *a);
+
+extern RBTree *rb_create(rb_comparator comparator,
+							rb_appendator appendator,
+							rb_freefunc freefunc,
+							void *arg);
+extern void	rb_free(RBTree *rb);
+
+extern void *rb_find(RBTree *rb, void *data);
+extern void *rb_insert(RBTree *rb, void *data);
+extern void rb_delete(RBTree *rb, void *data);
+extern void *rb_leftmost(RBTree *rb);
+
+typedef enum RBOrderControl
+{
+	LeftRightWalk,
+	RightLeftWalk,
+	DirectWalk,
+	InvertedWalk
+} RBOrderControl;
+
+extern RBTreeIterator* rb_begin_iterate(RBTree *rb, RBOrderControl ctrl);
+extern void *rb_iterate(RBTreeIterator *iterator);
+extern void rb_free_iterator(RBTreeIterator *iterator);
+
+#endif
diff --git a/src/include/utils/varbit.h b/src/include/utils/varbit.h
index 57efc34..c2f4005 100644
--- a/src/include/utils/varbit.h
+++ b/src/include/utils/varbit.h
@@ -89,6 +89,8 @@ extern Datum bitshiftright(PG_FUNCTION_ARGS);
 extern Datum bitcat(PG_FUNCTION_ARGS);
 extern Datum bitsubstr(PG_FUNCTION_ARGS);
 extern Datum bitsubstr_no_len(PG_FUNCTION_ARGS);
+extern Datum bitoverlay(PG_FUNCTION_ARGS);
+extern Datum bitoverlay_no_len(PG_FUNCTION_ARGS);
 extern Datum bitlength(PG_FUNCTION_ARGS);
 extern Datum bitoctetlength(PG_FUNCTION_ARGS);
 extern Datum bitfromint4(PG_FUNCTION_ARGS);
@@ -96,5 +98,7 @@ extern Datum bittoint4(PG_FUNCTION_ARGS);
 extern Datum bitfromint8(PG_FUNCTION_ARGS);
 extern Datum bittoint8(PG_FUNCTION_ARGS);
 extern Datum bitposition(PG_FUNCTION_ARGS);
+extern Datum bitsetbit(PG_FUNCTION_ARGS);
+extern Datum bitgetbit(PG_FUNCTION_ARGS);
 
 #endif
diff --git a/src/pl/tcl/expected/pltcl_setup.out b/src/pl/tcl/expected/pltcl_setup.out
index 496cf22..e46c1c3 100644
--- a/src/pl/tcl/expected/pltcl_setup.out
+++ b/src/pl/tcl/expected/pltcl_setup.out
@@ -495,3 +495,23 @@ CREATE OPERATOR CLASS tcl_int4_ops
 	OPERATOR 4  @>=,
 	OPERATOR 5  @>,
 	FUNCTION 1  tcl_int4cmp(int4,int4) ;
+--
+-- Test usage of Tcl's "clock" command.  In recent Tcl versions this
+-- command fails without working "unknown" support, so it's a good canary
+-- for initialization problems.
+--
+create function tcl_date_week(int4,int4,int4) returns text as $$
+    return [clock format [clock scan "$2/$3/$1"] -format "%U"]
+$$ language pltcl immutable;
+select tcl_date_week(2010,1,24);
+ tcl_date_week 
+---------------
+ 04
+(1 row)
+
+select tcl_date_week(2001,10,24);
+ tcl_date_week 
+---------------
+ 42
+(1 row)
+
diff --git a/src/pl/tcl/pltcl.c b/src/pl/tcl/pltcl.c
index 2603028..350d471 100644
--- a/src/pl/tcl/pltcl.c
+++ b/src/pl/tcl/pltcl.c
@@ -304,9 +304,12 @@ _PG_init(void)
 	 ************************************************************/
 	if ((pltcl_hold_interp = Tcl_CreateInterp()) == NULL)
 		elog(ERROR, "could not create \"hold\" interpreter");
+	if (Tcl_Init(pltcl_hold_interp) == TCL_ERROR)
+		elog(ERROR, "could not initialize \"hold\" interpreter");
 
 	/************************************************************
-	 * Create the two interpreters
+	 * Create the two slave interpreters.  Note: Tcl automatically does
+	 * Tcl_Init on the normal slave, and it's not wanted for the safe slave.
 	 ************************************************************/
 	if ((pltcl_norm_interp =
 		 Tcl_CreateSlave(pltcl_hold_interp, "norm", 0)) == NULL)
diff --git a/src/pl/tcl/sql/pltcl_setup.sql b/src/pl/tcl/sql/pltcl_setup.sql
index 55ac7e2..4a581ed 100644
--- a/src/pl/tcl/sql/pltcl_setup.sql
+++ b/src/pl/tcl/sql/pltcl_setup.sql
@@ -542,3 +542,15 @@ CREATE OPERATOR CLASS tcl_int4_ops
 	OPERATOR 4  @>=,
 	OPERATOR 5  @>,
 	FUNCTION 1  tcl_int4cmp(int4,int4) ;
+
+--
+-- Test usage of Tcl's "clock" command.  In recent Tcl versions this
+-- command fails without working "unknown" support, so it's a good canary
+-- for initialization problems.
+--
+create function tcl_date_week(int4,int4,int4) returns text as $$
+    return [clock format [clock scan "$2/$3/$1"] -format "%U"]
+$$ language pltcl immutable;
+
+select tcl_date_week(2010,1,24);
+select tcl_date_week(2001,10,24);
diff --git a/src/test/regress/expected/bit.out b/src/test/regress/expected/bit.out
index 3563d23..40082ca 100644
--- a/src/test/regress/expected/bit.out
+++ b/src/test/regress/expected/bit.out
@@ -509,3 +509,43 @@ SELECT POSITION(B'1101' IN v),
 
 DROP TABLE BIT_SHIFT_TABLE;
 DROP TABLE VARBIT_SHIFT_TABLE;
+-- Get/Set bit
+SELECT get_bit(B'0101011000100', 10);
+ get_bit 
+---------
+       1
+(1 row)
+
+SELECT set_bit(B'0101011000100100', 15, 1);
+     set_bit      
+------------------
+ 0101011000100101
+(1 row)
+
+SELECT set_bit(B'0101011000100100', 16, 1);	-- fail
+ERROR:  bit index 16 out of valid range (0..15)
+-- Overlay
+SELECT overlay(B'0101011100' placing '001' from 2 for 3);
+  overlay   
+------------
+ 0001011100
+(1 row)
+
+SELECT overlay(B'0101011100' placing '101' from 6);
+  overlay   
+------------
+ 0101010100
+(1 row)
+
+SELECT overlay(B'0101011100' placing '001' from 11);
+    overlay    
+---------------
+ 0101011100001
+(1 row)
+
+SELECT overlay(B'0101011100' placing '001' from 20);
+    overlay    
+---------------
+ 0101011100001
+(1 row)
+
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index 392f48e..42c34a5 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -1579,3 +1579,21 @@ SELECT btrim(E'\\000trim\\000'::bytea, ''::bytea);
  \000trim\000
 (1 row)
 
+SELECT encode(overlay(E'Th\\000omas'::bytea placing E'Th\\001omas'::bytea from 2),'escape');
+   encode    
+-------------
+ TTh\x01omas
+(1 row)
+
+SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 8),'escape');
+       encode       
+--------------------
+ Th\000omas\x02\x03
+(1 row)
+
+SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 5 for 3),'escape');
+     encode      
+-----------------
+ Th\000o\x02\x03
+(1 row)
+
diff --git a/src/test/regress/sql/bit.sql b/src/test/regress/sql/bit.sql
index f178991..73ddd37 100644
--- a/src/test/regress/sql/bit.sql
+++ b/src/test/regress/sql/bit.sql
@@ -184,3 +184,14 @@ SELECT POSITION(B'1101' IN v),
 
 DROP TABLE BIT_SHIFT_TABLE;
 DROP TABLE VARBIT_SHIFT_TABLE;
+
+-- Get/Set bit
+SELECT get_bit(B'0101011000100', 10);
+SELECT set_bit(B'0101011000100100', 15, 1);
+SELECT set_bit(B'0101011000100100', 16, 1);	-- fail
+
+-- Overlay
+SELECT overlay(B'0101011100' placing '001' from 2 for 3);
+SELECT overlay(B'0101011100' placing '101' from 6);
+SELECT overlay(B'0101011100' placing '001' from 11);
+SELECT overlay(B'0101011100' placing '001' from 20);
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 63df940..8c35be8 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -547,3 +547,6 @@ SELECT trim(E'\\000'::bytea from E'\\000Tom\\000'::bytea);
 SELECT btrim(E'\\000trim\\000'::bytea, E'\\000'::bytea);
 SELECT btrim(''::bytea, E'\\000'::bytea);
 SELECT btrim(E'\\000trim\\000'::bytea, ''::bytea);
+SELECT encode(overlay(E'Th\\000omas'::bytea placing E'Th\\001omas'::bytea from 2),'escape');
+SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 8),'escape');
+SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 5 for 3),'escape');