REINDEX CONCURRENTLY 2.0

Started by Michael Paquierabout 12 years ago170 messages

michael.paquier@gmail.com

about 12 years ago

3 attachment(s)

Hi all,

Please find attached updated patches for the support of REINDEX
CONCURRENTLY, renamed 2.0 for the occasion:
- 20131114_1_index_drop_comments.patch, patch that updates some
comments in index_drop. This updates only a couple of comments in
index_drop but has not been committed yet. It should be IMO...
- 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
providing a single API that can be used to wait for old snapshots
- 20131114_3_reindex_concurrently.patch, providing the core feature.
Patch 3 needs to have patch 2 applied first. Regression tests,
isolation tests and documentation are included with the patch.

This is the continuation of the previous thread that finished here:
/messages/by-id/CAB7nPqS+WYN021oQHd9GPe_5dSVcVXMvEBW_E2AV9OOEwggMHw@mail.gmail.com

This patch has been added for this commit fest.
Regards,
--
Michael

Attachments:

20131114_1_index_drop_comments.patchtext/x-patch; charset=US-ASCII; name=20131114_1_index_drop_comments.patchDownload

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 826e504..41b7866 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1444,9 +1444,11 @@ index_drop(Oid indexId, bool concurrent)
 
 		/*
 		 * Now we must wait until no running transaction could be using the
-		 * index for a query. Note we do not need to worry about xacts that
-		 * open the table for reading after this point; they will see the
-		 * index as invalid when they open the relation.
+		 * index for a query. This is done with AccessExclusiveLock to check
+		 * which running transaction has a lock of any kind on the table.
+		 * Note we do not need to worry about xacts that open the table for
+		 * reading after this point; they will see the index as invalid when
+		 * they open the relation.
 		 *
 		 * Note: the reason we use actual lock acquisition here, rather than
 		 * just checking the ProcArray and sleeping, is that deadlock is
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2155252..c952bc3 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -651,9 +651,10 @@ DefineIndex(IndexStmt *stmt,
 	 * for an overview of how this works)
 	 *
 	 * Now we must wait until no running transaction could have the table open
-	 * with the old list of indexes. Note we do not need to worry about xacts
-	 * that open the table for writing after this point; they will see the new
-	 * index when they open it.
+	 * with the old list of indexes. This is done with ShareLock to check
+	 * which running transaction holds a lock that permits writing the table.
+	 * Note we do not need to worry about xacts that open the table for
+	 * writing after this point; they will see the new index when they open it.
 	 *
 	 * Note: the reason we use actual lock acquisition here, rather than just
 	 * checking the ProcArray and sleeping, is that deadlock is possible if

20131114_2_WaitForOldsnapshots_refactor.patchtext/x-patch; charset=US-ASCII; name=20131114_2_WaitForOldsnapshots_refactor.patchDownload

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2155252..fe72613 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -277,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -321,12 +401,9 @@ DefineIndex(IndexStmt *stmt,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -766,74 +843,9 @@ DefineIndex(IndexStmt *stmt,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.	But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.	(Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry

20131114_3_reindex_concurrently.patchtext/x-patch; charset=US-ASCII; name=20131114_3_reindex_concurrently.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index f56eb55..cd8cce0 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -863,8 +863,9 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
-         some forms of <command>ALTER TABLE</command>.
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</> and some forms of
+         <command>ALTER TABLE</command>.
         </para>
        </listitem>
       </varlistentry>
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 7222665..5f42c4f 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
+     </para>
+    </listitem>
+
+    <listitem>
+     <para>
+      Concurrent indexes based on a <literal>PRIMARY KEY</> or an <literal>
+      EXCLUDE</>  constraint need to be dropped with <literal>ALTER TABLE
+      DROP CONSTRAINT</>. This is also the case of <literal>UNIQUE</> indexes
+      using constraints. Other indexes can be dropped using <literal>DROP INDEX</>,
+      including invalid toast indexes.
      </para>
     </listitem>
 
@@ -139,6 +152,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -231,6 +259,115 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    to be reindexed by separate commands.  This is still possible, but
    redundant.
   </para>
+
+
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database.  Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes without locking
+    out writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>.
+    When this option is used, <productname>PostgreSQL</> must perform two
+    scans of the table for each index that needs to be rebuild and in
+    addition it must wait for all existing transactions that could potentially
+    use the index to terminate. This method requires more total work than a
+    standard index rebuild and takes significantly longer to complete as it
+    needs to wait for unfinished transactions that might modify the index.
+    However, since it allows normal operations to continue while the index
+    is rebuilt, this method is useful for rebuilding indexes in a production
+    environment.  Of course, the extra CPU, memory and I/O load imposed by
+    the index rebuild might slow other operations.
+   </para>
+
+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one transaction,
+    then two table scans occur in two more transactions.  Once this is performed,
+    the old and fresh indexes are swapped. Finally two additional transactions
+    are used to mark the concurrent index as not ready and then drop it.
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. This works as well with indexes of toast relations.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the
+    same table to occur in parallel, but only one concurrent index build
+    can occur on a table at a time.  In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When <command>CONCURRENTLY</command>
+    is specified, the operation is done with <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support <command>CONCURRENTLY
+    </command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +399,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 0275240..22fd7a6 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -43,9 +43,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -673,6 +675,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -696,7 +702,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -739,19 +746,22 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -1091,6 +1101,410 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	List	   *columnNames = NIL;
+	List	   *indexprs = NIL;
+	ListCell   *indexpr_item;
+	int			i;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/* Get expressions associated to this index for compilation of column names */
+	indexprs = RelationGetIndexExpressions(indexRelation);
+	indexpr_item = list_head(indexprs);
+
+	/* Build the list of column names, necessary for index_create */
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		char	   *origname, *curname;
+		char		buf[NAMEDATALEN];
+		AttrNumber	attnum = indexInfo->ii_KeyAttrNumbers[i];
+		int			j;
+
+		/* Pick up column name depending on attribute type */
+		if (attnum > 0)
+		{
+			/*
+			 * This is a column attribute, so simply pick column name from
+			 * relation.
+			 */
+			Form_pg_attribute attform = heapRelation->rd_att->attrs[attnum - 1];;
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else if (attnum < 0)
+		{
+			/* Case of a system attribute */
+			Form_pg_attribute attform = SystemAttributeDefinition(attnum,
+										  heapRelation->rd_rel->relhasoids);
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else
+		{
+			Node *indnode;
+			/*
+			 * This is the case of an expression, so pick up the expression
+			 * name.
+			 */
+			Assert(indexpr_item != NULL);
+			indnode = (Node *) lfirst(indexpr_item);
+			indexpr_item = lnext(indexpr_item);
+			origname = deparse_expression(indnode,
+							deparse_context_for(RelationGetRelationName(heapRelation),
+												RelationGetRelid(heapRelation)),
+							false, false);
+		}
+
+		/*
+		 * Check if the name picked has any conflict with existing names and
+		 * change it.
+		 */
+		curname = origname;
+		for (j = 1;; j++)
+		{
+			ListCell   *lc2;
+			char		nbuf[32];
+			int			nlen;
+
+			foreach(lc2, columnNames)
+			{
+				if (strcmp(curname, (char *) lfirst(lc2)) == 0)
+					break;
+			}
+			if (lc2 == NULL)
+				break; /* found nonconflicting name */
+
+			sprintf(nbuf, "%d", j);
+
+			/* Ensure generated names are shorter than NAMEDATALEN */
+			nlen = pg_mbcliplen(origname, strlen(origname),
+								NAMEDATALEN - 1 - strlen(nbuf));
+			memcpy(buf, origname, nlen);
+			strcpy(buf + nlen, nbuf);
+			curname = buf;
+		}
+
+		/* Append name to existing list */
+		columnNames = lappend(columnNames, pstrdup(curname));
+	}
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 columnNames,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	rel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	rel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(rel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both the relations, but keep the locks */
+	heap_close(rel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. For the time being
+ * what is done here is switching the relation relfilenode of the indexes. If
+ * extra operations are necessary during a concurrent swap, processing should
+ * be added here. Relations do not require an exclusive lock thanks to the
+ * MVCC catalog access to relcache.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY before
+ * actually dropping the index. After calling this function the index is
+ * seen by all the backends as dead. Low-level locks taken during here
+ * are kept until the end of the transaction doing calling this function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query if necessary.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1443,50 +1857,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query. Note we do not need to worry about xacts that
-		 * open the table for reading after this point; they will see the
-		 * index as invalid when they open the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.	So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 385d64d..0c2971b 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -281,7 +281,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid, Datum reloptio
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index fe72613..c1166c0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -391,7 +392,6 @@ DefineIndex(IndexStmt *stmt,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -530,7 +530,8 @@ DefineIndex(IndexStmt *stmt,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -677,7 +678,7 @@ DefineIndex(IndexStmt *stmt,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -758,27 +759,15 @@ DefineIndex(IndexStmt *stmt,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -872,6 +861,542 @@ DefineIndex(IndexStmt *stmt,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each reindexing step
+ * is done in parallel with all the table's indexes as well as its dependent
+ * toast indexes.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for rebuilding concurrently the indexes.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built after. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * its indexes for toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 false,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/* Index relation has been closed by previous commit, so reopen it */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, ShareUpdateExclusiveLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with the INSERT that
+	 * might have occurred in the parent table.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is done
+	 * with a separate transaction to avoid opening transaction for an
+	 * unnecessary too long time.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated could be used,
+	 * we need to swap each concurrent index with its corresponding old index.
+	 * Note that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation fails multiple times in
+	 * a row due to a reason or another. Note that we already know thanks to
+	 * validation step that
+	 */
+
+	/* Swap the indexes and mark the indexes that have the old data as invalid */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+
+		/* Swap old index and its concurrent */
+		index_concurrent_swap(concurrentOid, indOid);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The concurrent indexes now hold the old relfilenode of the other indexes
+	 * transactions that might use them. Each operation is performed with a
+	 * separate transaction.
+	 */
+
+	/* Now mark the concurrent indexes as not ready */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes. This needs to be done through
+	 * performDeletion or related dependencies will not be dropped for the old
+	 * indexes. The internal mechanism of DROP INDEX CONCURRENTLY is not used
+	 * as here the indexes are already considered as dead and invalid, so they
+	 * will not be used by other backends.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1534,7 +2059,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1560,6 +2086,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1672,18 +2205,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1752,13 +2289,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run through the concurrent process if necessary */
+	if (concurrent)
+	{
+		if (!ReindexRelationConcurrently(heapOid))
+		{
+			ereport(NOTICE,
+					(errmsg("table \"%s\" has no indexes",
+							relation->relname)));
+		}
+		return heapOid;
+	}
 
 	if (!reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
@@ -1779,7 +2330,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1791,6 +2345,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1874,17 +2437,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0b31f55..309a716 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -875,6 +875,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -910,7 +911,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 39e3b2e..5495f22 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 1733da6..af7549f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3651,6 +3651,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 7b29812..a699b46 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1858,6 +1858,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 11f6291..a8258f5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -6833,29 +6833,32 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6a7bf0d..e66c415 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -779,16 +779,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -800,8 +804,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index e697275..ab45c67 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -60,7 +60,24 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 836c99e..d78a63e 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -27,10 +27,11 @@ extern Oid DefineIndex(IndexStmt *stmt,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 55524b4..dee176d 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2596,6 +2596,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 329dbf1..114035b 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -21,4 +21,5 @@ test: delete-abort-savept-2
 test: aborted-keyrevoke
 test: multixact-no-deadlock
 test: drop-index-concurrently-1
+test: reindex-concurrently
 test: timeouts
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index b7b9203..5a59010 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2757,3 +2757,60 @@ ORDER BY thousand;
         1 |     1001
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 54f9161..1e1a560 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -917,3 +917,44 @@ ORDER BY thousand;
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

Peter Eisentraut

peter_e@gmx.net

about 12 years ago

In reply to: Michael Paquier (#1)

Re: REINDEX CONCURRENTLY 2.0

On 11/14/13, 9:40 PM, Michael Paquier wrote:

Hi all,

Please find attached updated patches for the support of REINDEX
CONCURRENTLY, renamed 2.0 for the occasion:
- 20131114_1_index_drop_comments.patch, patch that updates some
comments in index_drop. This updates only a couple of comments in
index_drop but has not been committed yet. It should be IMO...
- 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
providing a single API that can be used to wait for old snapshots
- 20131114_3_reindex_concurrently.patch, providing the core feature.
Patch 3 needs to have patch 2 applied first. Regression tests,
isolation tests and documentation are included with the patch.

The third patch needs to be rebased, because of a conflict in index.c.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@2ndquadrant.com

about 12 years ago

In reply to: Michael Paquier (#1)

Re: REINDEX CONCURRENTLY 2.0

Hi,

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:

- 20131114_3_reindex_concurrently.patch, providing the core feature.
Patch 3 needs to have patch 2 applied first. Regression tests,
isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael.paquier@gmail.com

about 12 years ago

In reply to: Andres Freund (#3)

3 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:

- 20131114_3_reindex_concurrently.patch, providing the core feature.
Patch 3 needs to have patch 2 applied first. Regression tests,
isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

I have added documentation in the patch with a better explanation
about why those choices of implementation are made.
Thanks,
--
Michael

Attachments:

20131119_0_reindex_base.patchtext/x-patch; charset=US-ASCII; name=20131119_0_reindex_base.patchDownload

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2155252..fe72613 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -277,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -321,12 +401,9 @@ DefineIndex(IndexStmt *stmt,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -766,74 +843,9 @@ DefineIndex(IndexStmt *stmt,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.	But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.	(Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry

20131119_1_reindex_conc.patchtext/x-patch; charset=US-ASCII; name=20131119_1_reindex_conc.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index f56eb55..cd8cce0 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -863,8 +863,9 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
-         some forms of <command>ALTER TABLE</command>.
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</> and some forms of
+         <command>ALTER TABLE</command>.
         </para>
        </listitem>
       </varlistentry>
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 7222665..f4f6333 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
+     </para>
+    </listitem>
+
+    <listitem>
+     <para>
+      Concurrent indexes based on a <literal>PRIMARY KEY</> or an <literal>
+      EXCLUDE</>  constraint need to be dropped with <literal>ALTER TABLE
+      DROP CONSTRAINT</>. This is also the case of <literal>UNIQUE</> indexes
+      using constraints. Other indexes can be dropped using <literal>DROP INDEX</>,
+      including invalid toast indexes.
      </para>
     </listitem>
 
@@ -139,6 +152,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -231,6 +259,127 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    to be reindexed by separate commands.  This is still possible, but
    redundant.
   </para>
+
+
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database.  Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes without locking
+    out writes.  This method is invoked by specifying the
+    option <literal>CONCURRENTLY</> of <command>REINDEX</>.
+    When this option is used, <productname>PostgreSQL</> must perform two
+    scans of the table for each index that needs to be rebuild and in
+    addition it must wait for all existing transactions that could potentially
+    use the index to terminate. This method requires more total work than a
+    standard index rebuild and takes significantly longer to complete as it
+    needs to wait for unfinished transactions that might modify the index.
+    However, since it allows normal operations to continue while the index
+    is rebuilt, this method is useful for rebuilding indexes in a production
+    environment.  Of course, the extra CPU, memory and I/O load imposed by
+    the index rebuild might slow other operations.
+   </para>
+
+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one
+    transaction, then two table scans occur in two more transactions.  Once
+    this is performed, the old and fresh indexes are swapped in a third
+    transaction by exchanging their values of
+    <structname>pg_class</>.<structfield>relfilenode</>.  Note that at swap
+    phase, the concurrent index is kept as invalid so swap is done using the
+    former index, which is valid, and its concurrent index, remaining in
+    invalid state. This processing prevents cases where the number of valid
+    indexes would double in the case of a failure of
+    <command>REINDEX CONCURRENTLY</> as this operation cannot be performed
+    on invalid indexes. Once the swap phase is done, process begins a fourth
+    transaction that is used to mark the concurrent index (now having the
+    old value of <structname>pg_class</>.<structfield>relfilenode</>) as
+    not ready for each index rebuilt. Finally a fifth transaction is done
+    to drop the index that has been concurrently created in a way similar to
+    <command>DROP INDEX CONCURRENTLY</>.
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. This works as well with indexes of toast relations.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the
+    same table to occur in parallel, but only one concurrent index build
+    can occur on a table at a time.  In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When <command>CONCURRENTLY</command>
+    is specified, the operation is done with <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support <command>CONCURRENTLY
+    </command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +411,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 0275240..22fd7a6 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -43,9 +43,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -673,6 +675,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -696,7 +702,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -739,19 +746,22 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -1091,6 +1101,410 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	List	   *columnNames = NIL;
+	List	   *indexprs = NIL;
+	ListCell   *indexpr_item;
+	int			i;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/* Get expressions associated to this index for compilation of column names */
+	indexprs = RelationGetIndexExpressions(indexRelation);
+	indexpr_item = list_head(indexprs);
+
+	/* Build the list of column names, necessary for index_create */
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		char	   *origname, *curname;
+		char		buf[NAMEDATALEN];
+		AttrNumber	attnum = indexInfo->ii_KeyAttrNumbers[i];
+		int			j;
+
+		/* Pick up column name depending on attribute type */
+		if (attnum > 0)
+		{
+			/*
+			 * This is a column attribute, so simply pick column name from
+			 * relation.
+			 */
+			Form_pg_attribute attform = heapRelation->rd_att->attrs[attnum - 1];;
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else if (attnum < 0)
+		{
+			/* Case of a system attribute */
+			Form_pg_attribute attform = SystemAttributeDefinition(attnum,
+										  heapRelation->rd_rel->relhasoids);
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else
+		{
+			Node *indnode;
+			/*
+			 * This is the case of an expression, so pick up the expression
+			 * name.
+			 */
+			Assert(indexpr_item != NULL);
+			indnode = (Node *) lfirst(indexpr_item);
+			indexpr_item = lnext(indexpr_item);
+			origname = deparse_expression(indnode,
+							deparse_context_for(RelationGetRelationName(heapRelation),
+												RelationGetRelid(heapRelation)),
+							false, false);
+		}
+
+		/*
+		 * Check if the name picked has any conflict with existing names and
+		 * change it.
+		 */
+		curname = origname;
+		for (j = 1;; j++)
+		{
+			ListCell   *lc2;
+			char		nbuf[32];
+			int			nlen;
+
+			foreach(lc2, columnNames)
+			{
+				if (strcmp(curname, (char *) lfirst(lc2)) == 0)
+					break;
+			}
+			if (lc2 == NULL)
+				break; /* found nonconflicting name */
+
+			sprintf(nbuf, "%d", j);
+
+			/* Ensure generated names are shorter than NAMEDATALEN */
+			nlen = pg_mbcliplen(origname, strlen(origname),
+								NAMEDATALEN - 1 - strlen(nbuf));
+			memcpy(buf, origname, nlen);
+			strcpy(buf + nlen, nbuf);
+			curname = buf;
+		}
+
+		/* Append name to existing list */
+		columnNames = lappend(columnNames, pstrdup(curname));
+	}
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 columnNames,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	rel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	rel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(rel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both the relations, but keep the locks */
+	heap_close(rel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. For the time being
+ * what is done here is switching the relation relfilenode of the indexes. If
+ * extra operations are necessary during a concurrent swap, processing should
+ * be added here. Relations do not require an exclusive lock thanks to the
+ * MVCC catalog access to relcache.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY before
+ * actually dropping the index. After calling this function the index is
+ * seen by all the backends as dead. Low-level locks taken during here
+ * are kept until the end of the transaction doing calling this function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query if necessary.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1443,50 +1857,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query. Note we do not need to worry about xacts that
-		 * open the table for reading after this point; they will see the
-		 * index as invalid when they open the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.	So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 385d64d..0c2971b 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -281,7 +281,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid, Datum reloptio
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index fe72613..c1166c0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -391,7 +392,6 @@ DefineIndex(IndexStmt *stmt,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -530,7 +530,8 @@ DefineIndex(IndexStmt *stmt,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -677,7 +678,7 @@ DefineIndex(IndexStmt *stmt,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -758,27 +759,15 @@ DefineIndex(IndexStmt *stmt,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -872,6 +861,542 @@ DefineIndex(IndexStmt *stmt,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each reindexing step
+ * is done in parallel with all the table's indexes as well as its dependent
+ * toast indexes.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for rebuilding concurrently the indexes.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built after. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * its indexes for toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 false,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/* Index relation has been closed by previous commit, so reopen it */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, ShareUpdateExclusiveLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with the INSERT that
+	 * might have occurred in the parent table.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is done
+	 * with a separate transaction to avoid opening transaction for an
+	 * unnecessary too long time.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated could be used,
+	 * we need to swap each concurrent index with its corresponding old index.
+	 * Note that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation fails multiple times in
+	 * a row due to a reason or another. Note that we already know thanks to
+	 * validation step that
+	 */
+
+	/* Swap the indexes and mark the indexes that have the old data as invalid */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+
+		/* Swap old index and its concurrent */
+		index_concurrent_swap(concurrentOid, indOid);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The concurrent indexes now hold the old relfilenode of the other indexes
+	 * transactions that might use them. Each operation is performed with a
+	 * separate transaction.
+	 */
+
+	/* Now mark the concurrent indexes as not ready */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes. This needs to be done through
+	 * performDeletion or related dependencies will not be dropped for the old
+	 * indexes. The internal mechanism of DROP INDEX CONCURRENTLY is not used
+	 * as here the indexes are already considered as dead and invalid, so they
+	 * will not be used by other backends.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1534,7 +2059,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1560,6 +2086,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1672,18 +2205,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1752,13 +2289,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run through the concurrent process if necessary */
+	if (concurrent)
+	{
+		if (!ReindexRelationConcurrently(heapOid))
+		{
+			ereport(NOTICE,
+					(errmsg("table \"%s\" has no indexes",
+							relation->relname)));
+		}
+		return heapOid;
+	}
 
 	if (!reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
@@ -1779,7 +2330,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1791,6 +2345,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1874,17 +2437,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0b31f55..309a716 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -875,6 +875,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -910,7 +911,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 39e3b2e..5495f22 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 1733da6..af7549f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3651,6 +3651,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 7b29812..a699b46 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1858,6 +1858,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 11f6291..a8258f5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -6833,29 +6833,32 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6a7bf0d..e66c415 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -779,16 +779,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -800,8 +804,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index e697275..ab45c67 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -60,7 +60,24 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index f8ceb5d..11d45bb 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -27,10 +27,11 @@ extern Oid DefineIndex(IndexStmt *stmt,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 55524b4..dee176d 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2596,6 +2596,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 329dbf1..114035b 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -21,4 +21,5 @@ test: delete-abort-savept-2
 test: aborted-keyrevoke
 test: multixact-no-deadlock
 test: drop-index-concurrently-1
+test: reindex-concurrently
 test: timeouts
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index b7b9203..5a59010 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2757,3 +2757,60 @@ ORDER BY thousand;
         1 |     1001
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 54f9161..1e1a560 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -917,3 +917,44 @@ ORDER BY thousand;
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

20131119_index_drop_comments.patchtext/x-patch; charset=US-ASCII; name=20131119_index_drop_comments.patchDownload

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 826e504..41b7866 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1444,9 +1444,11 @@ index_drop(Oid indexId, bool concurrent)
 
 		/*
 		 * Now we must wait until no running transaction could be using the
-		 * index for a query. Note we do not need to worry about xacts that
-		 * open the table for reading after this point; they will see the
-		 * index as invalid when they open the relation.
+		 * index for a query. This is done with AccessExclusiveLock to check
+		 * which running transaction has a lock of any kind on the table.
+		 * Note we do not need to worry about xacts that open the table for
+		 * reading after this point; they will see the index as invalid when
+		 * they open the relation.
 		 *
 		 * Note: the reason we use actual lock acquisition here, rather than
 		 * just checking the ProcArray and sleeping, is that deadlock is
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2155252..c952bc3 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -651,9 +651,10 @@ DefineIndex(IndexStmt *stmt,
 	 * for an overview of how this works)
 	 *
 	 * Now we must wait until no running transaction could have the table open
-	 * with the old list of indexes. Note we do not need to worry about xacts
-	 * that open the table for writing after this point; they will see the new
-	 * index when they open it.
+	 * with the old list of indexes. This is done with ShareLock to check
+	 * which running transaction holds a lock that permits writing the table.
+	 * Note we do not need to worry about xacts that open the table for
+	 * writing after this point; they will see the new index when they open it.
 	 *
 	 * Note: the reason we use actual lock acquisition here, rather than just
 	 * checking the ProcArray and sleeping, is that deadlock is possible if

Andres Freund

andres@2ndquadrant.com

about 12 years ago

In reply to: Michael Paquier (#4)

Re: REINDEX CONCURRENTLY 2.0

On 2013-11-18 19:52:29 +0900, Michael Paquier wrote:

On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:

- 20131114_3_reindex_concurrently.patch, providing the core feature.
Patch 3 needs to have patch 2 applied first. Regression tests,
isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

I have added documentation in the patch with a better explanation
about why those choices of implementation are made.

The dropping still isn't safe:
After phase 4 we are in the state:
old index: valid, live, !isdead
new index: !valid, live, !isdead
Then you do a index_concurrent_set_dead() from that state on in phase 5.
There you do WaitForLockers(locktag, AccessExclusiveLock) before
index_set_state_flags(INDEX_DROP_SET_DEAD).
That's not sufficient.

Consider what happens with the following sequence:
1) WaitForLockers(locktag, AccessExclusiveLock)
-> GetLockConflicts() => virtualxact 1
-> VirtualXactLock(1)
2) virtualxact 2 starts, opens the *old* index since it's currently the
only valid one.
3) virtualxact 1 finishes
4) index_concurrent_set_dead() does index_set_state_flags(DROP_SET_DEAD)
5) another transaction (vxid 3) starts inserting data into the relation, updates
only the new index, the old index is dead
6) vxid 2 inserts data, updates only the old index. Since it had the
index already open the cache invalidations won't be processed.

Now the indexes are out of sync. There's entries only in the old index
and there's entries only in the new index. Not good.

I hate to repeat myself, but you really need to follow the current
protocol for concurrently dropping indexes. Which involves *first*
marking the index as invalid so it won't be used for querying anymore,
then wait for everyone possibly still seing that entry to finish, and
only *after* that mark the index as dead. You cannot argue away
correctness concerns with potential deadlocks.

c.f. /messages/by-id/20130926103400.GA2471420@alap2.anarazel.de

I am also still unconvinced that the logic in index_concurrent_swap() is
correct. It very much needs to explain why no backend can see values
that are inconsistent. E.g. what prevents a backend thinking the old and
new indexes have the same relfilenode? MVCC snapshots don't seem to
protect you against that.
I am not sure there's a problem, but there certainly needs to more
comments explaining why there are none.

Something like the following might be possible:

Backend 1: start reindex concurrently, till phase 4
Backend 2: ExecOpenIndices()
-> RelationGetIndexList (that list is consistent due to mvcc snapshots!)
Backend 2: -> index_open(old_index) (old relfilenode)
Backend 1: index_concurrent_swap()
-> CommitTransaction()
-> ProcArrayEndTransaction() (changes visible to others henceforth!)
Backend 2: -> index_open(new_index)
=> no cache invalidations yet, gets the old relfilenode
Backend 2: ExecInsertIndexTuples()
=> updates the same relation twice, corruptf
Backend 1: stil. in CommitTransaction()
-> AtEOXact_Inval() sends out invalidations

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

about 12 years ago

In reply to: Michael Paquier (#1)

Re: REINDEX CONCURRENTLY 2.0

Michael Paquier escribiï¿½:

Hi all,

Please find attached updated patches for the support of REINDEX
CONCURRENTLY, renamed 2.0 for the occasion:
- 20131114_1_index_drop_comments.patch, patch that updates some
comments in index_drop. This updates only a couple of comments in
index_drop but has not been committed yet. It should be IMO...

Pushed this one, thanks.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jim Nasby

jim@nasby.net

about 12 years ago

In reply to: Michael Paquier (#1)

Re: REINDEX CONCURRENTLY 2.0

Sorry for the lateness of this...

On 11/14/13, 8:40 PM, Michael Paquier wrote:

+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated could be used,
+	 * we need to swap each concurrent index with its corresponding old index.
+	 * Note that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation fails multiple times in
+	 * a row due to a reason or another. Note that we already know thanks to
+	 * validation step that
+	 */
+

Was there supposed to be more to that comment?

In the loop right below it...

+	/* Swap the indexes and mark the indexes that have the old data as invalid */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
...
+		CacheInvalidateRelcacheByRelid(relOid);

Do we actually need to invalidate the cache on each case? Is it because we're grabbing a new transaction each time through?
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael.paquier@gmail.com

almost 12 years ago

In reply to: Jim Nasby (#7)

2 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

Hi,

Thanks for your comments.

On Fri, Jan 10, 2014 at 9:59 AM, Jim Nasby <jim@nasby.net> wrote:

Sorry for the lateness of this...

On 11/14/13, 8:40 PM, Michael Paquier wrote:

+       /*
+        * Phase 4 of REINDEX CONCURRENTLY
+        *
+        * Now that the concurrent indexes have been validated could be
used,
+        * we need to swap each concurrent index with its corresponding
old index.
+        * Note that the concurrent index used for swaping is not marked
as valid
+        * because we need to keep the former index and the concurrent
index with
+        * a different valid status to avoid an implosion in the number of
indexes
+        * a parent relation could have if this operation fails multiple
times in
+        * a row due to a reason or another. Note that we already know
thanks to
+        * validation step that
+        */
+

Was there supposed to be more to that comment?

Not really, it seems that this chunk came out after writing multiple
successive versions of this patch.

In the loop right below it...
+       /* Swap the indexes and mark the indexes that have the old data as
invalid */
+       forboth(lc, indexIds, lc2, concurrentIndexIds)
...
+               CacheInvalidateRelcacheByRelid(relOid);
Do we actually need to invalidate the cache on each case? Is it because
we're grabbing a new transaction each time through?

This is to force a refresh of the cached plans that have been using
the old index before transaction of step 4 began.

I have realigned this patch with latest head (d2458e3)... In case
someone is interested at some point...

Regards,
--
Michael

Attachments:

20140121_0_old_snapshots.patchtext/x-diff; charset=US-ASCII; name=20140121_0_old_snapshots.patchDownload

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 4259c47..ab01484 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -277,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -321,12 +401,9 @@ DefineIndex(IndexStmt *stmt,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -767,74 +844,9 @@ DefineIndex(IndexStmt *stmt,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.	But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.	(Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry

20140121_1_reindex_conc_core.patchtext/x-diff; charset=US-ASCII; name=20140121_1_reindex_conc_core.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 2ca423c..5be15ea 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,8 +864,9 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
-         some forms of <command>ALTER TABLE</command>.
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</> and some forms of
+         <command>ALTER TABLE</command>.
         </para>
        </listitem>
       </varlistentry>
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 7222665..f4f6333 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
+     </para>
+    </listitem>
+
+    <listitem>
+     <para>
+      Concurrent indexes based on a <literal>PRIMARY KEY</> or an <literal>
+      EXCLUDE</>  constraint need to be dropped with <literal>ALTER TABLE
+      DROP CONSTRAINT</>. This is also the case of <literal>UNIQUE</> indexes
+      using constraints. Other indexes can be dropped using <literal>DROP INDEX</>,
+      including invalid toast indexes.
      </para>
     </listitem>
 
@@ -139,6 +152,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -231,6 +259,127 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    to be reindexed by separate commands.  This is still possible, but
    redundant.
   </para>
+
+
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database.  Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes without locking
+    out writes.  This method is invoked by specifying the
+    option <literal>CONCURRENTLY</> of <command>REINDEX</>.
+    When this option is used, <productname>PostgreSQL</> must perform two
+    scans of the table for each index that needs to be rebuild and in
+    addition it must wait for all existing transactions that could potentially
+    use the index to terminate. This method requires more total work than a
+    standard index rebuild and takes significantly longer to complete as it
+    needs to wait for unfinished transactions that might modify the index.
+    However, since it allows normal operations to continue while the index
+    is rebuilt, this method is useful for rebuilding indexes in a production
+    environment.  Of course, the extra CPU, memory and I/O load imposed by
+    the index rebuild might slow other operations.
+   </para>
+
+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one
+    transaction, then two table scans occur in two more transactions.  Once
+    this is performed, the old and fresh indexes are swapped in a third
+    transaction by exchanging their values of
+    <structname>pg_class</>.<structfield>relfilenode</>.  Note that at swap
+    phase, the concurrent index is kept as invalid so swap is done using the
+    former index, which is valid, and its concurrent index, remaining in
+    invalid state. This processing prevents cases where the number of valid
+    indexes would double in the case of a failure of
+    <command>REINDEX CONCURRENTLY</> as this operation cannot be performed
+    on invalid indexes. Once the swap phase is done, process begins a fourth
+    transaction that is used to mark the concurrent index (now having the
+    old value of <structname>pg_class</>.<structfield>relfilenode</>) as
+    not ready for each index rebuilt. Finally a fifth transaction is done
+    to drop the index that has been concurrently created in a way similar to
+    <command>DROP INDEX CONCURRENTLY</>.
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. This works as well with indexes of toast relations.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the
+    same table to occur in parallel, but only one concurrent index build
+    can occur on a table at a time.  In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When <command>CONCURRENTLY</command>
+    is specified, the operation is done with <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support <command>CONCURRENTLY
+    </command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +411,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8eae43d..97639b8 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -674,6 +676,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -697,7 +703,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -740,19 +747,22 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -1092,6 +1102,414 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	List	   *columnNames = NIL;
+	List	   *indexprs = NIL;
+	ListCell   *indexpr_item;
+	int			i;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/* Get expressions associated to this index for compilation of column names */
+	indexprs = RelationGetIndexExpressions(indexRelation);
+	indexpr_item = list_head(indexprs);
+
+	/* Build the list of column names, necessary for index_create */
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		char	   *origname, *curname;
+		char		buf[NAMEDATALEN];
+		AttrNumber	attnum = indexInfo->ii_KeyAttrNumbers[i];
+		int			j;
+
+		/* Pick up column name depending on attribute type */
+		if (attnum > 0)
+		{
+			/*
+			 * This is a column attribute, so simply pick column name from
+			 * relation.
+			 */
+			Form_pg_attribute attform = heapRelation->rd_att->attrs[attnum - 1];;
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else if (attnum < 0)
+		{
+			/* Case of a system attribute */
+			Form_pg_attribute attform = SystemAttributeDefinition(attnum,
+										  heapRelation->rd_rel->relhasoids);
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else
+		{
+			Node *indnode;
+			/*
+			 * This is the case of an expression, so pick up the expression
+			 * name.
+			 */
+			Assert(indexpr_item != NULL);
+			indnode = (Node *) lfirst(indexpr_item);
+			indexpr_item = lnext(indexpr_item);
+			origname = deparse_expression(indnode,
+							deparse_context_for(RelationGetRelationName(heapRelation),
+												RelationGetRelid(heapRelation)),
+							false, false);
+		}
+
+		/*
+		 * Check if the name picked has any conflict with existing names and
+		 * change it.
+		 */
+		curname = origname;
+		for (j = 1;; j++)
+		{
+			ListCell   *lc2;
+			char		nbuf[32];
+			int			nlen;
+
+			foreach(lc2, columnNames)
+			{
+				if (strcmp(curname, (char *) lfirst(lc2)) == 0)
+					break;
+			}
+			if (lc2 == NULL)
+				break; /* found nonconflicting name */
+
+			sprintf(nbuf, "%d", j);
+
+			/* Ensure generated names are shorter than NAMEDATALEN */
+			nlen = pg_mbcliplen(origname, strlen(origname),
+								NAMEDATALEN - 1 - strlen(nbuf));
+			memcpy(buf, origname, nlen);
+			strcpy(buf + nlen, nbuf);
+			curname = buf;
+		}
+
+		/* Append name to existing list */
+		columnNames = lappend(columnNames, pstrdup(curname));
+	}
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 columnNames,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	rel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	rel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(rel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both the relations, but keep the locks */
+	heap_close(rel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. For the time being
+ * what is done here is switching the relation relfilenode of the indexes. If
+ * extra operations are necessary during a concurrent swap, processing should
+ * be added here. Relations do not require an exclusive lock thanks to the
+ * MVCC catalog access to relcache.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY before
+ * actually dropping the index. After calling this function the index is
+ * seen by all the backends as dead. Low-level locks taken during here
+ * are kept until the end of the transaction doing calling this function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table.
+	 * Note we do not need to worry about xacts that open the table for
+	 * reading after this point; they will see the index as invalid when
+	 * they open the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1444,52 +1862,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table.
-		 * Note we do not need to worry about xacts that open the table for
-		 * reading after this point; they will see the index as invalid when
-		 * they open the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.	So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index f044280..a00b474 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -280,7 +280,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid, Datum reloptio
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index ab01484..0299c34 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -391,7 +392,6 @@ DefineIndex(IndexStmt *stmt,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -530,7 +530,8 @@ DefineIndex(IndexStmt *stmt,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -677,7 +678,7 @@ DefineIndex(IndexStmt *stmt,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -759,27 +760,15 @@ DefineIndex(IndexStmt *stmt,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -873,6 +862,541 @@ DefineIndex(IndexStmt *stmt,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each reindexing step
+ * is done in parallel with all the table's indexes as well as its dependent
+ * toast indexes.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for rebuilding concurrently the indexes.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built after. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * its indexes for toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 false,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/* Index relation has been closed by previous commit, so reopen it */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, ShareUpdateExclusiveLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with the INSERT that
+	 * might have occurred in the parent table.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is done
+	 * with a separate transaction to avoid opening transaction for an
+	 * unnecessary too long time.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated and can be used,
+	 * we need to swap each concurrent index with its corresponding old index.
+	 * Note that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation fails multiple times in
+	 * a row due to a reason or another.
+	 */
+
+	/* Swap the indexes and mark the indexes that have the old data as invalid */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+
+		/* Swap old index and its concurrent */
+		index_concurrent_swap(concurrentOid, indOid);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The concurrent indexes now hold the old relfilenode of the other indexes
+	 * transactions that might use them. Each operation is performed with a
+	 * separate transaction.
+	 */
+
+	/* Now mark the concurrent indexes as not ready */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes. This needs to be done through
+	 * performDeletion or related dependencies will not be dropped for the old
+	 * indexes. The internal mechanism of DROP INDEX CONCURRENTLY is not used
+	 * as here the indexes are already considered as dead and invalid, so they
+	 * will not be used by other backends.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1535,7 +2059,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1561,6 +2086,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1673,18 +2205,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1753,13 +2289,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run through the concurrent process if necessary */
+	if (concurrent)
+	{
+		if (!ReindexRelationConcurrently(heapOid))
+		{
+			ereport(NOTICE,
+					(errmsg("table \"%s\" has no indexes",
+							relation->relname)));
+		}
+		return heapOid;
+	}
 
 	if (!reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
@@ -1780,7 +2330,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1792,6 +2345,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1876,17 +2438,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 26a4613..85d18a1 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -875,6 +875,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -910,7 +911,39 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index &&
+		!allowSystemTableMods &&
+		IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 46895b2..04ddecf 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index bb356d0..5b0fd08 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3688,6 +3688,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5908d9a..a2bb38c 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1881,6 +1881,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 12a6beb..26b4121 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -6892,29 +6892,32 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index f4d25bd..9d68ed0 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -790,16 +790,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -811,8 +815,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 006b180..b5a528f 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -60,7 +60,24 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0b1b2b7..bc5f804 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -27,10 +27,11 @@ extern Oid DefineIndex(IndexStmt *stmt,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 846c31a..808ca41 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2648,6 +2648,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 1e73b4a..479a0ca 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -23,4 +23,5 @@ test: multixact-no-deadlock
 test: multixact-no-forget
 test: propagate-lock-delete
 test: drop-index-concurrently-1
+test: reindex-concurrently
 test: timeouts
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index d10253b..0a58073 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2757,3 +2757,60 @@ ORDER BY thousand;
         1 |     1001
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 8ac1d1d..738388c 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -917,3 +917,44 @@ ORDER BY thousand;
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

Michael Paquier

michael.paquier@gmail.com

over 11 years ago

In reply to: Michael Paquier (#8)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

I have realigned this patch with latest head (d2458e3)... In case
someone is interested at some point...

Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD
(d7938a4), as some people are showing interest in it by reading recent
discussions. Patch compiles and passes regression as well as isolation
tests..
--
Michael

Attachments:

20140827_Support-for-REINDEX-CONCURRENTLY.patchtext/x-patch; charset=US-ASCII; name=20140827_Support-for-REINDEX-CONCURRENTLY.patchDownload

From 944bc6cc2998ebac1cab11a91f55120a3452f3ed Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Wed, 27 Aug 2014 10:56:02 +0900
Subject: [PATCH] Support for REINDEX CONCURRENTLY

Fully supported patch, with regression and isolation tests, including
documentation. This version uses low-level lock when swapping relations
and suffers from deadlock problems when checking for old snapshots as
well.
---
 doc/src/sgml/mvcc.sgml                             |   4 +-
 doc/src/sgml/ref/reindex.sgml                      | 169 ++++-
 src/backend/catalog/index.c                        | 478 ++++++++++--
 src/backend/catalog/toasting.c                     |   2 +-
 src/backend/commands/indexcmds.c                   | 820 ++++++++++++++++++---
 src/backend/commands/tablecmds.c                   |  33 +-
 src/backend/executor/execUtils.c                   |  14 +
 src/backend/nodes/copyfuncs.c                      |   1 +
 src/backend/nodes/equalfuncs.c                     |   1 +
 src/backend/parser/gram.y                          |  15 +-
 src/backend/tcop/utility.c                         |  12 +-
 src/include/catalog/index.h                        |  19 +-
 src/include/commands/defrem.h                      |   7 +-
 src/include/nodes/parsenodes.h                     |   1 +
 .../isolation/expected/reindex-concurrently.out    |  78 ++
 src/test/isolation/isolation_schedule              |   1 +
 src/test/isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out         |  57 ++
 src/test/regress/sql/create_index.sql              |  42 ++
 19 files changed, 1610 insertions(+), 184 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 0bbbc71..365a6dd 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,9 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
+         and
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..ea3410f 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
+     </para>
+    </listitem>
+
+    <listitem>
+     <para>
+      Concurrent indexes based on a <literal>PRIMARY KEY</> or an <literal>
+      EXCLUDE</>  constraint need to be dropped with <literal>ALTER TABLE
+      DROP CONSTRAINT</>. This is also the case of <literal>UNIQUE</> indexes
+      using constraints. Other indexes can be dropped using <literal>DROP INDEX</>,
+      including invalid toast indexes.
      </para>
     </listitem>
 
@@ -139,6 +152,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -218,6 +246,126 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    reindex anything.
   </para>
 
+  <para>
+   Prior to <productname>PostgreSQL</productname> 8.1, <command>REINDEX
+   DATABASE</> processed only system indexes, not all indexes as one would
+   expect from the name.  This has been changed to reduce the surprise
+   factor.  The old behavior is available as <command>REINDEX SYSTEM</>.
+  </para>
+
+  <para>
+   Prior to <productname>PostgreSQL</productname> 7.4, <command>REINDEX
+   TABLE</> did not automatically process TOAST tables, and so those had
+   to be reindexed by separate commands.  This is still possible, but
+   redundant.
+  </para>
+
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database.  Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes without locking
+    out writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>.
+    When this option is used, <productname>PostgreSQL</> must perform two
+    scans of the table for each index that needs to be rebuild and in
+    addition it must wait for all existing transactions that could potentially
+    use the index to terminate. This method requires more total work than a
+    standard index rebuild and takes significantly longer to complete as it
+    needs to wait for unfinished transactions that might modify the index.
+    However, since it allows normal operations to continue while the index
+    is rebuilt, this method is useful for rebuilding indexes in a production
+    environment.  Of course, the extra CPU, memory and I/O load imposed by
+    the index rebuild might slow other operations.
+   </para>
+
+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one transaction,
+    then two table scans occur in two more transactions.  Once this is performed,
+    the old and fresh indexes are swapped. Finally two additional transactions
+    are used to mark the concurrent index as not ready and then drop it.
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. This works as well with indexes of toast relations.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the
+    same table to occur in parallel, but only one concurrent index build
+    can occur on a table at a time.  In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When <command>CONCURRENTLY</command>
+    is specified, the operation is done with <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -249,7 +397,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index a5a204e..ae649e0 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -674,6 +676,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -697,7 +703,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -740,19 +747,22 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -1092,6 +1102,414 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	List	   *columnNames = NIL;
+	List	   *indexprs = NIL;
+	ListCell   *indexpr_item;
+	int			i;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/* Get expressions associated to this index for compilation of column names */
+	indexprs = RelationGetIndexExpressions(indexRelation);
+	indexpr_item = list_head(indexprs);
+
+	/* Build the list of column names, necessary for index_create */
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		char	   *origname, *curname;
+		char		buf[NAMEDATALEN];
+		AttrNumber	attnum = indexInfo->ii_KeyAttrNumbers[i];
+		int			j;
+
+		/* Pick up column name depending on attribute type */
+		if (attnum > 0)
+		{
+			/*
+			 * This is a column attribute, so simply pick column name from
+			 * relation.
+			 */
+			Form_pg_attribute attform = heapRelation->rd_att->attrs[attnum - 1];;
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else if (attnum < 0)
+		{
+			/* Case of a system attribute */
+			Form_pg_attribute attform = SystemAttributeDefinition(attnum,
+										  heapRelation->rd_rel->relhasoids);
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else
+		{
+			Node *indnode;
+			/*
+			 * This is the case of an expression, so pick up the expression
+			 * name.
+			 */
+			Assert(indexpr_item != NULL);
+			indnode = (Node *) lfirst(indexpr_item);
+			indexpr_item = lnext(indexpr_item);
+			origname = deparse_expression(indnode,
+							deparse_context_for(RelationGetRelationName(heapRelation),
+												RelationGetRelid(heapRelation)),
+							false, false);
+		}
+
+		/*
+		 * Check if the name picked has any conflict with existing names and
+		 * change it.
+		 */
+		curname = origname;
+		for (j = 1;; j++)
+		{
+			ListCell   *lc2;
+			char		nbuf[32];
+			int			nlen;
+
+			foreach(lc2, columnNames)
+			{
+				if (strcmp(curname, (char *) lfirst(lc2)) == 0)
+					break;
+			}
+			if (lc2 == NULL)
+				break; /* found nonconflicting name */
+
+			sprintf(nbuf, "%d", j);
+
+			/* Ensure generated names are shorter than NAMEDATALEN */
+			nlen = pg_mbcliplen(origname, strlen(origname),
+								NAMEDATALEN - 1 - strlen(nbuf));
+			memcpy(buf, origname, nlen);
+			strcpy(buf + nlen, nbuf);
+			curname = buf;
+		}
+
+		/* Append name to existing list */
+		columnNames = lappend(columnNames, pstrdup(curname));
+	}
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 columnNames,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	rel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	rel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(rel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both the relations, but keep the locks */
+	heap_close(rel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. For the time being
+ * what is done here is switching the relation relfilenode of the indexes. If
+ * extra operations are necessary during a concurrent swap, processing should
+ * be added here. Relations do not require an exclusive lock thanks to the
+ * MVCC catalog access to relcache.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY before
+ * actually dropping the index. After calling this function the index is
+ * seen by all the backends as dead. Low-level locks taken during here
+ * are kept until the end of the transaction doing calling this function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table. Note
+	 * we do not need to worry about xacts that open the table for reading
+	 * after this point; they will see the index as invalid when they open
+	 * the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1440,52 +1858,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table. Note
-		 * we do not need to worry about xacts that open the table for reading
-		 * after this point; they will see the index as invalid when they open
-		 * the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 94543e1..0c05669 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -338,7 +338,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index fdfa6ca..dace8f0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -276,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -312,7 +393,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -322,13 +402,10 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -459,7 +536,8 @@ DefineIndex(Oid relationId,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -606,7 +684,7 @@ DefineIndex(Oid relationId,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -688,27 +766,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -773,74 +839,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -867,6 +868,542 @@ DefineIndex(Oid relationId,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each reindexing step
+ * is done in parallel with all the table's indexes as well as its dependent
+ * toast indexes.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for rebuilding concurrently the indexes.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built after. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * its indexes for toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 false,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/* Index relation has been closed by previous commit, so reopen it */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, ShareUpdateExclusiveLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with the INSERT that
+	 * might have occurred in the parent table.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is done
+	 * with a separate transaction to avoid opening transaction for an
+	 * unnecessary too long time.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated could be used,
+	 * we need to swap each concurrent index with its corresponding old index.
+	 * Note that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation fails multiple times in
+	 * a row due to a reason or another. Note that we already know thanks to
+	 * validation step that
+	 */
+
+	/* Swap the indexes and mark the indexes that have the old data as invalid */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+
+		/* Swap old index and its concurrent */
+		index_concurrent_swap(concurrentOid, indOid);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The concurrent indexes now hold the old relfilenode of the other indexes
+	 * transactions that might use them. Each operation is performed with a
+	 * separate transaction.
+	 */
+
+	/* Now mark the concurrent indexes as not ready */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes. This needs to be done through
+	 * performDeletion or related dependencies will not be dropped for the old
+	 * indexes. The internal mechanism of DROP INDEX CONCURRENTLY is not used
+	 * as here the indexes are already considered as dead and invalid, so they
+	 * will not be used by other backends.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = * (LockRelId *) lfirst(lc);
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1529,7 +2066,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1555,6 +2093,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1667,18 +2212,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1747,13 +2296,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run through the concurrent process if necessary */
+	if (concurrent)
+	{
+		if (!ReindexRelationConcurrently(heapOid))
+		{
+			ereport(NOTICE,
+					(errmsg("table \"%s\" has no indexes",
+							relation->relname)));
+		}
+		return heapOid;
+	}
 
 	if (!reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
@@ -1774,7 +2337,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1786,6 +2352,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1870,17 +2445,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d37534e..be41f4b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -902,6 +902,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -937,7 +938,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..8690eeb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f5ddc1c..12a7b92 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3722,6 +3722,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index ccd6064..e9cbbf7 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1899,6 +1899,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 327f2d2..032591b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7036,29 +7036,32 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index f648945..030f027 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -747,16 +747,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -768,8 +772,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 006b180..b5a528f 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -60,7 +60,24 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0ebdbc1..b988555 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -28,10 +28,11 @@ extern Oid DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 28029fe..71d22d9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2693,6 +2693,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 36acec1..1753733 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -24,4 +24,5 @@ test: multixact-no-forget
 test: propagate-lock-delete
 test: drop-index-concurrently-1
 test: alter-table-1
+test: reindex-concurrently
 test: timeouts
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index f6f5516..a7768f7 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2782,3 +2782,60 @@ explain (costs off)
    Index Cond: ((thousand = 1) AND (tenthous = 1001))
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index d4d24ef..93321c0 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -938,3 +938,45 @@ ORDER BY thousand;
 
 explain (costs off)
   select * from tenk1 where (thousand, tenthous) in ((1,1001), (null,null));
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
-- 
2.1.0

#10

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: Michael Paquier (#9)

Re: REINDEX CONCURRENTLY 2.0

On 2014-08-27 11:00:56 +0900, Michael Paquier wrote:

On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

I have realigned this patch with latest head (d2458e3)... In case
someone is interested at some point...

Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD
(d7938a4), as some people are showing interest in it by reading recent
discussions. Patch compiles and passes regression as well as isolation
tests..

Can you add it to the next CF? I'll try to look earlier, but can't
promise anything.

I very much would like this to get committed in some form or another.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Michael Paquier

michael.paquier@gmail.com

over 11 years ago

In reply to: Andres Freund (#10)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Can you add it to the next CF? I'll try to look earlier, but can't
promise anything.

I very much would like this to get committed in some form or another.

Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Michael Paquier

michael.paquier@gmail.com

over 11 years ago

In reply to: Michael Paquier (#11)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

Can you add it to the next CF? I'll try to look earlier, but can't
promise anything.

I very much would like this to get committed in some form or another.

Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563

Attached is a fairly-refreshed patch that should be used as a base for the
next commit fest. The following changes should be noticed:
- Use of AccessExclusiveLock when swapping relfilenodes of an index and its
concurrent entry instead of ShareUpdateExclusiveLock for safety. At the
limit of my understanding, that's the consensus reached until now.
- Cleanup of many comments and refresh of the documentation that was
somewhat wrongly-formulated or shaped at some places
- Addition of support for autocommit off in psql for REINDEX [ TABLE |
INDEX ] CONCURRENTLY
- Some more code cleanup..
I haven't been through the tab completion support for psql but looking at
tab-completion.c this seems a bit tricky with the stuff related to CREATE
INDEX CONCURRENTLY already present. Nothing huge though.
Regards,
--
Michael

Attachments:

20141001_reindex_concurrently.patchtext/x-diff; charset=US-ASCII; name=20141001_reindex_concurrently.patchDownload

From 5e9125da5cbeb2f6265b68ff14cc70e4cb10d502 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Wed, 27 Aug 2014 10:56:02 +0900
Subject: [PATCH] Support for REINDEX CONCURRENTLY

Fully supported patch, with regression and isolation tests, including
documentation and support for autocommit 'off' in psql. This version
uses an exclusive lock when swapping relations for safety. psql tab
completion has not been added yet in this version as this is rather
independent from the core version and actually psql makes things a
bit tricky with CREATE INDEX CONCURRENTLY using the same keywords.
---
 doc/src/sgml/mvcc.sgml                             |   5 +-
 doc/src/sgml/ref/reindex.sgml                      | 160 +++-
 src/backend/catalog/index.c                        | 476 ++++++++++--
 src/backend/catalog/toasting.c                     |   2 +-
 src/backend/commands/indexcmds.c                   | 821 ++++++++++++++++++---
 src/backend/commands/tablecmds.c                   |  33 +-
 src/backend/executor/execUtils.c                   |  14 +
 src/backend/nodes/copyfuncs.c                      |   1 +
 src/backend/nodes/equalfuncs.c                     |   1 +
 src/backend/parser/gram.y                          |  17 +-
 src/backend/tcop/utility.c                         |  12 +-
 src/bin/psql/common.c                              |  17 +
 src/include/catalog/index.h                        |  19 +-
 src/include/commands/defrem.h                      |   7 +-
 src/include/nodes/parsenodes.h                     |   1 +
 .../isolation/expected/reindex-concurrently.out    |  78 ++
 src/test/isolation/isolation_schedule              |   1 +
 src/test/isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out         |  57 ++
 src/test/regress/sql/create_index.sql              |  42 ++
 20 files changed, 1615 insertions(+), 189 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
    <sect2 id="locking-pages">
     <title>Page-level Locks</title>
-  
+
     <para>
      In addition to table and row locks, page-level share/exclusive locks are
      used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..0b7a93c 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
+     </para>
+    </listitem>
+
+    <listitem>
+     <para>
+      Concurrent indexes based on a <literal>PRIMARY KEY</> or an exclude
+      constraint need to be dropped with
+      <literal>ALTER TABLE DROP CONSTRAINT</>. This is also the case of
+      <literal>UNIQUE</> indexes using constraints. Other indexes can be
+      dropped using <literal>DROP INDEX</> including invalid toast indexes.
      </para>
     </listitem>
 
@@ -139,6 +152,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -218,6 +246,117 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one
+    transaction, then two table scans occur in two more transactions. Once this
+    is performed, the old and fresh indexes are swapped by taking a lock
+    <literal>ACCESS EXCLUSIVE</>. Finally two additional transactions
+    are used to mark the concurrent index as not ready and then drop it.
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. This works as well with indexes of toast relations.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal> except when an index and its
+    concurrent entry are swapped where a <literal>ACCESS EXCLUSIVE</literal>
+    lock is taken on the parent relation.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -249,7 +388,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index ee10594..d55803f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -674,6 +676,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -697,7 +703,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -740,19 +747,22 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -1093,6 +1103,412 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	List	   *columnNames = NIL;
+	List	   *indexprs = NIL;
+	ListCell   *indexpr_item;
+	int			i;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/* Get expressions associated to this index for compilation of column names */
+	indexprs = RelationGetIndexExpressions(indexRelation);
+	indexpr_item = list_head(indexprs);
+
+	/* Build the list of column names, necessary for index_create */
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		char	   *origname, *curname;
+		char		buf[NAMEDATALEN];
+		AttrNumber	attnum = indexInfo->ii_KeyAttrNumbers[i];
+		int			j;
+
+		/* Pick up column name depending on attribute type */
+		if (attnum > 0)
+		{
+			/*
+			 * This is a column attribute, so simply pick column name from
+			 * relation.
+			 */
+			Form_pg_attribute attform = heapRelation->rd_att->attrs[attnum - 1];;
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else if (attnum < 0)
+		{
+			/* Case of a system attribute */
+			Form_pg_attribute attform = SystemAttributeDefinition(attnum,
+										  heapRelation->rd_rel->relhasoids);
+			origname = pstrdup(NameStr(attform->attname));
+		}
+		else
+		{
+			Node *indnode;
+			/*
+			 * This is the case of an expression, so pick up the expression
+			 * name.
+			 */
+			Assert(indexpr_item != NULL);
+			indnode = (Node *) lfirst(indexpr_item);
+			indexpr_item = lnext(indexpr_item);
+			origname = deparse_expression(indnode,
+							deparse_context_for(RelationGetRelationName(heapRelation),
+												RelationGetRelid(heapRelation)),
+							false, false);
+		}
+
+		/*
+		 * Check if the name picked has any conflict with existing names and
+		 * change it.
+		 */
+		curname = origname;
+		for (j = 1;; j++)
+		{
+			ListCell   *lc2;
+			char		nbuf[32];
+			int			nlen;
+
+			foreach(lc2, columnNames)
+			{
+				if (strcmp(curname, (char *) lfirst(lc2)) == 0)
+					break;
+			}
+			if (lc2 == NULL)
+				break; /* found nonconflicting name */
+
+			sprintf(nbuf, "%d", j);
+
+			/* Ensure generated names are shorter than NAMEDATALEN */
+			nlen = pg_mbcliplen(origname, strlen(origname),
+								NAMEDATALEN - 1 - strlen(nbuf));
+			memcpy(buf, origname, nlen);
+			strcpy(buf + nlen, nbuf);
+			curname = buf;
+		}
+
+		/* Append name to existing list */
+		columnNames = lappend(columnNames, pstrdup(curname));
+	}
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 columnNames,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. An exclusive lock
+ * is taken on those two relations during the swap of their relfilenode.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, AccessExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, AccessExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table. Note
+	 * we do not need to worry about xacts that open the table for reading
+	 * after this point; they will see the index as invalid when they open
+	 * the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1441,52 +1857,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table. Note
-		 * we do not need to worry about xacts that open the table for reading
-		 * after this point; they will see the index as invalid when they open
-		 * the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 160f006..73520e5 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -342,7 +342,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 8a1cb4b..9c307a1 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -276,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -312,7 +393,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -322,13 +402,10 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -459,7 +536,8 @@ DefineIndex(Oid relationId,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -610,7 +688,7 @@ DefineIndex(Oid relationId,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -692,27 +770,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -777,74 +843,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -871,6 +872,541 @@ DefineIndex(Oid relationId,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each step of REINDEX
+ * CONCURRENTLY is done in parallel with all the table's indexes as well as
+ * its dependent toast indexes.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for rebuilding concurrently the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built after. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * its indexes for toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 NULL,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with the INSERT that
+	 * might have occurred in the parent table.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is done
+	 * with a separate transaction to avoid opening transaction for an
+	 * unnecessary too long time.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index. Note
+	 * that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation step fails multiple times
+	 * in a row due to a reason or another.
+	 */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them. One transaction is used for each single index
+	 * entry.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the concurrent entries are
+	 * already considered as invalid and not ready, so they will not be used
+	 * by other backends for any read or write operations.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1533,7 +2069,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1559,6 +2096,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1671,18 +2215,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1751,17 +2299,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
-
-	if (!reindex_relation(heapOid,
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run the concurrent process if necessary */
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid);
+	else
+		result = reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS))
+							 REINDEX_REL_CHECK_CONSTRAINTS);
+
+	/* Let user know if operation has been moot */
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1778,7 +2336,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1790,6 +2351,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1874,17 +2444,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index cb16c53..723037a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -906,6 +906,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -941,7 +942,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..8690eeb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 225756c..a19b6bc 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3724,6 +3724,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 905468e..474d5ed 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1901,6 +1901,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 77d2f29..9c4f6db 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7172,35 +7172,38 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *
  *		QUERY:
  *
- *		REINDEX type <name> [FORCE]
+ *		REINDEX type [CONCURRENTLY] <name> [FORCE]
  *
  * FORCE no longer does anything, but we accept it for backwards compatibility
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 4a2a339..5339676 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -744,16 +744,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -765,8 +769,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 0f83799..49dad77 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1662,6 +1662,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 		return false;
 	}
 
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 006b180..b5a528f 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -60,7 +60,24 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0ebdbc1..b988555 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -28,10 +28,11 @@ extern Oid DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f3aa69e..87391e2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2727,6 +2727,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 3241a91..3e146d3 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -30,3 +30,4 @@ test: nowait-5
 test: drop-index-concurrently-1
 test: alter-table-1
 test: timeouts
+test: drop-index-concurrently-1
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index a2bef7a..d1aae3c 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2786,3 +2786,60 @@ explain (costs off)
    Index Cond: ((thousand = 1) AND (tenthous = 1001))
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index d4d24ef..93321c0 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -938,3 +938,45 @@ ORDER BY thousand;
 
 explain (costs off)
   select * from tenk1 where (thousand, tenthous) in ((1,1001), (null,null));
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
-- 
2.1.1

#13

Jim Nasby

Jim.Nasby@BlueTreble.com

about 11 years ago

In reply to: Michael Paquier (#12)

2 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On 10/1/14, 2:00 AM, Michael Paquier wrote:

On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier <michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote:

On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com <mailto:andres@2ndquadrant.com>> wrote:

Can you add it to the next CF? I'll try to look earlier, but can't
promise anything.

I very much would like this to get committed in some form or another.

Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563

Attached is a fairly-refreshed patch that should be used as a base for the next commit fest. The following changes should be noticed:
- Use of AccessExclusiveLock when swapping relfilenodes of an index and its concurrent entry instead of ShareUpdateExclusiveLock for safety. At the limit of my understanding, that's the consensus reached until now.
- Cleanup of many comments and refresh of the documentation that was somewhat wrongly-formulated or shaped at some places
- Addition of support for autocommit off in psql for REINDEX [ TABLE | INDEX ] CONCURRENTLY
- Some more code cleanup..
I haven't been through the tab completion support for psql but looking at tab-completion.c this seems a bit tricky with the stuff related to CREATE INDEX CONCURRENTLY already present. Nothing huge though.

Patch applies against current HEAD and builds, but I'm getting 37 failed tests (mostly parallel, but also misc and WITH; results attached). Is that expeccted?

+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one
+    transaction, then two table scans occur in two more transactions. Once this
+    is performed, the old and fresh indexes are swapped by taking a lock
+    <literal>ACCESS EXCLUSIVE</>. Finally two additional transactions
+    are used to mark the concurrent index as not ready and then drop it.
+   </para>

The "mark the concurrent index" bit is rather confusing; it sounds like it's referring to the new index instead of the old. Now that I've read the code I understand what's going on here between the concurrent index *entry* and the filenode swap, but I don't think the docs make this sufficiently clear to users.

How about something like this:

The following steps occur in a concurrent index build, each in a separate transaction. Note that if there are multiple indexes to be rebuilt then each step loops through all the indexes we're rebuilding, using a separate transaction for each one.

1. A new "temporary" index definition is entered into the catalog. This definition is only used to build the new index, and will be removed at the completion of the process.
2. A first pass index build is done.
3. A second pass is performed to add tuples that were added while the first pass build was running.
4. pg_class.relfilenode for the existing index definition and the "temporary" definition are swapped. This means that the existing index definition now uses the index data that we stored during the build, and the "temporary" definition is using the old index data.
5. The "temporary" index definition is marked as dead.
6. The "temporary" index definition and it's data (which is now the data for the old index) are dropped.

+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.

Last bit presumably should be "on which the index is based".

+ /* Build the list of column names, necessary for index_create */
Instead of all this work wouldn't it be easier to create a version of index_create/ConstructTupleDescriptor that will use the IndexInfo for the old index? ISTM index_concurrent_create() is doing a heck of a lot of work to marshal data into one form just to have it get marshaled yet again. Worst case, if we do have to play this game, there should be a stand-alone function to get the columns/expressions for an existing index; you're duplicating a lot of code from pg_get_indexdef_worker().

index_concurrent_swap(): Perhaps it'd be better to create index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and refactor the duplicated code out... the actual function would then become:

ReindexRelationConcurrently()

+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each step of REINDEX
+ * CONCURRENTLY is done in parallel with all the table's indexes as well as
+ * its dependent toast indexes.
This comment is a bit misleading; we're not actually doing anything in parallel, right? AFAICT index_concurrent_build is going to block while each index is built the first time.

+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation.

This comment is a bit unclear to me... at minimum I think it should be "* on parent relations" instead of "* parent relations", but I think it needs to elaborate on why/when we're also taking session level locks.

I also wordsmithed this comment a bit...
* Here begins the process for concurrently rebuilding the index entries.
* We need first to create an index which is based on the same data
* as the former index except that it will be only registered in catalogs
* and will be built later. It is possible to perform all the operations
* on all the indexes at the same time for a parent relation including
* indexes for its toast relation.

and this one...
* During this phase the concurrent indexes catch up with any new tuples that
* were created during the previous phase.
*
* We once again wait until no transaction can have the table open with
* the index marked as read-only for updates. Each index validation is done
* in a separate transaction to minimize how long we hold an open transaction.

+	 * a different valid status to avoid an implosion in the number of indexes
+	 * a parent relation could have if this operation step fails multiple times
+	 * in a row due to a reason or another.

I'd change that to "explosion in the number of indexes a parent relation could have if this operation fails."

Phase 4, 5 and 6 are rather confusing if you don't understand that each "concurrent index" entry is meant to be thrown away. I think the Phase 4 comment should elaborate on that.

The comment in check_exclusion_constraint() is good; shouldn't the related comment on this line in index_create() mention that check_exclusion_constraint() needs to be changed if we ever support concurrent builds of exclusion indexes?
if (concurrent && is_exclusion && !is_reindex)

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachments:

regression.diffstext/plain; charset=UTF-8; name=regression.diffs; x-mac-creator=0; x-mac-type=0Download

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/create_index.out	Tue Oct 28 15:52:02 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/create_index.out	Tue Oct 28 15:53:04 2014
***************
*** 2835,2867 ****
  CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
  REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
  REINDEX TABLE CONCURRENTLY concur_reindex_tab;
! REINDEX TABLE CONCURRENTLY concur_reindex_matview;
! -- Check errors
! -- Cannot run inside a transaction block
! BEGIN;
! REINDEX TABLE CONCURRENTLY concur_reindex_tab;
! ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
! COMMIT;
! REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
! ERROR:  concurrent reindex is not supported for shared relations
! REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
! ERROR:  concurrent reindex is not supported for catalog relations
! REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
! ERROR:  cannot reindex system concurrently
! -- Check the relation status, there should not be invalid indexes
! \d concur_reindex_tab
! Table "public.concur_reindex_tab"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  c1     | integer | not null
!  c2     | text    | 
! Indexes:
!     "concur_reindex_ind1" PRIMARY KEY, btree (c1)
!     "concur_reindex_ind3" UNIQUE, btree (abs(c1))
!     "concur_reindex_ind2" btree (c2)
!     "concur_reindex_ind4" btree (c1, c1, c2)
! Referenced by:
!     TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
! 
! DROP MATERIALIZED VIEW concur_reindex_matview;
! DROP TABLE concur_reindex_tab, concur_reindex_tab2;
--- 2835,2841 ----
  CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
  REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
  REINDEX TABLE CONCURRENTLY concur_reindex_tab;
! server closed the connection unexpectedly
! 	This probably means the server terminated abnormally
! 	before or while processing the request.
! connection to server was lost

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/create_aggregate.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/create_aggregate.out	Tue Oct 28 15:53:05 2014
***************
*** 1,129 ****
! --
! -- CREATE_AGGREGATE
! --
! -- all functions CREATEd
! CREATE AGGREGATE newavg (
!    sfunc = int4_avg_accum, basetype = int4, stype = _int8,
!    finalfunc = int8_avg,
!    initcond1 = '{0,0}'
! );
! -- test comments
! COMMENT ON AGGREGATE newavg_wrong (int4) IS 'an agg comment';
! ERROR:  aggregate newavg_wrong(integer) does not exist
! COMMENT ON AGGREGATE newavg (int4) IS 'an agg comment';
! COMMENT ON AGGREGATE newavg (int4) IS NULL;
! -- without finalfunc; test obsolete spellings 'sfunc1' etc
! CREATE AGGREGATE newsum (
!    sfunc1 = int4pl, basetype = int4, stype1 = int4,
!    initcond1 = '0'
! );
! -- zero-argument aggregate
! CREATE AGGREGATE newcnt (*) (
!    sfunc = int8inc, stype = int8,
!    initcond = '0'
! );
! -- old-style spelling of same
! CREATE AGGREGATE oldcnt (
!    sfunc = int8inc, basetype = 'ANY', stype = int8,
!    initcond = '0'
! );
! -- aggregate that only cares about null/nonnull input
! CREATE AGGREGATE newcnt ("any") (
!    sfunc = int8inc_any, stype = int8,
!    initcond = '0'
! );
! COMMENT ON AGGREGATE nosuchagg (*) IS 'should fail';
! ERROR:  aggregate nosuchagg(*) does not exist
! COMMENT ON AGGREGATE newcnt (*) IS 'an agg(*) comment';
! COMMENT ON AGGREGATE newcnt ("any") IS 'an agg(any) comment';
! -- multi-argument aggregate
! create function sum3(int8,int8,int8) returns int8 as
! 'select $1 + $2 + $3' language sql strict immutable;
! create aggregate sum2(int8,int8) (
!    sfunc = sum3, stype = int8,
!    initcond = '0'
! );
! -- multi-argument aggregates sensitive to distinct/order, strict/nonstrict
! create type aggtype as (a integer, b integer, c text);
! create function aggf_trans(aggtype[],integer,integer,text) returns aggtype[]
! as 'select array_append($1,ROW($2,$3,$4)::aggtype)'
! language sql strict immutable;
! create function aggfns_trans(aggtype[],integer,integer,text) returns aggtype[]
! as 'select array_append($1,ROW($2,$3,$4)::aggtype)'
! language sql immutable;
! create aggregate aggfstr(integer,integer,text) (
!    sfunc = aggf_trans, stype = aggtype[],
!    initcond = '{}'
! );
! create aggregate aggfns(integer,integer,text) (
!    sfunc = aggfns_trans, stype = aggtype[], sspace = 10000,
!    initcond = '{}'
! );
! -- variadic aggregate
! create function least_accum(anyelement, variadic anyarray)
! returns anyelement language sql as
!   'select least($1, min($2[i])) from generate_subscripts($2,1) g(i)';
! create aggregate least_agg(variadic items anyarray) (
!   stype = anyelement, sfunc = least_accum
! );
! -- test ordered-set aggs using built-in support functions
! create aggregate my_percentile_disc(float8 ORDER BY anyelement) (
!   stype = internal,
!   sfunc = ordered_set_transition,
!   finalfunc = percentile_disc_final,
!   finalfunc_extra = true
! );
! create aggregate my_rank(VARIADIC "any" ORDER BY VARIADIC "any") (
!   stype = internal,
!   sfunc = ordered_set_transition_multi,
!   finalfunc = rank_final,
!   finalfunc_extra = true,
!   hypothetical
! );
! alter aggregate my_percentile_disc(float8 ORDER BY anyelement)
!   rename to test_percentile_disc;
! alter aggregate my_rank(VARIADIC "any" ORDER BY VARIADIC "any")
!   rename to test_rank;
! \da test_*
!                                        List of aggregate functions
!  Schema |         Name         | Result data type |          Argument data types           | Description 
! --------+----------------------+------------------+----------------------------------------+-------------
!  public | test_percentile_disc | anyelement       | double precision ORDER BY anyelement   | 
!  public | test_rank            | bigint           | VARIADIC "any" ORDER BY VARIADIC "any" | 
! (2 rows)
! 
! -- moving-aggregate options
! CREATE AGGREGATE sumdouble (float8)
! (
!     stype = float8,
!     sfunc = float8pl,
!     mstype = float8,
!     msfunc = float8pl,
!     minvfunc = float8mi
! );
! -- invalid: nonstrict inverse with strict forward function
! CREATE FUNCTION float8mi_n(float8, float8) RETURNS float8 AS
! $$ SELECT $1 - $2; $$
! LANGUAGE SQL;
! CREATE AGGREGATE invalidsumdouble (float8)
! (
!     stype = float8,
!     sfunc = float8pl,
!     mstype = float8,
!     msfunc = float8pl,
!     minvfunc = float8mi_n
! );
! ERROR:  strictness of aggregate's forward and inverse transition functions must match
! -- invalid: non-matching result types
! CREATE FUNCTION float8mi_int(float8, float8) RETURNS int AS
! $$ SELECT CAST($1 - $2 AS INT); $$
! LANGUAGE SQL;
! CREATE AGGREGATE wrongreturntype (float8)
! (
!     stype = float8,
!     sfunc = float8pl,
!     mstype = float8,
!     msfunc = float8pl,
!     minvfunc = float8mi_int
! );
! ERROR:  return type of inverse transition function float8mi_int is not double precision
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/create_function_3.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/create_function_3.out	Tue Oct 28 15:53:05 2014
***************
*** 1,243 ****
! --
! -- CREATE FUNCTION
! --
! -- sanity check of pg_proc catalog to the given parameters
! --
! CREATE USER regtest_unpriv_user;
! CREATE SCHEMA temp_func_test;
! GRANT ALL ON SCHEMA temp_func_test TO public;
! SET search_path TO temp_func_test, public;
! --
! -- ARGUMENT and RETURN TYPES
! --
! CREATE FUNCTION functest_A_1(text, date) RETURNS bool LANGUAGE 'sql'
!        AS 'SELECT $1 = ''abcd'' AND $2 > ''2001-01-01''';
! CREATE FUNCTION functest_A_2(text[]) RETURNS int LANGUAGE 'sql'
!        AS 'SELECT $1[0]::int';
! CREATE FUNCTION functest_A_3() RETURNS bool LANGUAGE 'sql'
!        AS 'SELECT false';
! SELECT proname, prorettype::regtype, proargtypes::regtype[] FROM pg_proc
!        WHERE oid in ('functest_A_1'::regproc,
!                      'functest_A_2'::regproc,
!                      'functest_A_3'::regproc) ORDER BY proname;
!    proname    | prorettype |    proargtypes    
! --------------+------------+-------------------
!  functest_a_1 | boolean    | [0:1]={text,date}
!  functest_a_2 | integer    | [0:0]={text[]}
!  functest_a_3 | boolean    | {}
! (3 rows)
! 
! --
! -- IMMUTABLE | STABLE | VOLATILE
! --
! CREATE FUNCTION functest_B_1(int) RETURNS bool LANGUAGE 'sql'
!        AS 'SELECT $1 > 0';
! CREATE FUNCTION functest_B_2(int) RETURNS bool LANGUAGE 'sql'
!        IMMUTABLE AS 'SELECT $1 > 0';
! CREATE FUNCTION functest_B_3(int) RETURNS bool LANGUAGE 'sql'
!        STABLE AS 'SELECT $1 = 0';
! CREATE FUNCTION functest_B_4(int) RETURNS bool LANGUAGE 'sql'
!        VOLATILE AS 'SELECT $1 < 0';
! SELECT proname, provolatile FROM pg_proc
!        WHERE oid in ('functest_B_1'::regproc,
!                      'functest_B_2'::regproc,
!                      'functest_B_3'::regproc,
! 		     'functest_B_4'::regproc) ORDER BY proname;
!    proname    | provolatile 
! --------------+-------------
!  functest_b_1 | v
!  functest_b_2 | i
!  functest_b_3 | s
!  functest_b_4 | v
! (4 rows)
! 
! ALTER FUNCTION functest_B_2(int) VOLATILE;
! ALTER FUNCTION functest_B_3(int) COST 100;	-- unrelated change, no effect
! SELECT proname, provolatile FROM pg_proc
!        WHERE oid in ('functest_B_1'::regproc,
!                      'functest_B_2'::regproc,
!                      'functest_B_3'::regproc,
! 		     'functest_B_4'::regproc) ORDER BY proname;
!    proname    | provolatile 
! --------------+-------------
!  functest_b_1 | v
!  functest_b_2 | v
!  functest_b_3 | s
!  functest_b_4 | v
! (4 rows)
! 
! --
! -- SECURITY DEFINER | INVOKER
! --
! CREATE FUNCTION functext_C_1(int) RETURNS bool LANGUAGE 'sql'
!        AS 'SELECT $1 > 0';
! CREATE FUNCTION functext_C_2(int) RETURNS bool LANGUAGE 'sql'
!        SECURITY DEFINER AS 'SELECT $1 = 0';
! CREATE FUNCTION functext_C_3(int) RETURNS bool LANGUAGE 'sql'
!        SECURITY INVOKER AS 'SELECT $1 < 0';
! SELECT proname, prosecdef FROM pg_proc
!        WHERE oid in ('functext_C_1'::regproc,
!                      'functext_C_2'::regproc,
!                      'functext_C_3'::regproc) ORDER BY proname;
!    proname    | prosecdef 
! --------------+-----------
!  functext_c_1 | f
!  functext_c_2 | t
!  functext_c_3 | f
! (3 rows)
! 
! ALTER FUNCTION functext_C_1(int) IMMUTABLE;	-- unrelated change, no effect
! ALTER FUNCTION functext_C_2(int) SECURITY INVOKER;
! ALTER FUNCTION functext_C_3(int) SECURITY DEFINER;
! SELECT proname, prosecdef FROM pg_proc
!        WHERE oid in ('functext_C_1'::regproc,
!                      'functext_C_2'::regproc,
!                      'functext_C_3'::regproc) ORDER BY proname;
!    proname    | prosecdef 
! --------------+-----------
!  functext_c_1 | f
!  functext_c_2 | f
!  functext_c_3 | t
! (3 rows)
! 
! --
! -- LEAKPROOF
! --
! CREATE FUNCTION functext_E_1(int) RETURNS bool LANGUAGE 'sql'
!        AS 'SELECT $1 > 100';
! CREATE FUNCTION functext_E_2(int) RETURNS bool LANGUAGE 'sql'
!        LEAKPROOF AS 'SELECT $1 > 100';
! SELECT proname, proleakproof FROM pg_proc
!        WHERE oid in ('functext_E_1'::regproc,
!                      'functext_E_2'::regproc) ORDER BY proname;
!    proname    | proleakproof 
! --------------+--------------
!  functext_e_1 | f
!  functext_e_2 | t
! (2 rows)
! 
! ALTER FUNCTION functext_E_1(int) LEAKPROOF;
! ALTER FUNCTION functext_E_2(int) STABLE;	-- unrelated change, no effect
! SELECT proname, proleakproof FROM pg_proc
!        WHERE oid in ('functext_E_1'::regproc,
!                      'functext_E_2'::regproc) ORDER BY proname;
!    proname    | proleakproof 
! --------------+--------------
!  functext_e_1 | t
!  functext_e_2 | t
! (2 rows)
! 
! ALTER FUNCTION functext_E_2(int) NOT LEAKPROOF;	-- remove leakproog attribute
! SELECT proname, proleakproof FROM pg_proc
!        WHERE oid in ('functext_E_1'::regproc,
!                      'functext_E_2'::regproc) ORDER BY proname;
!    proname    | proleakproof 
! --------------+--------------
!  functext_e_1 | t
!  functext_e_2 | f
! (2 rows)
! 
! -- it takes superuser privilege to turn on leakproof, but not for turn off
! ALTER FUNCTION functext_E_1(int) OWNER TO regtest_unpriv_user;
! ALTER FUNCTION functext_E_2(int) OWNER TO regtest_unpriv_user;
! SET SESSION AUTHORIZATION regtest_unpriv_user;
! SET search_path TO temp_func_test, public;
! ALTER FUNCTION functext_E_1(int) NOT LEAKPROOF;
! ALTER FUNCTION functext_E_2(int) LEAKPROOF;
! ERROR:  only superuser can define a leakproof function
! CREATE FUNCTION functext_E_3(int) RETURNS bool LANGUAGE 'sql'
!        LEAKPROOF AS 'SELECT $1 < 200';	-- failed
! ERROR:  only superuser can define a leakproof function
! RESET SESSION AUTHORIZATION;
! --
! -- CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
! --
! CREATE FUNCTION functext_F_1(int) RETURNS bool LANGUAGE 'sql'
!        AS 'SELECT $1 > 50';
! CREATE FUNCTION functext_F_2(int) RETURNS bool LANGUAGE 'sql'
!        CALLED ON NULL INPUT AS 'SELECT $1 = 50';
! CREATE FUNCTION functext_F_3(int) RETURNS bool LANGUAGE 'sql'
!        RETURNS NULL ON NULL INPUT AS 'SELECT $1 < 50';
! CREATE FUNCTION functext_F_4(int) RETURNS bool LANGUAGE 'sql'
!        STRICT AS 'SELECT $1 = 50';
! SELECT proname, proisstrict FROM pg_proc
!        WHERE oid in ('functext_F_1'::regproc,
!                      'functext_F_2'::regproc,
!                      'functext_F_3'::regproc,
!                      'functext_F_4'::regproc) ORDER BY proname;
!    proname    | proisstrict 
! --------------+-------------
!  functext_f_1 | f
!  functext_f_2 | f
!  functext_f_3 | t
!  functext_f_4 | t
! (4 rows)
! 
! ALTER FUNCTION functext_F_1(int) IMMUTABLE;	-- unrelated change, no effect
! ALTER FUNCTION functext_F_2(int) STRICT;
! ALTER FUNCTION functext_F_3(int) CALLED ON NULL INPUT;
! SELECT proname, proisstrict FROM pg_proc
!        WHERE oid in ('functext_F_1'::regproc,
!                      'functext_F_2'::regproc,
!                      'functext_F_3'::regproc,
!                      'functext_F_4'::regproc) ORDER BY proname;
!    proname    | proisstrict 
! --------------+-------------
!  functext_f_1 | f
!  functext_f_2 | t
!  functext_f_3 | f
!  functext_f_4 | t
! (4 rows)
! 
! -- information_schema tests
! CREATE FUNCTION functest_IS_1(a int, b int default 1, c text default 'foo')
!     RETURNS int
!     LANGUAGE SQL
!     AS 'SELECT $1 + $2';
! CREATE FUNCTION functest_IS_2(out a int, b int default 1)
!     RETURNS int
!     LANGUAGE SQL
!     AS 'SELECT $1';
! CREATE FUNCTION functest_IS_3(a int default 1, out b int)
!     RETURNS int
!     LANGUAGE SQL
!     AS 'SELECT $1';
! SELECT routine_name, ordinal_position, parameter_name, parameter_default
!     FROM information_schema.parameters JOIN information_schema.routines USING (specific_schema, specific_name)
!     WHERE routine_schema = 'temp_func_test' AND routine_name ~ '^functest_is_'
!     ORDER BY 1, 2;
!  routine_name  | ordinal_position | parameter_name | parameter_default 
! ---------------+------------------+----------------+-------------------
!  functest_is_1 |                1 | a              | 
!  functest_is_1 |                2 | b              | 1
!  functest_is_1 |                3 | c              | 'foo'::text
!  functest_is_2 |                1 | a              | 
!  functest_is_2 |                2 | b              | 1
!  functest_is_3 |                1 | a              | 1
!  functest_is_3 |                2 | b              | 
! (7 rows)
! 
! -- Cleanups
! DROP SCHEMA temp_func_test CASCADE;
! NOTICE:  drop cascades to 19 other objects
! DETAIL:  drop cascades to function functest_a_1(text,date)
! drop cascades to function functest_a_2(text[])
! drop cascades to function functest_a_3()
! drop cascades to function functest_b_1(integer)
! drop cascades to function functest_b_2(integer)
! drop cascades to function functest_b_3(integer)
! drop cascades to function functest_b_4(integer)
! drop cascades to function functext_c_1(integer)
! drop cascades to function functext_c_2(integer)
! drop cascades to function functext_c_3(integer)
! drop cascades to function functext_e_1(integer)
! drop cascades to function functext_e_2(integer)
! drop cascades to function functext_f_1(integer)
! drop cascades to function functext_f_2(integer)
! drop cascades to function functext_f_3(integer)
! drop cascades to function functext_f_4(integer)
! drop cascades to function functest_is_1(integer,integer,text)
! drop cascades to function functest_is_2(integer)
! drop cascades to function functest_is_3(integer)
! DROP USER regtest_unpriv_user;
! RESET search_path;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/create_cast.out	Sun Oct  3 21:26:00 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/create_cast.out	Tue Oct 28 15:53:05 2014
***************
*** 1,74 ****
! --
! -- CREATE_CAST
! --
! -- Create some types to test with
! CREATE TYPE casttesttype;
! CREATE FUNCTION casttesttype_in(cstring)
!    RETURNS casttesttype
!    AS 'textin'
!    LANGUAGE internal STRICT;
! NOTICE:  return type casttesttype is only a shell
! CREATE FUNCTION casttesttype_out(casttesttype)
!    RETURNS cstring
!    AS 'textout'
!    LANGUAGE internal STRICT;
! NOTICE:  argument type casttesttype is only a shell
! CREATE TYPE casttesttype (
!    internallength = variable,
!    input = casttesttype_in,
!    output = casttesttype_out,
!    alignment = int4
! );
! -- a dummy function to test with
! CREATE FUNCTION casttestfunc(casttesttype) RETURNS int4 LANGUAGE SQL AS
! $$ SELECT 1; $$;
! SELECT casttestfunc('foo'::text); -- fails, as there's no cast
! ERROR:  function casttestfunc(text) does not exist
! LINE 1: SELECT casttestfunc('foo'::text);
!                ^
! HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
! -- Try binary coercion cast
! CREATE CAST (text AS casttesttype) WITHOUT FUNCTION;
! SELECT casttestfunc('foo'::text); -- doesn't work, as the cast is explicit
! ERROR:  function casttestfunc(text) does not exist
! LINE 1: SELECT casttestfunc('foo'::text);
!                ^
! HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
! SELECT casttestfunc('foo'::text::casttesttype); -- should work
!  casttestfunc 
! --------------
!             1
! (1 row)
! 
! DROP CAST (text AS casttesttype); -- cleanup
! -- Try IMPLICIT binary coercion cast
! CREATE CAST (text AS casttesttype) WITHOUT FUNCTION AS IMPLICIT;
! SELECT casttestfunc('foo'::text); -- Should work now
!  casttestfunc 
! --------------
!             1
! (1 row)
! 
! -- Try I/O conversion cast.
! SELECT 1234::int4::casttesttype; -- No cast yet, should fail
! ERROR:  cannot cast type integer to casttesttype
! LINE 1: SELECT 1234::int4::casttesttype;
!                          ^
! CREATE CAST (int4 AS casttesttype) WITH INOUT;
! SELECT 1234::int4::casttesttype; -- Should work now
!  casttesttype 
! --------------
!  1234
! (1 row)
! 
! DROP CAST (int4 AS casttesttype);
! -- Try cast with a function
! CREATE FUNCTION int4_casttesttype(int4) RETURNS casttesttype LANGUAGE SQL AS
! $$ SELECT ('foo'::text || $1::text)::casttesttype; $$;
! CREATE CAST (int4 AS casttesttype) WITH FUNCTION int4_casttesttype(int4) AS IMPLICIT;
! SELECT 1234::int4::casttesttype; -- Should work now
!  casttesttype 
! --------------
!  foo1234
! (1 row)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/constraints.out	Tue Oct 28 15:52:48 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/constraints.out	Tue Oct 28 15:53:05 2014
***************
*** 1,647 ****
! --
! -- CONSTRAINTS
! -- Constraints can be specified with:
! --  - DEFAULT clause
! --  - CHECK clauses
! --  - PRIMARY KEY clauses
! --  - UNIQUE clauses
! --  - EXCLUDE clauses
! --
! --
! -- DEFAULT syntax
! --
! CREATE TABLE DEFAULT_TBL (i int DEFAULT 100,
! 	x text DEFAULT 'vadim', f float8 DEFAULT 123.456);
! INSERT INTO DEFAULT_TBL VALUES (1, 'thomas', 57.0613);
! INSERT INTO DEFAULT_TBL VALUES (1, 'bruce');
! INSERT INTO DEFAULT_TBL (i, f) VALUES (2, 987.654);
! INSERT INTO DEFAULT_TBL (x) VALUES ('marc');
! INSERT INTO DEFAULT_TBL VALUES (3, null, 1.0);
! SELECT '' AS five, * FROM DEFAULT_TBL;
!  five |  i  |   x    |    f    
! ------+-----+--------+---------
!       |   1 | thomas | 57.0613
!       |   1 | bruce  | 123.456
!       |   2 | vadim  | 987.654
!       | 100 | marc   | 123.456
!       |   3 |        |       1
! (5 rows)
! 
! CREATE SEQUENCE DEFAULT_SEQ;
! CREATE TABLE DEFAULTEXPR_TBL (i1 int DEFAULT 100 + (200-199) * 2,
! 	i2 int DEFAULT nextval('default_seq'));
! INSERT INTO DEFAULTEXPR_TBL VALUES (-1, -2);
! INSERT INTO DEFAULTEXPR_TBL (i1) VALUES (-3);
! INSERT INTO DEFAULTEXPR_TBL (i2) VALUES (-4);
! INSERT INTO DEFAULTEXPR_TBL (i2) VALUES (NULL);
! SELECT '' AS four, * FROM DEFAULTEXPR_TBL;
!  four | i1  | i2 
! ------+-----+----
!       |  -1 | -2
!       |  -3 |  1
!       | 102 | -4
!       | 102 |   
! (4 rows)
! 
! -- syntax errors
! --  test for extraneous comma
! CREATE TABLE error_tbl (i int DEFAULT (100, ));
! ERROR:  syntax error at or near ")"
! LINE 1: CREATE TABLE error_tbl (i int DEFAULT (100, ));
!                                                     ^
! --  this will fail because gram.y uses b_expr not a_expr for defaults,
! --  to avoid a shift/reduce conflict that arises from NOT NULL being
! --  part of the column definition syntax:
! CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2));
! ERROR:  syntax error at or near "IN"
! LINE 1: CREATE TABLE error_tbl (b1 bool DEFAULT 1 IN (1, 2));
!                                                   ^
! --  this should work, however:
! CREATE TABLE error_tbl (b1 bool DEFAULT (1 IN (1, 2)));
! DROP TABLE error_tbl;
! --
! -- CHECK syntax
! --
! CREATE TABLE CHECK_TBL (x int,
! 	CONSTRAINT CHECK_CON CHECK (x > 3));
! INSERT INTO CHECK_TBL VALUES (5);
! INSERT INTO CHECK_TBL VALUES (4);
! INSERT INTO CHECK_TBL VALUES (3);
! ERROR:  new row for relation "check_tbl" violates check constraint "check_con"
! DETAIL:  Failing row contains (3).
! INSERT INTO CHECK_TBL VALUES (2);
! ERROR:  new row for relation "check_tbl" violates check constraint "check_con"
! DETAIL:  Failing row contains (2).
! INSERT INTO CHECK_TBL VALUES (6);
! INSERT INTO CHECK_TBL VALUES (1);
! ERROR:  new row for relation "check_tbl" violates check constraint "check_con"
! DETAIL:  Failing row contains (1).
! SELECT '' AS three, * FROM CHECK_TBL;
!  three | x 
! -------+---
!        | 5
!        | 4
!        | 6
! (3 rows)
! 
! CREATE SEQUENCE CHECK_SEQ;
! CREATE TABLE CHECK2_TBL (x int, y text, z int,
! 	CONSTRAINT SEQUENCE_CON
! 	CHECK (x > 3 and y <> 'check failed' and z < 8));
! INSERT INTO CHECK2_TBL VALUES (4, 'check ok', -2);
! INSERT INTO CHECK2_TBL VALUES (1, 'x check failed', -2);
! ERROR:  new row for relation "check2_tbl" violates check constraint "sequence_con"
! DETAIL:  Failing row contains (1, x check failed, -2).
! INSERT INTO CHECK2_TBL VALUES (5, 'z check failed', 10);
! ERROR:  new row for relation "check2_tbl" violates check constraint "sequence_con"
! DETAIL:  Failing row contains (5, z check failed, 10).
! INSERT INTO CHECK2_TBL VALUES (0, 'check failed', -2);
! ERROR:  new row for relation "check2_tbl" violates check constraint "sequence_con"
! DETAIL:  Failing row contains (0, check failed, -2).
! INSERT INTO CHECK2_TBL VALUES (6, 'check failed', 11);
! ERROR:  new row for relation "check2_tbl" violates check constraint "sequence_con"
! DETAIL:  Failing row contains (6, check failed, 11).
! INSERT INTO CHECK2_TBL VALUES (7, 'check ok', 7);
! SELECT '' AS two, * from CHECK2_TBL;
!  two | x |    y     | z  
! -----+---+----------+----
!      | 4 | check ok | -2
!      | 7 | check ok |  7
! (2 rows)
! 
! --
! -- Check constraints on INSERT
! --
! CREATE SEQUENCE INSERT_SEQ;
! CREATE TABLE INSERT_TBL (x INT DEFAULT nextval('insert_seq'),
! 	y TEXT DEFAULT '-NULL-',
! 	z INT DEFAULT -1 * currval('insert_seq'),
! 	CONSTRAINT INSERT_CON CHECK (x >= 3 AND y <> 'check failed' AND x < 8),
! 	CHECK (x + z = 0));
! INSERT INTO INSERT_TBL(x,z) VALUES (2, -2);
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (2, -NULL-, -2).
! SELECT '' AS zero, * FROM INSERT_TBL;
!  zero | x | y | z 
! ------+---+---+---
! (0 rows)
! 
! SELECT 'one' AS one, nextval('insert_seq');
!  one | nextval 
! -----+---------
!  one |       1
! (1 row)
! 
! INSERT INTO INSERT_TBL(y) VALUES ('Y');
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (2, Y, -2).
! INSERT INTO INSERT_TBL(y) VALUES ('Y');
! INSERT INTO INSERT_TBL(x,z) VALUES (1, -2);
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_tbl_check"
! DETAIL:  Failing row contains (1, -NULL-, -2).
! INSERT INTO INSERT_TBL(z,x) VALUES (-7,  7);
! INSERT INTO INSERT_TBL VALUES (5, 'check failed', -5);
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (5, check failed, -5).
! INSERT INTO INSERT_TBL VALUES (7, '!check failed', -7);
! INSERT INTO INSERT_TBL(y) VALUES ('-!NULL-');
! SELECT '' AS four, * FROM INSERT_TBL;
!  four | x |       y       | z  
! ------+---+---------------+----
!       | 3 | Y             | -3
!       | 7 | -NULL-        | -7
!       | 7 | !check failed | -7
!       | 4 | -!NULL-       | -4
! (4 rows)
! 
! INSERT INTO INSERT_TBL(y,z) VALUES ('check failed', 4);
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_tbl_check"
! DETAIL:  Failing row contains (5, check failed, 4).
! INSERT INTO INSERT_TBL(x,y) VALUES (5, 'check failed');
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (5, check failed, -5).
! INSERT INTO INSERT_TBL(x,y) VALUES (5, '!check failed');
! INSERT INTO INSERT_TBL(y) VALUES ('-!NULL-');
! SELECT '' AS six, * FROM INSERT_TBL;
!  six | x |       y       | z  
! -----+---+---------------+----
!      | 3 | Y             | -3
!      | 7 | -NULL-        | -7
!      | 7 | !check failed | -7
!      | 4 | -!NULL-       | -4
!      | 5 | !check failed | -5
!      | 6 | -!NULL-       | -6
! (6 rows)
! 
! SELECT 'seven' AS one, nextval('insert_seq');
!   one  | nextval 
! -------+---------
!  seven |       7
! (1 row)
! 
! INSERT INTO INSERT_TBL(y) VALUES ('Y');
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (8, Y, -8).
! SELECT 'eight' AS one, currval('insert_seq');
!   one  | currval 
! -------+---------
!  eight |       8
! (1 row)
! 
! -- According to SQL, it is OK to insert a record that gives rise to NULL
! -- constraint-condition results.  Postgres used to reject this, but it
! -- was wrong:
! INSERT INTO INSERT_TBL VALUES (null, null, null);
! SELECT '' AS nine, * FROM INSERT_TBL;
!  nine | x |       y       | z  
! ------+---+---------------+----
!       | 3 | Y             | -3
!       | 7 | -NULL-        | -7
!       | 7 | !check failed | -7
!       | 4 | -!NULL-       | -4
!       | 5 | !check failed | -5
!       | 6 | -!NULL-       | -6
!       |   |               |   
! (7 rows)
! 
! --
! -- Check constraints on system columns
! --
! CREATE TABLE SYS_COL_CHECK_TBL (city text, state text, is_capital bool,
!                   altitude int,
!                   CHECK (NOT (is_capital AND tableoid::regclass::text = 'sys_col_check_tbl')));
! INSERT INTO SYS_COL_CHECK_TBL VALUES ('Seattle', 'Washington', false, 100);
! INSERT INTO SYS_COL_CHECK_TBL VALUES ('Olympia', 'Washington', true, 100);
! ERROR:  new row for relation "sys_col_check_tbl" violates check constraint "sys_col_check_tbl_check"
! DETAIL:  Failing row contains (Olympia, Washington, t, 100).
! SELECT *, tableoid::regclass::text FROM SYS_COL_CHECK_TBL;
!   city   |   state    | is_capital | altitude |     tableoid      
! ---------+------------+------------+----------+-------------------
!  Seattle | Washington | f          |      100 | sys_col_check_tbl
! (1 row)
! 
! DROP TABLE SYS_COL_CHECK_TBL;
! --
! -- Check constraints on system columns other then TableOid should return error
! --
! CREATE TABLE SYS_COL_CHECK_TBL (city text, state text, is_capital bool,
!                   altitude int,
! 				  CHECK (NOT (is_capital AND ctid::text = 'sys_col_check_tbl')));
! ERROR:  system column "ctid" reference in check constraint is invalid
! --
! -- Check inheritance of defaults and constraints
! --
! CREATE TABLE INSERT_CHILD (cx INT default 42,
! 	cy INT CHECK (cy > x))
! 	INHERITS (INSERT_TBL);
! INSERT INTO INSERT_CHILD(x,z,cy) VALUES (7,-7,11);
! INSERT INTO INSERT_CHILD(x,z,cy) VALUES (7,-7,6);
! ERROR:  new row for relation "insert_child" violates check constraint "insert_child_check"
! DETAIL:  Failing row contains (7, -NULL-, -7, 42, 6).
! INSERT INTO INSERT_CHILD(x,z,cy) VALUES (6,-7,7);
! ERROR:  new row for relation "insert_child" violates check constraint "insert_tbl_check"
! DETAIL:  Failing row contains (6, -NULL-, -7, 42, 7).
! INSERT INTO INSERT_CHILD(x,y,z,cy) VALUES (6,'check failed',-6,7);
! ERROR:  new row for relation "insert_child" violates check constraint "insert_con"
! DETAIL:  Failing row contains (6, check failed, -6, 42, 7).
! SELECT * FROM INSERT_CHILD;
!  x |   y    | z  | cx | cy 
! ---+--------+----+----+----
!  7 | -NULL- | -7 | 42 | 11
! (1 row)
! 
! DROP TABLE INSERT_CHILD;
! --
! -- Check NO INHERIT type of constraints and inheritance
! --
! CREATE TABLE ATACC1 (TEST INT
! 	CHECK (TEST > 0) NO INHERIT);
! CREATE TABLE ATACC2 (TEST2 INT) INHERITS (ATACC1);
! -- check constraint is not there on child
! INSERT INTO ATACC2 (TEST) VALUES (-3);
! -- check constraint is there on parent
! INSERT INTO ATACC1 (TEST) VALUES (-3);
! ERROR:  new row for relation "atacc1" violates check constraint "atacc1_test_check"
! DETAIL:  Failing row contains (-3).
! DROP TABLE ATACC1 CASCADE;
! NOTICE:  drop cascades to table atacc2
! CREATE TABLE ATACC1 (TEST INT, TEST2 INT
! 	CHECK (TEST > 0), CHECK (TEST2 > 10) NO INHERIT);
! CREATE TABLE ATACC2 () INHERITS (ATACC1);
! -- check constraint is there on child
! INSERT INTO ATACC2 (TEST) VALUES (-3);
! ERROR:  new row for relation "atacc2" violates check constraint "atacc1_test_check"
! DETAIL:  Failing row contains (-3, null).
! -- check constraint is there on parent
! INSERT INTO ATACC1 (TEST) VALUES (-3);
! ERROR:  new row for relation "atacc1" violates check constraint "atacc1_test_check"
! DETAIL:  Failing row contains (-3, null).
! -- check constraint is not there on child
! INSERT INTO ATACC2 (TEST2) VALUES (3);
! -- check constraint is there on parent
! INSERT INTO ATACC1 (TEST2) VALUES (3);
! ERROR:  new row for relation "atacc1" violates check constraint "atacc1_test2_check"
! DETAIL:  Failing row contains (null, 3).
! DROP TABLE ATACC1 CASCADE;
! NOTICE:  drop cascades to table atacc2
! --
! -- Check constraints on INSERT INTO
! --
! DELETE FROM INSERT_TBL;
! ALTER SEQUENCE INSERT_SEQ RESTART WITH 4;
! CREATE TABLE tmp (xd INT, yd TEXT, zd INT);
! INSERT INTO tmp VALUES (null, 'Y', null);
! INSERT INTO tmp VALUES (5, '!check failed', null);
! INSERT INTO tmp VALUES (null, 'try again', null);
! INSERT INTO INSERT_TBL(y) select yd from tmp;
! SELECT '' AS three, * FROM INSERT_TBL;
!  three | x |       y       | z  
! -------+---+---------------+----
!        | 4 | Y             | -4
!        | 5 | !check failed | -5
!        | 6 | try again     | -6
! (3 rows)
! 
! INSERT INTO INSERT_TBL SELECT * FROM tmp WHERE yd = 'try again';
! INSERT INTO INSERT_TBL(y,z) SELECT yd, -7 FROM tmp WHERE yd = 'try again';
! INSERT INTO INSERT_TBL(y,z) SELECT yd, -8 FROM tmp WHERE yd = 'try again';
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (8, try again, -8).
! SELECT '' AS four, * FROM INSERT_TBL;
!  four | x |       y       | z  
! ------+---+---------------+----
!       | 4 | Y             | -4
!       | 5 | !check failed | -5
!       | 6 | try again     | -6
!       |   | try again     |   
!       | 7 | try again     | -7
! (5 rows)
! 
! DROP TABLE tmp;
! --
! -- Check constraints on UPDATE
! --
! UPDATE INSERT_TBL SET x = NULL WHERE x = 5;
! UPDATE INSERT_TBL SET x = 6 WHERE x = 6;
! UPDATE INSERT_TBL SET x = -z, z = -x;
! UPDATE INSERT_TBL SET x = z, z = x;
! ERROR:  new row for relation "insert_tbl" violates check constraint "insert_con"
! DETAIL:  Failing row contains (-4, Y, 4).
! SELECT * FROM INSERT_TBL;
!  x |       y       | z  
! ---+---------------+----
!  4 | Y             | -4
!    | try again     |   
!  7 | try again     | -7
!  5 | !check failed |   
!  6 | try again     | -6
! (5 rows)
! 
! -- DROP TABLE INSERT_TBL;
! --
! -- Check constraints on COPY FROM
! --
! CREATE TABLE COPY_TBL (x INT, y TEXT, z INT,
! 	CONSTRAINT COPY_CON
! 	CHECK (x > 3 AND y <> 'check failed' AND x < 7 ));
! COPY COPY_TBL FROM '/Users/decibel/pgsql/HEAD/src/test/regress/data/constro.data';
! SELECT '' AS two, * FROM COPY_TBL;
!  two | x |       y       | z 
! -----+---+---------------+---
!      | 4 | !check failed | 5
!      | 6 | OK            | 4
! (2 rows)
! 
! COPY COPY_TBL FROM '/Users/decibel/pgsql/HEAD/src/test/regress/data/constrf.data';
! ERROR:  new row for relation "copy_tbl" violates check constraint "copy_con"
! DETAIL:  Failing row contains (7, check failed, 6).
! CONTEXT:  COPY copy_tbl, line 2: "7	check failed	6"
! SELECT * FROM COPY_TBL;
!  x |       y       | z 
! ---+---------------+---
!  4 | !check failed | 5
!  6 | OK            | 4
! (2 rows)
! 
! --
! -- Primary keys
! --
! CREATE TABLE PRIMARY_TBL (i int PRIMARY KEY, t text);
! INSERT INTO PRIMARY_TBL VALUES (1, 'one');
! INSERT INTO PRIMARY_TBL VALUES (2, 'two');
! INSERT INTO PRIMARY_TBL VALUES (1, 'three');
! ERROR:  duplicate key value violates unique constraint "primary_tbl_pkey"
! DETAIL:  Key (i)=(1) already exists.
! INSERT INTO PRIMARY_TBL VALUES (4, 'three');
! INSERT INTO PRIMARY_TBL VALUES (5, 'one');
! INSERT INTO PRIMARY_TBL (t) VALUES ('six');
! ERROR:  null value in column "i" violates not-null constraint
! DETAIL:  Failing row contains (null, six).
! SELECT '' AS four, * FROM PRIMARY_TBL;
!  four | i |   t   
! ------+---+-------
!       | 1 | one
!       | 2 | two
!       | 4 | three
!       | 5 | one
! (4 rows)
! 
! DROP TABLE PRIMARY_TBL;
! CREATE TABLE PRIMARY_TBL (i int, t text,
! 	PRIMARY KEY(i,t));
! INSERT INTO PRIMARY_TBL VALUES (1, 'one');
! INSERT INTO PRIMARY_TBL VALUES (2, 'two');
! INSERT INTO PRIMARY_TBL VALUES (1, 'three');
! INSERT INTO PRIMARY_TBL VALUES (4, 'three');
! INSERT INTO PRIMARY_TBL VALUES (5, 'one');
! INSERT INTO PRIMARY_TBL (t) VALUES ('six');
! ERROR:  null value in column "i" violates not-null constraint
! DETAIL:  Failing row contains (null, six).
! SELECT '' AS three, * FROM PRIMARY_TBL;
!  three | i |   t   
! -------+---+-------
!        | 1 | one
!        | 2 | two
!        | 1 | three
!        | 4 | three
!        | 5 | one
! (5 rows)
! 
! DROP TABLE PRIMARY_TBL;
! --
! -- Unique keys
! --
! CREATE TABLE UNIQUE_TBL (i int UNIQUE, t text);
! INSERT INTO UNIQUE_TBL VALUES (1, 'one');
! INSERT INTO UNIQUE_TBL VALUES (2, 'two');
! INSERT INTO UNIQUE_TBL VALUES (1, 'three');
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_key"
! DETAIL:  Key (i)=(1) already exists.
! INSERT INTO UNIQUE_TBL VALUES (4, 'four');
! INSERT INTO UNIQUE_TBL VALUES (5, 'one');
! INSERT INTO UNIQUE_TBL (t) VALUES ('six');
! INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
! SELECT '' AS five, * FROM UNIQUE_TBL;
!  five | i |   t   
! ------+---+-------
!       | 1 | one
!       | 2 | two
!       | 4 | four
!       | 5 | one
!       |   | six
!       |   | seven
! (6 rows)
! 
! DROP TABLE UNIQUE_TBL;
! CREATE TABLE UNIQUE_TBL (i int, t text,
! 	UNIQUE(i,t));
! INSERT INTO UNIQUE_TBL VALUES (1, 'one');
! INSERT INTO UNIQUE_TBL VALUES (2, 'two');
! INSERT INTO UNIQUE_TBL VALUES (1, 'three');
! INSERT INTO UNIQUE_TBL VALUES (1, 'one');
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_t_key"
! DETAIL:  Key (i, t)=(1, one) already exists.
! INSERT INTO UNIQUE_TBL VALUES (5, 'one');
! INSERT INTO UNIQUE_TBL (t) VALUES ('six');
! SELECT '' AS five, * FROM UNIQUE_TBL;
!  five | i |   t   
! ------+---+-------
!       | 1 | one
!       | 2 | two
!       | 1 | three
!       | 5 | one
!       |   | six
! (5 rows)
! 
! DROP TABLE UNIQUE_TBL;
! --
! -- Deferrable unique constraints
! --
! CREATE TABLE unique_tbl (i int UNIQUE DEFERRABLE, t text);
! INSERT INTO unique_tbl VALUES (0, 'one');
! INSERT INTO unique_tbl VALUES (1, 'two');
! INSERT INTO unique_tbl VALUES (2, 'tree');
! INSERT INTO unique_tbl VALUES (3, 'four');
! INSERT INTO unique_tbl VALUES (4, 'five');
! BEGIN;
! -- default is immediate so this should fail right away
! UPDATE unique_tbl SET i = 1 WHERE i = 0;
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_key"
! DETAIL:  Key (i)=(1) already exists.
! ROLLBACK;
! -- check is done at end of statement, so this should succeed
! UPDATE unique_tbl SET i = i+1;
! SELECT * FROM unique_tbl;
!  i |  t   
! ---+------
!  1 | one
!  2 | two
!  3 | tree
!  4 | four
!  5 | five
! (5 rows)
! 
! -- explicitly defer the constraint
! BEGIN;
! SET CONSTRAINTS unique_tbl_i_key DEFERRED;
! INSERT INTO unique_tbl VALUES (3, 'three');
! DELETE FROM unique_tbl WHERE t = 'tree'; -- makes constraint valid again
! COMMIT; -- should succeed
! SELECT * FROM unique_tbl;
!  i |   t   
! ---+-------
!  1 | one
!  2 | two
!  4 | four
!  5 | five
!  3 | three
! (5 rows)
! 
! -- try adding an initially deferred constraint
! ALTER TABLE unique_tbl DROP CONSTRAINT unique_tbl_i_key;
! ALTER TABLE unique_tbl ADD CONSTRAINT unique_tbl_i_key
! 	UNIQUE (i) DEFERRABLE INITIALLY DEFERRED;
! BEGIN;
! INSERT INTO unique_tbl VALUES (1, 'five');
! INSERT INTO unique_tbl VALUES (5, 'one');
! UPDATE unique_tbl SET i = 4 WHERE i = 2;
! UPDATE unique_tbl SET i = 2 WHERE i = 4 AND t = 'four';
! DELETE FROM unique_tbl WHERE i = 1 AND t = 'one';
! DELETE FROM unique_tbl WHERE i = 5 AND t = 'five';
! COMMIT;
! SELECT * FROM unique_tbl;
!  i |   t   
! ---+-------
!  3 | three
!  1 | five
!  5 | one
!  4 | two
!  2 | four
! (5 rows)
! 
! -- should fail at commit-time
! BEGIN;
! INSERT INTO unique_tbl VALUES (3, 'Three'); -- should succeed for now
! COMMIT; -- should fail
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_key"
! DETAIL:  Key (i)=(3) already exists.
! -- make constraint check immediate
! BEGIN;
! SET CONSTRAINTS ALL IMMEDIATE;
! INSERT INTO unique_tbl VALUES (3, 'Three'); -- should fail
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_key"
! DETAIL:  Key (i)=(3) already exists.
! COMMIT;
! -- forced check when SET CONSTRAINTS is called
! BEGIN;
! SET CONSTRAINTS ALL DEFERRED;
! INSERT INTO unique_tbl VALUES (3, 'Three'); -- should succeed for now
! SET CONSTRAINTS ALL IMMEDIATE; -- should fail
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_key"
! DETAIL:  Key (i)=(3) already exists.
! COMMIT;
! -- test a HOT update that invalidates the conflicting tuple.
! -- the trigger should still fire and catch the violation
! BEGIN;
! INSERT INTO unique_tbl VALUES (3, 'Three'); -- should succeed for now
! UPDATE unique_tbl SET t = 'THREE' WHERE i = 3 AND t = 'Three';
! COMMIT; -- should fail
! ERROR:  duplicate key value violates unique constraint "unique_tbl_i_key"
! DETAIL:  Key (i)=(3) already exists.
! SELECT * FROM unique_tbl;
!  i |   t   
! ---+-------
!  3 | three
!  1 | five
!  5 | one
!  4 | two
!  2 | four
! (5 rows)
! 
! -- test a HOT update that modifies the newly inserted tuple,
! -- but should succeed because we then remove the other conflicting tuple.
! BEGIN;
! INSERT INTO unique_tbl VALUES(3, 'tree'); -- should succeed for now
! UPDATE unique_tbl SET t = 'threex' WHERE t = 'tree';
! DELETE FROM unique_tbl WHERE t = 'three';
! SELECT * FROM unique_tbl;
!  i |   t    
! ---+--------
!  1 | five
!  5 | one
!  4 | two
!  2 | four
!  3 | threex
! (5 rows)
! 
! COMMIT;
! SELECT * FROM unique_tbl;
!  i |   t    
! ---+--------
!  1 | five
!  5 | one
!  4 | two
!  2 | four
!  3 | threex
! (5 rows)
! 
! DROP TABLE unique_tbl;
! --
! -- EXCLUDE constraints
! --
! CREATE TABLE circles (
!   c1 CIRCLE,
!   c2 TEXT,
!   EXCLUDE USING gist
!     (c1 WITH &&, (c2::circle) WITH &&)
!     WHERE (circle_center(c1) <> '(0,0)')
! );
! -- these should succeed because they don't match the index predicate
! INSERT INTO circles VALUES('<(0,0), 5>', '<(0,0), 5>');
! INSERT INTO circles VALUES('<(0,0), 5>', '<(0,0), 4>');
! -- succeed
! INSERT INTO circles VALUES('<(10,10), 10>', '<(0,0), 5>');
! -- fail, overlaps
! INSERT INTO circles VALUES('<(20,20), 10>', '<(0,0), 4>');
! ERROR:  conflicting key value violates exclusion constraint "circles_c1_c2_excl"
! DETAIL:  Key (c1, (c2::circle))=(<(20,20),10>, <(0,0),4>) conflicts with existing key (c1, (c2::circle))=(<(10,10),10>, <(0,0),5>).
! -- succeed because c1 doesn't overlap
! INSERT INTO circles VALUES('<(20,20), 1>', '<(0,0), 5>');
! -- succeed because c2 doesn't overlap
! INSERT INTO circles VALUES('<(20,20), 10>', '<(10,10), 5>');
! -- should fail on existing data without the WHERE clause
! ALTER TABLE circles ADD EXCLUDE USING gist
!   (c1 WITH &&, (c2::circle) WITH &&);
! ERROR:  could not create exclusion constraint "circles_c1_c2_excl1"
! DETAIL:  Key (c1, (c2::circle))=(<(0,0),5>, <(0,0),5>) conflicts with key (c1, (c2::circle))=(<(0,0),5>, <(0,0),4>).
! -- try reindexing an existing constraint
! REINDEX INDEX circles_c1_c2_excl;
! DROP TABLE circles;
! -- Check deferred exclusion constraint
! CREATE TABLE deferred_excl (
!   f1 int,
!   CONSTRAINT deferred_excl_con EXCLUDE (f1 WITH =) INITIALLY DEFERRED
! );
! INSERT INTO deferred_excl VALUES(1);
! INSERT INTO deferred_excl VALUES(2);
! INSERT INTO deferred_excl VALUES(1); -- fail
! ERROR:  conflicting key value violates exclusion constraint "deferred_excl_con"
! DETAIL:  Key (f1)=(1) conflicts with existing key (f1)=(1).
! BEGIN;
! INSERT INTO deferred_excl VALUES(2); -- no fail here
! COMMIT; -- should fail here
! ERROR:  conflicting key value violates exclusion constraint "deferred_excl_con"
! DETAIL:  Key (f1)=(2) conflicts with existing key (f1)=(2).
! BEGIN;
! INSERT INTO deferred_excl VALUES(3);
! INSERT INTO deferred_excl VALUES(3); -- no fail here
! COMMIT; -- should fail here
! ERROR:  conflicting key value violates exclusion constraint "deferred_excl_con"
! DETAIL:  Key (f1)=(3) conflicts with existing key (f1)=(3).
! ALTER TABLE deferred_excl DROP CONSTRAINT deferred_excl_con;
! -- This should fail, but worth testing because of HOT updates
! UPDATE deferred_excl SET f1 = 3;
! ALTER TABLE deferred_excl ADD EXCLUDE (f1 WITH =);
! ERROR:  could not create exclusion constraint "deferred_excl_f1_excl"
! DETAIL:  Key (f1)=(3) conflicts with key (f1)=(3).
! DROP TABLE deferred_excl;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/triggers.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/triggers.out	Tue Oct 28 15:53:05 2014
***************
*** 1,1733 ****
! --
! -- TRIGGERS
! --
! create table pkeys (pkey1 int4 not null, pkey2 text not null);
! create table fkeys (fkey1 int4, fkey2 text, fkey3 int);
! create table fkeys2 (fkey21 int4, fkey22 text, pkey23 int not null);
! create index fkeys_i on fkeys (fkey1, fkey2);
! create index fkeys2_i on fkeys2 (fkey21, fkey22);
! create index fkeys2p_i on fkeys2 (pkey23);
! insert into pkeys values (10, '1');
! insert into pkeys values (20, '2');
! insert into pkeys values (30, '3');
! insert into pkeys values (40, '4');
! insert into pkeys values (50, '5');
! insert into pkeys values (60, '6');
! create unique index pkeys_i on pkeys (pkey1, pkey2);
! --
! -- For fkeys:
! -- 	(fkey1, fkey2)	--> pkeys (pkey1, pkey2)
! -- 	(fkey3)		--> fkeys2 (pkey23)
! --
! create trigger check_fkeys_pkey_exist
! 	before insert or update on fkeys
! 	for each row
! 	execute procedure
! 	check_primary_key ('fkey1', 'fkey2', 'pkeys', 'pkey1', 'pkey2');
! create trigger check_fkeys_pkey2_exist
! 	before insert or update on fkeys
! 	for each row
! 	execute procedure check_primary_key ('fkey3', 'fkeys2', 'pkey23');
! --
! -- For fkeys2:
! -- 	(fkey21, fkey22)	--> pkeys (pkey1, pkey2)
! --
! create trigger check_fkeys2_pkey_exist
! 	before insert or update on fkeys2
! 	for each row
! 	execute procedure
! 	check_primary_key ('fkey21', 'fkey22', 'pkeys', 'pkey1', 'pkey2');
! -- Test comments
! COMMENT ON TRIGGER check_fkeys2_pkey_bad ON fkeys2 IS 'wrong';
! ERROR:  trigger "check_fkeys2_pkey_bad" for table "fkeys2" does not exist
! COMMENT ON TRIGGER check_fkeys2_pkey_exist ON fkeys2 IS 'right';
! COMMENT ON TRIGGER check_fkeys2_pkey_exist ON fkeys2 IS NULL;
! --
! -- For pkeys:
! -- 	ON DELETE/UPDATE (pkey1, pkey2) CASCADE:
! -- 		fkeys (fkey1, fkey2) and fkeys2 (fkey21, fkey22)
! --
! create trigger check_pkeys_fkey_cascade
! 	before delete or update on pkeys
! 	for each row
! 	execute procedure
! 	check_foreign_key (2, 'cascade', 'pkey1', 'pkey2',
! 	'fkeys', 'fkey1', 'fkey2', 'fkeys2', 'fkey21', 'fkey22');
! --
! -- For fkeys2:
! -- 	ON DELETE/UPDATE (pkey23) RESTRICT:
! -- 		fkeys (fkey3)
! --
! create trigger check_fkeys2_fkey_restrict
! 	before delete or update on fkeys2
! 	for each row
! 	execute procedure check_foreign_key (1, 'restrict', 'pkey23', 'fkeys', 'fkey3');
! insert into fkeys2 values (10, '1', 1);
! insert into fkeys2 values (30, '3', 2);
! insert into fkeys2 values (40, '4', 5);
! insert into fkeys2 values (50, '5', 3);
! -- no key in pkeys
! insert into fkeys2 values (70, '5', 3);
! ERROR:  tuple references non-existent key
! DETAIL:  Trigger "check_fkeys2_pkey_exist" found tuple referencing non-existent key in "pkeys".
! insert into fkeys values (10, '1', 2);
! insert into fkeys values (30, '3', 3);
! insert into fkeys values (40, '4', 2);
! insert into fkeys values (50, '5', 2);
! -- no key in pkeys
! insert into fkeys values (70, '5', 1);
! ERROR:  tuple references non-existent key
! DETAIL:  Trigger "check_fkeys_pkey_exist" found tuple referencing non-existent key in "pkeys".
! -- no key in fkeys2
! insert into fkeys values (60, '6', 4);
! ERROR:  tuple references non-existent key
! DETAIL:  Trigger "check_fkeys_pkey2_exist" found tuple referencing non-existent key in "fkeys2".
! delete from pkeys where pkey1 = 30 and pkey2 = '3';
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
! ERROR:  "check_fkeys2_fkey_restrict": tuple is referenced in "fkeys"
! CONTEXT:  SQL statement "delete from fkeys2 where fkey21 = $1 and fkey22 = $2 "
! delete from pkeys where pkey1 = 40 and pkey2 = '4';
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted
! update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 50 and pkey2 = '5';
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
! ERROR:  "check_fkeys2_fkey_restrict": tuple is referenced in "fkeys"
! CONTEXT:  SQL statement "delete from fkeys2 where fkey21 = $1 and fkey22 = $2 "
! update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 10 and pkey2 = '1';
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted
! NOTICE:  check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted
! DROP TABLE pkeys;
! DROP TABLE fkeys;
! DROP TABLE fkeys2;
! -- -- I've disabled the funny_dup17 test because the new semantics
! -- -- of AFTER ROW triggers, which get now fired at the end of a
! -- -- query always, cause funny_dup17 to enter an endless loop.
! -- --
! -- --      Jan
! --
! -- create table dup17 (x int4);
! --
! -- create trigger dup17_before
! -- 	before insert on dup17
! -- 	for each row
! -- 	execute procedure
! -- 	funny_dup17 ()
! -- ;
! --
! -- insert into dup17 values (17);
! -- select count(*) from dup17;
! -- insert into dup17 values (17);
! -- select count(*) from dup17;
! --
! -- drop trigger dup17_before on dup17;
! --
! -- create trigger dup17_after
! -- 	after insert on dup17
! -- 	for each row
! -- 	execute procedure
! -- 	funny_dup17 ()
! -- ;
! -- insert into dup17 values (13);
! -- select count(*) from dup17 where x = 13;
! -- insert into dup17 values (13);
! -- select count(*) from dup17 where x = 13;
! --
! -- DROP TABLE dup17;
! create sequence ttdummy_seq increment 10 start 0 minvalue 0;
! create table tttest (
! 	price_id	int4,
! 	price_val	int4,
! 	price_on	int4,
! 	price_off	int4 default 999999
! );
! create trigger ttdummy
! 	before delete or update on tttest
! 	for each row
! 	execute procedure
! 	ttdummy (price_on, price_off);
! create trigger ttserial
! 	before insert or update on tttest
! 	for each row
! 	execute procedure
! 	autoinc (price_on, ttdummy_seq);
! insert into tttest values (1, 1, null);
! insert into tttest values (2, 2, null);
! insert into tttest values (3, 3, 0);
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         2 |         2 |       20 |    999999
!         3 |         3 |       30 |    999999
! (3 rows)
! 
! delete from tttest where price_id = 2;
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         3 |         3 |       30 |    999999
!         2 |         2 |       20 |        40
! (3 rows)
! 
! -- what do we see ?
! -- get current prices
! select * from tttest where price_off = 999999;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         3 |         3 |       30 |    999999
! (2 rows)
! 
! -- change price for price_id == 3
! update tttest set price_val = 30 where price_id = 3;
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         2 |         2 |       20 |        40
!         3 |        30 |       50 |    999999
!         3 |         3 |       30 |        50
! (4 rows)
! 
! -- now we want to change pric_id in ALL tuples
! -- this gets us not what we need
! update tttest set price_id = 5 where price_id = 3;
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         2 |         2 |       20 |        40
!         3 |         3 |       30 |        50
!         5 |        30 |       60 |    999999
!         3 |        30 |       50 |        60
! (5 rows)
! 
! -- restore data as before last update:
! select set_ttdummy(0);
!  set_ttdummy 
! -------------
!            1
! (1 row)
! 
! delete from tttest where price_id = 5;
! update tttest set price_off = 999999 where price_val = 30;
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         2 |         2 |       20 |        40
!         3 |         3 |       30 |        50
!         3 |        30 |       50 |    999999
! (4 rows)
! 
! -- and try change price_id now!
! update tttest set price_id = 5 where price_id = 3;
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         1 |         1 |       10 |    999999
!         2 |         2 |       20 |        40
!         5 |         3 |       30 |        50
!         5 |        30 |       50 |    999999
! (4 rows)
! 
! -- isn't it what we need ?
! select set_ttdummy(1);
!  set_ttdummy 
! -------------
!            0
! (1 row)
! 
! -- we want to correct some "date"
! update tttest set price_on = -1 where price_id = 1;
! ERROR:  ttdummy (tttest): you cannot change price_on and/or price_off columns (use set_ttdummy)
! -- but this doesn't work
! -- try in this way
! select set_ttdummy(0);
!  set_ttdummy 
! -------------
!            1
! (1 row)
! 
! update tttest set price_on = -1 where price_id = 1;
! select * from tttest;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         2 |         2 |       20 |        40
!         5 |         3 |       30 |        50
!         5 |        30 |       50 |    999999
!         1 |         1 |       -1 |    999999
! (4 rows)
! 
! -- isn't it what we need ?
! -- get price for price_id == 5 as it was @ "date" 35
! select * from tttest where price_on <= 35 and price_off > 35 and price_id = 5;
!  price_id | price_val | price_on | price_off 
! ----------+-----------+----------+-----------
!         5 |         3 |       30 |        50
! (1 row)
! 
! drop table tttest;
! drop sequence ttdummy_seq;
! --
! -- tests for per-statement triggers
! --
! CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
! CREATE TABLE main_table (a int, b int);
! COPY main_table (a,b) FROM stdin;
! CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '
! BEGIN
! 	RAISE NOTICE ''trigger_func(%) called: action = %, when = %, level = %'', TG_ARGV[0], TG_OP, TG_WHEN, TG_LEVEL;
! 	RETURN NULL;
! END;';
! CREATE TRIGGER before_ins_stmt_trig BEFORE INSERT ON main_table
! FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('before_ins_stmt');
! CREATE TRIGGER after_ins_stmt_trig AFTER INSERT ON main_table
! FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
! --
! -- if neither 'FOR EACH ROW' nor 'FOR EACH STATEMENT' was specified,
! -- CREATE TRIGGER should default to 'FOR EACH STATEMENT'
! --
! CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
! EXECUTE PROCEDURE trigger_func('after_upd_stmt');
! CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
! INSERT INTO main_table DEFAULT VALUES;
! NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
! UPDATE main_table SET a = a + 1 WHERE b < 30;
! NOTICE:  trigger_func(after_upd_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! -- UPDATE that effects zero rows should still call per-statement trigger
! UPDATE main_table SET a = a + 2 WHERE b > 100;
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! -- COPY should fire per-row and per-statement INSERT triggers
! COPY main_table (a, b) FROM stdin;
! NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
! SELECT * FROM main_table ORDER BY a, b;
!  a  | b  
! ----+----
!   6 | 10
!  21 | 20
!  30 | 40
!  31 | 10
!  50 | 35
!  50 | 60
!  81 | 15
!     |   
! (8 rows)
! 
! --
! -- test triggers with WHEN clause
! --
! CREATE TRIGGER modified_a BEFORE UPDATE OF a ON main_table
! FOR EACH ROW WHEN (OLD.a <> NEW.a) EXECUTE PROCEDURE trigger_func('modified_a');
! CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table
! FOR EACH ROW WHEN (OLD.* IS DISTINCT FROM NEW.*) EXECUTE PROCEDURE trigger_func('modified_any');
! CREATE TRIGGER insert_a AFTER INSERT ON main_table
! FOR EACH ROW WHEN (NEW.a = 123) EXECUTE PROCEDURE trigger_func('insert_a');
! CREATE TRIGGER delete_a AFTER DELETE ON main_table
! FOR EACH ROW WHEN (OLD.a = 123) EXECUTE PROCEDURE trigger_func('delete_a');
! CREATE TRIGGER insert_when BEFORE INSERT ON main_table
! FOR EACH STATEMENT WHEN (true) EXECUTE PROCEDURE trigger_func('insert_when');
! CREATE TRIGGER delete_when AFTER DELETE ON main_table
! FOR EACH STATEMENT WHEN (true) EXECUTE PROCEDURE trigger_func('delete_when');
! INSERT INTO main_table (a) VALUES (123), (456);
! NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(insert_when) called: action = INSERT, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(insert_a) called: action = INSERT, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
! COPY main_table FROM stdin;
! NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(insert_when) called: action = INSERT, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(insert_a) called: action = INSERT, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
! DELETE FROM main_table WHERE a IN (123, 456);
! NOTICE:  trigger_func(delete_a) called: action = DELETE, when = AFTER, level = ROW
! NOTICE:  trigger_func(delete_a) called: action = DELETE, when = AFTER, level = ROW
! NOTICE:  trigger_func(delete_when) called: action = DELETE, when = AFTER, level = STATEMENT
! UPDATE main_table SET a = 50, b = 60;
! NOTICE:  trigger_func(modified_any) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(modified_any) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(modified_a) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(modified_a) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(modified_a) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(modified_a) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(modified_a) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(after_upd_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! SELECT * FROM main_table ORDER BY a, b;
!  a  | b  
! ----+----
!   6 | 10
!  21 | 20
!  30 | 40
!  31 | 10
!  50 | 35
!  50 | 60
!  81 | 15
!     |   
! (8 rows)
! 
! SELECT pg_get_triggerdef(oid, true) FROM pg_trigger WHERE tgrelid = 'main_table'::regclass AND tgname = 'modified_a';
!                                                              pg_get_triggerdef                                                              
! --------------------------------------------------------------------------------------------------------------------------------------------
!  CREATE TRIGGER modified_a BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN (old.a <> new.a) EXECUTE PROCEDURE trigger_func('modified_a')
! (1 row)
! 
! SELECT pg_get_triggerdef(oid, false) FROM pg_trigger WHERE tgrelid = 'main_table'::regclass AND tgname = 'modified_a';
!                                                               pg_get_triggerdef                                                               
! ----------------------------------------------------------------------------------------------------------------------------------------------
!  CREATE TRIGGER modified_a BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN ((old.a <> new.a)) EXECUTE PROCEDURE trigger_func('modified_a')
! (1 row)
! 
! SELECT pg_get_triggerdef(oid, true) FROM pg_trigger WHERE tgrelid = 'main_table'::regclass AND tgname = 'modified_any';
!                                                                       pg_get_triggerdef                                                                       
! --------------------------------------------------------------------------------------------------------------------------------------------------------------
!  CREATE TRIGGER modified_any BEFORE UPDATE OF a ON main_table FOR EACH ROW WHEN (old.* IS DISTINCT FROM new.*) EXECUTE PROCEDURE trigger_func('modified_any')
! (1 row)
! 
! DROP TRIGGER modified_a ON main_table;
! DROP TRIGGER modified_any ON main_table;
! DROP TRIGGER insert_a ON main_table;
! DROP TRIGGER delete_a ON main_table;
! DROP TRIGGER insert_when ON main_table;
! DROP TRIGGER delete_when ON main_table;
! -- Test column-level triggers
! DROP TRIGGER after_upd_row_trig ON main_table;
! CREATE TRIGGER before_upd_a_row_trig BEFORE UPDATE OF a ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_upd_a_row');
! CREATE TRIGGER after_upd_b_row_trig AFTER UPDATE OF b ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_b_row');
! CREATE TRIGGER after_upd_a_b_row_trig AFTER UPDATE OF a, b ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_a_b_row');
! CREATE TRIGGER before_upd_a_stmt_trig BEFORE UPDATE OF a ON main_table
! FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('before_upd_a_stmt');
! CREATE TRIGGER after_upd_b_stmt_trig AFTER UPDATE OF b ON main_table
! FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_upd_b_stmt');
! SELECT pg_get_triggerdef(oid) FROM pg_trigger WHERE tgrelid = 'main_table'::regclass AND tgname = 'after_upd_a_b_row_trig';
!                                                              pg_get_triggerdef                                                             
! -------------------------------------------------------------------------------------------------------------------------------------------
!  CREATE TRIGGER after_upd_a_b_row_trig AFTER UPDATE OF a, b ON main_table FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_a_b_row')
! (1 row)
! 
! UPDATE main_table SET a = 50;
! NOTICE:  trigger_func(before_upd_a_stmt) called: action = UPDATE, when = BEFORE, level = STATEMENT
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! UPDATE main_table SET b = 10;
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! NOTICE:  trigger_func(after_upd_b_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! --
! -- Test case for bug with BEFORE trigger followed by AFTER trigger with WHEN
! --
! CREATE TABLE some_t (some_col boolean NOT NULL);
! CREATE FUNCTION dummy_update_func() RETURNS trigger AS $$
! BEGIN
!   RAISE NOTICE 'dummy_update_func(%) called: action = %, old = %, new = %',
!     TG_ARGV[0], TG_OP, OLD, NEW;
!   RETURN NEW;
! END;
! $$ LANGUAGE plpgsql;
! CREATE TRIGGER some_trig_before BEFORE UPDATE ON some_t FOR EACH ROW
!   EXECUTE PROCEDURE dummy_update_func('before');
! CREATE TRIGGER some_trig_aftera AFTER UPDATE ON some_t FOR EACH ROW
!   WHEN (NOT OLD.some_col AND NEW.some_col)
!   EXECUTE PROCEDURE dummy_update_func('aftera');
! CREATE TRIGGER some_trig_afterb AFTER UPDATE ON some_t FOR EACH ROW
!   WHEN (NOT NEW.some_col)
!   EXECUTE PROCEDURE dummy_update_func('afterb');
! INSERT INTO some_t VALUES (TRUE);
! UPDATE some_t SET some_col = TRUE;
! NOTICE:  dummy_update_func(before) called: action = UPDATE, old = (t), new = (t)
! UPDATE some_t SET some_col = FALSE;
! NOTICE:  dummy_update_func(before) called: action = UPDATE, old = (t), new = (f)
! NOTICE:  dummy_update_func(afterb) called: action = UPDATE, old = (t), new = (f)
! UPDATE some_t SET some_col = TRUE;
! NOTICE:  dummy_update_func(before) called: action = UPDATE, old = (f), new = (t)
! NOTICE:  dummy_update_func(aftera) called: action = UPDATE, old = (f), new = (t)
! DROP TABLE some_t;
! -- bogus cases
! CREATE TRIGGER error_upd_and_col BEFORE UPDATE OR UPDATE OF a ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('error_upd_and_col');
! ERROR:  duplicate trigger events specified at or near "ON"
! LINE 1: ...ER error_upd_and_col BEFORE UPDATE OR UPDATE OF a ON main_ta...
!                                                              ^
! CREATE TRIGGER error_upd_a_a BEFORE UPDATE OF a, a ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('error_upd_a_a');
! ERROR:  column "a" specified more than once
! CREATE TRIGGER error_ins_a BEFORE INSERT OF a ON main_table
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('error_ins_a');
! ERROR:  syntax error at or near "OF"
! LINE 1: CREATE TRIGGER error_ins_a BEFORE INSERT OF a ON main_table
!                                                  ^
! CREATE TRIGGER error_ins_when BEFORE INSERT OR UPDATE ON main_table
! FOR EACH ROW WHEN (OLD.a <> NEW.a)
! EXECUTE PROCEDURE trigger_func('error_ins_old');
! ERROR:  INSERT trigger's WHEN condition cannot reference OLD values
! LINE 2: FOR EACH ROW WHEN (OLD.a <> NEW.a)
!                            ^
! CREATE TRIGGER error_del_when BEFORE DELETE OR UPDATE ON main_table
! FOR EACH ROW WHEN (OLD.a <> NEW.a)
! EXECUTE PROCEDURE trigger_func('error_del_new');
! ERROR:  DELETE trigger's WHEN condition cannot reference NEW values
! LINE 2: FOR EACH ROW WHEN (OLD.a <> NEW.a)
!                                     ^
! CREATE TRIGGER error_del_when BEFORE INSERT OR UPDATE ON main_table
! FOR EACH ROW WHEN (NEW.tableoid <> 0)
! EXECUTE PROCEDURE trigger_func('error_when_sys_column');
! ERROR:  BEFORE trigger's WHEN condition cannot reference NEW system columns
! LINE 2: FOR EACH ROW WHEN (NEW.tableoid <> 0)
!                            ^
! CREATE TRIGGER error_stmt_when BEFORE UPDATE OF a ON main_table
! FOR EACH STATEMENT WHEN (OLD.* IS DISTINCT FROM NEW.*)
! EXECUTE PROCEDURE trigger_func('error_stmt_when');
! ERROR:  statement trigger's WHEN condition cannot reference column values
! LINE 2: FOR EACH STATEMENT WHEN (OLD.* IS DISTINCT FROM NEW.*)
!                                  ^
! -- check dependency restrictions
! ALTER TABLE main_table DROP COLUMN b;
! ERROR:  cannot drop table main_table column b because other objects depend on it
! DETAIL:  trigger after_upd_b_row_trig on table main_table depends on table main_table column b
! trigger after_upd_a_b_row_trig on table main_table depends on table main_table column b
! trigger after_upd_b_stmt_trig on table main_table depends on table main_table column b
! HINT:  Use DROP ... CASCADE to drop the dependent objects too.
! -- this should succeed, but we'll roll it back to keep the triggers around
! begin;
! DROP TRIGGER after_upd_a_b_row_trig ON main_table;
! DROP TRIGGER after_upd_b_row_trig ON main_table;
! DROP TRIGGER after_upd_b_stmt_trig ON main_table;
! ALTER TABLE main_table DROP COLUMN b;
! rollback;
! -- Test enable/disable triggers
! create table trigtest (i serial primary key);
! -- test that disabling RI triggers works
! create table trigtest2 (i int references trigtest(i) on delete cascade);
! create function trigtest() returns trigger as $$
! begin
! 	raise notice '% % % %', TG_RELNAME, TG_OP, TG_WHEN, TG_LEVEL;
! 	return new;
! end;$$ language plpgsql;
! create trigger trigtest_b_row_tg before insert or update or delete on trigtest
! for each row execute procedure trigtest();
! create trigger trigtest_a_row_tg after insert or update or delete on trigtest
! for each row execute procedure trigtest();
! create trigger trigtest_b_stmt_tg before insert or update or delete on trigtest
! for each statement execute procedure trigtest();
! create trigger trigtest_a_stmt_tg after insert or update or delete on trigtest
! for each statement execute procedure trigtest();
! insert into trigtest default values;
! NOTICE:  trigtest INSERT BEFORE STATEMENT
! NOTICE:  trigtest INSERT BEFORE ROW
! NOTICE:  trigtest INSERT AFTER ROW
! NOTICE:  trigtest INSERT AFTER STATEMENT
! alter table trigtest disable trigger trigtest_b_row_tg;
! insert into trigtest default values;
! NOTICE:  trigtest INSERT BEFORE STATEMENT
! NOTICE:  trigtest INSERT AFTER ROW
! NOTICE:  trigtest INSERT AFTER STATEMENT
! alter table trigtest disable trigger user;
! insert into trigtest default values;
! alter table trigtest enable trigger trigtest_a_stmt_tg;
! insert into trigtest default values;
! NOTICE:  trigtest INSERT AFTER STATEMENT
! insert into trigtest2 values(1);
! insert into trigtest2 values(2);
! delete from trigtest where i=2;
! NOTICE:  trigtest DELETE AFTER STATEMENT
! select * from trigtest2;
!  i 
! ---
!  1
! (1 row)
! 
! alter table trigtest disable trigger all;
! delete from trigtest where i=1;
! select * from trigtest2;
!  i 
! ---
!  1
! (1 row)
! 
! -- ensure we still insert, even when all triggers are disabled
! insert into trigtest default values;
! select *  from trigtest;
!  i 
! ---
!  3
!  4
!  5
! (3 rows)
! 
! drop table trigtest2;
! drop table trigtest;
! -- dump trigger data
! CREATE TABLE trigger_test (
!         i int,
!         v varchar
! );
! CREATE OR REPLACE FUNCTION trigger_data()  RETURNS trigger
! LANGUAGE plpgsql AS $$
! 
! declare
! 
! 	argstr text;
! 	relid text;
! 
! begin
! 
! 	relid := TG_relid::regclass;
! 
! 	-- plpgsql can't discover its trigger data in a hash like perl and python
! 	-- can, or by a sort of reflection like tcl can,
! 	-- so we have to hard code the names.
! 	raise NOTICE 'TG_NAME: %', TG_name;
! 	raise NOTICE 'TG_WHEN: %', TG_when;
! 	raise NOTICE 'TG_LEVEL: %', TG_level;
! 	raise NOTICE 'TG_OP: %', TG_op;
! 	raise NOTICE 'TG_RELID::regclass: %', relid;
! 	raise NOTICE 'TG_RELNAME: %', TG_relname;
! 	raise NOTICE 'TG_TABLE_NAME: %', TG_table_name;
! 	raise NOTICE 'TG_TABLE_SCHEMA: %', TG_table_schema;
! 	raise NOTICE 'TG_NARGS: %', TG_nargs;
! 
! 	argstr := '[';
! 	for i in 0 .. TG_nargs - 1 loop
! 		if i > 0 then
! 			argstr := argstr || ', ';
! 		end if;
! 		argstr := argstr || TG_argv[i];
! 	end loop;
! 	argstr := argstr || ']';
! 	raise NOTICE 'TG_ARGV: %', argstr;
! 
! 	if TG_OP != 'INSERT' then
! 		raise NOTICE 'OLD: %', OLD;
! 	end if;
! 
! 	if TG_OP != 'DELETE' then
! 		raise NOTICE 'NEW: %', NEW;
! 	end if;
! 
! 	if TG_OP = 'DELETE' then
! 		return OLD;
! 	else
! 		return NEW;
! 	end if;
! 
! end;
! $$;
! CREATE TRIGGER show_trigger_data_trig
! BEFORE INSERT OR UPDATE OR DELETE ON trigger_test
! FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
! insert into trigger_test values(1,'insert');
! NOTICE:  TG_NAME: show_trigger_data_trig
! NOTICE:  TG_WHEN: BEFORE
! NOTICE:  TG_LEVEL: ROW
! NOTICE:  TG_OP: INSERT
! NOTICE:  TG_RELID::regclass: trigger_test
! NOTICE:  TG_RELNAME: trigger_test
! NOTICE:  TG_TABLE_NAME: trigger_test
! NOTICE:  TG_TABLE_SCHEMA: public
! NOTICE:  TG_NARGS: 2
! NOTICE:  TG_ARGV: [23, skidoo]
! NOTICE:  NEW: (1,insert)
! update trigger_test set v = 'update' where i = 1;
! NOTICE:  TG_NAME: show_trigger_data_trig
! NOTICE:  TG_WHEN: BEFORE
! NOTICE:  TG_LEVEL: ROW
! NOTICE:  TG_OP: UPDATE
! NOTICE:  TG_RELID::regclass: trigger_test
! NOTICE:  TG_RELNAME: trigger_test
! NOTICE:  TG_TABLE_NAME: trigger_test
! NOTICE:  TG_TABLE_SCHEMA: public
! NOTICE:  TG_NARGS: 2
! NOTICE:  TG_ARGV: [23, skidoo]
! NOTICE:  OLD: (1,insert)
! NOTICE:  NEW: (1,update)
! delete from trigger_test;
! NOTICE:  TG_NAME: show_trigger_data_trig
! NOTICE:  TG_WHEN: BEFORE
! NOTICE:  TG_LEVEL: ROW
! NOTICE:  TG_OP: DELETE
! NOTICE:  TG_RELID::regclass: trigger_test
! NOTICE:  TG_RELNAME: trigger_test
! NOTICE:  TG_TABLE_NAME: trigger_test
! NOTICE:  TG_TABLE_SCHEMA: public
! NOTICE:  TG_NARGS: 2
! NOTICE:  TG_ARGV: [23, skidoo]
! NOTICE:  OLD: (1,update)
! DROP TRIGGER show_trigger_data_trig on trigger_test;
! DROP FUNCTION trigger_data();
! DROP TABLE trigger_test;
! --
! -- Test use of row comparisons on OLD/NEW
! --
! CREATE TABLE trigger_test (f1 int, f2 text, f3 text);
! -- this is the obvious (and wrong...) way to compare rows
! CREATE FUNCTION mytrigger() RETURNS trigger LANGUAGE plpgsql as $$
! begin
! 	if row(old.*) = row(new.*) then
! 		raise notice 'row % not changed', new.f1;
! 	else
! 		raise notice 'row % changed', new.f1;
! 	end if;
! 	return new;
! end$$;
! CREATE TRIGGER t
! BEFORE UPDATE ON trigger_test
! FOR EACH ROW EXECUTE PROCEDURE mytrigger();
! INSERT INTO trigger_test VALUES(1, 'foo', 'bar');
! INSERT INTO trigger_test VALUES(2, 'baz', 'quux');
! UPDATE trigger_test SET f3 = 'bar';
! NOTICE:  row 1 not changed
! NOTICE:  row 2 changed
! UPDATE trigger_test SET f3 = NULL;
! NOTICE:  row 1 changed
! NOTICE:  row 2 changed
! -- this demonstrates that the above isn't really working as desired:
! UPDATE trigger_test SET f3 = NULL;
! NOTICE:  row 1 changed
! NOTICE:  row 2 changed
! -- the right way when considering nulls is
! CREATE OR REPLACE FUNCTION mytrigger() RETURNS trigger LANGUAGE plpgsql as $$
! begin
! 	if row(old.*) is distinct from row(new.*) then
! 		raise notice 'row % changed', new.f1;
! 	else
! 		raise notice 'row % not changed', new.f1;
! 	end if;
! 	return new;
! end$$;
! UPDATE trigger_test SET f3 = 'bar';
! NOTICE:  row 1 changed
! NOTICE:  row 2 changed
! UPDATE trigger_test SET f3 = NULL;
! NOTICE:  row 1 changed
! NOTICE:  row 2 changed
! UPDATE trigger_test SET f3 = NULL;
! NOTICE:  row 1 not changed
! NOTICE:  row 2 not changed
! DROP TABLE trigger_test;
! DROP FUNCTION mytrigger();
! -- Test snapshot management in serializable transactions involving triggers
! -- per bug report in 6bc73d4c0910042358k3d1adff3qa36f8df75198ecea@mail.gmail.com
! CREATE FUNCTION serializable_update_trig() RETURNS trigger LANGUAGE plpgsql AS
! $$
! declare
! 	rec record;
! begin
! 	new.description = 'updated in trigger';
! 	return new;
! end;
! $$;
! CREATE TABLE serializable_update_tab (
! 	id int,
! 	filler  text,
! 	description text
! );
! CREATE TRIGGER serializable_update_trig BEFORE UPDATE ON serializable_update_tab
! 	FOR EACH ROW EXECUTE PROCEDURE serializable_update_trig();
! INSERT INTO serializable_update_tab SELECT a, repeat('xyzxz', 100), 'new'
! 	FROM generate_series(1, 50) a;
! BEGIN;
! SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! UPDATE serializable_update_tab SET description = 'no no', id = 1 WHERE id = 1;
! COMMIT;
! SELECT description FROM serializable_update_tab WHERE id = 1;
!     description     
! --------------------
!  updated in trigger
! (1 row)
! 
! DROP TABLE serializable_update_tab;
! -- minimal update trigger
! CREATE TABLE min_updates_test (
! 	f1	text,
! 	f2 int,
! 	f3 int);
! CREATE TABLE min_updates_test_oids (
! 	f1	text,
! 	f2 int,
! 	f3 int) WITH OIDS;
! INSERT INTO min_updates_test VALUES ('a',1,2),('b','2',null);
! INSERT INTO min_updates_test_oids VALUES ('a',1,2),('b','2',null);
! CREATE TRIGGER z_min_update
! BEFORE UPDATE ON min_updates_test
! FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
! CREATE TRIGGER z_min_update
! BEFORE UPDATE ON min_updates_test_oids
! FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
! \set QUIET false
! UPDATE min_updates_test SET f1 = f1;
! UPDATE 0
! UPDATE min_updates_test SET f2 = f2 + 1;
! UPDATE 2
! UPDATE min_updates_test SET f3 = 2 WHERE f3 is null;
! UPDATE 1
! UPDATE min_updates_test_oids SET f1 = f1;
! UPDATE 0
! UPDATE min_updates_test_oids SET f2 = f2 + 1;
! UPDATE 2
! UPDATE min_updates_test_oids SET f3 = 2 WHERE f3 is null;
! UPDATE 1
! \set QUIET true
! SELECT * FROM min_updates_test;
!  f1 | f2 | f3 
! ----+----+----
!  a  |  2 |  2
!  b  |  3 |  2
! (2 rows)
! 
! SELECT * FROM min_updates_test_oids;
!  f1 | f2 | f3 
! ----+----+----
!  a  |  2 |  2
!  b  |  3 |  2
! (2 rows)
! 
! DROP TABLE min_updates_test;
! DROP TABLE min_updates_test_oids;
! --
! -- Test triggers on views
! --
! CREATE VIEW main_view AS SELECT a, b FROM main_table;
! -- VIEW trigger function
! CREATE OR REPLACE FUNCTION view_trigger() RETURNS trigger
! LANGUAGE plpgsql AS $$
! declare
!     argstr text := '';
! begin
!     for i in 0 .. TG_nargs - 1 loop
!         if i > 0 then
!             argstr := argstr || ', ';
!         end if;
!         argstr := argstr || TG_argv[i];
!     end loop;
! 
!     raise notice '% % % % (%)', TG_RELNAME, TG_WHEN, TG_OP, TG_LEVEL, argstr;
! 
!     if TG_LEVEL = 'ROW' then
!         if TG_OP = 'INSERT' then
!             raise NOTICE 'NEW: %', NEW;
!             INSERT INTO main_table VALUES (NEW.a, NEW.b);
!             RETURN NEW;
!         end if;
! 
!         if TG_OP = 'UPDATE' then
!             raise NOTICE 'OLD: %, NEW: %', OLD, NEW;
!             UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b;
!             if NOT FOUND then RETURN NULL; end if;
!             RETURN NEW;
!         end if;
! 
!         if TG_OP = 'DELETE' then
!             raise NOTICE 'OLD: %', OLD;
!             DELETE FROM main_table WHERE a = OLD.a AND b = OLD.b;
!             if NOT FOUND then RETURN NULL; end if;
!             RETURN OLD;
!         end if;
!     end if;
! 
!     RETURN NULL;
! end;
! $$;
! -- Before row triggers aren't allowed on views
! CREATE TRIGGER invalid_trig BEFORE INSERT ON main_view
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_ins_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have row-level BEFORE or AFTER triggers.
! CREATE TRIGGER invalid_trig BEFORE UPDATE ON main_view
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_upd_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have row-level BEFORE or AFTER triggers.
! CREATE TRIGGER invalid_trig BEFORE DELETE ON main_view
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_del_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have row-level BEFORE or AFTER triggers.
! -- After row triggers aren't allowed on views
! CREATE TRIGGER invalid_trig AFTER INSERT ON main_view
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_ins_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have row-level BEFORE or AFTER triggers.
! CREATE TRIGGER invalid_trig AFTER UPDATE ON main_view
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_upd_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have row-level BEFORE or AFTER triggers.
! CREATE TRIGGER invalid_trig AFTER DELETE ON main_view
! FOR EACH ROW EXECUTE PROCEDURE trigger_func('before_del_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have row-level BEFORE or AFTER triggers.
! -- Truncate triggers aren't allowed on views
! CREATE TRIGGER invalid_trig BEFORE TRUNCATE ON main_view
! EXECUTE PROCEDURE trigger_func('before_tru_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have TRUNCATE triggers.
! CREATE TRIGGER invalid_trig AFTER TRUNCATE ON main_view
! EXECUTE PROCEDURE trigger_func('before_tru_row');
! ERROR:  "main_view" is a view
! DETAIL:  Views cannot have TRUNCATE triggers.
! -- INSTEAD OF triggers aren't allowed on tables
! CREATE TRIGGER invalid_trig INSTEAD OF INSERT ON main_table
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_ins');
! ERROR:  "main_table" is a table
! DETAIL:  Tables cannot have INSTEAD OF triggers.
! CREATE TRIGGER invalid_trig INSTEAD OF UPDATE ON main_table
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_upd');
! ERROR:  "main_table" is a table
! DETAIL:  Tables cannot have INSTEAD OF triggers.
! CREATE TRIGGER invalid_trig INSTEAD OF DELETE ON main_table
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_del');
! ERROR:  "main_table" is a table
! DETAIL:  Tables cannot have INSTEAD OF triggers.
! -- Don't support WHEN clauses with INSTEAD OF triggers
! CREATE TRIGGER invalid_trig INSTEAD OF UPDATE ON main_view
! FOR EACH ROW WHEN (OLD.a <> NEW.a) EXECUTE PROCEDURE view_trigger('instead_of_upd');
! ERROR:  INSTEAD OF triggers cannot have WHEN conditions
! -- Don't support column-level INSTEAD OF triggers
! CREATE TRIGGER invalid_trig INSTEAD OF UPDATE OF a ON main_view
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_upd');
! ERROR:  INSTEAD OF triggers cannot have column lists
! -- Don't support statement-level INSTEAD OF triggers
! CREATE TRIGGER invalid_trig INSTEAD OF UPDATE ON main_view
! EXECUTE PROCEDURE view_trigger('instead_of_upd');
! ERROR:  INSTEAD OF triggers must be FOR EACH ROW
! -- Valid INSTEAD OF triggers
! CREATE TRIGGER instead_of_insert_trig INSTEAD OF INSERT ON main_view
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_ins');
! CREATE TRIGGER instead_of_update_trig INSTEAD OF UPDATE ON main_view
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_upd');
! CREATE TRIGGER instead_of_delete_trig INSTEAD OF DELETE ON main_view
! FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_del');
! -- Valid BEFORE statement VIEW triggers
! CREATE TRIGGER before_ins_stmt_trig BEFORE INSERT ON main_view
! FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_ins_stmt');
! CREATE TRIGGER before_upd_stmt_trig BEFORE UPDATE ON main_view
! FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_upd_stmt');
! CREATE TRIGGER before_del_stmt_trig BEFORE DELETE ON main_view
! FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_del_stmt');
! -- Valid AFTER statement VIEW triggers
! CREATE TRIGGER after_ins_stmt_trig AFTER INSERT ON main_view
! FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_ins_stmt');
! CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_view
! FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_upd_stmt');
! CREATE TRIGGER after_del_stmt_trig AFTER DELETE ON main_view
! FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_del_stmt');
! \set QUIET false
! -- Insert into view using trigger
! INSERT INTO main_view VALUES (20, 30);
! NOTICE:  main_view BEFORE INSERT STATEMENT (before_view_ins_stmt)
! NOTICE:  main_view INSTEAD OF INSERT ROW (instead_of_ins)
! NOTICE:  NEW: (20,30)
! NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
! CONTEXT:  SQL statement "INSERT INTO main_table VALUES (NEW.a, NEW.b)"
! PL/pgSQL function view_trigger() line 17 at SQL statement
! NOTICE:  trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "INSERT INTO main_table VALUES (NEW.a, NEW.b)"
! PL/pgSQL function view_trigger() line 17 at SQL statement
! NOTICE:  main_view AFTER INSERT STATEMENT (after_view_ins_stmt)
! INSERT 0 1
! INSERT INTO main_view VALUES (21, 31) RETURNING a, b;
! NOTICE:  main_view BEFORE INSERT STATEMENT (before_view_ins_stmt)
! NOTICE:  main_view INSTEAD OF INSERT ROW (instead_of_ins)
! NOTICE:  NEW: (21,31)
! NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
! CONTEXT:  SQL statement "INSERT INTO main_table VALUES (NEW.a, NEW.b)"
! PL/pgSQL function view_trigger() line 17 at SQL statement
! NOTICE:  trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "INSERT INTO main_table VALUES (NEW.a, NEW.b)"
! PL/pgSQL function view_trigger() line 17 at SQL statement
! NOTICE:  main_view AFTER INSERT STATEMENT (after_view_ins_stmt)
!  a  | b  
! ----+----
!  21 | 31
! (1 row)
! 
! INSERT 0 1
! -- Table trigger will prevent updates
! UPDATE main_view SET b = 31 WHERE a = 20;
! NOTICE:  main_view BEFORE UPDATE STATEMENT (before_view_upd_stmt)
! NOTICE:  main_view INSTEAD OF UPDATE ROW (instead_of_upd)
! NOTICE:  OLD: (20,30), NEW: (20,31)
! NOTICE:  trigger_func(before_upd_a_stmt) called: action = UPDATE, when = BEFORE, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_b_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  main_view AFTER UPDATE STATEMENT (after_view_upd_stmt)
! UPDATE 0
! UPDATE main_view SET b = 32 WHERE a = 21 AND b = 31 RETURNING a, b;
! NOTICE:  main_view BEFORE UPDATE STATEMENT (before_view_upd_stmt)
! NOTICE:  main_view INSTEAD OF UPDATE ROW (instead_of_upd)
! NOTICE:  OLD: (21,31), NEW: (21,32)
! NOTICE:  trigger_func(before_upd_a_stmt) called: action = UPDATE, when = BEFORE, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(before_upd_a_row) called: action = UPDATE, when = BEFORE, level = ROW
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_b_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  main_view AFTER UPDATE STATEMENT (after_view_upd_stmt)
!  a | b 
! ---+---
! (0 rows)
! 
! UPDATE 0
! -- Remove table trigger to allow updates
! DROP TRIGGER before_upd_a_row_trig ON main_table;
! DROP TRIGGER
! UPDATE main_view SET b = 31 WHERE a = 20;
! NOTICE:  main_view BEFORE UPDATE STATEMENT (before_view_upd_stmt)
! NOTICE:  main_view INSTEAD OF UPDATE ROW (instead_of_upd)
! NOTICE:  OLD: (20,30), NEW: (20,31)
! NOTICE:  trigger_func(before_upd_a_stmt) called: action = UPDATE, when = BEFORE, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_b_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  main_view AFTER UPDATE STATEMENT (after_view_upd_stmt)
! UPDATE 1
! UPDATE main_view SET b = 32 WHERE a = 21 AND b = 31 RETURNING a, b;
! NOTICE:  main_view BEFORE UPDATE STATEMENT (before_view_upd_stmt)
! NOTICE:  main_view INSTEAD OF UPDATE ROW (instead_of_upd)
! NOTICE:  OLD: (21,31), NEW: (21,32)
! NOTICE:  trigger_func(before_upd_a_stmt) called: action = UPDATE, when = BEFORE, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_a_b_row) called: action = UPDATE, when = AFTER, level = ROW
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_b_row) called: action = UPDATE, when = AFTER, level = ROW
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_b_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
! CONTEXT:  SQL statement "UPDATE main_table SET a = NEW.a, b = NEW.b WHERE a = OLD.a AND b = OLD.b"
! PL/pgSQL function view_trigger() line 23 at SQL statement
! NOTICE:  main_view AFTER UPDATE STATEMENT (after_view_upd_stmt)
!  a  | b  
! ----+----
!  21 | 32
! (1 row)
! 
! UPDATE 1
! -- Before and after stmt triggers should fire even when no rows are affected
! UPDATE main_view SET b = 0 WHERE false;
! NOTICE:  main_view BEFORE UPDATE STATEMENT (before_view_upd_stmt)
! NOTICE:  main_view AFTER UPDATE STATEMENT (after_view_upd_stmt)
! UPDATE 0
! -- Delete from view using trigger
! DELETE FROM main_view WHERE a IN (20,21);
! NOTICE:  main_view BEFORE DELETE STATEMENT (before_view_del_stmt)
! NOTICE:  main_view INSTEAD OF DELETE ROW (instead_of_del)
! NOTICE:  OLD: (21,10)
! NOTICE:  main_view INSTEAD OF DELETE ROW (instead_of_del)
! NOTICE:  OLD: (20,31)
! NOTICE:  main_view INSTEAD OF DELETE ROW (instead_of_del)
! NOTICE:  OLD: (21,32)
! NOTICE:  main_view AFTER DELETE STATEMENT (after_view_del_stmt)
! DELETE 3
! DELETE FROM main_view WHERE a = 31 RETURNING a, b;
! NOTICE:  main_view BEFORE DELETE STATEMENT (before_view_del_stmt)
! NOTICE:  main_view INSTEAD OF DELETE ROW (instead_of_del)
! NOTICE:  OLD: (31,10)
! NOTICE:  main_view AFTER DELETE STATEMENT (after_view_del_stmt)
!  a  | b  
! ----+----
!  31 | 10
! (1 row)
! 
! DELETE 1
! \set QUIET true
! -- Describe view should list triggers
! \d main_view
!    View "public.main_view"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  a      | integer | 
!  b      | integer | 
! Triggers:
!     after_del_stmt_trig AFTER DELETE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_del_stmt')
!     after_ins_stmt_trig AFTER INSERT ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_ins_stmt')
!     after_upd_stmt_trig AFTER UPDATE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_upd_stmt')
!     before_del_stmt_trig BEFORE DELETE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_del_stmt')
!     before_ins_stmt_trig BEFORE INSERT ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_ins_stmt')
!     before_upd_stmt_trig BEFORE UPDATE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_upd_stmt')
!     instead_of_delete_trig INSTEAD OF DELETE ON main_view FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_del')
!     instead_of_insert_trig INSTEAD OF INSERT ON main_view FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_ins')
!     instead_of_update_trig INSTEAD OF UPDATE ON main_view FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_upd')
! 
! -- Test dropping view triggers
! DROP TRIGGER instead_of_insert_trig ON main_view;
! DROP TRIGGER instead_of_delete_trig ON main_view;
! \d+ main_view
!                View "public.main_view"
!  Column |  Type   | Modifiers | Storage | Description 
! --------+---------+-----------+---------+-------------
!  a      | integer |           | plain   | 
!  b      | integer |           | plain   | 
! View definition:
!  SELECT main_table.a,
!     main_table.b
!    FROM main_table;
! Triggers:
!     after_del_stmt_trig AFTER DELETE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_del_stmt')
!     after_ins_stmt_trig AFTER INSERT ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_ins_stmt')
!     after_upd_stmt_trig AFTER UPDATE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('after_view_upd_stmt')
!     before_del_stmt_trig BEFORE DELETE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_del_stmt')
!     before_ins_stmt_trig BEFORE INSERT ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_ins_stmt')
!     before_upd_stmt_trig BEFORE UPDATE ON main_view FOR EACH STATEMENT EXECUTE PROCEDURE view_trigger('before_view_upd_stmt')
!     instead_of_update_trig INSTEAD OF UPDATE ON main_view FOR EACH ROW EXECUTE PROCEDURE view_trigger('instead_of_upd')
! 
! DROP VIEW main_view;
! --
! -- Test triggers on a join view
! --
! CREATE TABLE country_table (
!     country_id        serial primary key,
!     country_name    text unique not null,
!     continent        text not null
! );
! INSERT INTO country_table (country_name, continent)
!     VALUES ('Japan', 'Asia'),
!            ('UK', 'Europe'),
!            ('USA', 'North America')
!     RETURNING *;
!  country_id | country_name |   continent   
! ------------+--------------+---------------
!           1 | Japan        | Asia
!           2 | UK           | Europe
!           3 | USA          | North America
! (3 rows)
! 
! CREATE TABLE city_table (
!     city_id        serial primary key,
!     city_name    text not null,
!     population    bigint,
!     country_id    int references country_table
! );
! CREATE VIEW city_view AS
!     SELECT city_id, city_name, population, country_name, continent
!     FROM city_table ci
!     LEFT JOIN country_table co ON co.country_id = ci.country_id;
! CREATE FUNCTION city_insert() RETURNS trigger LANGUAGE plpgsql AS $$
! declare
!     ctry_id int;
! begin
!     if NEW.country_name IS NOT NULL then
!         SELECT country_id, continent INTO ctry_id, NEW.continent
!             FROM country_table WHERE country_name = NEW.country_name;
!         if NOT FOUND then
!             raise exception 'No such country: "%"', NEW.country_name;
!         end if;
!     else
!         NEW.continent := NULL;
!     end if;
! 
!     if NEW.city_id IS NOT NULL then
!         INSERT INTO city_table
!             VALUES(NEW.city_id, NEW.city_name, NEW.population, ctry_id);
!     else
!         INSERT INTO city_table(city_name, population, country_id)
!             VALUES(NEW.city_name, NEW.population, ctry_id)
!             RETURNING city_id INTO NEW.city_id;
!     end if;
! 
!     RETURN NEW;
! end;
! $$;
! CREATE TRIGGER city_insert_trig INSTEAD OF INSERT ON city_view
! FOR EACH ROW EXECUTE PROCEDURE city_insert();
! CREATE FUNCTION city_delete() RETURNS trigger LANGUAGE plpgsql AS $$
! begin
!     DELETE FROM city_table WHERE city_id = OLD.city_id;
!     if NOT FOUND then RETURN NULL; end if;
!     RETURN OLD;
! end;
! $$;
! CREATE TRIGGER city_delete_trig INSTEAD OF DELETE ON city_view
! FOR EACH ROW EXECUTE PROCEDURE city_delete();
! CREATE FUNCTION city_update() RETURNS trigger LANGUAGE plpgsql AS $$
! declare
!     ctry_id int;
! begin
!     if NEW.country_name IS DISTINCT FROM OLD.country_name then
!         SELECT country_id, continent INTO ctry_id, NEW.continent
!             FROM country_table WHERE country_name = NEW.country_name;
!         if NOT FOUND then
!             raise exception 'No such country: "%"', NEW.country_name;
!         end if;
! 
!         UPDATE city_table SET city_name = NEW.city_name,
!                               population = NEW.population,
!                               country_id = ctry_id
!             WHERE city_id = OLD.city_id;
!     else
!         UPDATE city_table SET city_name = NEW.city_name,
!                               population = NEW.population
!             WHERE city_id = OLD.city_id;
!         NEW.continent := OLD.continent;
!     end if;
! 
!     if NOT FOUND then RETURN NULL; end if;
!     RETURN NEW;
! end;
! $$;
! CREATE TRIGGER city_update_trig INSTEAD OF UPDATE ON city_view
! FOR EACH ROW EXECUTE PROCEDURE city_update();
! \set QUIET false
! -- INSERT .. RETURNING
! INSERT INTO city_view(city_name) VALUES('Tokyo') RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        1 | Tokyo     |            |              | 
! (1 row)
! 
! INSERT 0 1
! INSERT INTO city_view(city_name, population) VALUES('London', 7556900) RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        2 | London    |    7556900 |              | 
! (1 row)
! 
! INSERT 0 1
! INSERT INTO city_view(city_name, country_name) VALUES('Washington DC', 'USA') RETURNING *;
!  city_id |   city_name   | population | country_name |   continent   
! ---------+---------------+------------+--------------+---------------
!        3 | Washington DC |            | USA          | North America
! (1 row)
! 
! INSERT 0 1
! INSERT INTO city_view(city_id, city_name) VALUES(123456, 'New York') RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!   123456 | New York  |            |              | 
! (1 row)
! 
! INSERT 0 1
! INSERT INTO city_view VALUES(234567, 'Birmingham', 1016800, 'UK', 'EU') RETURNING *;
!  city_id | city_name  | population | country_name | continent 
! ---------+------------+------------+--------------+-----------
!   234567 | Birmingham |    1016800 | UK           | Europe
! (1 row)
! 
! INSERT 0 1
! -- UPDATE .. RETURNING
! UPDATE city_view SET country_name = 'Japon' WHERE city_name = 'Tokyo'; -- error
! ERROR:  No such country: "Japon"
! UPDATE city_view SET country_name = 'Japan' WHERE city_name = 'Takyo'; -- no match
! UPDATE 0
! UPDATE city_view SET country_name = 'Japan' WHERE city_name = 'Tokyo' RETURNING *; -- OK
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        1 | Tokyo     |            | Japan        | Asia
! (1 row)
! 
! UPDATE 1
! UPDATE city_view SET population = 13010279 WHERE city_name = 'Tokyo' RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        1 | Tokyo     |   13010279 | Japan        | Asia
! (1 row)
! 
! UPDATE 1
! UPDATE city_view SET country_name = 'UK' WHERE city_name = 'New York' RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!   123456 | New York  |            | UK           | Europe
! (1 row)
! 
! UPDATE 1
! UPDATE city_view SET country_name = 'USA', population = 8391881 WHERE city_name = 'New York' RETURNING *;
!  city_id | city_name | population | country_name |   continent   
! ---------+-----------+------------+--------------+---------------
!   123456 | New York  |    8391881 | USA          | North America
! (1 row)
! 
! UPDATE 1
! UPDATE city_view SET continent = 'EU' WHERE continent = 'Europe' RETURNING *;
!  city_id | city_name  | population | country_name | continent 
! ---------+------------+------------+--------------+-----------
!   234567 | Birmingham |    1016800 | UK           | Europe
! (1 row)
! 
! UPDATE 1
! UPDATE city_view v1 SET country_name = v2.country_name FROM city_view v2
!     WHERE v2.city_name = 'Birmingham' AND v1.city_name = 'London' RETURNING *;
!  city_id | city_name | population | country_name | continent | city_id | city_name  | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------+---------+------------+------------+--------------+-----------
!        2 | London    |    7556900 | UK           | Europe    |  234567 | Birmingham |    1016800 | UK           | Europe
! (1 row)
! 
! UPDATE 1
! -- DELETE .. RETURNING
! DELETE FROM city_view WHERE city_name = 'Birmingham' RETURNING *;
!  city_id | city_name  | population | country_name | continent 
! ---------+------------+------------+--------------+-----------
!   234567 | Birmingham |    1016800 | UK           | Europe
! (1 row)
! 
! DELETE 1
! \set QUIET true
! -- read-only view with WHERE clause
! CREATE VIEW european_city_view AS
!     SELECT * FROM city_view WHERE continent = 'Europe';
! SELECT count(*) FROM european_city_view;
!  count 
! -------
!      1
! (1 row)
! 
! CREATE FUNCTION no_op_trig_fn() RETURNS trigger LANGUAGE plpgsql
! AS 'begin RETURN NULL; end';
! CREATE TRIGGER no_op_trig INSTEAD OF INSERT OR UPDATE OR DELETE
! ON european_city_view FOR EACH ROW EXECUTE PROCEDURE no_op_trig_fn();
! \set QUIET false
! INSERT INTO european_city_view VALUES (0, 'x', 10000, 'y', 'z');
! INSERT 0 0
! UPDATE european_city_view SET population = 10000;
! UPDATE 0
! DELETE FROM european_city_view;
! DELETE 0
! \set QUIET true
! -- rules bypassing no-op triggers
! CREATE RULE european_city_insert_rule AS ON INSERT TO european_city_view
! DO INSTEAD INSERT INTO city_view
! VALUES (NEW.city_id, NEW.city_name, NEW.population, NEW.country_name, NEW.continent)
! RETURNING *;
! CREATE RULE european_city_update_rule AS ON UPDATE TO european_city_view
! DO INSTEAD UPDATE city_view SET
!     city_name = NEW.city_name,
!     population = NEW.population,
!     country_name = NEW.country_name
! WHERE city_id = OLD.city_id
! RETURNING NEW.*;
! CREATE RULE european_city_delete_rule AS ON DELETE TO european_city_view
! DO INSTEAD DELETE FROM city_view WHERE city_id = OLD.city_id RETURNING *;
! \set QUIET false
! -- INSERT not limited by view's WHERE clause, but UPDATE AND DELETE are
! INSERT INTO european_city_view(city_name, country_name)
!     VALUES ('Cambridge', 'USA') RETURNING *;
!  city_id | city_name | population | country_name |   continent   
! ---------+-----------+------------+--------------+---------------
!        4 | Cambridge |            | USA          | North America
! (1 row)
! 
! INSERT 0 1
! UPDATE european_city_view SET country_name = 'UK'
!     WHERE city_name = 'Cambridge';
! UPDATE 0
! DELETE FROM european_city_view WHERE city_name = 'Cambridge';
! DELETE 0
! -- UPDATE and DELETE via rule and trigger
! UPDATE city_view SET country_name = 'UK'
!     WHERE city_name = 'Cambridge' RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        4 | Cambridge |            | UK           | Europe
! (1 row)
! 
! UPDATE 1
! UPDATE european_city_view SET population = 122800
!     WHERE city_name = 'Cambridge' RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        4 | Cambridge |     122800 | UK           | Europe
! (1 row)
! 
! UPDATE 1
! DELETE FROM european_city_view WHERE city_name = 'Cambridge' RETURNING *;
!  city_id | city_name | population | country_name | continent 
! ---------+-----------+------------+--------------+-----------
!        4 | Cambridge |     122800 | UK           | Europe
! (1 row)
! 
! DELETE 1
! -- join UPDATE test
! UPDATE city_view v SET population = 599657
!     FROM city_table ci, country_table co
!     WHERE ci.city_name = 'Washington DC' and co.country_name = 'USA'
!     AND v.city_id = ci.city_id AND v.country_name = co.country_name
!     RETURNING co.country_id, v.country_name,
!               v.city_id, v.city_name, v.population;
!  country_id | country_name | city_id |   city_name   | population 
! ------------+--------------+---------+---------------+------------
!           3 | USA          |       3 | Washington DC |     599657
! (1 row)
! 
! UPDATE 1
! \set QUIET true
! SELECT * FROM city_view;
!  city_id |   city_name   | population | country_name |   continent   
! ---------+---------------+------------+--------------+---------------
!        1 | Tokyo         |   13010279 | Japan        | Asia
!   123456 | New York      |    8391881 | USA          | North America
!        2 | London        |    7556900 | UK           | Europe
!        3 | Washington DC |     599657 | USA          | North America
! (4 rows)
! 
! DROP TABLE city_table CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view city_view
! drop cascades to view european_city_view
! DROP TABLE country_table;
! -- Test pg_trigger_depth()
! create table depth_a (id int not null primary key);
! create table depth_b (id int not null primary key);
! create table depth_c (id int not null primary key);
! create function depth_a_tf() returns trigger
!   language plpgsql as $$
! begin
!   raise notice '%: depth = %', tg_name, pg_trigger_depth();
!   insert into depth_b values (new.id);
!   raise notice '%: depth = %', tg_name, pg_trigger_depth();
!   return new;
! end;
! $$;
! create trigger depth_a_tr before insert on depth_a
!   for each row execute procedure depth_a_tf();
! create function depth_b_tf() returns trigger
!   language plpgsql as $$
! begin
!   raise notice '%: depth = %', tg_name, pg_trigger_depth();
!   begin
!     execute 'insert into depth_c values (' || new.id::text || ')';
!   exception
!     when sqlstate 'U9999' then
!       raise notice 'SQLSTATE = U9999: depth = %', pg_trigger_depth();
!   end;
!   raise notice '%: depth = %', tg_name, pg_trigger_depth();
!   if new.id = 1 then
!     execute 'insert into depth_c values (' || new.id::text || ')';
!   end if;
!   return new;
! end;
! $$;
! create trigger depth_b_tr before insert on depth_b
!   for each row execute procedure depth_b_tf();
! create function depth_c_tf() returns trigger
!   language plpgsql as $$
! begin
!   raise notice '%: depth = %', tg_name, pg_trigger_depth();
!   if new.id = 1 then
!     raise exception sqlstate 'U9999';
!   end if;
!   raise notice '%: depth = %', tg_name, pg_trigger_depth();
!   return new;
! end;
! $$;
! create trigger depth_c_tr before insert on depth_c
!   for each row execute procedure depth_c_tf();
! select pg_trigger_depth();
!  pg_trigger_depth 
! ------------------
!                 0
! (1 row)
! 
! insert into depth_a values (1);
! NOTICE:  depth_a_tr: depth = 1
! NOTICE:  depth_b_tr: depth = 2
! CONTEXT:  SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_c_tr: depth = 3
! CONTEXT:  SQL statement "insert into depth_c values (1)"
! PL/pgSQL function depth_b_tf() line 5 at EXECUTE statement
! SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  SQLSTATE = U9999: depth = 2
! CONTEXT:  SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_b_tr: depth = 2
! CONTEXT:  SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_c_tr: depth = 3
! CONTEXT:  SQL statement "insert into depth_c values (1)"
! PL/pgSQL function depth_b_tf() line 12 at EXECUTE statement
! SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! ERROR:  U9999
! CONTEXT:  SQL statement "insert into depth_c values (1)"
! PL/pgSQL function depth_b_tf() line 12 at EXECUTE statement
! SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! select pg_trigger_depth();
!  pg_trigger_depth 
! ------------------
!                 0
! (1 row)
! 
! insert into depth_a values (2);
! NOTICE:  depth_a_tr: depth = 1
! NOTICE:  depth_b_tr: depth = 2
! CONTEXT:  SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_c_tr: depth = 3
! CONTEXT:  SQL statement "insert into depth_c values (2)"
! PL/pgSQL function depth_b_tf() line 5 at EXECUTE statement
! SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_c_tr: depth = 3
! CONTEXT:  SQL statement "insert into depth_c values (2)"
! PL/pgSQL function depth_b_tf() line 5 at EXECUTE statement
! SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_b_tr: depth = 2
! CONTEXT:  SQL statement "insert into depth_b values (new.id)"
! PL/pgSQL function depth_a_tf() line 4 at SQL statement
! NOTICE:  depth_a_tr: depth = 1
! select pg_trigger_depth();
!  pg_trigger_depth 
! ------------------
!                 0
! (1 row)
! 
! drop table depth_a, depth_b, depth_c;
! drop function depth_a_tf();
! drop function depth_b_tf();
! drop function depth_c_tf();
! --
! -- Test updates to rows during firing of BEFORE ROW triggers.
! -- As of 9.2, such cases should be rejected (see bug #6123).
! --
! create temp table parent (
!     aid int not null primary key,
!     val1 text,
!     val2 text,
!     val3 text,
!     val4 text,
!     bcnt int not null default 0);
! create temp table child (
!     bid int not null primary key,
!     aid int not null,
!     val1 text);
! create function parent_upd_func()
!   returns trigger language plpgsql as
! $$
! begin
!   if old.val1 <> new.val1 then
!     new.val2 = new.val1;
!     delete from child where child.aid = new.aid and child.val1 = new.val1;
!   end if;
!   return new;
! end;
! $$;
! create trigger parent_upd_trig before update on parent
!   for each row execute procedure parent_upd_func();
! create function parent_del_func()
!   returns trigger language plpgsql as
! $$
! begin
!   delete from child where aid = old.aid;
!   return old;
! end;
! $$;
! create trigger parent_del_trig before delete on parent
!   for each row execute procedure parent_del_func();
! create function child_ins_func()
!   returns trigger language plpgsql as
! $$
! begin
!   update parent set bcnt = bcnt + 1 where aid = new.aid;
!   return new;
! end;
! $$;
! create trigger child_ins_trig after insert on child
!   for each row execute procedure child_ins_func();
! create function child_del_func()
!   returns trigger language plpgsql as
! $$
! begin
!   update parent set bcnt = bcnt - 1 where aid = old.aid;
!   return old;
! end;
! $$;
! create trigger child_del_trig after delete on child
!   for each row execute procedure child_del_func();
! insert into parent values (1, 'a', 'a', 'a', 'a', 0);
! insert into child values (10, 1, 'b');
! select * from parent; select * from child;
!  aid | val1 | val2 | val3 | val4 | bcnt 
! -----+------+------+------+------+------
!    1 | a    | a    | a    | a    |    1
! (1 row)
! 
!  bid | aid | val1 
! -----+-----+------
!   10 |   1 | b
! (1 row)
! 
! update parent set val1 = 'b' where aid = 1; -- should fail
! ERROR:  tuple to be updated was already modified by an operation triggered by the current command
! HINT:  Consider using an AFTER trigger instead of a BEFORE trigger to propagate changes to other rows.
! select * from parent; select * from child;
!  aid | val1 | val2 | val3 | val4 | bcnt 
! -----+------+------+------+------+------
!    1 | a    | a    | a    | a    |    1
! (1 row)
! 
!  bid | aid | val1 
! -----+-----+------
!   10 |   1 | b
! (1 row)
! 
! delete from parent where aid = 1; -- should fail
! ERROR:  tuple to be updated was already modified by an operation triggered by the current command
! HINT:  Consider using an AFTER trigger instead of a BEFORE trigger to propagate changes to other rows.
! select * from parent; select * from child;
!  aid | val1 | val2 | val3 | val4 | bcnt 
! -----+------+------+------+------+------
!    1 | a    | a    | a    | a    |    1
! (1 row)
! 
!  bid | aid | val1 
! -----+-----+------
!   10 |   1 | b
! (1 row)
! 
! -- replace the trigger function with one that restarts the deletion after
! -- having modified a child
! create or replace function parent_del_func()
!   returns trigger language plpgsql as
! $$
! begin
!   delete from child where aid = old.aid;
!   if found then
!     delete from parent where aid = old.aid;
!     return null; -- cancel outer deletion
!   end if;
!   return old;
! end;
! $$;
! delete from parent where aid = 1;
! select * from parent; select * from child;
!  aid | val1 | val2 | val3 | val4 | bcnt 
! -----+------+------+------+------+------
! (0 rows)
! 
!  bid | aid | val1 
! -----+-----+------
! (0 rows)
! 
! drop table parent, child;
! drop function parent_upd_func();
! drop function parent_del_func();
! drop function child_ins_func();
! drop function child_del_func();
! -- similar case, but with a self-referencing FK so that parent and child
! -- rows can be affected by a single operation
! create temp table self_ref_trigger (
!     id int primary key,
!     parent int references self_ref_trigger,
!     data text,
!     nchildren int not null default 0
! );
! create function self_ref_trigger_ins_func()
!   returns trigger language plpgsql as
! $$
! begin
!   if new.parent is not null then
!     update self_ref_trigger set nchildren = nchildren + 1
!       where id = new.parent;
!   end if;
!   return new;
! end;
! $$;
! create trigger self_ref_trigger_ins_trig before insert on self_ref_trigger
!   for each row execute procedure self_ref_trigger_ins_func();
! create function self_ref_trigger_del_func()
!   returns trigger language plpgsql as
! $$
! begin
!   if old.parent is not null then
!     update self_ref_trigger set nchildren = nchildren - 1
!       where id = old.parent;
!   end if;
!   return old;
! end;
! $$;
! create trigger self_ref_trigger_del_trig before delete on self_ref_trigger
!   for each row execute procedure self_ref_trigger_del_func();
! insert into self_ref_trigger values (1, null, 'root');
! insert into self_ref_trigger values (2, 1, 'root child A');
! insert into self_ref_trigger values (3, 1, 'root child B');
! insert into self_ref_trigger values (4, 2, 'grandchild 1');
! insert into self_ref_trigger values (5, 3, 'grandchild 2');
! update self_ref_trigger set data = 'root!' where id = 1;
! select * from self_ref_trigger;
!  id | parent |     data     | nchildren 
! ----+--------+--------------+-----------
!   2 |      1 | root child A |         1
!   4 |      2 | grandchild 1 |         0
!   3 |      1 | root child B |         1
!   5 |      3 | grandchild 2 |         0
!   1 |        | root!        |         2
! (5 rows)
! 
! delete from self_ref_trigger;
! ERROR:  tuple to be updated was already modified by an operation triggered by the current command
! HINT:  Consider using an AFTER trigger instead of a BEFORE trigger to propagate changes to other rows.
! select * from self_ref_trigger;
!  id | parent |     data     | nchildren 
! ----+--------+--------------+-----------
!   2 |      1 | root child A |         1
!   4 |      2 | grandchild 1 |         0
!   3 |      1 | root child B |         1
!   5 |      3 | grandchild 2 |         0
!   1 |        | root!        |         2
! (5 rows)
! 
! drop table self_ref_trigger;
! drop function self_ref_trigger_ins_func();
! drop function self_ref_trigger_del_func();
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/inherit.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/inherit.out	Tue Oct 28 15:53:05 2014
***************
*** 1,1457 ****
! --
! -- Test inheritance features
! --
! CREATE TABLE a (aa TEXT);
! CREATE TABLE b (bb TEXT) INHERITS (a);
! CREATE TABLE c (cc TEXT) INHERITS (a);
! CREATE TABLE d (dd TEXT) INHERITS (b,c,a);
! NOTICE:  merging multiple inherited definitions of column "aa"
! NOTICE:  merging multiple inherited definitions of column "aa"
! INSERT INTO a(aa) VALUES('aaa');
! INSERT INTO a(aa) VALUES('aaaa');
! INSERT INTO a(aa) VALUES('aaaaa');
! INSERT INTO a(aa) VALUES('aaaaaa');
! INSERT INTO a(aa) VALUES('aaaaaaa');
! INSERT INTO a(aa) VALUES('aaaaaaaa');
! INSERT INTO b(aa) VALUES('bbb');
! INSERT INTO b(aa) VALUES('bbbb');
! INSERT INTO b(aa) VALUES('bbbbb');
! INSERT INTO b(aa) VALUES('bbbbbb');
! INSERT INTO b(aa) VALUES('bbbbbbb');
! INSERT INTO b(aa) VALUES('bbbbbbbb');
! INSERT INTO c(aa) VALUES('ccc');
! INSERT INTO c(aa) VALUES('cccc');
! INSERT INTO c(aa) VALUES('ccccc');
! INSERT INTO c(aa) VALUES('cccccc');
! INSERT INTO c(aa) VALUES('ccccccc');
! INSERT INTO c(aa) VALUES('cccccccc');
! INSERT INTO d(aa) VALUES('ddd');
! INSERT INTO d(aa) VALUES('dddd');
! INSERT INTO d(aa) VALUES('ddddd');
! INSERT INTO d(aa) VALUES('dddddd');
! INSERT INTO d(aa) VALUES('ddddddd');
! INSERT INTO d(aa) VALUES('dddddddd');
! SELECT relname, a.* FROM a, pg_class where a.tableoid = pg_class.oid;
!  relname |    aa    
! ---------+----------
!  a       | aaa
!  a       | aaaa
!  a       | aaaaa
!  a       | aaaaaa
!  a       | aaaaaaa
!  a       | aaaaaaaa
!  b       | bbb
!  b       | bbbb
!  b       | bbbbb
!  b       | bbbbbb
!  b       | bbbbbbb
!  b       | bbbbbbbb
!  c       | ccc
!  c       | cccc
!  c       | ccccc
!  c       | cccccc
!  c       | ccccccc
!  c       | cccccccc
!  d       | ddd
!  d       | dddd
!  d       | ddddd
!  d       | dddddd
!  d       | ddddddd
!  d       | dddddddd
! (24 rows)
! 
! SELECT relname, b.* FROM b, pg_class where b.tableoid = pg_class.oid;
!  relname |    aa    | bb 
! ---------+----------+----
!  b       | bbb      | 
!  b       | bbbb     | 
!  b       | bbbbb    | 
!  b       | bbbbbb   | 
!  b       | bbbbbbb  | 
!  b       | bbbbbbbb | 
!  d       | ddd      | 
!  d       | dddd     | 
!  d       | ddddd    | 
!  d       | dddddd   | 
!  d       | ddddddd  | 
!  d       | dddddddd | 
! (12 rows)
! 
! SELECT relname, c.* FROM c, pg_class where c.tableoid = pg_class.oid;
!  relname |    aa    | cc 
! ---------+----------+----
!  c       | ccc      | 
!  c       | cccc     | 
!  c       | ccccc    | 
!  c       | cccccc   | 
!  c       | ccccccc  | 
!  c       | cccccccc | 
!  d       | ddd      | 
!  d       | dddd     | 
!  d       | ddddd    | 
!  d       | dddddd   | 
!  d       | ddddddd  | 
!  d       | dddddddd | 
! (12 rows)
! 
! SELECT relname, d.* FROM d, pg_class where d.tableoid = pg_class.oid;
!  relname |    aa    | bb | cc | dd 
! ---------+----------+----+----+----
!  d       | ddd      |    |    | 
!  d       | dddd     |    |    | 
!  d       | ddddd    |    |    | 
!  d       | dddddd   |    |    | 
!  d       | ddddddd  |    |    | 
!  d       | dddddddd |    |    | 
! (6 rows)
! 
! SELECT relname, a.* FROM ONLY a, pg_class where a.tableoid = pg_class.oid;
!  relname |    aa    
! ---------+----------
!  a       | aaa
!  a       | aaaa
!  a       | aaaaa
!  a       | aaaaaa
!  a       | aaaaaaa
!  a       | aaaaaaaa
! (6 rows)
! 
! SELECT relname, b.* FROM ONLY b, pg_class where b.tableoid = pg_class.oid;
!  relname |    aa    | bb 
! ---------+----------+----
!  b       | bbb      | 
!  b       | bbbb     | 
!  b       | bbbbb    | 
!  b       | bbbbbb   | 
!  b       | bbbbbbb  | 
!  b       | bbbbbbbb | 
! (6 rows)
! 
! SELECT relname, c.* FROM ONLY c, pg_class where c.tableoid = pg_class.oid;
!  relname |    aa    | cc 
! ---------+----------+----
!  c       | ccc      | 
!  c       | cccc     | 
!  c       | ccccc    | 
!  c       | cccccc   | 
!  c       | ccccccc  | 
!  c       | cccccccc | 
! (6 rows)
! 
! SELECT relname, d.* FROM ONLY d, pg_class where d.tableoid = pg_class.oid;
!  relname |    aa    | bb | cc | dd 
! ---------+----------+----+----+----
!  d       | ddd      |    |    | 
!  d       | dddd     |    |    | 
!  d       | ddddd    |    |    | 
!  d       | dddddd   |    |    | 
!  d       | ddddddd  |    |    | 
!  d       | dddddddd |    |    | 
! (6 rows)
! 
! UPDATE a SET aa='zzzz' WHERE aa='aaaa';
! UPDATE ONLY a SET aa='zzzzz' WHERE aa='aaaaa';
! UPDATE b SET aa='zzz' WHERE aa='aaa';
! UPDATE ONLY b SET aa='zzz' WHERE aa='aaa';
! UPDATE a SET aa='zzzzzz' WHERE aa LIKE 'aaa%';
! SELECT relname, a.* FROM a, pg_class where a.tableoid = pg_class.oid;
!  relname |    aa    
! ---------+----------
!  a       | zzzz
!  a       | zzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  b       | bbb
!  b       | bbbb
!  b       | bbbbb
!  b       | bbbbbb
!  b       | bbbbbbb
!  b       | bbbbbbbb
!  c       | ccc
!  c       | cccc
!  c       | ccccc
!  c       | cccccc
!  c       | ccccccc
!  c       | cccccccc
!  d       | ddd
!  d       | dddd
!  d       | ddddd
!  d       | dddddd
!  d       | ddddddd
!  d       | dddddddd
! (24 rows)
! 
! SELECT relname, b.* FROM b, pg_class where b.tableoid = pg_class.oid;
!  relname |    aa    | bb 
! ---------+----------+----
!  b       | bbb      | 
!  b       | bbbb     | 
!  b       | bbbbb    | 
!  b       | bbbbbb   | 
!  b       | bbbbbbb  | 
!  b       | bbbbbbbb | 
!  d       | ddd      | 
!  d       | dddd     | 
!  d       | ddddd    | 
!  d       | dddddd   | 
!  d       | ddddddd  | 
!  d       | dddddddd | 
! (12 rows)
! 
! SELECT relname, c.* FROM c, pg_class where c.tableoid = pg_class.oid;
!  relname |    aa    | cc 
! ---------+----------+----
!  c       | ccc      | 
!  c       | cccc     | 
!  c       | ccccc    | 
!  c       | cccccc   | 
!  c       | ccccccc  | 
!  c       | cccccccc | 
!  d       | ddd      | 
!  d       | dddd     | 
!  d       | ddddd    | 
!  d       | dddddd   | 
!  d       | ddddddd  | 
!  d       | dddddddd | 
! (12 rows)
! 
! SELECT relname, d.* FROM d, pg_class where d.tableoid = pg_class.oid;
!  relname |    aa    | bb | cc | dd 
! ---------+----------+----+----+----
!  d       | ddd      |    |    | 
!  d       | dddd     |    |    | 
!  d       | ddddd    |    |    | 
!  d       | dddddd   |    |    | 
!  d       | ddddddd  |    |    | 
!  d       | dddddddd |    |    | 
! (6 rows)
! 
! SELECT relname, a.* FROM ONLY a, pg_class where a.tableoid = pg_class.oid;
!  relname |   aa   
! ---------+--------
!  a       | zzzz
!  a       | zzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
! (6 rows)
! 
! SELECT relname, b.* FROM ONLY b, pg_class where b.tableoid = pg_class.oid;
!  relname |    aa    | bb 
! ---------+----------+----
!  b       | bbb      | 
!  b       | bbbb     | 
!  b       | bbbbb    | 
!  b       | bbbbbb   | 
!  b       | bbbbbbb  | 
!  b       | bbbbbbbb | 
! (6 rows)
! 
! SELECT relname, c.* FROM ONLY c, pg_class where c.tableoid = pg_class.oid;
!  relname |    aa    | cc 
! ---------+----------+----
!  c       | ccc      | 
!  c       | cccc     | 
!  c       | ccccc    | 
!  c       | cccccc   | 
!  c       | ccccccc  | 
!  c       | cccccccc | 
! (6 rows)
! 
! SELECT relname, d.* FROM ONLY d, pg_class where d.tableoid = pg_class.oid;
!  relname |    aa    | bb | cc | dd 
! ---------+----------+----+----+----
!  d       | ddd      |    |    | 
!  d       | dddd     |    |    | 
!  d       | ddddd    |    |    | 
!  d       | dddddd   |    |    | 
!  d       | ddddddd  |    |    | 
!  d       | dddddddd |    |    | 
! (6 rows)
! 
! UPDATE b SET aa='new';
! SELECT relname, a.* FROM a, pg_class where a.tableoid = pg_class.oid;
!  relname |    aa    
! ---------+----------
!  a       | zzzz
!  a       | zzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  b       | new
!  b       | new
!  b       | new
!  b       | new
!  b       | new
!  b       | new
!  c       | ccc
!  c       | cccc
!  c       | ccccc
!  c       | cccccc
!  c       | ccccccc
!  c       | cccccccc
!  d       | new
!  d       | new
!  d       | new
!  d       | new
!  d       | new
!  d       | new
! (24 rows)
! 
! SELECT relname, b.* FROM b, pg_class where b.tableoid = pg_class.oid;
!  relname | aa  | bb 
! ---------+-----+----
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
! (12 rows)
! 
! SELECT relname, c.* FROM c, pg_class where c.tableoid = pg_class.oid;
!  relname |    aa    | cc 
! ---------+----------+----
!  c       | ccc      | 
!  c       | cccc     | 
!  c       | ccccc    | 
!  c       | cccccc   | 
!  c       | ccccccc  | 
!  c       | cccccccc | 
!  d       | new      | 
!  d       | new      | 
!  d       | new      | 
!  d       | new      | 
!  d       | new      | 
!  d       | new      | 
! (12 rows)
! 
! SELECT relname, d.* FROM d, pg_class where d.tableoid = pg_class.oid;
!  relname | aa  | bb | cc | dd 
! ---------+-----+----+----+----
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
! (6 rows)
! 
! SELECT relname, a.* FROM ONLY a, pg_class where a.tableoid = pg_class.oid;
!  relname |   aa   
! ---------+--------
!  a       | zzzz
!  a       | zzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
!  a       | zzzzzz
! (6 rows)
! 
! SELECT relname, b.* FROM ONLY b, pg_class where b.tableoid = pg_class.oid;
!  relname | aa  | bb 
! ---------+-----+----
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
! (6 rows)
! 
! SELECT relname, c.* FROM ONLY c, pg_class where c.tableoid = pg_class.oid;
!  relname |    aa    | cc 
! ---------+----------+----
!  c       | ccc      | 
!  c       | cccc     | 
!  c       | ccccc    | 
!  c       | cccccc   | 
!  c       | ccccccc  | 
!  c       | cccccccc | 
! (6 rows)
! 
! SELECT relname, d.* FROM ONLY d, pg_class where d.tableoid = pg_class.oid;
!  relname | aa  | bb | cc | dd 
! ---------+-----+----+----+----
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
! (6 rows)
! 
! UPDATE a SET aa='new';
! DELETE FROM ONLY c WHERE aa='new';
! SELECT relname, a.* FROM a, pg_class where a.tableoid = pg_class.oid;
!  relname | aa  
! ---------+-----
!  a       | new
!  a       | new
!  a       | new
!  a       | new
!  a       | new
!  a       | new
!  b       | new
!  b       | new
!  b       | new
!  b       | new
!  b       | new
!  b       | new
!  d       | new
!  d       | new
!  d       | new
!  d       | new
!  d       | new
!  d       | new
! (18 rows)
! 
! SELECT relname, b.* FROM b, pg_class where b.tableoid = pg_class.oid;
!  relname | aa  | bb 
! ---------+-----+----
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
! (12 rows)
! 
! SELECT relname, c.* FROM c, pg_class where c.tableoid = pg_class.oid;
!  relname | aa  | cc 
! ---------+-----+----
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
!  d       | new | 
! (6 rows)
! 
! SELECT relname, d.* FROM d, pg_class where d.tableoid = pg_class.oid;
!  relname | aa  | bb | cc | dd 
! ---------+-----+----+----+----
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
! (6 rows)
! 
! SELECT relname, a.* FROM ONLY a, pg_class where a.tableoid = pg_class.oid;
!  relname | aa  
! ---------+-----
!  a       | new
!  a       | new
!  a       | new
!  a       | new
!  a       | new
!  a       | new
! (6 rows)
! 
! SELECT relname, b.* FROM ONLY b, pg_class where b.tableoid = pg_class.oid;
!  relname | aa  | bb 
! ---------+-----+----
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
!  b       | new | 
! (6 rows)
! 
! SELECT relname, c.* FROM ONLY c, pg_class where c.tableoid = pg_class.oid;
!  relname | aa | cc 
! ---------+----+----
! (0 rows)
! 
! SELECT relname, d.* FROM ONLY d, pg_class where d.tableoid = pg_class.oid;
!  relname | aa  | bb | cc | dd 
! ---------+-----+----+----+----
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
!  d       | new |    |    | 
! (6 rows)
! 
! DELETE FROM a;
! SELECT relname, a.* FROM a, pg_class where a.tableoid = pg_class.oid;
!  relname | aa 
! ---------+----
! (0 rows)
! 
! SELECT relname, b.* FROM b, pg_class where b.tableoid = pg_class.oid;
!  relname | aa | bb 
! ---------+----+----
! (0 rows)
! 
! SELECT relname, c.* FROM c, pg_class where c.tableoid = pg_class.oid;
!  relname | aa | cc 
! ---------+----+----
! (0 rows)
! 
! SELECT relname, d.* FROM d, pg_class where d.tableoid = pg_class.oid;
!  relname | aa | bb | cc | dd 
! ---------+----+----+----+----
! (0 rows)
! 
! SELECT relname, a.* FROM ONLY a, pg_class where a.tableoid = pg_class.oid;
!  relname | aa 
! ---------+----
! (0 rows)
! 
! SELECT relname, b.* FROM ONLY b, pg_class where b.tableoid = pg_class.oid;
!  relname | aa | bb 
! ---------+----+----
! (0 rows)
! 
! SELECT relname, c.* FROM ONLY c, pg_class where c.tableoid = pg_class.oid;
!  relname | aa | cc 
! ---------+----+----
! (0 rows)
! 
! SELECT relname, d.* FROM ONLY d, pg_class where d.tableoid = pg_class.oid;
!  relname | aa | bb | cc | dd 
! ---------+----+----+----+----
! (0 rows)
! 
! -- Confirm PRIMARY KEY adds NOT NULL constraint to child table
! CREATE TEMP TABLE z (b TEXT, PRIMARY KEY(aa, b)) inherits (a);
! INSERT INTO z VALUES (NULL, 'text'); -- should fail
! ERROR:  null value in column "aa" violates not-null constraint
! DETAIL:  Failing row contains (null, text).
! -- Check UPDATE with inherited target and an inherited source table
! create temp table foo(f1 int, f2 int);
! create temp table foo2(f3 int) inherits (foo);
! create temp table bar(f1 int, f2 int);
! create temp table bar2(f3 int) inherits (bar);
! insert into foo values(1,1);
! insert into foo values(3,3);
! insert into foo2 values(2,2,2);
! insert into foo2 values(3,3,3);
! insert into bar values(1,1);
! insert into bar values(2,2);
! insert into bar values(3,3);
! insert into bar values(4,4);
! insert into bar2 values(1,1,1);
! insert into bar2 values(2,2,2);
! insert into bar2 values(3,3,3);
! insert into bar2 values(4,4,4);
! update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
! select tableoid::regclass::text as relname, bar.* from bar order by 1,2;
!  relname | f1 | f2  
! ---------+----+-----
!  bar     |  1 | 101
!  bar     |  2 | 102
!  bar     |  3 | 103
!  bar     |  4 |   4
!  bar2    |  1 | 101
!  bar2    |  2 | 102
!  bar2    |  3 | 103
!  bar2    |  4 |   4
! (8 rows)
! 
! -- Check UPDATE with inherited target and an appendrel subquery
! update bar set f2 = f2 + 100
! from
!   ( select f1 from foo union all select f1+3 from foo ) ss
! where bar.f1 = ss.f1;
! select tableoid::regclass::text as relname, bar.* from bar order by 1,2;
!  relname | f1 | f2  
! ---------+----+-----
!  bar     |  1 | 201
!  bar     |  2 | 202
!  bar     |  3 | 203
!  bar     |  4 | 104
!  bar2    |  1 | 201
!  bar2    |  2 | 202
!  bar2    |  3 | 203
!  bar2    |  4 | 104
! (8 rows)
! 
! /* Test multiple inheritance of column defaults */
! CREATE TABLE firstparent (tomorrow date default now()::date + 1);
! CREATE TABLE secondparent (tomorrow date default  now() :: date  +  1);
! CREATE TABLE jointchild () INHERITS (firstparent, secondparent);  -- ok
! NOTICE:  merging multiple inherited definitions of column "tomorrow"
! CREATE TABLE thirdparent (tomorrow date default now()::date - 1);
! CREATE TABLE otherchild () INHERITS (firstparent, thirdparent);  -- not ok
! NOTICE:  merging multiple inherited definitions of column "tomorrow"
! ERROR:  column "tomorrow" inherits conflicting default values
! HINT:  To resolve the conflict, specify a default explicitly.
! CREATE TABLE otherchild (tomorrow date default now())
!   INHERITS (firstparent, thirdparent);  -- ok, child resolves ambiguous default
! NOTICE:  merging multiple inherited definitions of column "tomorrow"
! NOTICE:  merging column "tomorrow" with inherited definition
! DROP TABLE firstparent, secondparent, jointchild, thirdparent, otherchild;
! -- Test changing the type of inherited columns
! insert into d values('test','one','two','three');
! alter table a alter column aa type integer using bit_length(aa);
! select * from d;
!  aa | bb  | cc  |  dd   
! ----+-----+-----+-------
!  32 | one | two | three
! (1 row)
! 
! -- Test non-inheritable parent constraints
! create table p1(ff1 int);
! alter table p1 add constraint p1chk check (ff1 > 0) no inherit;
! alter table p1 add constraint p2chk check (ff1 > 10);
! -- connoinherit should be true for NO INHERIT constraint
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.connoinherit from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname = 'p1' order by 1,2;
!  relname | conname | contype | conislocal | coninhcount | connoinherit 
! ---------+---------+---------+------------+-------------+--------------
!  p1      | p1chk   | c       | t          |           0 | t
!  p1      | p2chk   | c       | t          |           0 | f
! (2 rows)
! 
! -- Test that child does not inherit NO INHERIT constraints
! create table c1 () inherits (p1);
! \d p1
!       Table "public.p1"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  ff1    | integer | 
! Check constraints:
!     "p1chk" CHECK (ff1 > 0) NO INHERIT
!     "p2chk" CHECK (ff1 > 10)
! Number of child tables: 1 (Use \d+ to list them.)
! 
! \d c1
!       Table "public.c1"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  ff1    | integer | 
! Check constraints:
!     "p2chk" CHECK (ff1 > 10)
! Inherits: p1
! 
! drop table p1 cascade;
! NOTICE:  drop cascades to table c1
! -- Tests for casting between the rowtypes of parent and child
! -- tables. See the pgsql-hackers thread beginning Dec. 4/04
! create table base (i integer);
! create table derived () inherits (base);
! insert into derived (i) values (0);
! select derived::base from derived;
!  derived 
! ---------
!  (0)
! (1 row)
! 
! drop table derived;
! drop table base;
! create table p1(ff1 int);
! create table p2(f1 text);
! create function p2text(p2) returns text as 'select $1.f1' language sql;
! create table c1(f3 int) inherits(p1,p2);
! insert into c1 values(123456789, 'hi', 42);
! select p2text(c1.*) from c1;
!  p2text 
! --------
!  hi
! (1 row)
! 
! drop function p2text(p2);
! drop table c1;
! drop table p2;
! drop table p1;
! CREATE TABLE ac (aa TEXT);
! alter table ac add constraint ac_check check (aa is not null);
! CREATE TABLE bc (bb TEXT) INHERITS (ac);
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname  | contype | conislocal | coninhcount |      consrc      
! ---------+----------+---------+------------+-------------+------------------
!  ac      | ac_check | c       | t          |           0 | (aa IS NOT NULL)
!  bc      | ac_check | c       | f          |           1 | (aa IS NOT NULL)
! (2 rows)
! 
! insert into ac (aa) values (NULL);
! ERROR:  new row for relation "ac" violates check constraint "ac_check"
! DETAIL:  Failing row contains (null).
! insert into bc (aa) values (NULL);
! ERROR:  new row for relation "bc" violates check constraint "ac_check"
! DETAIL:  Failing row contains (null, null).
! alter table bc drop constraint ac_check;  -- fail, disallowed
! ERROR:  cannot drop inherited constraint "ac_check" of relation "bc"
! alter table ac drop constraint ac_check;
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname | contype | conislocal | coninhcount | consrc 
! ---------+---------+---------+------------+-------------+--------
! (0 rows)
! 
! -- try the unnamed-constraint case
! alter table ac add check (aa is not null);
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname |   conname   | contype | conislocal | coninhcount |      consrc      
! ---------+-------------+---------+------------+-------------+------------------
!  ac      | ac_aa_check | c       | t          |           0 | (aa IS NOT NULL)
!  bc      | ac_aa_check | c       | f          |           1 | (aa IS NOT NULL)
! (2 rows)
! 
! insert into ac (aa) values (NULL);
! ERROR:  new row for relation "ac" violates check constraint "ac_aa_check"
! DETAIL:  Failing row contains (null).
! insert into bc (aa) values (NULL);
! ERROR:  new row for relation "bc" violates check constraint "ac_aa_check"
! DETAIL:  Failing row contains (null, null).
! alter table bc drop constraint ac_aa_check;  -- fail, disallowed
! ERROR:  cannot drop inherited constraint "ac_aa_check" of relation "bc"
! alter table ac drop constraint ac_aa_check;
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname | contype | conislocal | coninhcount | consrc 
! ---------+---------+---------+------------+-------------+--------
! (0 rows)
! 
! alter table ac add constraint ac_check check (aa is not null);
! alter table bc no inherit ac;
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname  | contype | conislocal | coninhcount |      consrc      
! ---------+----------+---------+------------+-------------+------------------
!  ac      | ac_check | c       | t          |           0 | (aa IS NOT NULL)
!  bc      | ac_check | c       | t          |           0 | (aa IS NOT NULL)
! (2 rows)
! 
! alter table bc drop constraint ac_check;
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname  | contype | conislocal | coninhcount |      consrc      
! ---------+----------+---------+------------+-------------+------------------
!  ac      | ac_check | c       | t          |           0 | (aa IS NOT NULL)
! (1 row)
! 
! alter table ac drop constraint ac_check;
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname | contype | conislocal | coninhcount | consrc 
! ---------+---------+---------+------------+-------------+--------
! (0 rows)
! 
! drop table bc;
! drop table ac;
! create table ac (a int constraint check_a check (a <> 0));
! create table bc (a int constraint check_a check (a <> 0), b int constraint check_b check (b <> 0)) inherits (ac);
! NOTICE:  merging column "a" with inherited definition
! NOTICE:  merging constraint "check_a" with inherited definition
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc') order by 1,2;
!  relname | conname | contype | conislocal | coninhcount |  consrc  
! ---------+---------+---------+------------+-------------+----------
!  ac      | check_a | c       | t          |           0 | (a <> 0)
!  bc      | check_a | c       | t          |           1 | (a <> 0)
!  bc      | check_b | c       | t          |           0 | (b <> 0)
! (3 rows)
! 
! drop table bc;
! drop table ac;
! create table ac (a int constraint check_a check (a <> 0));
! create table bc (b int constraint check_b check (b <> 0));
! create table cc (c int constraint check_c check (c <> 0)) inherits (ac, bc);
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc', 'cc') order by 1,2;
!  relname | conname | contype | conislocal | coninhcount |  consrc  
! ---------+---------+---------+------------+-------------+----------
!  ac      | check_a | c       | t          |           0 | (a <> 0)
!  bc      | check_b | c       | t          |           0 | (b <> 0)
!  cc      | check_a | c       | f          |           1 | (a <> 0)
!  cc      | check_b | c       | f          |           1 | (b <> 0)
!  cc      | check_c | c       | t          |           0 | (c <> 0)
! (5 rows)
! 
! alter table cc no inherit bc;
! select pc.relname, pgc.conname, pgc.contype, pgc.conislocal, pgc.coninhcount, pgc.consrc from pg_class as pc inner join pg_constraint as pgc on (pgc.conrelid = pc.oid) where pc.relname in ('ac', 'bc', 'cc') order by 1,2;
!  relname | conname | contype | conislocal | coninhcount |  consrc  
! ---------+---------+---------+------------+-------------+----------
!  ac      | check_a | c       | t          |           0 | (a <> 0)
!  bc      | check_b | c       | t          |           0 | (b <> 0)
!  cc      | check_a | c       | f          |           1 | (a <> 0)
!  cc      | check_b | c       | t          |           0 | (b <> 0)
!  cc      | check_c | c       | t          |           0 | (c <> 0)
! (5 rows)
! 
! drop table cc;
! drop table bc;
! drop table ac;
! create table p1(f1 int);
! create table p2(f2 int);
! create table c1(f3 int) inherits(p1,p2);
! insert into c1 values(1,-1,2);
! alter table p2 add constraint cc check (f2>0);  -- fail
! ERROR:  check constraint "cc" is violated by some row
! alter table p2 add check (f2>0);  -- check it without a name, too
! ERROR:  check constraint "p2_f2_check" is violated by some row
! delete from c1;
! insert into c1 values(1,1,2);
! alter table p2 add check (f2>0);
! insert into c1 values(1,-1,2);  -- fail
! ERROR:  new row for relation "c1" violates check constraint "p2_f2_check"
! DETAIL:  Failing row contains (1, -1, 2).
! create table c2(f3 int) inherits(p1,p2);
! \d c2
!       Table "public.c2"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  f1     | integer | 
!  f2     | integer | 
!  f3     | integer | 
! Check constraints:
!     "p2_f2_check" CHECK (f2 > 0)
! Inherits: p1,
!           p2
! 
! create table c3 (f4 int) inherits(c1,c2);
! NOTICE:  merging multiple inherited definitions of column "f1"
! NOTICE:  merging multiple inherited definitions of column "f2"
! NOTICE:  merging multiple inherited definitions of column "f3"
! \d c3
!       Table "public.c3"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  f1     | integer | 
!  f2     | integer | 
!  f3     | integer | 
!  f4     | integer | 
! Check constraints:
!     "p2_f2_check" CHECK (f2 > 0)
! Inherits: c1,
!           c2
! 
! drop table p1 cascade;
! NOTICE:  drop cascades to 3 other objects
! DETAIL:  drop cascades to table c1
! drop cascades to table c2
! drop cascades to table c3
! drop table p2 cascade;
! create table pp1 (f1 int);
! create table cc1 (f2 text, f3 int) inherits (pp1);
! alter table pp1 add column a1 int check (a1 > 0);
! \d cc1
!       Table "public.cc1"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  f1     | integer | 
!  f2     | text    | 
!  f3     | integer | 
!  a1     | integer | 
! Check constraints:
!     "pp1_a1_check" CHECK (a1 > 0)
! Inherits: pp1
! 
! create table cc2(f4 float) inherits(pp1,cc1);
! NOTICE:  merging multiple inherited definitions of column "f1"
! NOTICE:  merging multiple inherited definitions of column "a1"
! \d cc2
!           Table "public.cc2"
!  Column |       Type       | Modifiers 
! --------+------------------+-----------
!  f1     | integer          | 
!  a1     | integer          | 
!  f2     | text             | 
!  f3     | integer          | 
!  f4     | double precision | 
! Check constraints:
!     "pp1_a1_check" CHECK (a1 > 0)
! Inherits: pp1,
!           cc1
! 
! alter table pp1 add column a2 int check (a2 > 0);
! NOTICE:  merging definition of column "a2" for child "cc2"
! NOTICE:  merging constraint "pp1_a2_check" with inherited definition
! \d cc2
!           Table "public.cc2"
!  Column |       Type       | Modifiers 
! --------+------------------+-----------
!  f1     | integer          | 
!  a1     | integer          | 
!  f2     | text             | 
!  f3     | integer          | 
!  f4     | double precision | 
!  a2     | integer          | 
! Check constraints:
!     "pp1_a1_check" CHECK (a1 > 0)
!     "pp1_a2_check" CHECK (a2 > 0)
! Inherits: pp1,
!           cc1
! 
! drop table pp1 cascade;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to table cc1
! drop cascades to table cc2
! -- Test for renaming in simple multiple inheritance
! CREATE TABLE inht1 (a int, b int);
! CREATE TABLE inhs1 (b int, c int);
! CREATE TABLE inhts (d int) INHERITS (inht1, inhs1);
! NOTICE:  merging multiple inherited definitions of column "b"
! ALTER TABLE inht1 RENAME a TO aa;
! ALTER TABLE inht1 RENAME b TO bb;                -- to be failed
! ERROR:  cannot rename inherited column "b"
! ALTER TABLE inhts RENAME aa TO aaa;      -- to be failed
! ERROR:  cannot rename inherited column "aa"
! ALTER TABLE inhts RENAME d TO dd;
! \d+ inhts
!                         Table "public.inhts"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  aa     | integer |           | plain   |              | 
!  b      | integer |           | plain   |              | 
!  c      | integer |           | plain   |              | 
!  dd     | integer |           | plain   |              | 
! Inherits: inht1,
!           inhs1
! 
! DROP TABLE inhts;
! -- Test for renaming in diamond inheritance
! CREATE TABLE inht2 (x int) INHERITS (inht1);
! CREATE TABLE inht3 (y int) INHERITS (inht1);
! CREATE TABLE inht4 (z int) INHERITS (inht2, inht3);
! NOTICE:  merging multiple inherited definitions of column "aa"
! NOTICE:  merging multiple inherited definitions of column "b"
! ALTER TABLE inht1 RENAME aa TO aaa;
! \d+ inht4
!                         Table "public.inht4"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  aaa    | integer |           | plain   |              | 
!  b      | integer |           | plain   |              | 
!  x      | integer |           | plain   |              | 
!  y      | integer |           | plain   |              | 
!  z      | integer |           | plain   |              | 
! Inherits: inht2,
!           inht3
! 
! CREATE TABLE inhts (d int) INHERITS (inht2, inhs1);
! NOTICE:  merging multiple inherited definitions of column "b"
! ALTER TABLE inht1 RENAME aaa TO aaaa;
! ALTER TABLE inht1 RENAME b TO bb;                -- to be failed
! ERROR:  cannot rename inherited column "b"
! \d+ inhts
!                         Table "public.inhts"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  aaaa   | integer |           | plain   |              | 
!  b      | integer |           | plain   |              | 
!  x      | integer |           | plain   |              | 
!  c      | integer |           | plain   |              | 
!  d      | integer |           | plain   |              | 
! Inherits: inht2,
!           inhs1
! 
! WITH RECURSIVE r AS (
!   SELECT 'inht1'::regclass AS inhrelid
! UNION ALL
!   SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent
! )
! SELECT a.attrelid::regclass, a.attname, a.attinhcount, e.expected
!   FROM (SELECT inhrelid, count(*) AS expected FROM pg_inherits
!         WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid) e
!   JOIN pg_attribute a ON e.inhrelid = a.attrelid WHERE NOT attislocal
!   ORDER BY a.attrelid::regclass::name, a.attnum;
!  attrelid | attname | attinhcount | expected 
! ----------+---------+-------------+----------
!  inht2    | aaaa    |           1 |        1
!  inht2    | b       |           1 |        1
!  inht3    | aaaa    |           1 |        1
!  inht3    | b       |           1 |        1
!  inht4    | aaaa    |           2 |        2
!  inht4    | b       |           2 |        2
!  inht4    | x       |           1 |        2
!  inht4    | y       |           1 |        2
!  inhts    | aaaa    |           1 |        1
!  inhts    | b       |           2 |        1
!  inhts    | x       |           1 |        1
!  inhts    | c       |           1 |        1
! (12 rows)
! 
! DROP TABLE inht1, inhs1 CASCADE;
! NOTICE:  drop cascades to 4 other objects
! DETAIL:  drop cascades to table inht2
! drop cascades to table inhts
! drop cascades to table inht3
! drop cascades to table inht4
! -- Test non-inheritable indices [UNIQUE, EXCLUDE] contraints
! CREATE TABLE test_constraints (id int, val1 varchar, val2 int, UNIQUE(val1, val2));
! CREATE TABLE test_constraints_inh () INHERITS (test_constraints);
! \d+ test_constraints
!                         Table "public.test_constraints"
!  Column |       Type        | Modifiers | Storage  | Stats target | Description 
! --------+-------------------+-----------+----------+--------------+-------------
!  id     | integer           |           | plain    |              | 
!  val1   | character varying |           | extended |              | 
!  val2   | integer           |           | plain    |              | 
! Indexes:
!     "test_constraints_val1_val2_key" UNIQUE CONSTRAINT, btree (val1, val2)
! Child tables: test_constraints_inh
! 
! ALTER TABLE ONLY test_constraints DROP CONSTRAINT test_constraints_val1_val2_key;
! \d+ test_constraints
!                         Table "public.test_constraints"
!  Column |       Type        | Modifiers | Storage  | Stats target | Description 
! --------+-------------------+-----------+----------+--------------+-------------
!  id     | integer           |           | plain    |              | 
!  val1   | character varying |           | extended |              | 
!  val2   | integer           |           | plain    |              | 
! Child tables: test_constraints_inh
! 
! \d+ test_constraints_inh
!                       Table "public.test_constraints_inh"
!  Column |       Type        | Modifiers | Storage  | Stats target | Description 
! --------+-------------------+-----------+----------+--------------+-------------
!  id     | integer           |           | plain    |              | 
!  val1   | character varying |           | extended |              | 
!  val2   | integer           |           | plain    |              | 
! Inherits: test_constraints
! 
! DROP TABLE test_constraints_inh;
! DROP TABLE test_constraints;
! CREATE TABLE test_ex_constraints (
!     c circle,
!     EXCLUDE USING gist (c WITH &&)
! );
! CREATE TABLE test_ex_constraints_inh () INHERITS (test_ex_constraints);
! \d+ test_ex_constraints
!                  Table "public.test_ex_constraints"
!  Column |  Type  | Modifiers | Storage | Stats target | Description 
! --------+--------+-----------+---------+--------------+-------------
!  c      | circle |           | plain   |              | 
! Indexes:
!     "test_ex_constraints_c_excl" EXCLUDE USING gist (c WITH &&)
! Child tables: test_ex_constraints_inh
! 
! ALTER TABLE test_ex_constraints DROP CONSTRAINT test_ex_constraints_c_excl;
! \d+ test_ex_constraints
!                  Table "public.test_ex_constraints"
!  Column |  Type  | Modifiers | Storage | Stats target | Description 
! --------+--------+-----------+---------+--------------+-------------
!  c      | circle |           | plain   |              | 
! Child tables: test_ex_constraints_inh
! 
! \d+ test_ex_constraints_inh
!                Table "public.test_ex_constraints_inh"
!  Column |  Type  | Modifiers | Storage | Stats target | Description 
! --------+--------+-----------+---------+--------------+-------------
!  c      | circle |           | plain   |              | 
! Inherits: test_ex_constraints
! 
! DROP TABLE test_ex_constraints_inh;
! DROP TABLE test_ex_constraints;
! -- Test non-inheritable foreign key contraints
! CREATE TABLE test_primary_constraints(id int PRIMARY KEY);
! CREATE TABLE test_foreign_constraints(id1 int REFERENCES test_primary_constraints(id));
! CREATE TABLE test_foreign_constraints_inh () INHERITS (test_foreign_constraints);
! \d+ test_primary_constraints
!                Table "public.test_primary_constraints"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  id     | integer | not null  | plain   |              | 
! Indexes:
!     "test_primary_constraints_pkey" PRIMARY KEY, btree (id)
! Referenced by:
!     TABLE "test_foreign_constraints" CONSTRAINT "test_foreign_constraints_id1_fkey" FOREIGN KEY (id1) REFERENCES test_primary_constraints(id)
! 
! \d+ test_foreign_constraints
!                Table "public.test_foreign_constraints"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  id1    | integer |           | plain   |              | 
! Foreign-key constraints:
!     "test_foreign_constraints_id1_fkey" FOREIGN KEY (id1) REFERENCES test_primary_constraints(id)
! Child tables: test_foreign_constraints_inh
! 
! ALTER TABLE test_foreign_constraints DROP CONSTRAINT test_foreign_constraints_id1_fkey;
! \d+ test_foreign_constraints
!                Table "public.test_foreign_constraints"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  id1    | integer |           | plain   |              | 
! Child tables: test_foreign_constraints_inh
! 
! \d+ test_foreign_constraints_inh
!              Table "public.test_foreign_constraints_inh"
!  Column |  Type   | Modifiers | Storage | Stats target | Description 
! --------+---------+-----------+---------+--------------+-------------
!  id1    | integer |           | plain   |              | 
! Inherits: test_foreign_constraints
! 
! DROP TABLE test_foreign_constraints_inh;
! DROP TABLE test_foreign_constraints;
! DROP TABLE test_primary_constraints;
! --
! -- Test parameterized append plans for inheritance trees
! --
! create temp table patest0 (id, x) as
!   select x, x from generate_series(0,1000) x;
! create temp table patest1() inherits (patest0);
! insert into patest1
!   select x, x from generate_series(0,1000) x;
! create temp table patest2() inherits (patest0);
! insert into patest2
!   select x, x from generate_series(0,1000) x;
! create index patest0i on patest0(id);
! create index patest1i on patest1(id);
! create index patest2i on patest2(id);
! analyze patest0;
! analyze patest1;
! analyze patest2;
! explain (costs off)
! select * from patest0 join (select f1 from int4_tbl limit 1) ss on id = f1;
!                     QUERY PLAN                    
! --------------------------------------------------
!  Nested Loop
!    ->  Limit
!          ->  Seq Scan on int4_tbl
!    ->  Append
!          ->  Index Scan using patest0i on patest0
!                Index Cond: (id = int4_tbl.f1)
!          ->  Index Scan using patest1i on patest1
!                Index Cond: (id = int4_tbl.f1)
!          ->  Index Scan using patest2i on patest2
!                Index Cond: (id = int4_tbl.f1)
! (10 rows)
! 
! select * from patest0 join (select f1 from int4_tbl limit 1) ss on id = f1;
!  id | x | f1 
! ----+---+----
!   0 | 0 |  0
!   0 | 0 |  0
!   0 | 0 |  0
! (3 rows)
! 
! drop index patest2i;
! explain (costs off)
! select * from patest0 join (select f1 from int4_tbl limit 1) ss on id = f1;
!                     QUERY PLAN                    
! --------------------------------------------------
!  Nested Loop
!    ->  Limit
!          ->  Seq Scan on int4_tbl
!    ->  Append
!          ->  Index Scan using patest0i on patest0
!                Index Cond: (id = int4_tbl.f1)
!          ->  Index Scan using patest1i on patest1
!                Index Cond: (id = int4_tbl.f1)
!          ->  Seq Scan on patest2
!                Filter: (int4_tbl.f1 = id)
! (10 rows)
! 
! select * from patest0 join (select f1 from int4_tbl limit 1) ss on id = f1;
!  id | x | f1 
! ----+---+----
!   0 | 0 |  0
!   0 | 0 |  0
!   0 | 0 |  0
! (3 rows)
! 
! drop table patest0 cascade;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to table patest1
! drop cascades to table patest2
! --
! -- Test merge-append plans for inheritance trees
! --
! create table matest0 (id serial primary key, name text);
! create table matest1 (id integer primary key) inherits (matest0);
! NOTICE:  merging column "id" with inherited definition
! create table matest2 (id integer primary key) inherits (matest0);
! NOTICE:  merging column "id" with inherited definition
! create table matest3 (id integer primary key) inherits (matest0);
! NOTICE:  merging column "id" with inherited definition
! create index matest0i on matest0 ((1-id));
! create index matest1i on matest1 ((1-id));
! -- create index matest2i on matest2 ((1-id));  -- intentionally missing
! create index matest3i on matest3 ((1-id));
! insert into matest1 (name) values ('Test 1');
! insert into matest1 (name) values ('Test 2');
! insert into matest2 (name) values ('Test 3');
! insert into matest2 (name) values ('Test 4');
! insert into matest3 (name) values ('Test 5');
! insert into matest3 (name) values ('Test 6');
! set enable_indexscan = off;  -- force use of seqscan/sort, so no merge
! explain (verbose, costs off) select * from matest0 order by 1-id;
!                          QUERY PLAN                         
! ------------------------------------------------------------
!  Sort
!    Output: matest0.id, matest0.name, ((1 - matest0.id))
!    Sort Key: ((1 - matest0.id))
!    ->  Result
!          Output: matest0.id, matest0.name, (1 - matest0.id)
!          ->  Append
!                ->  Seq Scan on public.matest0
!                      Output: matest0.id, matest0.name
!                ->  Seq Scan on public.matest1
!                      Output: matest1.id, matest1.name
!                ->  Seq Scan on public.matest2
!                      Output: matest2.id, matest2.name
!                ->  Seq Scan on public.matest3
!                      Output: matest3.id, matest3.name
! (14 rows)
! 
! select * from matest0 order by 1-id;
!  id |  name  
! ----+--------
!   6 | Test 6
!   5 | Test 5
!   4 | Test 4
!   3 | Test 3
!   2 | Test 2
!   1 | Test 1
! (6 rows)
! 
! explain (verbose, costs off) select min(1-id) from matest0;
!                QUERY PLAN               
! ----------------------------------------
!  Aggregate
!    Output: min((1 - matest0.id))
!    ->  Append
!          ->  Seq Scan on public.matest0
!                Output: matest0.id
!          ->  Seq Scan on public.matest1
!                Output: matest1.id
!          ->  Seq Scan on public.matest2
!                Output: matest2.id
!          ->  Seq Scan on public.matest3
!                Output: matest3.id
! (11 rows)
! 
! select min(1-id) from matest0;
!  min 
! -----
!   -5
! (1 row)
! 
! reset enable_indexscan;
! set enable_seqscan = off;  -- plan with fewest seqscans should be merge
! explain (verbose, costs off) select * from matest0 order by 1-id;
!                             QUERY PLAN                            
! ------------------------------------------------------------------
!  Merge Append
!    Sort Key: ((1 - matest0.id))
!    ->  Index Scan using matest0i on public.matest0
!          Output: matest0.id, matest0.name, (1 - matest0.id)
!    ->  Index Scan using matest1i on public.matest1
!          Output: matest1.id, matest1.name, (1 - matest1.id)
!    ->  Sort
!          Output: matest2.id, matest2.name, ((1 - matest2.id))
!          Sort Key: ((1 - matest2.id))
!          ->  Seq Scan on public.matest2
!                Output: matest2.id, matest2.name, (1 - matest2.id)
!    ->  Index Scan using matest3i on public.matest3
!          Output: matest3.id, matest3.name, (1 - matest3.id)
! (13 rows)
! 
! select * from matest0 order by 1-id;
!  id |  name  
! ----+--------
!   6 | Test 6
!   5 | Test 5
!   4 | Test 4
!   3 | Test 3
!   2 | Test 2
!   1 | Test 1
! (6 rows)
! 
! explain (verbose, costs off) select min(1-id) from matest0;
!                                 QUERY PLAN                                
! --------------------------------------------------------------------------
!  Result
!    Output: $0
!    InitPlan 1 (returns $0)
!      ->  Limit
!            Output: ((1 - matest0.id))
!            ->  Result
!                  Output: ((1 - matest0.id))
!                  ->  Merge Append
!                        Sort Key: ((1 - matest0.id))
!                        ->  Index Scan using matest0i on public.matest0
!                              Output: matest0.id, (1 - matest0.id)
!                              Index Cond: ((1 - matest0.id) IS NOT NULL)
!                        ->  Index Scan using matest1i on public.matest1
!                              Output: matest1.id, (1 - matest1.id)
!                              Index Cond: ((1 - matest1.id) IS NOT NULL)
!                        ->  Sort
!                              Output: matest2.id, ((1 - matest2.id))
!                              Sort Key: ((1 - matest2.id))
!                              ->  Bitmap Heap Scan on public.matest2
!                                    Output: matest2.id, (1 - matest2.id)
!                                    Filter: ((1 - matest2.id) IS NOT NULL)
!                                    ->  Bitmap Index Scan on matest2_pkey
!                        ->  Index Scan using matest3i on public.matest3
!                              Output: matest3.id, (1 - matest3.id)
!                              Index Cond: ((1 - matest3.id) IS NOT NULL)
! (25 rows)
! 
! select min(1-id) from matest0;
!  min 
! -----
!   -5
! (1 row)
! 
! reset enable_seqscan;
! drop table matest0 cascade;
! NOTICE:  drop cascades to 3 other objects
! DETAIL:  drop cascades to table matest1
! drop cascades to table matest2
! drop cascades to table matest3
! --
! -- Test merge-append for UNION ALL append relations
! --
! set enable_seqscan = off;
! set enable_indexscan = on;
! set enable_bitmapscan = off;
! -- Check handling of duplicated, constant, or volatile targetlist items
! explain (costs off)
! SELECT thousand, tenthous FROM tenk1
! UNION ALL
! SELECT thousand, thousand FROM tenk1
! ORDER BY thousand, tenthous;
!                                QUERY PLAN                                
! -------------------------------------------------------------------------
!  Merge Append
!    Sort Key: tenk1.thousand, tenk1.tenthous
!    ->  Index Only Scan using tenk1_thous_tenthous on tenk1
!    ->  Sort
!          Sort Key: tenk1_1.thousand, tenk1_1.thousand
!          ->  Index Only Scan using tenk1_thous_tenthous on tenk1 tenk1_1
! (6 rows)
! 
! explain (costs off)
! SELECT thousand, tenthous, thousand+tenthous AS x FROM tenk1
! UNION ALL
! SELECT 42, 42, hundred FROM tenk1
! ORDER BY thousand, tenthous;
!                             QUERY PLAN                            
! ------------------------------------------------------------------
!  Merge Append
!    Sort Key: tenk1.thousand, tenk1.tenthous
!    ->  Index Only Scan using tenk1_thous_tenthous on tenk1
!    ->  Sort
!          Sort Key: (42), (42)
!          ->  Index Only Scan using tenk1_hundred on tenk1 tenk1_1
! (6 rows)
! 
! explain (costs off)
! SELECT thousand, tenthous FROM tenk1
! UNION ALL
! SELECT thousand, random()::integer FROM tenk1
! ORDER BY thousand, tenthous;
!                                QUERY PLAN                                
! -------------------------------------------------------------------------
!  Merge Append
!    Sort Key: tenk1.thousand, tenk1.tenthous
!    ->  Index Only Scan using tenk1_thous_tenthous on tenk1
!    ->  Sort
!          Sort Key: tenk1_1.thousand, ((random())::integer)
!          ->  Index Only Scan using tenk1_thous_tenthous on tenk1 tenk1_1
! (6 rows)
! 
! -- Check min/max aggregate optimization
! explain (costs off)
! SELECT min(x) FROM
!   (SELECT unique1 AS x FROM tenk1 a
!    UNION ALL
!    SELECT unique2 AS x FROM tenk1 b) s;
!                              QUERY PLAN                             
! --------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: a.unique1
!                  ->  Index Only Scan using tenk1_unique1 on tenk1 a
!                        Index Cond: (unique1 IS NOT NULL)
!                  ->  Index Only Scan using tenk1_unique2 on tenk1 b
!                        Index Cond: (unique2 IS NOT NULL)
! (9 rows)
! 
! explain (costs off)
! SELECT min(y) FROM
!   (SELECT unique1 AS x, unique1 AS y FROM tenk1 a
!    UNION ALL
!    SELECT unique2 AS x, unique2 AS y FROM tenk1 b) s;
!                              QUERY PLAN                             
! --------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: a.unique1
!                  ->  Index Only Scan using tenk1_unique1 on tenk1 a
!                        Index Cond: (unique1 IS NOT NULL)
!                  ->  Index Only Scan using tenk1_unique2 on tenk1 b
!                        Index Cond: (unique2 IS NOT NULL)
! (9 rows)
! 
! -- XXX planner doesn't recognize that index on unique2 is sufficiently sorted
! explain (costs off)
! SELECT x, y FROM
!   (SELECT thousand AS x, tenthous AS y FROM tenk1 a
!    UNION ALL
!    SELECT unique2 AS x, unique2 AS y FROM tenk1 b) s
! ORDER BY x, y;
!                          QUERY PLAN                          
! -------------------------------------------------------------
!  Merge Append
!    Sort Key: a.thousand, a.tenthous
!    ->  Index Only Scan using tenk1_thous_tenthous on tenk1 a
!    ->  Sort
!          Sort Key: b.unique2, b.unique2
!          ->  Index Only Scan using tenk1_unique2 on tenk1 b
! (6 rows)
! 
! -- exercise rescan code path via a repeatedly-evaluated subquery
! explain (costs off)
! SELECT
!     ARRAY(SELECT f.i FROM (
!         (SELECT d + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)
!         UNION ALL
!         (SELECT d + g.i FROM generate_series(0, 30, 5) d ORDER BY 1)
!     ) f(i)
!     ORDER BY f.i LIMIT 10)
! FROM generate_series(1, 3) g(i);
!                            QUERY PLAN                           
! ----------------------------------------------------------------
!  Function Scan on generate_series g
!    SubPlan 1
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: ((d.d + g.i))
!                  ->  Sort
!                        Sort Key: ((d.d + g.i))
!                        ->  Function Scan on generate_series d
!                  ->  Sort
!                        Sort Key: ((d_1.d + g.i))
!                        ->  Function Scan on generate_series d_1
! (11 rows)
! 
! SELECT
!     ARRAY(SELECT f.i FROM (
!         (SELECT d + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)
!         UNION ALL
!         (SELECT d + g.i FROM generate_series(0, 30, 5) d ORDER BY 1)
!     ) f(i)
!     ORDER BY f.i LIMIT 10)
! FROM generate_series(1, 3) g(i);
!             array             
! ------------------------------
!  {1,5,6,8,11,11,14,16,17,20}
!  {2,6,7,9,12,12,15,17,18,21}
!  {3,7,8,10,13,13,16,18,19,22}
! (3 rows)
! 
! reset enable_seqscan;
! reset enable_indexscan;
! reset enable_bitmapscan;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/create_table_like.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/create_table_like.out	Tue Oct 28 15:53:05 2014
***************
*** 1,230 ****
! /* Test inheritance of structure (LIKE) */
! CREATE TABLE inhx (xx text DEFAULT 'text');
! /*
!  * Test double inheritance
!  *
!  * Ensure that defaults are NOT included unless
!  * INCLUDING DEFAULTS is specified
!  */
! CREATE TABLE ctla (aa TEXT);
! CREATE TABLE ctlb (bb TEXT) INHERITS (ctla);
! CREATE TABLE foo (LIKE nonexistent);
! ERROR:  relation "nonexistent" does not exist
! LINE 1: CREATE TABLE foo (LIKE nonexistent);
!                                ^
! CREATE TABLE inhe (ee text, LIKE inhx) inherits (ctlb);
! INSERT INTO inhe VALUES ('ee-col1', 'ee-col2', DEFAULT, 'ee-col4');
! SELECT * FROM inhe; /* Columns aa, bb, xx value NULL, ee */
!    aa    |   bb    | ee |   xx    
! ---------+---------+----+---------
!  ee-col1 | ee-col2 |    | ee-col4
! (1 row)
! 
! SELECT * FROM inhx; /* Empty set since LIKE inherits structure only */
!  xx 
! ----
! (0 rows)
! 
! SELECT * FROM ctlb; /* Has ee entry */
!    aa    |   bb    
! ---------+---------
!  ee-col1 | ee-col2
! (1 row)
! 
! SELECT * FROM ctla; /* Has ee entry */
!    aa    
! ---------
!  ee-col1
! (1 row)
! 
! CREATE TABLE inhf (LIKE inhx, LIKE inhx); /* Throw error */
! ERROR:  column "xx" specified more than once
! CREATE TABLE inhf (LIKE inhx INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
! INSERT INTO inhf DEFAULT VALUES;
! SELECT * FROM inhf; /* Single entry with value 'text' */
!   xx  
! ------
!  text
! (1 row)
! 
! ALTER TABLE inhx add constraint foo CHECK (xx = 'text');
! ALTER TABLE inhx ADD PRIMARY KEY (xx);
! CREATE TABLE inhg (LIKE inhx); /* Doesn't copy constraint */
! INSERT INTO inhg VALUES ('foo');
! DROP TABLE inhg;
! CREATE TABLE inhg (x text, LIKE inhx INCLUDING CONSTRAINTS, y text); /* Copies constraints */
! INSERT INTO inhg VALUES ('x', 'text', 'y'); /* Succeeds */
! INSERT INTO inhg VALUES ('x', 'text', 'y'); /* Succeeds -- Unique constraints not copied */
! INSERT INTO inhg VALUES ('x', 'foo',  'y');  /* fails due to constraint */
! ERROR:  new row for relation "inhg" violates check constraint "foo"
! DETAIL:  Failing row contains (x, foo, y).
! SELECT * FROM inhg; /* Two records with three columns in order x=x, xx=text, y=y */
!  x |  xx  | y 
! ---+------+---
!  x | text | y
!  x | text | y
! (2 rows)
! 
! DROP TABLE inhg;
! CREATE TABLE inhg (x text, LIKE inhx INCLUDING INDEXES, y text); /* copies indexes */
! INSERT INTO inhg VALUES (5, 10);
! INSERT INTO inhg VALUES (20, 10); -- should fail
! ERROR:  duplicate key value violates unique constraint "inhg_pkey"
! DETAIL:  Key (xx)=(10) already exists.
! DROP TABLE inhg;
! /* Multiple primary keys creation should fail */
! CREATE TABLE inhg (x text, LIKE inhx INCLUDING INDEXES, PRIMARY KEY(x)); /* fails */
! ERROR:  multiple primary keys for table "inhg" are not allowed
! CREATE TABLE inhz (xx text DEFAULT 'text', yy int UNIQUE);
! CREATE UNIQUE INDEX inhz_xx_idx on inhz (xx) WHERE xx <> 'test';
! /* Ok to create multiple unique indexes */
! CREATE TABLE inhg (x text UNIQUE, LIKE inhz INCLUDING INDEXES);
! INSERT INTO inhg (xx, yy, x) VALUES ('test', 5, 10);
! INSERT INTO inhg (xx, yy, x) VALUES ('test', 10, 15);
! INSERT INTO inhg (xx, yy, x) VALUES ('foo', 10, 15); -- should fail
! ERROR:  duplicate key value violates unique constraint "inhg_x_key"
! DETAIL:  Key (x)=(15) already exists.
! DROP TABLE inhg;
! DROP TABLE inhz;
! -- including storage and comments
! CREATE TABLE ctlt1 (a text CHECK (length(a) > 2) PRIMARY KEY, b text);
! CREATE INDEX ctlt1_b_key ON ctlt1 (b);
! CREATE INDEX ctlt1_fnidx ON ctlt1 ((a || b));
! COMMENT ON COLUMN ctlt1.a IS 'A';
! COMMENT ON COLUMN ctlt1.b IS 'B';
! COMMENT ON CONSTRAINT ctlt1_a_check ON ctlt1 IS 't1_a_check';
! COMMENT ON INDEX ctlt1_pkey IS 'index pkey';
! COMMENT ON INDEX ctlt1_b_key IS 'index b_key';
! ALTER TABLE ctlt1 ALTER COLUMN a SET STORAGE MAIN;
! CREATE TABLE ctlt2 (c text);
! ALTER TABLE ctlt2 ALTER COLUMN c SET STORAGE EXTERNAL;
! COMMENT ON COLUMN ctlt2.c IS 'C';
! CREATE TABLE ctlt3 (a text CHECK (length(a) < 5), c text);
! ALTER TABLE ctlt3 ALTER COLUMN c SET STORAGE EXTERNAL;
! ALTER TABLE ctlt3 ALTER COLUMN a SET STORAGE MAIN;
! COMMENT ON COLUMN ctlt3.a IS 'A3';
! COMMENT ON COLUMN ctlt3.c IS 'C';
! COMMENT ON CONSTRAINT ctlt3_a_check ON ctlt3 IS 't3_a_check';
! CREATE TABLE ctlt4 (a text, c text);
! ALTER TABLE ctlt4 ALTER COLUMN c SET STORAGE EXTERNAL;
! CREATE TABLE ctlt12_storage (LIKE ctlt1 INCLUDING STORAGE, LIKE ctlt2 INCLUDING STORAGE);
! \d+ ctlt12_storage
!                    Table "public.ctlt12_storage"
!  Column | Type | Modifiers | Storage  | Stats target | Description 
! --------+------+-----------+----------+--------------+-------------
!  a      | text | not null  | main     |              | 
!  b      | text |           | extended |              | 
!  c      | text |           | external |              | 
! 
! CREATE TABLE ctlt12_comments (LIKE ctlt1 INCLUDING COMMENTS, LIKE ctlt2 INCLUDING COMMENTS);
! \d+ ctlt12_comments
!                   Table "public.ctlt12_comments"
!  Column | Type | Modifiers | Storage  | Stats target | Description 
! --------+------+-----------+----------+--------------+-------------
!  a      | text | not null  | extended |              | A
!  b      | text |           | extended |              | B
!  c      | text |           | extended |              | C
! 
! CREATE TABLE ctlt1_inh (LIKE ctlt1 INCLUDING CONSTRAINTS INCLUDING COMMENTS) INHERITS (ctlt1);
! NOTICE:  merging column "a" with inherited definition
! NOTICE:  merging column "b" with inherited definition
! NOTICE:  merging constraint "ctlt1_a_check" with inherited definition
! \d+ ctlt1_inh
!                      Table "public.ctlt1_inh"
!  Column | Type | Modifiers | Storage  | Stats target | Description 
! --------+------+-----------+----------+--------------+-------------
!  a      | text | not null  | main     |              | A
!  b      | text |           | extended |              | B
! Check constraints:
!     "ctlt1_a_check" CHECK (length(a) > 2)
! Inherits: ctlt1
! 
! SELECT description FROM pg_description, pg_constraint c WHERE classoid = 'pg_constraint'::regclass AND objoid = c.oid AND c.conrelid = 'ctlt1_inh'::regclass;
!  description 
! -------------
!  t1_a_check
! (1 row)
! 
! CREATE TABLE ctlt13_inh () INHERITS (ctlt1, ctlt3);
! NOTICE:  merging multiple inherited definitions of column "a"
! \d+ ctlt13_inh
!                      Table "public.ctlt13_inh"
!  Column | Type | Modifiers | Storage  | Stats target | Description 
! --------+------+-----------+----------+--------------+-------------
!  a      | text | not null  | main     |              | 
!  b      | text |           | extended |              | 
!  c      | text |           | external |              | 
! Check constraints:
!     "ctlt1_a_check" CHECK (length(a) > 2)
!     "ctlt3_a_check" CHECK (length(a) < 5)
! Inherits: ctlt1,
!           ctlt3
! 
! CREATE TABLE ctlt13_like (LIKE ctlt3 INCLUDING CONSTRAINTS INCLUDING COMMENTS INCLUDING STORAGE) INHERITS (ctlt1);
! NOTICE:  merging column "a" with inherited definition
! \d+ ctlt13_like
!                     Table "public.ctlt13_like"
!  Column | Type | Modifiers | Storage  | Stats target | Description 
! --------+------+-----------+----------+--------------+-------------
!  a      | text | not null  | main     |              | A3
!  b      | text |           | extended |              | 
!  c      | text |           | external |              | C
! Check constraints:
!     "ctlt1_a_check" CHECK (length(a) > 2)
!     "ctlt3_a_check" CHECK (length(a) < 5)
! Inherits: ctlt1
! 
! SELECT description FROM pg_description, pg_constraint c WHERE classoid = 'pg_constraint'::regclass AND objoid = c.oid AND c.conrelid = 'ctlt13_like'::regclass;
!  description 
! -------------
!  t3_a_check
! (1 row)
! 
! CREATE TABLE ctlt_all (LIKE ctlt1 INCLUDING ALL);
! \d+ ctlt_all
!                       Table "public.ctlt_all"
!  Column | Type | Modifiers | Storage  | Stats target | Description 
! --------+------+-----------+----------+--------------+-------------
!  a      | text | not null  | main     |              | A
!  b      | text |           | extended |              | B
! Indexes:
!     "ctlt_all_pkey" PRIMARY KEY, btree (a)
!     "ctlt_all_b_idx" btree (b)
!     "ctlt_all_expr_idx" btree ((a || b))
! Check constraints:
!     "ctlt1_a_check" CHECK (length(a) > 2)
! 
! SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
!     relname     | objsubid | description 
! ----------------+----------+-------------
!  ctlt_all_b_idx |        0 | index b_key
!  ctlt_all_pkey  |        0 | index pkey
! (2 rows)
! 
! CREATE TABLE inh_error1 () INHERITS (ctlt1, ctlt4);
! NOTICE:  merging multiple inherited definitions of column "a"
! ERROR:  inherited column "a" has a storage parameter conflict
! DETAIL:  MAIN versus EXTENDED
! CREATE TABLE inh_error2 (LIKE ctlt4 INCLUDING STORAGE) INHERITS (ctlt1);
! NOTICE:  merging column "a" with inherited definition
! ERROR:  column "a" has a storage parameter conflict
! DETAIL:  MAIN versus EXTENDED
! DROP TABLE ctlt1, ctlt2, ctlt3, ctlt4, ctlt12_storage, ctlt12_comments, ctlt1_inh, ctlt13_inh, ctlt13_like, ctlt_all, ctla, ctlb CASCADE;
! NOTICE:  drop cascades to table inhe
! /* LIKE with other relation kinds */
! CREATE TABLE ctlt4 (a int, b text);
! CREATE SEQUENCE ctlseq1;
! CREATE TABLE ctlt10 (LIKE ctlseq1);  -- fail
! ERROR:  "ctlseq1" is not a table, view, materialized view, composite type, or foreign table
! LINE 1: CREATE TABLE ctlt10 (LIKE ctlseq1);
!                                   ^
! CREATE VIEW ctlv1 AS SELECT * FROM ctlt4;
! CREATE TABLE ctlt11 (LIKE ctlv1);
! CREATE TABLE ctlt11a (LIKE ctlv1 INCLUDING ALL);
! CREATE TYPE ctlty1 AS (a int, b text);
! CREATE TABLE ctlt12 (LIKE ctlty1);
! DROP SEQUENCE ctlseq1;
! DROP TYPE ctlty1;
! DROP VIEW ctlv1;
! DROP TABLE IF EXISTS ctlt4, ctlt10, ctlt11, ctlt11a, ctlt12;
! NOTICE:  table "ctlt10" does not exist, skipping
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/typed_table.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/typed_table.out	Tue Oct 28 15:53:05 2014
***************
*** 1,108 ****
! CREATE TABLE ttable1 OF nothing;
! ERROR:  type "nothing" does not exist
! CREATE TYPE person_type AS (id int, name text);
! CREATE TABLE persons OF person_type;
! CREATE TABLE IF NOT EXISTS persons OF person_type;
! NOTICE:  relation "persons" already exists, skipping
! SELECT * FROM persons;
!  id | name 
! ----+------
! (0 rows)
! 
! \d persons
!     Table "public.persons"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  id     | integer | 
!  name   | text    | 
! Typed table of type: person_type
! 
! CREATE FUNCTION get_all_persons() RETURNS SETOF person_type
! LANGUAGE SQL
! AS $$
!     SELECT * FROM persons;
! $$;
! SELECT * FROM get_all_persons();
!  id | name 
! ----+------
! (0 rows)
! 
! -- certain ALTER TABLE operations on typed tables are not allowed
! ALTER TABLE persons ADD COLUMN comment text;
! ERROR:  cannot add column to typed table
! ALTER TABLE persons DROP COLUMN name;
! ERROR:  cannot drop column from typed table
! ALTER TABLE persons RENAME COLUMN id TO num;
! ERROR:  cannot rename column of typed table
! ALTER TABLE persons ALTER COLUMN name TYPE varchar;
! ERROR:  cannot alter column type of typed table
! CREATE TABLE stuff (id int);
! ALTER TABLE persons INHERIT stuff;
! ERROR:  cannot change inheritance of typed table
! CREATE TABLE personsx OF person_type (myname WITH OPTIONS NOT NULL); -- error
! ERROR:  column "myname" does not exist
! CREATE TABLE persons2 OF person_type (
!     id WITH OPTIONS PRIMARY KEY,
!     UNIQUE (name)
! );
! \d persons2
!    Table "public.persons2"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  id     | integer | not null
!  name   | text    | 
! Indexes:
!     "persons2_pkey" PRIMARY KEY, btree (id)
!     "persons2_name_key" UNIQUE CONSTRAINT, btree (name)
! Typed table of type: person_type
! 
! CREATE TABLE persons3 OF person_type (
!     PRIMARY KEY (id),
!     name WITH OPTIONS DEFAULT ''
! );
! \d persons3
!        Table "public.persons3"
!  Column |  Type   |    Modifiers     
! --------+---------+------------------
!  id     | integer | not null
!  name   | text    | default ''::text
! Indexes:
!     "persons3_pkey" PRIMARY KEY, btree (id)
! Typed table of type: person_type
! 
! CREATE TABLE persons4 OF person_type (
!     name WITH OPTIONS NOT NULL,
!     name WITH OPTIONS DEFAULT ''  -- error, specified more than once
! );
! ERROR:  column "name" specified more than once
! DROP TYPE person_type RESTRICT;
! ERROR:  cannot drop type person_type because other objects depend on it
! DETAIL:  table persons depends on type person_type
! function get_all_persons() depends on type person_type
! table persons2 depends on type person_type
! table persons3 depends on type person_type
! HINT:  Use DROP ... CASCADE to drop the dependent objects too.
! DROP TYPE person_type CASCADE;
! NOTICE:  drop cascades to 4 other objects
! DETAIL:  drop cascades to table persons
! drop cascades to function get_all_persons()
! drop cascades to table persons2
! drop cascades to table persons3
! CREATE TABLE persons5 OF stuff; -- only CREATE TYPE AS types may be used
! ERROR:  type stuff is not a composite type
! DROP TABLE stuff;
! -- implicit casting
! CREATE TYPE person_type AS (id int, name text);
! CREATE TABLE persons OF person_type;
! INSERT INTO persons VALUES (1, 'test');
! CREATE FUNCTION namelen(person_type) RETURNS int LANGUAGE SQL AS $$ SELECT length($1.name) $$;
! SELECT id, namelen(persons) FROM persons;
!  id | namelen 
! ----+---------
!   1 |       4
! (1 row)
! 
! DROP TYPE person_type CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to table persons
! drop cascades to function namelen(person_type)
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/vacuum.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/vacuum.out	Tue Oct 28 15:53:05 2014
***************
*** 1,71 ****
! --
! -- VACUUM
! --
! CREATE TABLE vactst (i INT);
! INSERT INTO vactst VALUES (1);
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst VALUES (0);
! SELECT count(*) FROM vactst;
!  count 
! -------
!   2049
! (1 row)
! 
! DELETE FROM vactst WHERE i != 0;
! SELECT * FROM vactst;
!  i 
! ---
!  0
! (1 row)
! 
! VACUUM FULL vactst;
! UPDATE vactst SET i = i + 1;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst SELECT * FROM vactst;
! INSERT INTO vactst VALUES (0);
! SELECT count(*) FROM vactst;
!  count 
! -------
!   2049
! (1 row)
! 
! DELETE FROM vactst WHERE i != 0;
! VACUUM (FULL) vactst;
! DELETE FROM vactst;
! SELECT * FROM vactst;
!  i 
! ---
! (0 rows)
! 
! VACUUM (FULL, FREEZE) vactst;
! VACUUM (ANALYZE, FULL) vactst;
! CREATE TABLE vaccluster (i INT PRIMARY KEY);
! ALTER TABLE vaccluster CLUSTER ON vaccluster_pkey;
! INSERT INTO vaccluster SELECT * FROM vactst;
! CLUSTER vaccluster;
! VACUUM FULL pg_am;
! VACUUM FULL pg_class;
! VACUUM FULL pg_database;
! VACUUM FULL vaccluster;
! VACUUM FULL vactst;
! DROP TABLE vaccluster;
! DROP TABLE vactst;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/drop_if_exists.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/drop_if_exists.out	Tue Oct 28 15:53:05 2014
***************
*** 1,298 ****
! --
! -- IF EXISTS tests
! --
! -- table (will be really dropped at the end)
! DROP TABLE test_exists;
! ERROR:  table "test_exists" does not exist
! DROP TABLE IF EXISTS test_exists;
! NOTICE:  table "test_exists" does not exist, skipping
! CREATE TABLE test_exists (a int, b text);
! -- view
! DROP VIEW test_view_exists;
! ERROR:  view "test_view_exists" does not exist
! DROP VIEW IF EXISTS test_view_exists;
! NOTICE:  view "test_view_exists" does not exist, skipping
! CREATE VIEW test_view_exists AS select * from test_exists;
! DROP VIEW IF EXISTS test_view_exists;
! DROP VIEW test_view_exists;
! ERROR:  view "test_view_exists" does not exist
! -- index
! DROP INDEX test_index_exists;
! ERROR:  index "test_index_exists" does not exist
! DROP INDEX IF EXISTS test_index_exists;
! NOTICE:  index "test_index_exists" does not exist, skipping
! CREATE INDEX test_index_exists on test_exists(a);
! DROP INDEX IF EXISTS test_index_exists;
! DROP INDEX test_index_exists;
! ERROR:  index "test_index_exists" does not exist
! -- sequence
! DROP SEQUENCE test_sequence_exists;
! ERROR:  sequence "test_sequence_exists" does not exist
! DROP SEQUENCE IF EXISTS test_sequence_exists;
! NOTICE:  sequence "test_sequence_exists" does not exist, skipping
! CREATE SEQUENCE test_sequence_exists;
! DROP SEQUENCE IF EXISTS test_sequence_exists;
! DROP SEQUENCE test_sequence_exists;
! ERROR:  sequence "test_sequence_exists" does not exist
! -- schema
! DROP SCHEMA test_schema_exists;
! ERROR:  schema "test_schema_exists" does not exist
! DROP SCHEMA IF EXISTS test_schema_exists;
! NOTICE:  schema "test_schema_exists" does not exist, skipping
! CREATE SCHEMA test_schema_exists;
! DROP SCHEMA IF EXISTS test_schema_exists;
! DROP SCHEMA test_schema_exists;
! ERROR:  schema "test_schema_exists" does not exist
! -- type
! DROP TYPE test_type_exists;
! ERROR:  type "test_type_exists" does not exist
! DROP TYPE IF EXISTS test_type_exists;
! NOTICE:  type "test_type_exists" does not exist, skipping
! CREATE type test_type_exists as (a int, b text);
! DROP TYPE IF EXISTS test_type_exists;
! DROP TYPE test_type_exists;
! ERROR:  type "test_type_exists" does not exist
! -- domain
! DROP DOMAIN test_domain_exists;
! ERROR:  type "test_domain_exists" does not exist
! DROP DOMAIN IF EXISTS test_domain_exists;
! NOTICE:  type "test_domain_exists" does not exist, skipping
! CREATE domain test_domain_exists as int not null check (value > 0);
! DROP DOMAIN IF EXISTS test_domain_exists;
! DROP DOMAIN test_domain_exists;
! ERROR:  type "test_domain_exists" does not exist
! ---
! --- role/user/group
! ---
! CREATE USER tu1;
! CREATE ROLE tr1;
! CREATE GROUP tg1;
! DROP USER tu2;
! ERROR:  role "tu2" does not exist
! DROP USER IF EXISTS tu1, tu2;
! NOTICE:  role "tu2" does not exist, skipping
! DROP USER tu1;
! ERROR:  role "tu1" does not exist
! DROP ROLE tr2;
! ERROR:  role "tr2" does not exist
! DROP ROLE IF EXISTS tr1, tr2;
! NOTICE:  role "tr2" does not exist, skipping
! DROP ROLE tr1;
! ERROR:  role "tr1" does not exist
! DROP GROUP tg2;
! ERROR:  role "tg2" does not exist
! DROP GROUP IF EXISTS tg1, tg2;
! NOTICE:  role "tg2" does not exist, skipping
! DROP GROUP tg1;
! ERROR:  role "tg1" does not exist
! -- collation
! DROP COLLATION IF EXISTS test_collation_exists;
! NOTICE:  collation "test_collation_exists" does not exist, skipping
! -- conversion
! DROP CONVERSION test_conversion_exists;
! ERROR:  conversion "test_conversion_exists" does not exist
! DROP CONVERSION IF EXISTS test_conversion_exists;
! NOTICE:  conversion "test_conversion_exists" does not exist, skipping
! CREATE CONVERSION test_conversion_exists
!     FOR 'LATIN1' TO 'UTF8' FROM iso8859_1_to_utf8;
! DROP CONVERSION test_conversion_exists;
! -- text search parser
! DROP TEXT SEARCH PARSER test_tsparser_exists;
! ERROR:  text search parser "test_tsparser_exists" does not exist
! DROP TEXT SEARCH PARSER IF EXISTS test_tsparser_exists;
! NOTICE:  text search parser "test_tsparser_exists" does not exist, skipping
! -- text search dictionary
! DROP TEXT SEARCH DICTIONARY test_tsdict_exists;
! ERROR:  text search dictionary "test_tsdict_exists" does not exist
! DROP TEXT SEARCH DICTIONARY IF EXISTS test_tsdict_exists;
! NOTICE:  text search dictionary "test_tsdict_exists" does not exist, skipping
! CREATE TEXT SEARCH DICTIONARY test_tsdict_exists (
!         Template=ispell,
!         DictFile=ispell_sample,
!         AffFile=ispell_sample
! );
! DROP TEXT SEARCH DICTIONARY test_tsdict_exists;
! -- test search template
! DROP TEXT SEARCH TEMPLATE test_tstemplate_exists;
! ERROR:  text search template "test_tstemplate_exists" does not exist
! DROP TEXT SEARCH TEMPLATE IF EXISTS test_tstemplate_exists;
! NOTICE:  text search template "test_tstemplate_exists" does not exist, skipping
! -- text search configuration
! DROP TEXT SEARCH CONFIGURATION test_tsconfig_exists;
! ERROR:  text search configuration "test_tsconfig_exists" does not exist
! DROP TEXT SEARCH CONFIGURATION IF EXISTS test_tsconfig_exists;
! NOTICE:  text search configuration "test_tsconfig_exists" does not exist, skipping
! CREATE TEXT SEARCH CONFIGURATION test_tsconfig_exists (COPY=english);
! DROP TEXT SEARCH CONFIGURATION test_tsconfig_exists;
! -- extension
! DROP EXTENSION test_extension_exists;
! ERROR:  extension "test_extension_exists" does not exist
! DROP EXTENSION IF EXISTS test_extension_exists;
! NOTICE:  extension "test_extension_exists" does not exist, skipping
! -- functions
! DROP FUNCTION test_function_exists();
! ERROR:  function test_function_exists() does not exist
! DROP FUNCTION IF EXISTS test_function_exists();
! NOTICE:  function test_function_exists() does not exist, skipping
! DROP FUNCTION test_function_exists(int, text, int[]);
! ERROR:  function test_function_exists(integer, text, integer[]) does not exist
! DROP FUNCTION IF EXISTS test_function_exists(int, text, int[]);
! NOTICE:  function test_function_exists(pg_catalog.int4,text,pg_catalog.int4[]) does not exist, skipping
! -- aggregate
! DROP AGGREGATE test_aggregate_exists(*);
! ERROR:  aggregate test_aggregate_exists(*) does not exist
! DROP AGGREGATE IF EXISTS test_aggregate_exists(*);
! NOTICE:  aggregate test_aggregate_exists() does not exist, skipping
! DROP AGGREGATE test_aggregate_exists(int);
! ERROR:  aggregate test_aggregate_exists(integer) does not exist
! DROP AGGREGATE IF EXISTS test_aggregate_exists(int);
! NOTICE:  aggregate test_aggregate_exists(pg_catalog.int4) does not exist, skipping
! -- operator
! DROP OPERATOR @#@ (int, int);
! ERROR:  operator does not exist: integer @#@ integer
! DROP OPERATOR IF EXISTS @#@ (int, int);
! NOTICE:  operator @#@ does not exist, skipping
! CREATE OPERATOR @#@
!         (leftarg = int8, rightarg = int8, procedure = int8xor);
! DROP OPERATOR @#@ (int8, int8);
! -- language
! DROP LANGUAGE test_language_exists;
! ERROR:  language "test_language_exists" does not exist
! DROP LANGUAGE IF EXISTS test_language_exists;
! NOTICE:  language "test_language_exists" does not exist, skipping
! -- cast
! DROP CAST (text AS text);
! ERROR:  cast from type text to type text does not exist
! DROP CAST IF EXISTS (text AS text);
! NOTICE:  cast from type text to type text does not exist, skipping
! -- trigger
! DROP TRIGGER test_trigger_exists ON test_exists;
! ERROR:  trigger "test_trigger_exists" for table "test_exists" does not exist
! DROP TRIGGER IF EXISTS test_trigger_exists ON test_exists;
! NOTICE:  trigger "test_trigger_exists" for relation "test_exists" does not exist, skipping
! DROP TRIGGER test_trigger_exists ON no_such_table;
! ERROR:  relation "no_such_table" does not exist
! DROP TRIGGER IF EXISTS test_trigger_exists ON no_such_table;
! NOTICE:  relation "no_such_table" does not exist, skipping
! DROP TRIGGER test_trigger_exists ON no_such_schema.no_such_table;
! ERROR:  schema "no_such_schema" does not exist
! DROP TRIGGER IF EXISTS test_trigger_exists ON no_such_schema.no_such_table;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! CREATE TRIGGER test_trigger_exists
!     BEFORE UPDATE ON test_exists
!     FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
! DROP TRIGGER test_trigger_exists ON test_exists;
! -- rule
! DROP RULE test_rule_exists ON test_exists;
! ERROR:  rule "test_rule_exists" for relation "test_exists" does not exist
! DROP RULE IF EXISTS test_rule_exists ON test_exists;
! NOTICE:  rule "test_rule_exists" for relation "test_exists" does not exist, skipping
! DROP RULE test_rule_exists ON no_such_table;
! ERROR:  relation "no_such_table" does not exist
! DROP RULE IF EXISTS test_rule_exists ON no_such_table;
! NOTICE:  relation "no_such_table" does not exist, skipping
! DROP RULE test_rule_exists ON no_such_schema.no_such_table;
! ERROR:  schema "no_such_schema" does not exist
! DROP RULE IF EXISTS test_rule_exists ON no_such_schema.no_such_table;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! CREATE RULE test_rule_exists AS ON INSERT TO test_exists
!     DO INSTEAD
!     INSERT INTO test_exists VALUES (NEW.a, NEW.b || NEW.a::text);
! DROP RULE test_rule_exists ON test_exists;
! -- foreign data wrapper
! DROP FOREIGN DATA WRAPPER test_fdw_exists;
! ERROR:  foreign-data wrapper "test_fdw_exists" does not exist
! DROP FOREIGN DATA WRAPPER IF EXISTS test_fdw_exists;
! NOTICE:  foreign-data wrapper "test_fdw_exists" does not exist, skipping
! -- foreign server
! DROP SERVER test_server_exists;
! ERROR:  server "test_server_exists" does not exist
! DROP SERVER IF EXISTS test_server_exists;
! NOTICE:  server "test_server_exists" does not exist, skipping
! -- operator class
! DROP OPERATOR CLASS test_operator_class USING btree;
! ERROR:  operator class "test_operator_class" does not exist for access method "btree"
! DROP OPERATOR CLASS IF EXISTS test_operator_class USING btree;
! NOTICE:  operator class "test_operator_class" does not exist for access method "btree", skipping
! DROP OPERATOR CLASS test_operator_class USING no_such_am;
! ERROR:  access method "no_such_am" does not exist
! DROP OPERATOR CLASS IF EXISTS test_operator_class USING no_such_am;
! ERROR:  access method "no_such_am" does not exist
! -- operator family
! DROP OPERATOR FAMILY test_operator_family USING btree;
! ERROR:  operator family "test_operator_family" does not exist for access method "btree"
! DROP OPERATOR FAMILY IF EXISTS test_operator_family USING btree;
! NOTICE:  operator family "test_operator_family" does not exist for access method "btree", skipping
! DROP OPERATOR FAMILY test_operator_family USING no_such_am;
! ERROR:  access method "no_such_am" does not exist
! DROP OPERATOR FAMILY IF EXISTS test_operator_family USING no_such_am;
! ERROR:  access method "no_such_am" does not exist
! -- drop the table
! DROP TABLE IF EXISTS test_exists;
! DROP TABLE test_exists;
! ERROR:  table "test_exists" does not exist
! -- be tolerant with missing schemas, types, etc
! DROP AGGREGATE IF EXISTS no_such_schema.foo(int);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP AGGREGATE IF EXISTS foo(no_such_type);
! NOTICE:  type "no_such_type" does not exist, skipping
! DROP AGGREGATE IF EXISTS foo(no_such_schema.no_such_type);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP CAST IF EXISTS (INTEGER AS no_such_type2);
! NOTICE:  type "no_such_type2" does not exist, skipping
! DROP CAST IF EXISTS (no_such_type1 AS INTEGER);
! NOTICE:  type "no_such_type1" does not exist, skipping
! DROP CAST IF EXISTS (INTEGER AS no_such_schema.bar);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP CAST IF EXISTS (no_such_schema.foo AS INTEGER);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP COLLATION IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP CONVERSION IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP DOMAIN IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP FOREIGN TABLE IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP FUNCTION IF EXISTS no_such_schema.foo();
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP FUNCTION IF EXISTS foo(no_such_type);
! NOTICE:  type "no_such_type" does not exist, skipping
! DROP FUNCTION IF EXISTS foo(no_such_schema.no_such_type);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP INDEX IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP MATERIALIZED VIEW IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP OPERATOR IF EXISTS no_such_schema.+ (int, int);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP OPERATOR IF EXISTS + (no_such_type, no_such_type);
! NOTICE:  type "no_such_type" does not exist, skipping
! DROP OPERATOR IF EXISTS + (no_such_schema.no_such_type, no_such_schema.no_such_type);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP OPERATOR IF EXISTS # (NONE, no_such_schema.no_such_type);
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP OPERATOR CLASS IF EXISTS no_such_schema.widget_ops USING btree;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP OPERATOR FAMILY IF EXISTS no_such_schema.float_ops USING btree;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP RULE IF EXISTS foo ON no_such_schema.bar;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP SEQUENCE IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TABLE IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TEXT SEARCH CONFIGURATION IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TEXT SEARCH DICTIONARY IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TEXT SEARCH PARSER IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TEXT SEARCH TEMPLATE IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TRIGGER IF EXISTS foo ON no_such_schema.bar;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP TYPE IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
! DROP VIEW IF EXISTS no_such_schema.foo;
! NOTICE:  schema "no_such_schema" does not exist, skipping
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/updatable_views.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/updatable_views.out	Tue Oct 28 15:53:05 2014
***************
*** 1,2266 ****
! --
! -- UPDATABLE VIEWS
! --
! -- check that non-updatable views and columns are rejected with useful error
! -- messages
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
! CREATE VIEW ro_view1 AS SELECT DISTINCT a, b FROM base_tbl; -- DISTINCT not supported
! CREATE VIEW ro_view2 AS SELECT a, b FROM base_tbl GROUP BY a, b; -- GROUP BY not supported
! CREATE VIEW ro_view3 AS SELECT 1 FROM base_tbl HAVING max(a) > 0; -- HAVING not supported
! CREATE VIEW ro_view4 AS SELECT count(*) FROM base_tbl; -- Aggregate functions not supported
! CREATE VIEW ro_view5 AS SELECT a, rank() OVER() FROM base_tbl; -- Window functions not supported
! CREATE VIEW ro_view6 AS SELECT a, b FROM base_tbl UNION SELECT -a, b FROM base_tbl; -- Set ops not supported
! CREATE VIEW ro_view7 AS WITH t AS (SELECT a, b FROM base_tbl) SELECT * FROM t; -- WITH not supported
! CREATE VIEW ro_view8 AS SELECT a, b FROM base_tbl ORDER BY a OFFSET 1; -- OFFSET not supported
! CREATE VIEW ro_view9 AS SELECT a, b FROM base_tbl ORDER BY a LIMIT 1; -- LIMIT not supported
! CREATE VIEW ro_view10 AS SELECT 1 AS a; -- No base relations
! CREATE VIEW ro_view11 AS SELECT b1.a, b2.b FROM base_tbl b1, base_tbl b2; -- Multiple base relations
! CREATE VIEW ro_view12 AS SELECT * FROM generate_series(1, 10) AS g(a); -- SRF in rangetable
! CREATE VIEW ro_view13 AS SELECT a, b FROM (SELECT * FROM base_tbl) AS t; -- Subselect in rangetable
! CREATE VIEW rw_view14 AS SELECT ctid, a, b FROM base_tbl; -- System columns may be part of an updatable view
! CREATE VIEW rw_view15 AS SELECT a, upper(b) FROM base_tbl; -- Expression/function may be part of an updatable view
! CREATE VIEW rw_view16 AS SELECT a, b, a AS aa FROM base_tbl; -- Repeated column may be part of an updatable view
! CREATE VIEW ro_view17 AS SELECT * FROM ro_view1; -- Base relation not updatable
! CREATE VIEW ro_view18 AS SELECT * FROM (VALUES(1)) AS tmp(a); -- VALUES in rangetable
! CREATE SEQUENCE seq;
! CREATE VIEW ro_view19 AS SELECT * FROM seq; -- View based on a sequence
! CREATE VIEW ro_view20 AS SELECT a, b, generate_series(1, a) g FROM base_tbl; -- SRF in targetlist not supported
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE E'r_\\_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  ro_view1   | NO
!  ro_view10  | NO
!  ro_view11  | NO
!  ro_view12  | NO
!  ro_view13  | NO
!  ro_view17  | NO
!  ro_view18  | NO
!  ro_view19  | NO
!  ro_view2   | NO
!  ro_view20  | NO
!  ro_view3   | NO
!  ro_view4   | NO
!  ro_view5   | NO
!  ro_view6   | NO
!  ro_view7   | NO
!  ro_view8   | NO
!  ro_view9   | NO
!  rw_view14  | YES
!  rw_view15  | YES
!  rw_view16  | YES
! (20 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE E'r_\\_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  ro_view1   | NO           | NO
!  ro_view10  | NO           | NO
!  ro_view11  | NO           | NO
!  ro_view12  | NO           | NO
!  ro_view13  | NO           | NO
!  ro_view17  | NO           | NO
!  ro_view18  | NO           | NO
!  ro_view19  | NO           | NO
!  ro_view2   | NO           | NO
!  ro_view20  | NO           | NO
!  ro_view3   | NO           | NO
!  ro_view4   | NO           | NO
!  ro_view5   | NO           | NO
!  ro_view6   | NO           | NO
!  ro_view7   | NO           | NO
!  ro_view8   | NO           | NO
!  ro_view9   | NO           | NO
!  rw_view14  | YES          | YES
!  rw_view15  | YES          | YES
!  rw_view16  | YES          | YES
! (20 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE E'r_\\_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name |  column_name  | is_updatable 
! ------------+---------------+--------------
!  ro_view1   | a             | NO
!  ro_view1   | b             | NO
!  ro_view10  | a             | NO
!  ro_view11  | a             | NO
!  ro_view11  | b             | NO
!  ro_view12  | a             | NO
!  ro_view13  | a             | NO
!  ro_view13  | b             | NO
!  ro_view17  | a             | NO
!  ro_view17  | b             | NO
!  ro_view18  | a             | NO
!  ro_view19  | sequence_name | NO
!  ro_view19  | last_value    | NO
!  ro_view19  | start_value   | NO
!  ro_view19  | increment_by  | NO
!  ro_view19  | max_value     | NO
!  ro_view19  | min_value     | NO
!  ro_view19  | cache_value   | NO
!  ro_view19  | log_cnt       | NO
!  ro_view19  | is_cycled     | NO
!  ro_view19  | is_called     | NO
!  ro_view2   | a             | NO
!  ro_view2   | b             | NO
!  ro_view20  | a             | NO
!  ro_view20  | b             | NO
!  ro_view20  | g             | NO
!  ro_view3   | ?column?      | NO
!  ro_view4   | count         | NO
!  ro_view5   | a             | NO
!  ro_view5   | rank          | NO
!  ro_view6   | a             | NO
!  ro_view6   | b             | NO
!  ro_view7   | a             | NO
!  ro_view7   | b             | NO
!  ro_view8   | a             | NO
!  ro_view8   | b             | NO
!  ro_view9   | a             | NO
!  ro_view9   | b             | NO
!  rw_view14  | ctid          | NO
!  rw_view14  | a             | YES
!  rw_view14  | b             | YES
!  rw_view15  | a             | YES
!  rw_view15  | upper         | NO
!  rw_view16  | a             | YES
!  rw_view16  | b             | YES
!  rw_view16  | aa            | YES
! (46 rows)
! 
! -- Read-only views
! DELETE FROM ro_view1;
! ERROR:  cannot delete from view "ro_view1"
! DETAIL:  Views containing DISTINCT are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! DELETE FROM ro_view2;
! ERROR:  cannot delete from view "ro_view2"
! DETAIL:  Views containing GROUP BY are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! DELETE FROM ro_view3;
! ERROR:  cannot delete from view "ro_view3"
! DETAIL:  Views containing HAVING are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! DELETE FROM ro_view4;
! ERROR:  cannot delete from view "ro_view4"
! DETAIL:  Views that return aggregate functions are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! DELETE FROM ro_view5;
! ERROR:  cannot delete from view "ro_view5"
! DETAIL:  Views that return window functions are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! DELETE FROM ro_view6;
! ERROR:  cannot delete from view "ro_view6"
! DETAIL:  Views containing UNION, INTERSECT, or EXCEPT are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! UPDATE ro_view7 SET a=a+1;
! ERROR:  cannot update view "ro_view7"
! DETAIL:  Views containing WITH are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! UPDATE ro_view8 SET a=a+1;
! ERROR:  cannot update view "ro_view8"
! DETAIL:  Views containing LIMIT or OFFSET are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! UPDATE ro_view9 SET a=a+1;
! ERROR:  cannot update view "ro_view9"
! DETAIL:  Views containing LIMIT or OFFSET are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! UPDATE ro_view10 SET a=a+1;
! ERROR:  cannot update view "ro_view10"
! DETAIL:  Views that do not select from a single table or view are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! UPDATE ro_view11 SET a=a+1;
! ERROR:  cannot update view "ro_view11"
! DETAIL:  Views that do not select from a single table or view are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! UPDATE ro_view12 SET a=a+1;
! ERROR:  cannot update view "ro_view12"
! DETAIL:  Views that do not select from a single table or view are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! INSERT INTO ro_view13 VALUES (3, 'Row 3');
! ERROR:  cannot insert into view "ro_view13"
! DETAIL:  Views that do not select from a single table or view are not automatically updatable.
! HINT:  To enable inserting into the view, provide an INSTEAD OF INSERT trigger or an unconditional ON INSERT DO INSTEAD rule.
! -- Partially updatable view
! INSERT INTO rw_view14 VALUES (null, 3, 'Row 3'); -- should fail
! ERROR:  cannot insert into column "ctid" of view "rw_view14"
! DETAIL:  View columns that refer to system columns are not updatable.
! INSERT INTO rw_view14 (a, b) VALUES (3, 'Row 3'); -- should be OK
! UPDATE rw_view14 SET ctid=null WHERE a=3; -- should fail
! ERROR:  cannot update column "ctid" of view "rw_view14"
! DETAIL:  View columns that refer to system columns are not updatable.
! UPDATE rw_view14 SET b='ROW 3' WHERE a=3; -- should be OK
! SELECT * FROM base_tbl;
!  a  |   b    
! ----+--------
!  -2 | Row -2
!  -1 | Row -1
!   0 | Row 0
!   1 | Row 1
!   2 | Row 2
!   3 | ROW 3
! (6 rows)
! 
! DELETE FROM rw_view14 WHERE a=3; -- should be OK
! -- Partially updatable view
! INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
! ERROR:  cannot insert into column "upper" of view "rw_view15"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
! ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
! INSERT INTO rw_view15 (a) VALUES (4); -- should fail
! ERROR:  cannot insert into column "upper" of view "rw_view15"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view15 SET upper='ROW 3' WHERE a=3; -- should fail
! ERROR:  cannot update column "upper" of view "rw_view15"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view15 SET upper=DEFAULT WHERE a=3; -- should fail
! ERROR:  cannot update column "upper" of view "rw_view15"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view15 SET a=4 WHERE a=3; -- should be OK
! SELECT * FROM base_tbl;
!  a  |      b      
! ----+-------------
!  -2 | Row -2
!  -1 | Row -1
!   0 | Row 0
!   1 | Row 1
!   2 | Row 2
!   4 | Unspecified
! (6 rows)
! 
! DELETE FROM rw_view15 WHERE a=4; -- should be OK
! -- Partially updatable view
! INSERT INTO rw_view16 VALUES (3, 'Row 3', 3); -- should fail
! ERROR:  multiple assignments to same column "a"
! INSERT INTO rw_view16 (a, b) VALUES (3, 'Row 3'); -- should be OK
! UPDATE rw_view16 SET a=3, aa=-3 WHERE a=3; -- should fail
! ERROR:  multiple assignments to same column "a"
! UPDATE rw_view16 SET aa=-3 WHERE a=3; -- should be OK
! SELECT * FROM base_tbl;
!  a  |   b    
! ----+--------
!  -2 | Row -2
!  -1 | Row -1
!   0 | Row 0
!   1 | Row 1
!   2 | Row 2
!  -3 | Row 3
! (6 rows)
! 
! DELETE FROM rw_view16 WHERE a=-3; -- should be OK
! -- Read-only views
! INSERT INTO ro_view17 VALUES (3, 'ROW 3');
! ERROR:  cannot insert into view "ro_view1"
! DETAIL:  Views containing DISTINCT are not automatically updatable.
! HINT:  To enable inserting into the view, provide an INSTEAD OF INSERT trigger or an unconditional ON INSERT DO INSTEAD rule.
! DELETE FROM ro_view18;
! ERROR:  cannot delete from view "ro_view18"
! DETAIL:  Views that do not select from a single table or view are not automatically updatable.
! HINT:  To enable deleting from the view, provide an INSTEAD OF DELETE trigger or an unconditional ON DELETE DO INSTEAD rule.
! UPDATE ro_view19 SET max_value=1000;
! ERROR:  cannot update view "ro_view19"
! DETAIL:  Views that do not select from a single table or view are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! UPDATE ro_view20 SET b=upper(b);
! ERROR:  cannot update view "ro_view20"
! DETAIL:  Views that return set-returning functions are not automatically updatable.
! HINT:  To enable updating the view, provide an INSTEAD OF UPDATE trigger or an unconditional ON UPDATE DO INSTEAD rule.
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 16 other objects
! DETAIL:  drop cascades to view ro_view1
! drop cascades to view ro_view17
! drop cascades to view ro_view2
! drop cascades to view ro_view3
! drop cascades to view ro_view5
! drop cascades to view ro_view6
! drop cascades to view ro_view7
! drop cascades to view ro_view8
! drop cascades to view ro_view9
! drop cascades to view ro_view11
! drop cascades to view ro_view13
! drop cascades to view rw_view15
! drop cascades to view rw_view16
! drop cascades to view ro_view20
! drop cascades to view ro_view4
! drop cascades to view rw_view14
! DROP VIEW ro_view10, ro_view12, ro_view18;
! DROP SEQUENCE seq CASCADE;
! NOTICE:  drop cascades to view ro_view19
! -- simple updatable view
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a>0;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name = 'rw_view1';
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | YES
! (1 row)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name = 'rw_view1';
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | YES          | YES
! (1 row)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name = 'rw_view1'
!  ORDER BY ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | YES
!  rw_view1   | b           | YES
! (2 rows)
! 
! INSERT INTO rw_view1 VALUES (3, 'Row 3');
! INSERT INTO rw_view1 (a) VALUES (4);
! UPDATE rw_view1 SET a=5 WHERE a=4;
! DELETE FROM rw_view1 WHERE b='Row 2';
! SELECT * FROM base_tbl;
!  a  |      b      
! ----+-------------
!  -2 | Row -2
!  -1 | Row -1
!   0 | Row 0
!   1 | Row 1
!   3 | Row 3
!   5 | Unspecified
! (6 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view1 SET a=6 WHERE a=5;
!                     QUERY PLAN                    
! --------------------------------------------------
!  Update on base_tbl
!    ->  Index Scan using base_tbl_pkey on base_tbl
!          Index Cond: ((a > 0) AND (a = 5))
! (3 rows)
! 
! EXPLAIN (costs off) DELETE FROM rw_view1 WHERE a=5;
!                     QUERY PLAN                    
! --------------------------------------------------
!  Delete on base_tbl
!    ->  Index Scan using base_tbl_pkey on base_tbl
!          Index Cond: ((a > 0) AND (a = 5))
! (3 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- view on top of view
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
! CREATE VIEW rw_view1 AS SELECT b AS bb, a AS aa FROM base_tbl WHERE a>0;
! CREATE VIEW rw_view2 AS SELECT aa AS aaa, bb AS bbb FROM rw_view1 WHERE aa<10;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name = 'rw_view2';
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view2   | YES
! (1 row)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name = 'rw_view2';
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view2   | YES          | YES
! (1 row)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name = 'rw_view2'
!  ORDER BY ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view2   | aaa         | YES
!  rw_view2   | bbb         | YES
! (2 rows)
! 
! INSERT INTO rw_view2 VALUES (3, 'Row 3');
! INSERT INTO rw_view2 (aaa) VALUES (4);
! SELECT * FROM rw_view2;
!  aaa |     bbb     
! -----+-------------
!    1 | Row 1
!    2 | Row 2
!    3 | Row 3
!    4 | Unspecified
! (4 rows)
! 
! UPDATE rw_view2 SET bbb='Row 4' WHERE aaa=4;
! DELETE FROM rw_view2 WHERE aaa=2;
! SELECT * FROM rw_view2;
!  aaa |  bbb  
! -----+-------
!    1 | Row 1
!    3 | Row 3
!    4 | Row 4
! (3 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view2 SET aaa=5 WHERE aaa=4;
!                        QUERY PLAN                       
! --------------------------------------------------------
!  Update on base_tbl
!    ->  Index Scan using base_tbl_pkey on base_tbl
!          Index Cond: ((a < 10) AND (a > 0) AND (a = 4))
! (3 rows)
! 
! EXPLAIN (costs off) DELETE FROM rw_view2 WHERE aaa=4;
!                        QUERY PLAN                       
! --------------------------------------------------------
!  Delete on base_tbl
!    ->  Index Scan using base_tbl_pkey on base_tbl
!          Index Cond: ((a < 10) AND (a > 0) AND (a = 4))
! (3 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! -- view on top of view with rules
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a>0 OFFSET 0; -- not updatable without rules/triggers
! CREATE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a<10;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | NO
!  rw_view2   | NO
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | NO           | NO
!  rw_view2   | NO           | NO
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! CREATE RULE rw_view1_ins_rule AS ON INSERT TO rw_view1
!   DO INSTEAD INSERT INTO base_tbl VALUES (NEW.a, NEW.b) RETURNING *;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | YES
!  rw_view2   | YES
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | NO           | YES
!  rw_view2   | NO           | YES
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! CREATE RULE rw_view1_upd_rule AS ON UPDATE TO rw_view1
!   DO INSTEAD UPDATE base_tbl SET b=NEW.b WHERE a=OLD.a RETURNING NEW.*;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | YES
!  rw_view2   | YES
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | NO           | YES
!  rw_view2   | NO           | YES
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! CREATE RULE rw_view1_del_rule AS ON DELETE TO rw_view1
!   DO INSTEAD DELETE FROM base_tbl WHERE a=OLD.a RETURNING OLD.*;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | YES
!  rw_view2   | YES
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | YES          | YES
!  rw_view2   | YES          | YES
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | YES
!  rw_view1   | b           | YES
!  rw_view2   | a           | YES
!  rw_view2   | b           | YES
! (4 rows)
! 
! INSERT INTO rw_view2 VALUES (3, 'Row 3') RETURNING *;
!  a |   b   
! ---+-------
!  3 | Row 3
! (1 row)
! 
! UPDATE rw_view2 SET b='Row three' WHERE a=3 RETURNING *;
!  a |     b     
! ---+-----------
!  3 | Row three
! (1 row)
! 
! SELECT * FROM rw_view2;
!  a |     b     
! ---+-----------
!  1 | Row 1
!  2 | Row 2
!  3 | Row three
! (3 rows)
! 
! DELETE FROM rw_view2 WHERE a=3 RETURNING *;
!  a |     b     
! ---+-----------
!  3 | Row three
! (1 row)
! 
! SELECT * FROM rw_view2;
!  a |   b   
! ---+-------
!  1 | Row 1
!  2 | Row 2
! (2 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view2 SET a=3 WHERE a=2;
!                            QUERY PLAN                           
! ----------------------------------------------------------------
!  Update on base_tbl
!    ->  Nested Loop
!          ->  Index Scan using base_tbl_pkey on base_tbl
!                Index Cond: (a = 2)
!          ->  Subquery Scan on rw_view1
!                Filter: ((rw_view1.a < 10) AND (rw_view1.a = 2))
!                ->  Bitmap Heap Scan on base_tbl base_tbl_1
!                      Recheck Cond: (a > 0)
!                      ->  Bitmap Index Scan on base_tbl_pkey
!                            Index Cond: (a > 0)
! (10 rows)
! 
! EXPLAIN (costs off) DELETE FROM rw_view2 WHERE a=2;
!                            QUERY PLAN                           
! ----------------------------------------------------------------
!  Delete on base_tbl
!    ->  Nested Loop
!          ->  Index Scan using base_tbl_pkey on base_tbl
!                Index Cond: (a = 2)
!          ->  Subquery Scan on rw_view1
!                Filter: ((rw_view1.a < 10) AND (rw_view1.a = 2))
!                ->  Bitmap Heap Scan on base_tbl base_tbl_1
!                      Recheck Cond: (a > 0)
!                      ->  Bitmap Index Scan on base_tbl_pkey
!                            Index Cond: (a > 0)
! (10 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! -- view on top of view with triggers
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a>0 OFFSET 0; -- not updatable without rules/triggers
! CREATE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a<10;
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | NO
!  rw_view2   | NO
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into,
!        is_trigger_updatable, is_trigger_deletable,
!        is_trigger_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  rw_view1   | NO           | NO                 | NO                   | NO                   | NO
!  rw_view2   | NO           | NO                 | NO                   | NO                   | NO
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! CREATE FUNCTION rw_view1_trig_fn()
! RETURNS trigger AS
! $$
! BEGIN
!   IF TG_OP = 'INSERT' THEN
!     INSERT INTO base_tbl VALUES (NEW.a, NEW.b);
!     RETURN NEW;
!   ELSIF TG_OP = 'UPDATE' THEN
!     UPDATE base_tbl SET b=NEW.b WHERE a=OLD.a;
!     RETURN NEW;
!   ELSIF TG_OP = 'DELETE' THEN
!     DELETE FROM base_tbl WHERE a=OLD.a;
!     RETURN OLD;
!   END IF;
! END;
! $$
! LANGUAGE plpgsql;
! CREATE TRIGGER rw_view1_ins_trig INSTEAD OF INSERT ON rw_view1
!   FOR EACH ROW EXECUTE PROCEDURE rw_view1_trig_fn();
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | NO
!  rw_view2   | NO
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into,
!        is_trigger_updatable, is_trigger_deletable,
!        is_trigger_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  rw_view1   | NO           | NO                 | NO                   | NO                   | YES
!  rw_view2   | NO           | NO                 | NO                   | NO                   | NO
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! CREATE TRIGGER rw_view1_upd_trig INSTEAD OF UPDATE ON rw_view1
!   FOR EACH ROW EXECUTE PROCEDURE rw_view1_trig_fn();
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | NO
!  rw_view2   | NO
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into,
!        is_trigger_updatable, is_trigger_deletable,
!        is_trigger_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  rw_view1   | NO           | NO                 | YES                  | NO                   | YES
!  rw_view2   | NO           | NO                 | NO                   | NO                   | NO
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! CREATE TRIGGER rw_view1_del_trig INSTEAD OF DELETE ON rw_view1
!   FOR EACH ROW EXECUTE PROCEDURE rw_view1_trig_fn();
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | NO
!  rw_view2   | NO
! (2 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into,
!        is_trigger_updatable, is_trigger_deletable,
!        is_trigger_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  rw_view1   | NO           | NO                 | YES                  | YES                  | YES
!  rw_view2   | NO           | NO                 | NO                   | NO                   | NO
! (2 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE 'rw_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | a           | NO
!  rw_view1   | b           | NO
!  rw_view2   | a           | NO
!  rw_view2   | b           | NO
! (4 rows)
! 
! INSERT INTO rw_view2 VALUES (3, 'Row 3') RETURNING *;
!  a |   b   
! ---+-------
!  3 | Row 3
! (1 row)
! 
! UPDATE rw_view2 SET b='Row three' WHERE a=3 RETURNING *;
!  a |     b     
! ---+-----------
!  3 | Row three
! (1 row)
! 
! SELECT * FROM rw_view2;
!  a |     b     
! ---+-----------
!  1 | Row 1
!  2 | Row 2
!  3 | Row three
! (3 rows)
! 
! DELETE FROM rw_view2 WHERE a=3 RETURNING *;
!  a |     b     
! ---+-----------
!  3 | Row three
! (1 row)
! 
! SELECT * FROM rw_view2;
!  a |   b   
! ---+-------
!  1 | Row 1
!  2 | Row 2
! (2 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view2 SET a=3 WHERE a=2;
!                         QUERY PLAN                        
! ----------------------------------------------------------
!  Update on rw_view1 rw_view1_1
!    ->  Subquery Scan on rw_view1
!          Filter: ((rw_view1.a < 10) AND (rw_view1.a = 2))
!          ->  Bitmap Heap Scan on base_tbl
!                Recheck Cond: (a > 0)
!                ->  Bitmap Index Scan on base_tbl_pkey
!                      Index Cond: (a > 0)
! (7 rows)
! 
! EXPLAIN (costs off) DELETE FROM rw_view2 WHERE a=2;
!                         QUERY PLAN                        
! ----------------------------------------------------------
!  Delete on rw_view1 rw_view1_1
!    ->  Subquery Scan on rw_view1
!          Filter: ((rw_view1.a < 10) AND (rw_view1.a = 2))
!          ->  Bitmap Heap Scan on base_tbl
!                Recheck Cond: (a > 0)
!                ->  Bitmap Index Scan on base_tbl_pkey
!                      Index Cond: (a > 0)
! (7 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! DROP FUNCTION rw_view1_trig_fn();
! -- update using whole row from view
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
! CREATE VIEW rw_view1 AS SELECT b AS bb, a AS aa FROM base_tbl;
! CREATE FUNCTION rw_view1_aa(x rw_view1)
!   RETURNS int AS $$ SELECT x.aa $$ LANGUAGE sql;
! UPDATE rw_view1 v SET bb='Updated row 2' WHERE rw_view1_aa(v)=2
!   RETURNING rw_view1_aa(v), v.bb;
!  rw_view1_aa |      bb       
! -------------+---------------
!            2 | Updated row 2
! (1 row)
! 
! SELECT * FROM base_tbl;
!  a  |       b       
! ----+---------------
!  -2 | Row -2
!  -1 | Row -1
!   0 | Row 0
!   1 | Row 1
!   2 | Updated row 2
! (5 rows)
! 
! EXPLAIN (costs off)
! UPDATE rw_view1 v SET bb='Updated row 2' WHERE rw_view1_aa(v)=2
!   RETURNING rw_view1_aa(v), v.bb;
!                     QUERY PLAN                    
! --------------------------------------------------
!  Update on base_tbl
!    ->  Index Scan using base_tbl_pkey on base_tbl
!          Index Cond: (a = 2)
! (3 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to function rw_view1_aa(rw_view1)
! -- permissions checks
! CREATE USER view_user1;
! CREATE USER view_user2;
! SET SESSION AUTHORIZATION view_user1;
! CREATE TABLE base_tbl(a int, b text, c float);
! INSERT INTO base_tbl VALUES (1, 'Row 1', 1.0);
! CREATE VIEW rw_view1 AS SELECT b AS bb, c AS cc, a AS aa FROM base_tbl;
! INSERT INTO rw_view1 VALUES ('Row 2', 2.0, 2);
! GRANT SELECT ON base_tbl TO view_user2;
! GRANT SELECT ON rw_view1 TO view_user2;
! GRANT UPDATE (a,c) ON base_tbl TO view_user2;
! GRANT UPDATE (bb,cc) ON rw_view1 TO view_user2;
! RESET SESSION AUTHORIZATION;
! SET SESSION AUTHORIZATION view_user2;
! CREATE VIEW rw_view2 AS SELECT b AS bb, c AS cc, a AS aa FROM base_tbl;
! SELECT * FROM base_tbl; -- ok
!  a |   b   | c 
! ---+-------+---
!  1 | Row 1 | 1
!  2 | Row 2 | 2
! (2 rows)
! 
! SELECT * FROM rw_view1; -- ok
!   bb   | cc | aa 
! -------+----+----
!  Row 1 |  1 |  1
!  Row 2 |  2 |  2
! (2 rows)
! 
! SELECT * FROM rw_view2; -- ok
!   bb   | cc | aa 
! -------+----+----
!  Row 1 |  1 |  1
!  Row 2 |  2 |  2
! (2 rows)
! 
! INSERT INTO base_tbl VALUES (3, 'Row 3', 3.0); -- not allowed
! ERROR:  permission denied for relation base_tbl
! INSERT INTO rw_view1 VALUES ('Row 3', 3.0, 3); -- not allowed
! ERROR:  permission denied for relation rw_view1
! INSERT INTO rw_view2 VALUES ('Row 3', 3.0, 3); -- not allowed
! ERROR:  permission denied for relation base_tbl
! UPDATE base_tbl SET a=a, c=c; -- ok
! UPDATE base_tbl SET b=b; -- not allowed
! ERROR:  permission denied for relation base_tbl
! UPDATE rw_view1 SET bb=bb, cc=cc; -- ok
! UPDATE rw_view1 SET aa=aa; -- not allowed
! ERROR:  permission denied for relation rw_view1
! UPDATE rw_view2 SET aa=aa, cc=cc; -- ok
! UPDATE rw_view2 SET bb=bb; -- not allowed
! ERROR:  permission denied for relation base_tbl
! DELETE FROM base_tbl; -- not allowed
! ERROR:  permission denied for relation base_tbl
! DELETE FROM rw_view1; -- not allowed
! ERROR:  permission denied for relation rw_view1
! DELETE FROM rw_view2; -- not allowed
! ERROR:  permission denied for relation base_tbl
! RESET SESSION AUTHORIZATION;
! SET SESSION AUTHORIZATION view_user1;
! GRANT INSERT, DELETE ON base_tbl TO view_user2;
! RESET SESSION AUTHORIZATION;
! SET SESSION AUTHORIZATION view_user2;
! INSERT INTO base_tbl VALUES (3, 'Row 3', 3.0); -- ok
! INSERT INTO rw_view1 VALUES ('Row 4', 4.0, 4); -- not allowed
! ERROR:  permission denied for relation rw_view1
! INSERT INTO rw_view2 VALUES ('Row 4', 4.0, 4); -- ok
! DELETE FROM base_tbl WHERE a=1; -- ok
! DELETE FROM rw_view1 WHERE aa=2; -- not allowed
! ERROR:  permission denied for relation rw_view1
! DELETE FROM rw_view2 WHERE aa=2; -- ok
! SELECT * FROM base_tbl;
!  a |   b   | c 
! ---+-------+---
!  3 | Row 3 | 3
!  4 | Row 4 | 4
! (2 rows)
! 
! RESET SESSION AUTHORIZATION;
! SET SESSION AUTHORIZATION view_user1;
! REVOKE INSERT, DELETE ON base_tbl FROM view_user2;
! GRANT INSERT, DELETE ON rw_view1 TO view_user2;
! RESET SESSION AUTHORIZATION;
! SET SESSION AUTHORIZATION view_user2;
! INSERT INTO base_tbl VALUES (5, 'Row 5', 5.0); -- not allowed
! ERROR:  permission denied for relation base_tbl
! INSERT INTO rw_view1 VALUES ('Row 5', 5.0, 5); -- ok
! INSERT INTO rw_view2 VALUES ('Row 6', 6.0, 6); -- not allowed
! ERROR:  permission denied for relation base_tbl
! DELETE FROM base_tbl WHERE a=3; -- not allowed
! ERROR:  permission denied for relation base_tbl
! DELETE FROM rw_view1 WHERE aa=3; -- ok
! DELETE FROM rw_view2 WHERE aa=4; -- not allowed
! ERROR:  permission denied for relation base_tbl
! SELECT * FROM base_tbl;
!  a |   b   | c 
! ---+-------+---
!  4 | Row 4 | 4
!  5 | Row 5 | 5
! (2 rows)
! 
! RESET SESSION AUTHORIZATION;
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! DROP USER view_user1;
! DROP USER view_user2;
! -- column defaults
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified', c serial);
! INSERT INTO base_tbl VALUES (1, 'Row 1');
! INSERT INTO base_tbl VALUES (2, 'Row 2');
! INSERT INTO base_tbl VALUES (3);
! CREATE VIEW rw_view1 AS SELECT a AS aa, b AS bb FROM base_tbl;
! ALTER VIEW rw_view1 ALTER COLUMN bb SET DEFAULT 'View default';
! INSERT INTO rw_view1 VALUES (4, 'Row 4');
! INSERT INTO rw_view1 (aa) VALUES (5);
! SELECT * FROM base_tbl;
!  a |      b       | c 
! ---+--------------+---
!  1 | Row 1        | 1
!  2 | Row 2        | 2
!  3 | Unspecified  | 3
!  4 | Row 4        | 4
!  5 | View default | 5
! (5 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- Table having triggers
! CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
! INSERT INTO base_tbl VALUES (1, 'Row 1');
! INSERT INTO base_tbl VALUES (2, 'Row 2');
! CREATE FUNCTION rw_view1_trig_fn()
! RETURNS trigger AS
! $$
! BEGIN
!   IF TG_OP = 'INSERT' THEN
!     UPDATE base_tbl SET b=NEW.b WHERE a=1;
!     RETURN NULL;
!   END IF;
!   RETURN NULL;
! END;
! $$
! LANGUAGE plpgsql;
! CREATE TRIGGER rw_view1_ins_trig AFTER INSERT ON base_tbl
!   FOR EACH ROW EXECUTE PROCEDURE rw_view1_trig_fn();
! CREATE VIEW rw_view1 AS SELECT a AS aa, b AS bb FROM base_tbl;
! INSERT INTO rw_view1 VALUES (3, 'Row 3');
! select * from base_tbl;
!  a |   b   
! ---+-------
!  2 | Row 2
!  3 | Row 3
!  1 | Row 3
! (3 rows)
! 
! DROP VIEW rw_view1;
! DROP TRIGGER rw_view1_ins_trig on base_tbl;
! DROP FUNCTION rw_view1_trig_fn();
! DROP TABLE base_tbl;
! -- view with ORDER BY
! CREATE TABLE base_tbl (a int, b int);
! INSERT INTO base_tbl VALUES (1,2), (4,5), (3,-3);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl ORDER BY a+b;
! SELECT * FROM rw_view1;
!  a | b  
! ---+----
!  3 | -3
!  1 |  2
!  4 |  5
! (3 rows)
! 
! INSERT INTO rw_view1 VALUES (7,-8);
! SELECT * FROM rw_view1;
!  a | b  
! ---+----
!  7 | -8
!  3 | -3
!  1 |  2
!  4 |  5
! (4 rows)
! 
! EXPLAIN (verbose, costs off) UPDATE rw_view1 SET b = b + 1 RETURNING *;
!                          QUERY PLAN                          
! -------------------------------------------------------------
!  Update on public.base_tbl
!    Output: base_tbl.a, base_tbl.b
!    ->  Seq Scan on public.base_tbl
!          Output: base_tbl.a, (base_tbl.b + 1), base_tbl.ctid
! (4 rows)
! 
! UPDATE rw_view1 SET b = b + 1 RETURNING *;
!  a | b  
! ---+----
!  1 |  3
!  4 |  6
!  3 | -2
!  7 | -7
! (4 rows)
! 
! SELECT * FROM rw_view1;
!  a | b  
! ---+----
!  7 | -7
!  3 | -2
!  1 |  3
!  4 |  6
! (4 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- multiple array-column updates
! CREATE TABLE base_tbl (a int, arr int[]);
! INSERT INTO base_tbl VALUES (1,ARRAY[2]), (3,ARRAY[4]);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl;
! UPDATE rw_view1 SET arr[1] = 42, arr[2] = 77 WHERE a = 3;
! SELECT * FROM rw_view1;
!  a |   arr   
! ---+---------
!  1 | {2}
!  3 | {42,77}
! (2 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- views with updatable and non-updatable columns
! CREATE TABLE base_tbl(a float);
! INSERT INTO base_tbl SELECT i/10.0 FROM generate_series(1,10) g(i);
! CREATE VIEW rw_view1 AS
!   SELECT ctid, sin(a) s, a, cos(a) c
!   FROM base_tbl
!   WHERE a != 0
!   ORDER BY abs(a);
! INSERT INTO rw_view1 VALUES (null, null, 1.1, null); -- should fail
! ERROR:  cannot insert into column "ctid" of view "rw_view1"
! DETAIL:  View columns that refer to system columns are not updatable.
! INSERT INTO rw_view1 (s, c, a) VALUES (null, null, 1.1); -- should fail
! ERROR:  cannot insert into column "s" of view "rw_view1"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! INSERT INTO rw_view1 (a) VALUES (1.1) RETURNING a, s, c; -- OK
!   a  |         s         |         c         
! -----+-------------------+-------------------
!  1.1 | 0.891207360061435 | 0.453596121425577
! (1 row)
! 
! UPDATE rw_view1 SET s = s WHERE a = 1.1; -- should fail
! ERROR:  cannot update column "s" of view "rw_view1"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view1 SET a = 1.05 WHERE a = 1.1 RETURNING s; -- OK
!          s         
! -------------------
!  0.867423225594017
! (1 row)
! 
! DELETE FROM rw_view1 WHERE a = 1.05; -- OK
! CREATE VIEW rw_view2 AS
!   SELECT s, c, s/c t, a base_a, ctid
!   FROM rw_view1;
! INSERT INTO rw_view2 VALUES (null, null, null, 1.1, null); -- should fail
! ERROR:  cannot insert into column "t" of view "rw_view2"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! INSERT INTO rw_view2(s, c, base_a) VALUES (null, null, 1.1); -- should fail
! ERROR:  cannot insert into column "s" of view "rw_view1"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! INSERT INTO rw_view2(base_a) VALUES (1.1) RETURNING t; -- OK
!         t         
! ------------------
!  1.96475965724865
! (1 row)
! 
! UPDATE rw_view2 SET s = s WHERE base_a = 1.1; -- should fail
! ERROR:  cannot update column "s" of view "rw_view1"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view2 SET t = t WHERE base_a = 1.1; -- should fail
! ERROR:  cannot update column "t" of view "rw_view2"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view2 SET base_a = 1.05 WHERE base_a = 1.1; -- OK
! DELETE FROM rw_view2 WHERE base_a = 1.05 RETURNING base_a, s, c, t; -- OK
!  base_a |         s         |         c         |        t         
! --------+-------------------+-------------------+------------------
!    1.05 | 0.867423225594017 | 0.497571047891727 | 1.74331530998317
! (1 row)
! 
! CREATE VIEW rw_view3 AS
!   SELECT s, c, s/c t, ctid
!   FROM rw_view1;
! INSERT INTO rw_view3 VALUES (null, null, null, null); -- should fail
! ERROR:  cannot insert into column "t" of view "rw_view3"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! INSERT INTO rw_view3(s) VALUES (null); -- should fail
! ERROR:  cannot insert into column "s" of view "rw_view1"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! UPDATE rw_view3 SET s = s; -- should fail
! ERROR:  cannot update column "s" of view "rw_view1"
! DETAIL:  View columns that are not columns of their base relation are not updatable.
! DELETE FROM rw_view3 WHERE s = sin(0.1); -- should be OK
! SELECT * FROM base_tbl ORDER BY a;
!   a  
! -----
!  0.2
!  0.3
!  0.4
!  0.5
!  0.6
!  0.7
!  0.8
!  0.9
!    1
! (9 rows)
! 
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name LIKE E'r_\\_view%'
!  ORDER BY table_name;
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | YES
!  rw_view2   | YES
!  rw_view3   | NO
! (3 rows)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name LIKE E'r_\\_view%'
!  ORDER BY table_name;
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | YES          | YES
!  rw_view2   | YES          | YES
!  rw_view3   | NO           | NO
! (3 rows)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name LIKE E'r_\\_view%'
!  ORDER BY table_name, ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | ctid        | NO
!  rw_view1   | s           | NO
!  rw_view1   | a           | YES
!  rw_view1   | c           | NO
!  rw_view2   | s           | NO
!  rw_view2   | c           | NO
!  rw_view2   | t           | NO
!  rw_view2   | base_a      | YES
!  rw_view2   | ctid        | NO
!  rw_view3   | s           | NO
!  rw_view3   | c           | NO
!  rw_view3   | t           | NO
!  rw_view3   | ctid        | NO
! (13 rows)
! 
! SELECT events & 4 != 0 AS upd,
!        events & 8 != 0 AS ins,
!        events & 16 != 0 AS del
!   FROM pg_catalog.pg_relation_is_updatable('rw_view3'::regclass, false) t(events);
!  upd | ins | del 
! -----+-----+-----
!  f   | f   | t
! (1 row)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 3 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! drop cascades to view rw_view3
! -- inheritance tests
! CREATE TABLE base_tbl_parent (a int);
! CREATE TABLE base_tbl_child (CHECK (a > 0)) INHERITS (base_tbl_parent);
! INSERT INTO base_tbl_parent SELECT * FROM generate_series(-8, -1);
! INSERT INTO base_tbl_child SELECT * FROM generate_series(1, 8);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl_parent;
! CREATE VIEW rw_view2 AS SELECT * FROM ONLY base_tbl_parent;
! SELECT * FROM rw_view1 ORDER BY a;
!  a  
! ----
!  -8
!  -7
!  -6
!  -5
!  -4
!  -3
!  -2
!  -1
!   1
!   2
!   3
!   4
!   5
!   6
!   7
!   8
! (16 rows)
! 
! SELECT * FROM ONLY rw_view1 ORDER BY a;
!  a  
! ----
!  -8
!  -7
!  -6
!  -5
!  -4
!  -3
!  -2
!  -1
!   1
!   2
!   3
!   4
!   5
!   6
!   7
!   8
! (16 rows)
! 
! SELECT * FROM rw_view2 ORDER BY a;
!  a  
! ----
!  -8
!  -7
!  -6
!  -5
!  -4
!  -3
!  -2
!  -1
! (8 rows)
! 
! INSERT INTO rw_view1 VALUES (-100), (100);
! INSERT INTO rw_view2 VALUES (-200), (200);
! UPDATE rw_view1 SET a = a*10 WHERE a IN (-1, 1); -- Should produce -10 and 10
! UPDATE ONLY rw_view1 SET a = a*10 WHERE a IN (-2, 2); -- Should produce -20 and 20
! UPDATE rw_view2 SET a = a*10 WHERE a IN (-3, 3); -- Should produce -30 only
! UPDATE ONLY rw_view2 SET a = a*10 WHERE a IN (-4, 4); -- Should produce -40 only
! DELETE FROM rw_view1 WHERE a IN (-5, 5); -- Should delete -5 and 5
! DELETE FROM ONLY rw_view1 WHERE a IN (-6, 6); -- Should delete -6 and 6
! DELETE FROM rw_view2 WHERE a IN (-7, 7); -- Should delete -7 only
! DELETE FROM ONLY rw_view2 WHERE a IN (-8, 8); -- Should delete -8 only
! SELECT * FROM ONLY base_tbl_parent ORDER BY a;
!   a   
! ------
!  -200
!  -100
!   -40
!   -30
!   -20
!   -10
!   100
!   200
! (8 rows)
! 
! SELECT * FROM base_tbl_child ORDER BY a;
!  a  
! ----
!   3
!   4
!   7
!   8
!  10
!  20
! (6 rows)
! 
! DROP TABLE base_tbl_parent, base_tbl_child CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! -- simple WITH CHECK OPTION
! CREATE TABLE base_tbl (a int, b int DEFAULT 10);
! INSERT INTO base_tbl VALUES (1,2), (2,3), (1,-1);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a < b
!   WITH LOCAL CHECK OPTION;
! \d+ rw_view1
!                 View "public.rw_view1"
!  Column |  Type   | Modifiers | Storage | Description 
! --------+---------+-----------+---------+-------------
!  a      | integer |           | plain   | 
!  b      | integer |           | plain   | 
! View definition:
!  SELECT base_tbl.a,
!     base_tbl.b
!    FROM base_tbl
!   WHERE base_tbl.a < base_tbl.b;
! Options: check_option=local
! 
! SELECT * FROM information_schema.views WHERE table_name = 'rw_view1';
!  table_catalog | table_schema | table_name |          view_definition           | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ---------------+--------------+------------+------------------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  regression    | public       | rw_view1   |  SELECT base_tbl.a,               +| LOCAL        | YES          | YES                | NO                   | NO                   | NO
!                |              |            |     base_tbl.b                    +|              |              |                    |                      |                      | 
!                |              |            |    FROM base_tbl                  +|              |              |                    |                      |                      | 
!                |              |            |   WHERE (base_tbl.a < base_tbl.b); |              |              |                    |                      |                      | 
! (1 row)
! 
! INSERT INTO rw_view1 VALUES(3,4); -- ok
! INSERT INTO rw_view1 VALUES(4,3); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (4, 3).
! INSERT INTO rw_view1 VALUES(5,null); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (5, null).
! UPDATE rw_view1 SET b = 5 WHERE a = 3; -- ok
! UPDATE rw_view1 SET b = -5 WHERE a = 3; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (3, -5).
! INSERT INTO rw_view1(a) VALUES (9); -- ok
! INSERT INTO rw_view1(a) VALUES (10); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (10, 10).
! SELECT * FROM base_tbl;
!  a | b  
! ---+----
!  1 |  2
!  2 |  3
!  1 | -1
!  3 |  5
!  9 | 10
! (5 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- WITH LOCAL/CASCADED CHECK OPTION
! CREATE TABLE base_tbl (a int);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a > 0;
! CREATE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a < 10
!   WITH CHECK OPTION; -- implicitly cascaded
! \d+ rw_view2
!                 View "public.rw_view2"
!  Column |  Type   | Modifiers | Storage | Description 
! --------+---------+-----------+---------+-------------
!  a      | integer |           | plain   | 
! View definition:
!  SELECT rw_view1.a
!    FROM rw_view1
!   WHERE rw_view1.a < 10;
! Options: check_option=cascaded
! 
! SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
!  table_catalog | table_schema | table_name |      view_definition       | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ---------------+--------------+------------+----------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  regression    | public       | rw_view2   |  SELECT rw_view1.a        +| CASCADED     | YES          | YES                | NO                   | NO                   | NO
!                |              |            |    FROM rw_view1          +|              |              |                    |                      |                      | 
!                |              |            |   WHERE (rw_view1.a < 10); |              |              |                    |                      |                      | 
! (1 row)
! 
! INSERT INTO rw_view2 VALUES (-5); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (-5).
! INSERT INTO rw_view2 VALUES (5); -- ok
! INSERT INTO rw_view2 VALUES (15); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (15).
! SELECT * FROM base_tbl;
!  a 
! ---
!  5
! (1 row)
! 
! UPDATE rw_view2 SET a = a - 10; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (-5).
! UPDATE rw_view2 SET a = a + 10; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (15).
! CREATE OR REPLACE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a < 10
!   WITH LOCAL CHECK OPTION;
! \d+ rw_view2
!                 View "public.rw_view2"
!  Column |  Type   | Modifiers | Storage | Description 
! --------+---------+-----------+---------+-------------
!  a      | integer |           | plain   | 
! View definition:
!  SELECT rw_view1.a
!    FROM rw_view1
!   WHERE rw_view1.a < 10;
! Options: check_option=local
! 
! SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
!  table_catalog | table_schema | table_name |      view_definition       | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ---------------+--------------+------------+----------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  regression    | public       | rw_view2   |  SELECT rw_view1.a        +| LOCAL        | YES          | YES                | NO                   | NO                   | NO
!                |              |            |    FROM rw_view1          +|              |              |                    |                      |                      | 
!                |              |            |   WHERE (rw_view1.a < 10); |              |              |                    |                      |                      | 
! (1 row)
! 
! INSERT INTO rw_view2 VALUES (-10); -- ok, but not in view
! INSERT INTO rw_view2 VALUES (20); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (20).
! SELECT * FROM base_tbl;
!   a  
! -----
!    5
!  -10
! (2 rows)
! 
! ALTER VIEW rw_view1 SET (check_option=here); -- invalid
! ERROR:  invalid value for "check_option" option
! DETAIL:  Valid values are "local" and "cascaded".
! ALTER VIEW rw_view1 SET (check_option=local);
! INSERT INTO rw_view2 VALUES (-20); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (-20).
! INSERT INTO rw_view2 VALUES (30); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (30).
! ALTER VIEW rw_view2 RESET (check_option);
! \d+ rw_view2
!                 View "public.rw_view2"
!  Column |  Type   | Modifiers | Storage | Description 
! --------+---------+-----------+---------+-------------
!  a      | integer |           | plain   | 
! View definition:
!  SELECT rw_view1.a
!    FROM rw_view1
!   WHERE rw_view1.a < 10;
! 
! SELECT * FROM information_schema.views WHERE table_name = 'rw_view2';
!  table_catalog | table_schema | table_name |      view_definition       | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ---------------+--------------+------------+----------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  regression    | public       | rw_view2   |  SELECT rw_view1.a        +| NONE         | YES          | YES                | NO                   | NO                   | NO
!                |              |            |    FROM rw_view1          +|              |              |                    |                      |                      | 
!                |              |            |   WHERE (rw_view1.a < 10); |              |              |                    |                      |                      | 
! (1 row)
! 
! INSERT INTO rw_view2 VALUES (30); -- ok, but not in view
! SELECT * FROM base_tbl;
!   a  
! -----
!    5
!  -10
!   30
! (3 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! -- WITH CHECK OPTION with no local view qual
! CREATE TABLE base_tbl (a int);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WITH CHECK OPTION;
! CREATE VIEW rw_view2 AS SELECT * FROM rw_view1 WHERE a > 0;
! CREATE VIEW rw_view3 AS SELECT * FROM rw_view2 WITH CHECK OPTION;
! SELECT * FROM information_schema.views WHERE table_name LIKE E'rw\\_view_' ORDER BY table_name;
!  table_catalog | table_schema | table_name |      view_definition      | check_option | is_updatable | is_insertable_into | is_trigger_updatable | is_trigger_deletable | is_trigger_insertable_into 
! ---------------+--------------+------------+---------------------------+--------------+--------------+--------------------+----------------------+----------------------+----------------------------
!  regression    | public       | rw_view1   |  SELECT base_tbl.a       +| CASCADED     | YES          | YES                | NO                   | NO                   | NO
!                |              |            |    FROM base_tbl;         |              |              |                    |                      |                      | 
!  regression    | public       | rw_view2   |  SELECT rw_view1.a       +| NONE         | YES          | YES                | NO                   | NO                   | NO
!                |              |            |    FROM rw_view1         +|              |              |                    |                      |                      | 
!                |              |            |   WHERE (rw_view1.a > 0); |              |              |                    |                      |                      | 
!  regression    | public       | rw_view3   |  SELECT rw_view2.a       +| CASCADED     | YES          | YES                | NO                   | NO                   | NO
!                |              |            |    FROM rw_view2;         |              |              |                    |                      |                      | 
! (3 rows)
! 
! INSERT INTO rw_view1 VALUES (-1); -- ok
! INSERT INTO rw_view1 VALUES (1); -- ok
! INSERT INTO rw_view2 VALUES (-2); -- ok, but not in view
! INSERT INTO rw_view2 VALUES (2); -- ok
! INSERT INTO rw_view3 VALUES (-3); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (-3).
! INSERT INTO rw_view3 VALUES (3); -- ok
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 3 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! drop cascades to view rw_view3
! -- WITH CHECK OPTION with scalar array ops
! CREATE TABLE base_tbl (a int, b int[]);
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a = ANY (b)
!   WITH CHECK OPTION;
! INSERT INTO rw_view1 VALUES (1, ARRAY[1,2,3]); -- ok
! INSERT INTO rw_view1 VALUES (10, ARRAY[4,5]); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (10, {4,5}).
! UPDATE rw_view1 SET b[2] = -b[2] WHERE a = 1; -- ok
! UPDATE rw_view1 SET b[1] = -b[1] WHERE a = 1; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (1, {-1,-2,3}).
! PREPARE ins(int, int[]) AS INSERT INTO rw_view1 VALUES($1, $2);
! EXECUTE ins(2, ARRAY[1,2,3]); -- ok
! EXECUTE ins(10, ARRAY[4,5]); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (10, {4,5}).
! DEALLOCATE PREPARE ins;
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- WITH CHECK OPTION with subquery
! CREATE TABLE base_tbl (a int);
! CREATE TABLE ref_tbl (a int PRIMARY KEY);
! INSERT INTO ref_tbl SELECT * FROM generate_series(1,10);
! CREATE VIEW rw_view1 AS
!   SELECT * FROM base_tbl b
!   WHERE EXISTS(SELECT 1 FROM ref_tbl r WHERE r.a = b.a)
!   WITH CHECK OPTION;
! INSERT INTO rw_view1 VALUES (5); -- ok
! INSERT INTO rw_view1 VALUES (15); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (15).
! UPDATE rw_view1 SET a = a + 5; -- ok
! UPDATE rw_view1 SET a = a + 5; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (15).
! EXPLAIN (costs off) INSERT INTO rw_view1 VALUES (5);
!                           QUERY PLAN                           
! ---------------------------------------------------------------
!  Insert on base_tbl b
!    ->  Result
!          SubPlan 1
!            ->  Index Only Scan using ref_tbl_pkey on ref_tbl r
!                  Index Cond: (a = b.a)
!          SubPlan 2
!            ->  Seq Scan on ref_tbl r_1
! (7 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view1 SET a = a + 5;
!                            QUERY PLAN                            
! -----------------------------------------------------------------
!  Update on base_tbl b
!    ->  Hash Semi Join
!          Hash Cond: (b.a = r.a)
!          ->  Seq Scan on base_tbl b
!          ->  Hash
!                ->  Seq Scan on ref_tbl r
!          SubPlan 1
!            ->  Index Only Scan using ref_tbl_pkey on ref_tbl r_1
!                  Index Cond: (a = b.a)
!          SubPlan 2
!            ->  Seq Scan on ref_tbl r_2
! (11 rows)
! 
! DROP TABLE base_tbl, ref_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- WITH CHECK OPTION with BEFORE trigger on base table
! CREATE TABLE base_tbl (a int, b int);
! CREATE FUNCTION base_tbl_trig_fn()
! RETURNS trigger AS
! $$
! BEGIN
!   NEW.b := 10;
!   RETURN NEW;
! END;
! $$
! LANGUAGE plpgsql;
! CREATE TRIGGER base_tbl_trig BEFORE INSERT OR UPDATE ON base_tbl
!   FOR EACH ROW EXECUTE PROCEDURE base_tbl_trig_fn();
! CREATE VIEW rw_view1 AS SELECT * FROM base_tbl WHERE a < b WITH CHECK OPTION;
! INSERT INTO rw_view1 VALUES (5,0); -- ok
! INSERT INTO rw_view1 VALUES (15, 20); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (15, 10).
! UPDATE rw_view1 SET a = 20, b = 30; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view1"
! DETAIL:  Failing row contains (20, 10).
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! DROP FUNCTION base_tbl_trig_fn();
! -- WITH LOCAL CHECK OPTION with INSTEAD OF trigger on base view
! CREATE TABLE base_tbl (a int, b int);
! CREATE VIEW rw_view1 AS SELECT a FROM base_tbl WHERE a < b;
! CREATE FUNCTION rw_view1_trig_fn()
! RETURNS trigger AS
! $$
! BEGIN
!   IF TG_OP = 'INSERT' THEN
!     INSERT INTO base_tbl VALUES (NEW.a, 10);
!     RETURN NEW;
!   ELSIF TG_OP = 'UPDATE' THEN
!     UPDATE base_tbl SET a=NEW.a WHERE a=OLD.a;
!     RETURN NEW;
!   ELSIF TG_OP = 'DELETE' THEN
!     DELETE FROM base_tbl WHERE a=OLD.a;
!     RETURN OLD;
!   END IF;
! END;
! $$
! LANGUAGE plpgsql;
! CREATE TRIGGER rw_view1_trig
!   INSTEAD OF INSERT OR UPDATE OR DELETE ON rw_view1
!   FOR EACH ROW EXECUTE PROCEDURE rw_view1_trig_fn();
! CREATE VIEW rw_view2 AS
!   SELECT * FROM rw_view1 WHERE a > 0 WITH LOCAL CHECK OPTION;
! INSERT INTO rw_view2 VALUES (-5); -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (-5).
! INSERT INTO rw_view2 VALUES (5); -- ok
! INSERT INTO rw_view2 VALUES (50); -- ok, but not in view
! UPDATE rw_view2 SET a = a - 10; -- should fail
! ERROR:  new row violates WITH CHECK OPTION for "rw_view2"
! DETAIL:  Failing row contains (-5).
! SELECT * FROM base_tbl;
!  a  | b  
! ----+----
!   5 | 10
!  50 | 10
! (2 rows)
! 
! -- Check option won't cascade down to base view with INSTEAD OF triggers
! ALTER VIEW rw_view2 SET (check_option=cascaded);
! INSERT INTO rw_view2 VALUES (100); -- ok, but not in view (doesn't fail rw_view1's check)
! UPDATE rw_view2 SET a = 200 WHERE a = 5; -- ok, but not in view (doesn't fail rw_view1's check)
! SELECT * FROM base_tbl;
!   a  | b  
! -----+----
!   50 | 10
!  100 | 10
!  200 | 10
! (3 rows)
! 
! -- Neither local nor cascaded check options work with INSTEAD rules
! DROP TRIGGER rw_view1_trig ON rw_view1;
! CREATE RULE rw_view1_ins_rule AS ON INSERT TO rw_view1
!   DO INSTEAD INSERT INTO base_tbl VALUES (NEW.a, 10);
! CREATE RULE rw_view1_upd_rule AS ON UPDATE TO rw_view1
!   DO INSTEAD UPDATE base_tbl SET a=NEW.a WHERE a=OLD.a;
! INSERT INTO rw_view2 VALUES (-10); -- ok, but not in view (doesn't fail rw_view2's check)
! INSERT INTO rw_view2 VALUES (5); -- ok
! INSERT INTO rw_view2 VALUES (20); -- ok, but not in view (doesn't fail rw_view1's check)
! UPDATE rw_view2 SET a = 30 WHERE a = 5; -- ok, but not in view (doesn't fail rw_view1's check)
! INSERT INTO rw_view2 VALUES (5); -- ok
! UPDATE rw_view2 SET a = -5 WHERE a = 5; -- ok, but not in view (doesn't fail rw_view2's check)
! SELECT * FROM base_tbl;
!   a  | b  
! -----+----
!   50 | 10
!  100 | 10
!  200 | 10
!  -10 | 10
!   20 | 10
!   30 | 10
!   -5 | 10
! (7 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! DROP FUNCTION rw_view1_trig_fn();
! CREATE TABLE base_tbl (a int);
! CREATE VIEW rw_view1 AS SELECT a,10 AS b FROM base_tbl;
! CREATE RULE rw_view1_ins_rule AS ON INSERT TO rw_view1
!   DO INSTEAD INSERT INTO base_tbl VALUES (NEW.a);
! CREATE VIEW rw_view2 AS
!   SELECT * FROM rw_view1 WHERE a > b WITH LOCAL CHECK OPTION;
! INSERT INTO rw_view2 VALUES (2,3); -- ok, but not in view (doesn't fail rw_view2's check)
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! -- security barrier view
! CREATE TABLE base_tbl (person text, visibility text);
! INSERT INTO base_tbl VALUES ('Tom', 'public'),
!                             ('Dick', 'private'),
!                             ('Harry', 'public');
! CREATE VIEW rw_view1 AS
!   SELECT person FROM base_tbl WHERE visibility = 'public';
! CREATE FUNCTION snoop(anyelement)
! RETURNS boolean AS
! $$
! BEGIN
!   RAISE NOTICE 'snooped value: %', $1;
!   RETURN true;
! END;
! $$
! LANGUAGE plpgsql COST 0.000001;
! CREATE OR REPLACE FUNCTION leakproof(anyelement)
! RETURNS boolean AS
! $$
! BEGIN
!   RETURN true;
! END;
! $$
! LANGUAGE plpgsql STRICT IMMUTABLE LEAKPROOF;
! SELECT * FROM rw_view1 WHERE snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Dick
! NOTICE:  snooped value: Harry
!  person 
! --------
!  Tom
!  Harry
! (2 rows)
! 
! UPDATE rw_view1 SET person=person WHERE snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Dick
! NOTICE:  snooped value: Harry
! DELETE FROM rw_view1 WHERE NOT snoop(person);
! NOTICE:  snooped value: Dick
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
! ALTER VIEW rw_view1 SET (security_barrier = true);
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name = 'rw_view1';
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view1   | YES
! (1 row)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name = 'rw_view1';
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view1   | YES          | YES
! (1 row)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name = 'rw_view1'
!  ORDER BY ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view1   | person      | YES
! (1 row)
! 
! SELECT * FROM rw_view1 WHERE snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
!  person 
! --------
!  Tom
!  Harry
! (2 rows)
! 
! UPDATE rw_view1 SET person=person WHERE snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
! DELETE FROM rw_view1 WHERE NOT snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
! EXPLAIN (costs off) SELECT * FROM rw_view1 WHERE snoop(person);
!                   QUERY PLAN                   
! -----------------------------------------------
!  Subquery Scan on rw_view1
!    Filter: snoop(rw_view1.person)
!    ->  Seq Scan on base_tbl
!          Filter: (visibility = 'public'::text)
! (4 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view1 SET person=person WHERE snoop(person);
!                      QUERY PLAN                      
! -----------------------------------------------------
!  Update on base_tbl base_tbl_1
!    ->  Subquery Scan on base_tbl
!          Filter: snoop(base_tbl.person)
!          ->  Seq Scan on base_tbl base_tbl_2
!                Filter: (visibility = 'public'::text)
! (5 rows)
! 
! EXPLAIN (costs off) DELETE FROM rw_view1 WHERE NOT snoop(person);
!                      QUERY PLAN                      
! -----------------------------------------------------
!  Delete on base_tbl base_tbl_1
!    ->  Subquery Scan on base_tbl
!          Filter: (NOT snoop(base_tbl.person))
!          ->  Seq Scan on base_tbl base_tbl_2
!                Filter: (visibility = 'public'::text)
! (5 rows)
! 
! -- security barrier view on top of security barrier view
! CREATE VIEW rw_view2 WITH (security_barrier = true) AS
!   SELECT * FROM rw_view1 WHERE snoop(person);
! SELECT table_name, is_insertable_into
!   FROM information_schema.tables
!  WHERE table_name = 'rw_view2';
!  table_name | is_insertable_into 
! ------------+--------------------
!  rw_view2   | YES
! (1 row)
! 
! SELECT table_name, is_updatable, is_insertable_into
!   FROM information_schema.views
!  WHERE table_name = 'rw_view2';
!  table_name | is_updatable | is_insertable_into 
! ------------+--------------+--------------------
!  rw_view2   | YES          | YES
! (1 row)
! 
! SELECT table_name, column_name, is_updatable
!   FROM information_schema.columns
!  WHERE table_name = 'rw_view2'
!  ORDER BY ordinal_position;
!  table_name | column_name | is_updatable 
! ------------+-------------+--------------
!  rw_view2   | person      | YES
! (1 row)
! 
! SELECT * FROM rw_view2 WHERE snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
! NOTICE:  snooped value: Harry
!  person 
! --------
!  Tom
!  Harry
! (2 rows)
! 
! UPDATE rw_view2 SET person=person WHERE snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
! NOTICE:  snooped value: Harry
! DELETE FROM rw_view2 WHERE NOT snoop(person);
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Tom
! NOTICE:  snooped value: Harry
! NOTICE:  snooped value: Harry
! EXPLAIN (costs off) SELECT * FROM rw_view2 WHERE snoop(person);
!                      QUERY PLAN                      
! -----------------------------------------------------
!  Subquery Scan on rw_view2
!    Filter: snoop(rw_view2.person)
!    ->  Subquery Scan on rw_view1
!          Filter: snoop(rw_view1.person)
!          ->  Seq Scan on base_tbl
!                Filter: (visibility = 'public'::text)
! (6 rows)
! 
! EXPLAIN (costs off) UPDATE rw_view2 SET person=person WHERE snoop(person);
!                         QUERY PLAN                         
! -----------------------------------------------------------
!  Update on base_tbl base_tbl_1
!    ->  Subquery Scan on base_tbl
!          Filter: snoop(base_tbl.person)
!          ->  Subquery Scan on base_tbl_2
!                Filter: snoop(base_tbl_2.person)
!                ->  Seq Scan on base_tbl base_tbl_3
!                      Filter: (visibility = 'public'::text)
! (7 rows)
! 
! EXPLAIN (costs off) DELETE FROM rw_view2 WHERE NOT snoop(person);
!                         QUERY PLAN                         
! -----------------------------------------------------------
!  Delete on base_tbl base_tbl_1
!    ->  Subquery Scan on base_tbl
!          Filter: (NOT snoop(base_tbl.person))
!          ->  Subquery Scan on base_tbl_2
!                Filter: snoop(base_tbl_2.person)
!                ->  Seq Scan on base_tbl base_tbl_3
!                      Filter: (visibility = 'public'::text)
! (7 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to view rw_view1
! drop cascades to view rw_view2
! -- security barrier view on top of table with rules
! CREATE TABLE base_tbl(id int PRIMARY KEY, data text, deleted boolean);
! INSERT INTO base_tbl VALUES (1, 'Row 1', false), (2, 'Row 2', true);
! CREATE RULE base_tbl_ins_rule AS ON INSERT TO base_tbl
!   WHERE EXISTS (SELECT 1 FROM base_tbl t WHERE t.id = new.id)
!   DO INSTEAD
!     UPDATE base_tbl SET data = new.data, deleted = false WHERE id = new.id;
! CREATE RULE base_tbl_del_rule AS ON DELETE TO base_tbl
!   DO INSTEAD
!     UPDATE base_tbl SET deleted = true WHERE id = old.id;
! CREATE VIEW rw_view1 WITH (security_barrier=true) AS
!   SELECT id, data FROM base_tbl WHERE NOT deleted;
! SELECT * FROM rw_view1;
!  id | data  
! ----+-------
!   1 | Row 1
! (1 row)
! 
! EXPLAIN (costs off) DELETE FROM rw_view1 WHERE id = 1 AND snoop(data);
!                                QUERY PLAN                                
! -------------------------------------------------------------------------
!  Update on base_tbl base_tbl_1
!    ->  Nested Loop
!          ->  Index Scan using base_tbl_pkey on base_tbl base_tbl_1
!                Index Cond: (id = 1)
!          ->  Subquery Scan on base_tbl
!                Filter: snoop(base_tbl.data)
!                ->  Index Scan using base_tbl_pkey on base_tbl base_tbl_2
!                      Index Cond: (id = 1)
!                      Filter: (NOT deleted)
! (9 rows)
! 
! DELETE FROM rw_view1 WHERE id = 1 AND snoop(data);
! NOTICE:  snooped value: Row 1
! EXPLAIN (costs off) INSERT INTO rw_view1 VALUES (2, 'New row 2');
!                         QUERY PLAN                         
! -----------------------------------------------------------
!  Insert on base_tbl
!    InitPlan 1 (returns $0)
!      ->  Index Only Scan using base_tbl_pkey on base_tbl t
!            Index Cond: (id = 2)
!    ->  Result
!          One-Time Filter: ($0 IS NOT TRUE)
!  
!  Update on base_tbl
!    InitPlan 1 (returns $0)
!      ->  Index Only Scan using base_tbl_pkey on base_tbl t
!            Index Cond: (id = 2)
!    ->  Result
!          One-Time Filter: $0
!          ->  Index Scan using base_tbl_pkey on base_tbl
!                Index Cond: (id = 2)
! (15 rows)
! 
! INSERT INTO rw_view1 VALUES (2, 'New row 2');
! SELECT * FROM base_tbl;
!  id |   data    | deleted 
! ----+-----------+---------
!   1 | Row 1     | t
!   2 | New row 2 | f
! (2 rows)
! 
! DROP TABLE base_tbl CASCADE;
! NOTICE:  drop cascades to view rw_view1
! -- security barrier view based on inheiritance set
! CREATE TABLE t1 (a int, b float, c text);
! CREATE INDEX t1_a_idx ON t1(a);
! INSERT INTO t1
! SELECT i,i,'t1' FROM generate_series(1,10) g(i);
! ANALYZE t1;
! CREATE TABLE t11 (d text) INHERITS (t1);
! CREATE INDEX t11_a_idx ON t11(a);
! INSERT INTO t11
! SELECT i,i,'t11','t11d' FROM generate_series(1,10) g(i);
! ANALYZE t11;
! CREATE TABLE t12 (e int[]) INHERITS (t1);
! CREATE INDEX t12_a_idx ON t12(a);
! INSERT INTO t12
! SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,10) g(i);
! ANALYZE t12;
! CREATE TABLE t111 () INHERITS (t11, t12);
! NOTICE:  merging multiple inherited definitions of column "a"
! NOTICE:  merging multiple inherited definitions of column "b"
! NOTICE:  merging multiple inherited definitions of column "c"
! CREATE INDEX t111_a_idx ON t111(a);
! INSERT INTO t111
! SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,10) g(i);
! ANALYZE t111;
! CREATE VIEW v1 WITH (security_barrier=true) AS
! SELECT *, (SELECT d FROM t11 WHERE t11.a = t1.a LIMIT 1) AS d
! FROM t1
! WHERE a > 5 AND EXISTS(SELECT 1 FROM t12 WHERE t12.a = t1.a);
! SELECT * FROM v1 WHERE a=3; -- should not see anything
!  a | b | c | d 
! ---+---+---+---
! (0 rows)
! 
! SELECT * FROM v1 WHERE a=8;
!  a | b |  c   |  d   
! ---+---+------+------
!  8 | 8 | t1   | t11d
!  8 | 8 | t11  | t11d
!  8 | 8 | t12  | t11d
!  8 | 8 | t111 | t11d
! (4 rows)
! 
! EXPLAIN (VERBOSE, COSTS OFF)
! UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
!                                         QUERY PLAN                                         
! -------------------------------------------------------------------------------------------
!  Update on public.t1 t1_4
!    ->  Subquery Scan on t1
!          Output: 100, t1.b, t1.c, t1.ctid
!          Filter: snoop(t1.a)
!          ->  Nested Loop Semi Join
!                Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c
!                ->  Seq Scan on public.t1 t1_5
!                      Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c
!                      Filter: ((t1_5.a > 5) AND (t1_5.a = 3) AND leakproof(t1_5.a))
!                ->  Append
!                      ->  Seq Scan on public.t12
!                            Output: t12.a
!                            Filter: (t12.a = 3)
!                      ->  Seq Scan on public.t111
!                            Output: t111.a
!                            Filter: (t111.a = 3)
!    ->  Subquery Scan on t1_1
!          Output: 100, t1_1.b, t1_1.c, t1_1.d, t1_1.ctid
!          Filter: snoop(t1_1.a)
!          ->  Nested Loop Semi Join
!                Output: t11.ctid, t11.a, t11.b, t11.c, t11.d
!                ->  Seq Scan on public.t11
!                      Output: t11.ctid, t11.a, t11.b, t11.c, t11.d
!                      Filter: ((t11.a > 5) AND (t11.a = 3) AND leakproof(t11.a))
!                ->  Append
!                      ->  Seq Scan on public.t12 t12_1
!                            Output: t12_1.a
!                            Filter: (t12_1.a = 3)
!                      ->  Seq Scan on public.t111 t111_1
!                            Output: t111_1.a
!                            Filter: (t111_1.a = 3)
!    ->  Subquery Scan on t1_2
!          Output: 100, t1_2.b, t1_2.c, t1_2.e, t1_2.ctid
!          Filter: snoop(t1_2.a)
!          ->  Nested Loop Semi Join
!                Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, t12_2.e
!                ->  Seq Scan on public.t12 t12_2
!                      Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, t12_2.e
!                      Filter: ((t12_2.a > 5) AND (t12_2.a = 3) AND leakproof(t12_2.a))
!                ->  Append
!                      ->  Seq Scan on public.t12 t12_3
!                            Output: t12_3.a
!                            Filter: (t12_3.a = 3)
!                      ->  Seq Scan on public.t111 t111_2
!                            Output: t111_2.a
!                            Filter: (t111_2.a = 3)
!    ->  Subquery Scan on t1_3
!          Output: 100, t1_3.b, t1_3.c, t1_3.d, t1_3.e, t1_3.ctid
!          Filter: snoop(t1_3.a)
!          ->  Nested Loop Semi Join
!                Output: t111_3.ctid, t111_3.a, t111_3.b, t111_3.c, t111_3.d, t111_3.e
!                ->  Seq Scan on public.t111 t111_3
!                      Output: t111_3.ctid, t111_3.a, t111_3.b, t111_3.c, t111_3.d, t111_3.e
!                      Filter: ((t111_3.a > 5) AND (t111_3.a = 3) AND leakproof(t111_3.a))
!                ->  Append
!                      ->  Seq Scan on public.t12 t12_4
!                            Output: t12_4.a
!                            Filter: (t12_4.a = 3)
!                      ->  Seq Scan on public.t111 t111_4
!                            Output: t111_4.a
!                            Filter: (t111_4.a = 3)
! (61 rows)
! 
! UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
! SELECT * FROM v1 WHERE a=100; -- Nothing should have been changed to 100
!  a | b | c | d 
! ---+---+---+---
! (0 rows)
! 
! SELECT * FROM t1 WHERE a=100; -- Nothing should have been changed to 100
!  a | b | c 
! ---+---+---
! (0 rows)
! 
! EXPLAIN (VERBOSE, COSTS OFF)
! UPDATE v1 SET a=a+1 WHERE snoop(a) AND leakproof(a) AND a = 8;
!                                         QUERY PLAN                                         
! -------------------------------------------------------------------------------------------
!  Update on public.t1 t1_4
!    ->  Subquery Scan on t1
!          Output: (t1.a + 1), t1.b, t1.c, t1.ctid
!          Filter: snoop(t1.a)
!          ->  Nested Loop Semi Join
!                Output: t1_5.a, t1_5.ctid, t1_5.b, t1_5.c
!                ->  Seq Scan on public.t1 t1_5
!                      Output: t1_5.a, t1_5.ctid, t1_5.b, t1_5.c
!                      Filter: ((t1_5.a > 5) AND (t1_5.a = 8) AND leakproof(t1_5.a))
!                ->  Append
!                      ->  Seq Scan on public.t12
!                            Output: t12.a
!                            Filter: (t12.a = 8)
!                      ->  Seq Scan on public.t111
!                            Output: t111.a
!                            Filter: (t111.a = 8)
!    ->  Subquery Scan on t1_1
!          Output: (t1_1.a + 1), t1_1.b, t1_1.c, t1_1.d, t1_1.ctid
!          Filter: snoop(t1_1.a)
!          ->  Nested Loop Semi Join
!                Output: t11.a, t11.ctid, t11.b, t11.c, t11.d
!                ->  Seq Scan on public.t11
!                      Output: t11.a, t11.ctid, t11.b, t11.c, t11.d
!                      Filter: ((t11.a > 5) AND (t11.a = 8) AND leakproof(t11.a))
!                ->  Append
!                      ->  Seq Scan on public.t12 t12_1
!                            Output: t12_1.a
!                            Filter: (t12_1.a = 8)
!                      ->  Seq Scan on public.t111 t111_1
!                            Output: t111_1.a
!                            Filter: (t111_1.a = 8)
!    ->  Subquery Scan on t1_2
!          Output: (t1_2.a + 1), t1_2.b, t1_2.c, t1_2.e, t1_2.ctid
!          Filter: snoop(t1_2.a)
!          ->  Nested Loop Semi Join
!                Output: t12_2.a, t12_2.ctid, t12_2.b, t12_2.c, t12_2.e
!                ->  Seq Scan on public.t12 t12_2
!                      Output: t12_2.a, t12_2.ctid, t12_2.b, t12_2.c, t12_2.e
!                      Filter: ((t12_2.a > 5) AND (t12_2.a = 8) AND leakproof(t12_2.a))
!                ->  Append
!                      ->  Seq Scan on public.t12 t12_3
!                            Output: t12_3.a
!                            Filter: (t12_3.a = 8)
!                      ->  Seq Scan on public.t111 t111_2
!                            Output: t111_2.a
!                            Filter: (t111_2.a = 8)
!    ->  Subquery Scan on t1_3
!          Output: (t1_3.a + 1), t1_3.b, t1_3.c, t1_3.d, t1_3.e, t1_3.ctid
!          Filter: snoop(t1_3.a)
!          ->  Nested Loop Semi Join
!                Output: t111_3.a, t111_3.ctid, t111_3.b, t111_3.c, t111_3.d, t111_3.e
!                ->  Seq Scan on public.t111 t111_3
!                      Output: t111_3.a, t111_3.ctid, t111_3.b, t111_3.c, t111_3.d, t111_3.e
!                      Filter: ((t111_3.a > 5) AND (t111_3.a = 8) AND leakproof(t111_3.a))
!                ->  Append
!                      ->  Seq Scan on public.t12 t12_4
!                            Output: t12_4.a
!                            Filter: (t12_4.a = 8)
!                      ->  Seq Scan on public.t111 t111_4
!                            Output: t111_4.a
!                            Filter: (t111_4.a = 8)
! (61 rows)
! 
! UPDATE v1 SET a=a+1 WHERE snoop(a) AND leakproof(a) AND a = 8;
! NOTICE:  snooped value: 8
! NOTICE:  snooped value: 8
! NOTICE:  snooped value: 8
! NOTICE:  snooped value: 8
! SELECT * FROM v1 WHERE b=8;
!  a | b |  c   |  d   
! ---+---+------+------
!  9 | 8 | t1   | t11d
!  9 | 8 | t11  | t11d
!  9 | 8 | t12  | t11d
!  9 | 8 | t111 | t11d
! (4 rows)
! 
! DELETE FROM v1 WHERE snoop(a) AND leakproof(a); -- should not delete everything, just where a>5
! NOTICE:  snooped value: 6
! NOTICE:  snooped value: 7
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 10
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 6
! NOTICE:  snooped value: 7
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 10
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 6
! NOTICE:  snooped value: 7
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 10
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 6
! NOTICE:  snooped value: 7
! NOTICE:  snooped value: 9
! NOTICE:  snooped value: 10
! NOTICE:  snooped value: 9
! TABLE t1; -- verify all a<=5 are intact
!  a | b |  c   
! ---+---+------
!  1 | 1 | t1
!  2 | 2 | t1
!  3 | 3 | t1
!  4 | 4 | t1
!  5 | 5 | t1
!  1 | 1 | t11
!  2 | 2 | t11
!  3 | 3 | t11
!  4 | 4 | t11
!  5 | 5 | t11
!  1 | 1 | t12
!  2 | 2 | t12
!  3 | 3 | t12
!  4 | 4 | t12
!  5 | 5 | t12
!  1 | 1 | t111
!  2 | 2 | t111
!  3 | 3 | t111
!  4 | 4 | t111
!  5 | 5 | t111
! (20 rows)
! 
! DROP TABLE t1, t11, t12, t111 CASCADE;
! NOTICE:  drop cascades to view v1
! DROP FUNCTION snoop(anyelement);
! DROP FUNCTION leakproof(anyelement);
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/sanity_check.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/sanity_check.out	Tue Oct 28 15:53:05 2014
***************
*** 1,190 ****
! VACUUM;
! --
! -- sanity check, if we don't have indices the test will take years to
! -- complete.  But skip TOAST relations (since they will have varying
! -- names depending on the current OID counter) as well as temp tables
! -- of other backends (to avoid timing-dependent behavior).
! --
! -- temporarily disable fancy output, so catalog changes create less diff noise
! \a\t
! SELECT relname, relhasindex
!    FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = relnamespace
!    WHERE relkind = 'r' AND (nspname ~ '^pg_temp_') IS NOT TRUE
!    ORDER BY relname;
! a|f
! a_star|f
! abstime_tbl|f
! aggtest|f
! array_index_op_test|t
! array_op_test|f
! b|f
! b_star|f
! box_tbl|f
! bprime|f
! bt_f8_heap|t
! bt_i4_heap|t
! bt_name_heap|t
! bt_txt_heap|t
! c|f
! c_star|f
! char_tbl|f
! check2_tbl|f
! check_tbl|f
! circle_tbl|t
! city|f
! copy_tbl|f
! d|f
! d_star|f
! date_tbl|f
! default_tbl|f
! defaultexpr_tbl|f
! dept|f
! dupindexcols|t
! e_star|f
! emp|f
! equipment_r|f
! f_star|f
! fast_emp4000|t
! float4_tbl|f
! float8_tbl|f
! func_index_heap|t
! hash_f8_heap|t
! hash_i4_heap|t
! hash_name_heap|t
! hash_txt_heap|t
! hobbies_r|f
! ihighway|t
! inet_tbl|f
! inhf|f
! inhx|t
! insert_tbl|f
! int2_tbl|f
! int4_tbl|f
! int8_tbl|f
! interval_tbl|f
! iportaltest|f
! kd_point_tbl|t
! line_tbl|f
! log_table|f
! lseg_tbl|f
! main_table|f
! money_data|f
! num_data|f
! num_exp_add|t
! num_exp_div|t
! num_exp_ln|t
! num_exp_log10|t
! num_exp_mul|t
! num_exp_power_10_ln|t
! num_exp_sqrt|t
! num_exp_sub|t
! num_input_test|f
! num_result|f
! onek|t
! onek2|t
! path_tbl|f
! person|f
! pg_aggregate|t
! pg_am|t
! pg_amop|t
! pg_amproc|t
! pg_attrdef|t
! pg_attribute|t
! pg_auth_members|t
! pg_authid|t
! pg_cast|t
! pg_class|t
! pg_collation|t
! pg_constraint|t
! pg_conversion|t
! pg_database|t
! pg_db_role_setting|t
! pg_default_acl|t
! pg_depend|t
! pg_description|t
! pg_enum|t
! pg_event_trigger|t
! pg_extension|t
! pg_foreign_data_wrapper|t
! pg_foreign_server|t
! pg_foreign_table|t
! pg_index|t
! pg_inherits|t
! pg_language|t
! pg_largeobject|t
! pg_largeobject_metadata|t
! pg_namespace|t
! pg_opclass|t
! pg_operator|t
! pg_opfamily|t
! pg_pltemplate|t
! pg_proc|t
! pg_range|t
! pg_rewrite|t
! pg_rowsecurity|t
! pg_seclabel|t
! pg_shdepend|t
! pg_shdescription|t
! pg_shseclabel|t
! pg_statistic|t
! pg_tablespace|t
! pg_trigger|t
! pg_ts_config|t
! pg_ts_config_map|t
! pg_ts_dict|t
! pg_ts_parser|t
! pg_ts_template|t
! pg_type|t
! pg_user_mapping|t
! point_tbl|t
! polygon_tbl|t
! quad_point_tbl|t
! radix_text_tbl|t
! ramp|f
! real_city|f
! reltime_tbl|f
! road|t
! shighway|t
! slow_emp4000|f
! sql_features|f
! sql_implementation_info|f
! sql_languages|f
! sql_packages|f
! sql_parts|f
! sql_sizing|f
! sql_sizing_profiles|f
! stud_emp|f
! student|f
! tenk1|t
! tenk2|t
! test_range_excl|t
! test_range_gist|t
! test_range_spgist|t
! test_tsvector|f
! testjsonb|f
! text_tbl|f
! time_tbl|f
! timestamp_tbl|f
! timestamptz_tbl|f
! timetz_tbl|f
! tinterval_tbl|f
! varchar_tbl|f
! -- restore normal output mode
! \a\t
! --
! -- another sanity check: every system catalog that has OIDs should have
! -- a unique index on OID.  This ensures that the OIDs will be unique,
! -- even after the OID counter wraps around.
! -- We exclude non-system tables from the check by looking at nspname.
! --
! SELECT relname, nspname
! FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = relnamespace
! WHERE relhasoids
!     AND ((nspname ~ '^pg_') IS NOT FALSE)
!     AND NOT EXISTS (SELECT 1 FROM pg_index i WHERE indrelid = c.oid
!                     AND indkey[0] = -2 AND indnatts = 1
!                     AND indisunique AND indimmediate);
!  relname | nspname 
! ---------+---------
! (0 rows)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/errors.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/errors.out	Tue Oct 28 15:53:05 2014
***************
*** 1,447 ****
! --
! -- ERRORS
! --
! -- bad in postquel, but ok in postsql
! select 1;
!  ?column? 
! ----------
!         1
! (1 row)
! 
! --
! -- UNSUPPORTED STUFF
! -- doesn't work
! -- notify pg_class
! --
! --
! -- SELECT
! -- this used to be a syntax error, but now we allow an empty target list
! select;
! --
! (1 row)
! 
! -- no such relation
! select * from nonesuch;
! ERROR:  relation "nonesuch" does not exist
! LINE 1: select * from nonesuch;
!                       ^
! -- bad name in target list
! select nonesuch from pg_database;
! ERROR:  column "nonesuch" does not exist
! LINE 1: select nonesuch from pg_database;
!                ^
! -- empty distinct list isn't OK
! select distinct from pg_database;
! ERROR:  SELECT DISTINCT must have at least one column
! -- bad attribute name on lhs of operator
! select * from pg_database where nonesuch = pg_database.datname;
! ERROR:  column "nonesuch" does not exist
! LINE 1: select * from pg_database where nonesuch = pg_database.datna...
!                                         ^
! -- bad attribute name on rhs of operator
! select * from pg_database where pg_database.datname = nonesuch;
! ERROR:  column "nonesuch" does not exist
! LINE 1: ...ect * from pg_database where pg_database.datname = nonesuch;
!                                                               ^
! -- bad attribute name in select distinct on
! select distinct on (foobar) * from pg_database;
! ERROR:  column "foobar" does not exist
! LINE 1: select distinct on (foobar) * from pg_database;
!                             ^
! --
! -- DELETE
! -- missing relation name (this had better not wildcard!)
! delete from;
! ERROR:  syntax error at or near ";"
! LINE 1: delete from;
!                    ^
! -- no such relation
! delete from nonesuch;
! ERROR:  relation "nonesuch" does not exist
! LINE 1: delete from nonesuch;
!                     ^
! --
! -- DROP
! -- missing relation name (this had better not wildcard!)
! drop table;
! ERROR:  syntax error at or near ";"
! LINE 1: drop table;
!                   ^
! -- no such relation
! drop table nonesuch;
! ERROR:  table "nonesuch" does not exist
! --
! -- ALTER TABLE
! -- relation renaming
! -- missing relation name
! alter table rename;
! ERROR:  syntax error at or near ";"
! LINE 1: alter table rename;
!                           ^
! -- no such relation
! alter table nonesuch rename to newnonesuch;
! ERROR:  relation "nonesuch" does not exist
! -- no such relation
! alter table nonesuch rename to stud_emp;
! ERROR:  relation "nonesuch" does not exist
! -- conflict
! alter table stud_emp rename to aggtest;
! ERROR:  relation "aggtest" already exists
! -- self-conflict
! alter table stud_emp rename to stud_emp;
! ERROR:  relation "stud_emp" already exists
! -- attribute renaming
! -- no such relation
! alter table nonesuchrel rename column nonesuchatt to newnonesuchatt;
! ERROR:  relation "nonesuchrel" does not exist
! -- no such attribute
! alter table emp rename column nonesuchatt to newnonesuchatt;
! ERROR:  column "nonesuchatt" does not exist
! -- conflict
! alter table emp rename column salary to manager;
! ERROR:  column "manager" of relation "stud_emp" already exists
! -- conflict
! alter table emp rename column salary to oid;
! ERROR:  column name "oid" conflicts with a system column name
! --
! -- TRANSACTION STUFF
! -- not in a xact
! abort;
! WARNING:  there is no transaction in progress
! -- not in a xact
! end;
! WARNING:  there is no transaction in progress
! --
! -- CREATE AGGREGATE
! -- sfunc/finalfunc type disagreement
! create aggregate newavg2 (sfunc = int4pl,
! 			  basetype = int4,
! 			  stype = int4,
! 			  finalfunc = int2um,
! 			  initcond = '0');
! ERROR:  function int2um(integer) does not exist
! -- left out basetype
! create aggregate newcnt1 (sfunc = int4inc,
! 			  stype = int4,
! 			  initcond = '0');
! ERROR:  aggregate input type must be specified
! --
! -- DROP INDEX
! -- missing index name
! drop index;
! ERROR:  syntax error at or near ";"
! LINE 1: drop index;
!                   ^
! -- bad index name
! drop index 314159;
! ERROR:  syntax error at or near "314159"
! LINE 1: drop index 314159;
!                    ^
! -- no such index
! drop index nonesuch;
! ERROR:  index "nonesuch" does not exist
! --
! -- DROP AGGREGATE
! -- missing aggregate name
! drop aggregate;
! ERROR:  syntax error at or near ";"
! LINE 1: drop aggregate;
!                       ^
! -- missing aggregate type
! drop aggregate newcnt1;
! ERROR:  syntax error at or near ";"
! LINE 1: drop aggregate newcnt1;
!                               ^
! -- bad aggregate name
! drop aggregate 314159 (int);
! ERROR:  syntax error at or near "314159"
! LINE 1: drop aggregate 314159 (int);
!                        ^
! -- bad aggregate type
! drop aggregate newcnt (nonesuch);
! ERROR:  type "nonesuch" does not exist
! -- no such aggregate
! drop aggregate nonesuch (int4);
! ERROR:  aggregate nonesuch(integer) does not exist
! -- no such aggregate for type
! drop aggregate newcnt (float4);
! ERROR:  aggregate newcnt(real) does not exist
! --
! -- DROP FUNCTION
! -- missing function name
! drop function ();
! ERROR:  syntax error at or near "("
! LINE 1: drop function ();
!                       ^
! -- bad function name
! drop function 314159();
! ERROR:  syntax error at or near "314159"
! LINE 1: drop function 314159();
!                       ^
! -- no such function
! drop function nonesuch();
! ERROR:  function nonesuch() does not exist
! --
! -- DROP TYPE
! -- missing type name
! drop type;
! ERROR:  syntax error at or near ";"
! LINE 1: drop type;
!                  ^
! -- bad type name
! drop type 314159;
! ERROR:  syntax error at or near "314159"
! LINE 1: drop type 314159;
!                   ^
! -- no such type
! drop type nonesuch;
! ERROR:  type "nonesuch" does not exist
! --
! -- DROP OPERATOR
! -- missing everything
! drop operator;
! ERROR:  syntax error at or near ";"
! LINE 1: drop operator;
!                      ^
! -- bad operator name
! drop operator equals;
! ERROR:  syntax error at or near ";"
! LINE 1: drop operator equals;
!                             ^
! -- missing type list
! drop operator ===;
! ERROR:  syntax error at or near ";"
! LINE 1: drop operator ===;
!                          ^
! -- missing parentheses
! drop operator int4, int4;
! ERROR:  syntax error at or near ","
! LINE 1: drop operator int4, int4;
!                           ^
! -- missing operator name
! drop operator (int4, int4);
! ERROR:  syntax error at or near "("
! LINE 1: drop operator (int4, int4);
!                       ^
! -- missing type list contents
! drop operator === ();
! ERROR:  syntax error at or near ")"
! LINE 1: drop operator === ();
!                            ^
! -- no such operator
! drop operator === (int4);
! ERROR:  missing argument
! LINE 1: drop operator === (int4);
!                                ^
! HINT:  Use NONE to denote the missing argument of a unary operator.
! -- no such operator by that name
! drop operator === (int4, int4);
! ERROR:  operator does not exist: integer === integer
! -- no such type1
! drop operator = (nonesuch);
! ERROR:  missing argument
! LINE 1: drop operator = (nonesuch);
!                                  ^
! HINT:  Use NONE to denote the missing argument of a unary operator.
! -- no such type1
! drop operator = ( , int4);
! ERROR:  syntax error at or near ","
! LINE 1: drop operator = ( , int4);
!                           ^
! -- no such type1
! drop operator = (nonesuch, int4);
! ERROR:  type "nonesuch" does not exist
! -- no such type2
! drop operator = (int4, nonesuch);
! ERROR:  type "nonesuch" does not exist
! -- no such type2
! drop operator = (int4, );
! ERROR:  syntax error at or near ")"
! LINE 1: drop operator = (int4, );
!                                ^
! --
! -- DROP RULE
! -- missing rule name
! drop rule;
! ERROR:  syntax error at or near ";"
! LINE 1: drop rule;
!                  ^
! -- bad rule name
! drop rule 314159;
! ERROR:  syntax error at or near "314159"
! LINE 1: drop rule 314159;
!                   ^
! -- no such rule
! drop rule nonesuch on noplace;
! ERROR:  relation "noplace" does not exist
! -- these postquel variants are no longer supported
! drop tuple rule nonesuch;
! ERROR:  syntax error at or near "tuple"
! LINE 1: drop tuple rule nonesuch;
!              ^
! drop instance rule nonesuch on noplace;
! ERROR:  syntax error at or near "instance"
! LINE 1: drop instance rule nonesuch on noplace;
!              ^
! drop rewrite rule nonesuch;
! ERROR:  syntax error at or near "rewrite"
! LINE 1: drop rewrite rule nonesuch;
!              ^
! --
! -- Check that division-by-zero is properly caught.
! --
! select 1/0;
! ERROR:  division by zero
! select 1::int8/0;
! ERROR:  division by zero
! select 1/0::int8;
! ERROR:  division by zero
! select 1::int2/0;
! ERROR:  division by zero
! select 1/0::int2;
! ERROR:  division by zero
! select 1::numeric/0;
! ERROR:  division by zero
! select 1/0::numeric;
! ERROR:  division by zero
! select 1::float8/0;
! ERROR:  division by zero
! select 1/0::float8;
! ERROR:  division by zero
! select 1::float4/0;
! ERROR:  division by zero
! select 1/0::float4;
! ERROR:  division by zero
! --
! -- Test psql's reporting of syntax error location
! --
! xxx;
! ERROR:  syntax error at or near "xxx"
! LINE 1: xxx;
!         ^
! CREATE foo;
! ERROR:  syntax error at or near "foo"
! LINE 1: CREATE foo;
!                ^
! CREATE TABLE ;
! ERROR:  syntax error at or near ";"
! LINE 1: CREATE TABLE ;
!                      ^
! CREATE TABLE
! \g
! ERROR:  syntax error at end of input
! LINE 1: CREATE TABLE
!                     ^
! INSERT INTO foo VALUES(123) foo;
! ERROR:  syntax error at or near "foo"
! LINE 1: INSERT INTO foo VALUES(123) foo;
!                                     ^
! INSERT INTO 123
! VALUES(123);
! ERROR:  syntax error at or near "123"
! LINE 1: INSERT INTO 123
!                     ^
! INSERT INTO foo
! VALUES(123) 123
! ;
! ERROR:  syntax error at or near "123"
! LINE 2: VALUES(123) 123
!                     ^
! -- with a tab
! CREATE TABLE foo
!   (id INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY,
! 	id3 INTEGER NOT NUL,
!    id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL);
! ERROR:  syntax error at or near "NUL"
! LINE 3:  id3 INTEGER NOT NUL,
!                          ^
! -- long line to be truncated on the left
! CREATE TABLE foo(id INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL,
! id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL);
! ERROR:  syntax error at or near "NUL"
! LINE 1: ...OT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL,
!                                                                    ^
! -- long line to be truncated on the right
! CREATE TABLE foo(
! id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL, id INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY);
! ERROR:  syntax error at or near "NUL"
! LINE 2: id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQ...
!                         ^
! -- long line to be truncated both ways
! CREATE TABLE foo(id INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL);
! ERROR:  syntax error at or near "NUL"
! LINE 1: ...L, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 I...
!                                                              ^
! -- long line to be truncated on the left, many lines
! CREATE
! TEMPORARY
! TABLE
! foo(id INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL,
! id4 INT4
! UNIQUE
! NOT
! NULL,
! id5 TEXT
! UNIQUE
! NOT
! NULL)
! ;
! ERROR:  syntax error at or near "NUL"
! LINE 4: ...OT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL,
!                                                                    ^
! -- long line to be truncated on the right, many lines
! CREATE
! TEMPORARY
! TABLE
! foo(
! id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL, id INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY)
! ;
! ERROR:  syntax error at or near "NUL"
! LINE 5: id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQ...
!                         ^
! -- long line to be truncated both ways, many lines
! CREATE
! TEMPORARY
! TABLE
! foo
! (id
! INT4
! UNIQUE NOT NULL, idx INT4 UNIQUE NOT NULL, idy INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL,
! idz INT4 UNIQUE NOT NULL,
! idv INT4 UNIQUE NOT NULL);
! ERROR:  syntax error at or near "NUL"
! LINE 7: ...L, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 I...
!                                                              ^
! -- more than 10 lines...
! CREATE
! TEMPORARY
! TABLE
! foo
! (id
! INT4
! UNIQUE
! NOT
! NULL
! ,
! idm
! INT4
! UNIQUE
! NOT
! NULL,
! idx INT4 UNIQUE NOT NULL, idy INT4 UNIQUE NOT NULL, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 INT4 UNIQUE NOT NULL, id5 TEXT UNIQUE NOT NULL,
! idz INT4 UNIQUE NOT NULL,
! idv
! INT4
! UNIQUE
! NOT
! NULL);
! ERROR:  syntax error at or near "NUL"
! LINE 16: ...L, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 I...
!                                                               ^
! -- Check that stack depth detection mechanism works and
! -- max_stack_depth is not set too high
! create function infinite_recurse() returns int as
! 'select infinite_recurse()' language sql;
! \set VERBOSITY terse
! select infinite_recurse();
! ERROR:  stack depth limit exceeded
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/select.out	Sun Dec 12 20:21:38 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/select.out	Tue Oct 28 15:53:05 2014
***************
*** 1,783 ****
! --
! -- SELECT
! --
! -- btree index
! -- awk '{if($1<10){print;}else{next;}}' onek.data | sort +0n -1
! --
! SELECT * FROM onek
!    WHERE onek.unique1 < 10
!    ORDER BY onek.unique1;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!        0 |     998 |   0 |    0 |   0 |      0 |       0 |        0 |           0 |         0 |        0 |   0 |    1 | AAAAAA   | KMBAAA   | OOOOxx
!        1 |     214 |   1 |    1 |   1 |      1 |       1 |        1 |           1 |         1 |        1 |   2 |    3 | BAAAAA   | GIAAAA   | OOOOxx
!        2 |     326 |   0 |    2 |   2 |      2 |       2 |        2 |           2 |         2 |        2 |   4 |    5 | CAAAAA   | OMAAAA   | OOOOxx
!        3 |     431 |   1 |    3 |   3 |      3 |       3 |        3 |           3 |         3 |        3 |   6 |    7 | DAAAAA   | PQAAAA   | VVVVxx
!        4 |     833 |   0 |    0 |   4 |      4 |       4 |        4 |           4 |         4 |        4 |   8 |    9 | EAAAAA   | BGBAAA   | HHHHxx
!        5 |     541 |   1 |    1 |   5 |      5 |       5 |        5 |           5 |         5 |        5 |  10 |   11 | FAAAAA   | VUAAAA   | HHHHxx
!        6 |     978 |   0 |    2 |   6 |      6 |       6 |        6 |           6 |         6 |        6 |  12 |   13 | GAAAAA   | QLBAAA   | OOOOxx
!        7 |     647 |   1 |    3 |   7 |      7 |       7 |        7 |           7 |         7 |        7 |  14 |   15 | HAAAAA   | XYAAAA   | VVVVxx
!        8 |     653 |   0 |    0 |   8 |      8 |       8 |        8 |           8 |         8 |        8 |  16 |   17 | IAAAAA   | DZAAAA   | HHHHxx
!        9 |      49 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |         9 |        9 |  18 |   19 | JAAAAA   | XBAAAA   | HHHHxx
! (10 rows)
! 
! --
! -- awk '{if($1<20){print $1,$14;}else{next;}}' onek.data | sort +0nr -1
! --
! SELECT onek.unique1, onek.stringu1 FROM onek
!    WHERE onek.unique1 < 20
!    ORDER BY unique1 using >;
!  unique1 | stringu1 
! ---------+----------
!       19 | TAAAAA
!       18 | SAAAAA
!       17 | RAAAAA
!       16 | QAAAAA
!       15 | PAAAAA
!       14 | OAAAAA
!       13 | NAAAAA
!       12 | MAAAAA
!       11 | LAAAAA
!       10 | KAAAAA
!        9 | JAAAAA
!        8 | IAAAAA
!        7 | HAAAAA
!        6 | GAAAAA
!        5 | FAAAAA
!        4 | EAAAAA
!        3 | DAAAAA
!        2 | CAAAAA
!        1 | BAAAAA
!        0 | AAAAAA
! (20 rows)
! 
! --
! -- awk '{if($1>980){print $1,$14;}else{next;}}' onek.data | sort +1d -2
! --
! SELECT onek.unique1, onek.stringu1 FROM onek
!    WHERE onek.unique1 > 980
!    ORDER BY stringu1 using <;
!  unique1 | stringu1 
! ---------+----------
!      988 | AMAAAA
!      989 | BMAAAA
!      990 | CMAAAA
!      991 | DMAAAA
!      992 | EMAAAA
!      993 | FMAAAA
!      994 | GMAAAA
!      995 | HMAAAA
!      996 | IMAAAA
!      997 | JMAAAA
!      998 | KMAAAA
!      999 | LMAAAA
!      981 | TLAAAA
!      982 | ULAAAA
!      983 | VLAAAA
!      984 | WLAAAA
!      985 | XLAAAA
!      986 | YLAAAA
!      987 | ZLAAAA
! (19 rows)
! 
! --
! -- awk '{if($1>980){print $1,$16;}else{next;}}' onek.data |
! -- sort +1d -2 +0nr -1
! --
! SELECT onek.unique1, onek.string4 FROM onek
!    WHERE onek.unique1 > 980
!    ORDER BY string4 using <, unique1 using >;
!  unique1 | string4 
! ---------+---------
!      999 | AAAAxx
!      995 | AAAAxx
!      983 | AAAAxx
!      982 | AAAAxx
!      981 | AAAAxx
!      998 | HHHHxx
!      997 | HHHHxx
!      993 | HHHHxx
!      990 | HHHHxx
!      986 | HHHHxx
!      996 | OOOOxx
!      991 | OOOOxx
!      988 | OOOOxx
!      987 | OOOOxx
!      985 | OOOOxx
!      994 | VVVVxx
!      992 | VVVVxx
!      989 | VVVVxx
!      984 | VVVVxx
! (19 rows)
! 
! --
! -- awk '{if($1>980){print $1,$16;}else{next;}}' onek.data |
! -- sort +1dr -2 +0n -1
! --
! SELECT onek.unique1, onek.string4 FROM onek
!    WHERE onek.unique1 > 980
!    ORDER BY string4 using >, unique1 using <;
!  unique1 | string4 
! ---------+---------
!      984 | VVVVxx
!      989 | VVVVxx
!      992 | VVVVxx
!      994 | VVVVxx
!      985 | OOOOxx
!      987 | OOOOxx
!      988 | OOOOxx
!      991 | OOOOxx
!      996 | OOOOxx
!      986 | HHHHxx
!      990 | HHHHxx
!      993 | HHHHxx
!      997 | HHHHxx
!      998 | HHHHxx
!      981 | AAAAxx
!      982 | AAAAxx
!      983 | AAAAxx
!      995 | AAAAxx
!      999 | AAAAxx
! (19 rows)
! 
! --
! -- awk '{if($1<20){print $1,$16;}else{next;}}' onek.data |
! -- sort +0nr -1 +1d -2
! --
! SELECT onek.unique1, onek.string4 FROM onek
!    WHERE onek.unique1 < 20
!    ORDER BY unique1 using >, string4 using <;
!  unique1 | string4 
! ---------+---------
!       19 | OOOOxx
!       18 | VVVVxx
!       17 | HHHHxx
!       16 | OOOOxx
!       15 | VVVVxx
!       14 | AAAAxx
!       13 | OOOOxx
!       12 | AAAAxx
!       11 | OOOOxx
!       10 | AAAAxx
!        9 | HHHHxx
!        8 | HHHHxx
!        7 | VVVVxx
!        6 | OOOOxx
!        5 | HHHHxx
!        4 | HHHHxx
!        3 | VVVVxx
!        2 | OOOOxx
!        1 | OOOOxx
!        0 | OOOOxx
! (20 rows)
! 
! --
! -- awk '{if($1<20){print $1,$16;}else{next;}}' onek.data |
! -- sort +0n -1 +1dr -2
! --
! SELECT onek.unique1, onek.string4 FROM onek
!    WHERE onek.unique1 < 20
!    ORDER BY unique1 using <, string4 using >;
!  unique1 | string4 
! ---------+---------
!        0 | OOOOxx
!        1 | OOOOxx
!        2 | OOOOxx
!        3 | VVVVxx
!        4 | HHHHxx
!        5 | HHHHxx
!        6 | OOOOxx
!        7 | VVVVxx
!        8 | HHHHxx
!        9 | HHHHxx
!       10 | AAAAxx
!       11 | OOOOxx
!       12 | AAAAxx
!       13 | OOOOxx
!       14 | AAAAxx
!       15 | VVVVxx
!       16 | OOOOxx
!       17 | HHHHxx
!       18 | VVVVxx
!       19 | OOOOxx
! (20 rows)
! 
! --
! -- test partial btree indexes
! --
! -- As of 7.2, planner probably won't pick an indexscan without stats,
! -- so ANALYZE first.  Also, we want to prevent it from picking a bitmapscan
! -- followed by sort, because that could hide index ordering problems.
! --
! ANALYZE onek2;
! SET enable_seqscan TO off;
! SET enable_bitmapscan TO off;
! SET enable_sort TO off;
! --
! -- awk '{if($1<10){print $0;}else{next;}}' onek.data | sort +0n -1
! --
! SELECT onek2.* FROM onek2 WHERE onek2.unique1 < 10;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!        0 |     998 |   0 |    0 |   0 |      0 |       0 |        0 |           0 |         0 |        0 |   0 |    1 | AAAAAA   | KMBAAA   | OOOOxx
!        1 |     214 |   1 |    1 |   1 |      1 |       1 |        1 |           1 |         1 |        1 |   2 |    3 | BAAAAA   | GIAAAA   | OOOOxx
!        2 |     326 |   0 |    2 |   2 |      2 |       2 |        2 |           2 |         2 |        2 |   4 |    5 | CAAAAA   | OMAAAA   | OOOOxx
!        3 |     431 |   1 |    3 |   3 |      3 |       3 |        3 |           3 |         3 |        3 |   6 |    7 | DAAAAA   | PQAAAA   | VVVVxx
!        4 |     833 |   0 |    0 |   4 |      4 |       4 |        4 |           4 |         4 |        4 |   8 |    9 | EAAAAA   | BGBAAA   | HHHHxx
!        5 |     541 |   1 |    1 |   5 |      5 |       5 |        5 |           5 |         5 |        5 |  10 |   11 | FAAAAA   | VUAAAA   | HHHHxx
!        6 |     978 |   0 |    2 |   6 |      6 |       6 |        6 |           6 |         6 |        6 |  12 |   13 | GAAAAA   | QLBAAA   | OOOOxx
!        7 |     647 |   1 |    3 |   7 |      7 |       7 |        7 |           7 |         7 |        7 |  14 |   15 | HAAAAA   | XYAAAA   | VVVVxx
!        8 |     653 |   0 |    0 |   8 |      8 |       8 |        8 |           8 |         8 |        8 |  16 |   17 | IAAAAA   | DZAAAA   | HHHHxx
!        9 |      49 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |         9 |        9 |  18 |   19 | JAAAAA   | XBAAAA   | HHHHxx
! (10 rows)
! 
! --
! -- awk '{if($1<20){print $1,$14;}else{next;}}' onek.data | sort +0nr -1
! --
! SELECT onek2.unique1, onek2.stringu1 FROM onek2
!     WHERE onek2.unique1 < 20
!     ORDER BY unique1 using >;
!  unique1 | stringu1 
! ---------+----------
!       19 | TAAAAA
!       18 | SAAAAA
!       17 | RAAAAA
!       16 | QAAAAA
!       15 | PAAAAA
!       14 | OAAAAA
!       13 | NAAAAA
!       12 | MAAAAA
!       11 | LAAAAA
!       10 | KAAAAA
!        9 | JAAAAA
!        8 | IAAAAA
!        7 | HAAAAA
!        6 | GAAAAA
!        5 | FAAAAA
!        4 | EAAAAA
!        3 | DAAAAA
!        2 | CAAAAA
!        1 | BAAAAA
!        0 | AAAAAA
! (20 rows)
! 
! --
! -- awk '{if($1>980){print $1,$14;}else{next;}}' onek.data | sort +1d -2
! --
! SELECT onek2.unique1, onek2.stringu1 FROM onek2
!    WHERE onek2.unique1 > 980;
!  unique1 | stringu1 
! ---------+----------
!      981 | TLAAAA
!      982 | ULAAAA
!      983 | VLAAAA
!      984 | WLAAAA
!      985 | XLAAAA
!      986 | YLAAAA
!      987 | ZLAAAA
!      988 | AMAAAA
!      989 | BMAAAA
!      990 | CMAAAA
!      991 | DMAAAA
!      992 | EMAAAA
!      993 | FMAAAA
!      994 | GMAAAA
!      995 | HMAAAA
!      996 | IMAAAA
!      997 | JMAAAA
!      998 | KMAAAA
!      999 | LMAAAA
! (19 rows)
! 
! RESET enable_seqscan;
! RESET enable_bitmapscan;
! RESET enable_sort;
! SELECT two, stringu1, ten, string4
!    INTO TABLE tmp
!    FROM onek;
! --
! -- awk '{print $1,$2;}' person.data |
! -- awk '{if(NF!=2){print $3,$2;}else{print;}}' - emp.data |
! -- awk '{if(NF!=2){print $3,$2;}else{print;}}' - student.data |
! -- awk 'BEGIN{FS="      ";}{if(NF!=2){print $4,$5;}else{print;}}' - stud_emp.data
! --
! -- SELECT name, age FROM person*; ??? check if different
! SELECT p.name, p.age FROM person* p;
!   name   | age 
! ---------+-----
!  mike    |  40
!  joe     |  20
!  sally   |  34
!  sandra  |  19
!  alex    |  30
!  sue     |  50
!  denise  |  24
!  sarah   |  88
!  teresa  |  38
!  nan     |  28
!  leah    |  68
!  wendy   |  78
!  melissa |  28
!  joan    |  18
!  mary    |   8
!  jane    |  58
!  liza    |  38
!  jean    |  28
!  jenifer |  38
!  juanita |  58
!  susan   |  78
!  zena    |  98
!  martie  |  88
!  chris   |  78
!  pat     |  18
!  zola    |  58
!  louise  |  98
!  edna    |  18
!  bertha  |  88
!  sumi    |  38
!  koko    |  88
!  gina    |  18
!  rean    |  48
!  sharon  |  78
!  paula   |  68
!  julie   |  68
!  belinda |  38
!  karen   |  48
!  carina  |  58
!  diane   |  18
!  esther  |  98
!  trudy   |  88
!  fanny   |   8
!  carmen  |  78
!  lita    |  25
!  pamela  |  48
!  sandy   |  38
!  trisha  |  88
!  uma     |  78
!  velma   |  68
!  sharon  |  25
!  sam     |  30
!  bill    |  20
!  fred    |  28
!  larry   |  60
!  jeff    |  23
!  cim     |  30
!  linda   |  19
! (58 rows)
! 
! --
! -- awk '{print $1,$2;}' person.data |
! -- awk '{if(NF!=2){print $3,$2;}else{print;}}' - emp.data |
! -- awk '{if(NF!=2){print $3,$2;}else{print;}}' - student.data |
! -- awk 'BEGIN{FS="      ";}{if(NF!=1){print $4,$5;}else{print;}}' - stud_emp.data |
! -- sort +1nr -2
! --
! SELECT p.name, p.age FROM person* p ORDER BY age using >, name;
!   name   | age 
! ---------+-----
!  esther  |  98
!  louise  |  98
!  zena    |  98
!  bertha  |  88
!  koko    |  88
!  martie  |  88
!  sarah   |  88
!  trisha  |  88
!  trudy   |  88
!  carmen  |  78
!  chris   |  78
!  sharon  |  78
!  susan   |  78
!  uma     |  78
!  wendy   |  78
!  julie   |  68
!  leah    |  68
!  paula   |  68
!  velma   |  68
!  larry   |  60
!  carina  |  58
!  jane    |  58
!  juanita |  58
!  zola    |  58
!  sue     |  50
!  karen   |  48
!  pamela  |  48
!  rean    |  48
!  mike    |  40
!  belinda |  38
!  jenifer |  38
!  liza    |  38
!  sandy   |  38
!  sumi    |  38
!  teresa  |  38
!  sally   |  34
!  alex    |  30
!  cim     |  30
!  sam     |  30
!  fred    |  28
!  jean    |  28
!  melissa |  28
!  nan     |  28
!  lita    |  25
!  sharon  |  25
!  denise  |  24
!  jeff    |  23
!  bill    |  20
!  joe     |  20
!  linda   |  19
!  sandra  |  19
!  diane   |  18
!  edna    |  18
!  gina    |  18
!  joan    |  18
!  pat     |  18
!  fanny   |   8
!  mary    |   8
! (58 rows)
! 
! --
! -- Test some cases involving whole-row Var referencing a subquery
! --
! select foo from (select 1) as foo;
!  foo 
! -----
!  (1)
! (1 row)
! 
! select foo from (select null) as foo;
!  foo 
! -----
!  ()
! (1 row)
! 
! select foo from (select 'xyzzy',1,null) as foo;
!     foo     
! ------------
!  (xyzzy,1,)
! (1 row)
! 
! --
! -- Test VALUES lists
! --
! select * from onek, (values(147, 'RFAAAA'), (931, 'VJAAAA')) as v (i, j)
!     WHERE onek.unique1 = v.i and onek.stringu1 = v.j;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 |  i  |   j    
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------+-----+--------
!      147 |       0 |   1 |    3 |   7 |      7 |       7 |       47 |         147 |       147 |      147 |  14 |   15 | RFAAAA   | AAAAAA   | AAAAxx  | 147 | RFAAAA
!      931 |       1 |   1 |    3 |   1 |     11 |       1 |       31 |         131 |       431 |      931 |   2 |    3 | VJAAAA   | BAAAAA   | HHHHxx  | 931 | VJAAAA
! (2 rows)
! 
! -- a more complex case
! -- looks like we're coding lisp :-)
! select * from onek,
!   (values ((select i from
!     (values(10000), (2), (389), (1000), (2000), ((select 10029))) as foo(i)
!     order by i asc limit 1))) bar (i)
!   where onek.unique1 = bar.i;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 | i 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------+---
!        2 |     326 |   0 |    2 |   2 |      2 |       2 |        2 |           2 |         2 |        2 |   4 |    5 | CAAAAA   | OMAAAA   | OOOOxx  | 2
! (1 row)
! 
! -- try VALUES in a subquery
! select * from onek
!     where (unique1,ten) in (values (1,1), (20,0), (99,9), (17,99))
!     order by unique1;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!        1 |     214 |   1 |    1 |   1 |      1 |       1 |        1 |           1 |         1 |        1 |   2 |    3 | BAAAAA   | GIAAAA   | OOOOxx
!       20 |     306 |   0 |    0 |   0 |      0 |       0 |       20 |          20 |        20 |       20 |   0 |    1 | UAAAAA   | ULAAAA   | OOOOxx
!       99 |     101 |   1 |    3 |   9 |     19 |       9 |       99 |          99 |        99 |       99 |  18 |   19 | VDAAAA   | XDAAAA   | HHHHxx
! (3 rows)
! 
! -- VALUES is also legal as a standalone query or a set-operation member
! VALUES (1,2), (3,4+4), (7,77.7);
!  column1 | column2 
! ---------+---------
!        1 |       2
!        3 |       8
!        7 |    77.7
! (3 rows)
! 
! VALUES (1,2), (3,4+4), (7,77.7)
! UNION ALL
! SELECT 2+2, 57
! UNION ALL
! TABLE int8_tbl;
!      column1      |      column2      
! ------------------+-------------------
!                 1 |                 2
!                 3 |                 8
!                 7 |              77.7
!                 4 |                57
!               123 |               456
!               123 |  4567890123456789
!  4567890123456789 |               123
!  4567890123456789 |  4567890123456789
!  4567890123456789 | -4567890123456789
! (9 rows)
! 
! --
! -- Test ORDER BY options
! --
! CREATE TEMP TABLE foo (f1 int);
! INSERT INTO foo VALUES (42),(3),(10),(7),(null),(null),(1);
! SELECT * FROM foo ORDER BY f1;
!  f1 
! ----
!   1
!   3
!   7
!  10
!  42
!    
!    
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 ASC;	-- same thing
!  f1 
! ----
!   1
!   3
!   7
!  10
!  42
!    
!    
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 NULLS FIRST;
!  f1 
! ----
!    
!    
!   1
!   3
!   7
!  10
!  42
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC;
!  f1 
! ----
!    
!    
!  42
!  10
!   7
!   3
!   1
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC NULLS LAST;
!  f1 
! ----
!  42
!  10
!   7
!   3
!   1
!    
!    
! (7 rows)
! 
! -- check if indexscans do the right things
! CREATE INDEX fooi ON foo (f1);
! SET enable_sort = false;
! SELECT * FROM foo ORDER BY f1;
!  f1 
! ----
!   1
!   3
!   7
!  10
!  42
!    
!    
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 NULLS FIRST;
!  f1 
! ----
!    
!    
!   1
!   3
!   7
!  10
!  42
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC;
!  f1 
! ----
!    
!    
!  42
!  10
!   7
!   3
!   1
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC NULLS LAST;
!  f1 
! ----
!  42
!  10
!   7
!   3
!   1
!    
!    
! (7 rows)
! 
! DROP INDEX fooi;
! CREATE INDEX fooi ON foo (f1 DESC);
! SELECT * FROM foo ORDER BY f1;
!  f1 
! ----
!   1
!   3
!   7
!  10
!  42
!    
!    
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 NULLS FIRST;
!  f1 
! ----
!    
!    
!   1
!   3
!   7
!  10
!  42
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC;
!  f1 
! ----
!    
!    
!  42
!  10
!   7
!   3
!   1
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC NULLS LAST;
!  f1 
! ----
!  42
!  10
!   7
!   3
!   1
!    
!    
! (7 rows)
! 
! DROP INDEX fooi;
! CREATE INDEX fooi ON foo (f1 DESC NULLS LAST);
! SELECT * FROM foo ORDER BY f1;
!  f1 
! ----
!   1
!   3
!   7
!  10
!  42
!    
!    
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 NULLS FIRST;
!  f1 
! ----
!    
!    
!   1
!   3
!   7
!  10
!  42
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC;
!  f1 
! ----
!    
!    
!  42
!  10
!   7
!   3
!   1
! (7 rows)
! 
! SELECT * FROM foo ORDER BY f1 DESC NULLS LAST;
!  f1 
! ----
!  42
!  10
!   7
!   3
!   1
!    
!    
! (7 rows)
! 
! --
! -- Test some corner cases that have been known to confuse the planner
! --
! -- ORDER BY on a constant doesn't really need any sorting
! SELECT 1 AS x ORDER BY x;
!  x 
! ---
!  1
! (1 row)
! 
! -- But ORDER BY on a set-valued expression does
! create function sillysrf(int) returns setof int as
!   'values (1),(10),(2),($1)' language sql immutable;
! select sillysrf(42);
!  sillysrf 
! ----------
!         1
!        10
!         2
!        42
! (4 rows)
! 
! select sillysrf(-1) order by 1;
!  sillysrf 
! ----------
!        -1
!         1
!         2
!        10
! (4 rows)
! 
! drop function sillysrf(int);
! -- X = X isn't a no-op, it's effectively X IS NOT NULL assuming = is strict
! -- (see bug #5084)
! select * from (values (2),(null),(1)) v(k) where k = k order by k;
!  k 
! ---
!  1
!  2
! (2 rows)
! 
! select * from (values (2),(null),(1)) v(k) where k = k;
!  k 
! ---
!  2
!  1
! (2 rows)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/select_into.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/select_into.out	Tue Oct 28 15:53:05 2014
***************
*** 1,96 ****
! --
! -- SELECT_INTO
! --
! SELECT *
!    INTO TABLE tmp1
!    FROM onek
!    WHERE onek.unique1 < 2;
! DROP TABLE tmp1;
! SELECT *
!    INTO TABLE tmp1
!    FROM onek2
!    WHERE onek2.unique1 < 2;
! DROP TABLE tmp1;
! --
! -- SELECT INTO and INSERT permission, if owner is not allowed to insert.
! --
! CREATE SCHEMA selinto_schema;
! CREATE USER selinto_user;
! ALTER DEFAULT PRIVILEGES FOR ROLE selinto_user
! 	  REVOKE INSERT ON TABLES FROM selinto_user;
! GRANT ALL ON SCHEMA selinto_schema TO public;
! SET SESSION AUTHORIZATION selinto_user;
! SELECT * INTO TABLE selinto_schema.tmp1
! 	  FROM pg_class WHERE relname like '%a%';	-- Error
! ERROR:  permission denied for relation tmp1
! SELECT oid AS clsoid, relname, relnatts + 10 AS x
! 	  INTO selinto_schema.tmp2
! 	  FROM pg_class WHERE relname like '%b%';	-- Error
! ERROR:  permission denied for relation tmp2
! CREATE TABLE selinto_schema.tmp3 (a,b,c)
! 	   AS SELECT oid,relname,relacl FROM pg_class
! 	   WHERE relname like '%c%';	-- Error
! ERROR:  permission denied for relation tmp3
! RESET SESSION AUTHORIZATION;
! ALTER DEFAULT PRIVILEGES FOR ROLE selinto_user
! 	  GRANT INSERT ON TABLES TO selinto_user;
! SET SESSION AUTHORIZATION selinto_user;
! SELECT * INTO TABLE selinto_schema.tmp1
! 	  FROM pg_class WHERE relname like '%a%';	-- OK
! SELECT oid AS clsoid, relname, relnatts + 10 AS x
! 	  INTO selinto_schema.tmp2
! 	  FROM pg_class WHERE relname like '%b%';	-- OK
! CREATE TABLE selinto_schema.tmp3 (a,b,c)
! 	   AS SELECT oid,relname,relacl FROM pg_class
! 	   WHERE relname like '%c%';	-- OK
! RESET SESSION AUTHORIZATION;
! DROP SCHEMA selinto_schema CASCADE;
! NOTICE:  drop cascades to 3 other objects
! DETAIL:  drop cascades to table selinto_schema.tmp1
! drop cascades to table selinto_schema.tmp2
! drop cascades to table selinto_schema.tmp3
! DROP USER selinto_user;
! --
! -- CREATE TABLE AS/SELECT INTO as last command in a SQL function
! -- have been known to cause problems
! --
! CREATE FUNCTION make_table() RETURNS VOID
! AS $$
!   CREATE TABLE created_table AS SELECT * FROM int8_tbl;
! $$ LANGUAGE SQL;
! SELECT make_table();
!  make_table 
! ------------
!  
! (1 row)
! 
! SELECT * FROM created_table;
!         q1        |        q2         
! ------------------+-------------------
!               123 |               456
!               123 |  4567890123456789
!  4567890123456789 |               123
!  4567890123456789 |  4567890123456789
!  4567890123456789 | -4567890123456789
! (5 rows)
! 
! DROP TABLE created_table;
! --
! -- Disallowed uses of SELECT ... INTO.  All should fail
! --
! DECLARE foo CURSOR FOR SELECT 1 INTO b;
! ERROR:  SELECT ... INTO is not allowed here
! LINE 1: DECLARE foo CURSOR FOR SELECT 1 INTO b;
!                                              ^
! COPY (SELECT 1 INTO frak UNION SELECT 2) TO 'blob';
! ERROR:  COPY (SELECT INTO) is not supported
! SELECT * FROM (SELECT 1 INTO f) bar;
! ERROR:  SELECT ... INTO is not allowed here
! LINE 1: SELECT * FROM (SELECT 1 INTO f) bar;
!                                      ^
! CREATE VIEW foo AS SELECT 1 INTO b;
! ERROR:  views must not contain SELECT INTO
! INSERT INTO b SELECT 1 INTO f;
! ERROR:  SELECT ... INTO is not allowed here
! LINE 1: INSERT INTO b SELECT 1 INTO f;
!                                     ^
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/select_distinct.out	Sun Oct  3 21:26:00 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/select_distinct.out	Tue Oct 28 15:53:05 2014
***************
*** 1,222 ****
! --
! -- SELECT_DISTINCT
! --
! --
! -- awk '{print $3;}' onek.data | sort -n | uniq
! --
! SELECT DISTINCT two FROM tmp ORDER BY 1;
!  two 
! -----
!    0
!    1
! (2 rows)
! 
! --
! -- awk '{print $5;}' onek.data | sort -n | uniq
! --
! SELECT DISTINCT ten FROM tmp ORDER BY 1;
!  ten 
! -----
!    0
!    1
!    2
!    3
!    4
!    5
!    6
!    7
!    8
!    9
! (10 rows)
! 
! --
! -- awk '{print $16;}' onek.data | sort -d | uniq
! --
! SELECT DISTINCT string4 FROM tmp ORDER BY 1;
!  string4 
! ---------
!  AAAAxx
!  HHHHxx
!  OOOOxx
!  VVVVxx
! (4 rows)
! 
! --
! -- awk '{print $3,$16,$5;}' onek.data | sort -d | uniq |
! -- sort +0n -1 +1d -2 +2n -3
! --
! SELECT DISTINCT two, string4, ten
!    FROM tmp
!    ORDER BY two using <, string4 using <, ten using <;
!  two | string4 | ten 
! -----+---------+-----
!    0 | AAAAxx  |   0
!    0 | AAAAxx  |   2
!    0 | AAAAxx  |   4
!    0 | AAAAxx  |   6
!    0 | AAAAxx  |   8
!    0 | HHHHxx  |   0
!    0 | HHHHxx  |   2
!    0 | HHHHxx  |   4
!    0 | HHHHxx  |   6
!    0 | HHHHxx  |   8
!    0 | OOOOxx  |   0
!    0 | OOOOxx  |   2
!    0 | OOOOxx  |   4
!    0 | OOOOxx  |   6
!    0 | OOOOxx  |   8
!    0 | VVVVxx  |   0
!    0 | VVVVxx  |   2
!    0 | VVVVxx  |   4
!    0 | VVVVxx  |   6
!    0 | VVVVxx  |   8
!    1 | AAAAxx  |   1
!    1 | AAAAxx  |   3
!    1 | AAAAxx  |   5
!    1 | AAAAxx  |   7
!    1 | AAAAxx  |   9
!    1 | HHHHxx  |   1
!    1 | HHHHxx  |   3
!    1 | HHHHxx  |   5
!    1 | HHHHxx  |   7
!    1 | HHHHxx  |   9
!    1 | OOOOxx  |   1
!    1 | OOOOxx  |   3
!    1 | OOOOxx  |   5
!    1 | OOOOxx  |   7
!    1 | OOOOxx  |   9
!    1 | VVVVxx  |   1
!    1 | VVVVxx  |   3
!    1 | VVVVxx  |   5
!    1 | VVVVxx  |   7
!    1 | VVVVxx  |   9
! (40 rows)
! 
! --
! -- awk '{print $2;}' person.data |
! -- awk '{if(NF!=1){print $2;}else{print;}}' - emp.data |
! -- awk '{if(NF!=1){print $2;}else{print;}}' - student.data |
! -- awk 'BEGIN{FS="      ";}{if(NF!=1){print $5;}else{print;}}' - stud_emp.data |
! -- sort -n -r | uniq
! --
! SELECT DISTINCT p.age FROM person* p ORDER BY age using >;
!  age 
! -----
!   98
!   88
!   78
!   68
!   60
!   58
!   50
!   48
!   40
!   38
!   34
!   30
!   28
!   25
!   24
!   23
!   20
!   19
!   18
!    8
! (20 rows)
! 
! --
! -- Also, some tests of IS DISTINCT FROM, which doesn't quite deserve its
! -- very own regression file.
! --
! CREATE TEMP TABLE disttable (f1 integer);
! INSERT INTO DISTTABLE VALUES(1);
! INSERT INTO DISTTABLE VALUES(2);
! INSERT INTO DISTTABLE VALUES(3);
! INSERT INTO DISTTABLE VALUES(NULL);
! -- basic cases
! SELECT f1, f1 IS DISTINCT FROM 2 as "not 2" FROM disttable;
!  f1 | not 2 
! ----+-------
!   1 | t
!   2 | f
!   3 | t
!     | t
! (4 rows)
! 
! SELECT f1, f1 IS DISTINCT FROM NULL as "not null" FROM disttable;
!  f1 | not null 
! ----+----------
!   1 | t
!   2 | t
!   3 | t
!     | f
! (4 rows)
! 
! SELECT f1, f1 IS DISTINCT FROM f1 as "false" FROM disttable;
!  f1 | false 
! ----+-------
!   1 | f
!   2 | f
!   3 | f
!     | f
! (4 rows)
! 
! SELECT f1, f1 IS DISTINCT FROM f1+1 as "not null" FROM disttable;
!  f1 | not null 
! ----+----------
!   1 | t
!   2 | t
!   3 | t
!     | f
! (4 rows)
! 
! -- check that optimizer constant-folds it properly
! SELECT 1 IS DISTINCT FROM 2 as "yes";
!  yes 
! -----
!  t
! (1 row)
! 
! SELECT 2 IS DISTINCT FROM 2 as "no";
!  no 
! ----
!  f
! (1 row)
! 
! SELECT 2 IS DISTINCT FROM null as "yes";
!  yes 
! -----
!  t
! (1 row)
! 
! SELECT null IS DISTINCT FROM null as "no";
!  no 
! ----
!  f
! (1 row)
! 
! -- negated form
! SELECT 1 IS NOT DISTINCT FROM 2 as "no";
!  no 
! ----
!  f
! (1 row)
! 
! SELECT 2 IS NOT DISTINCT FROM 2 as "yes";
!  yes 
! -----
!  t
! (1 row)
! 
! SELECT 2 IS NOT DISTINCT FROM null as "no";
!  no 
! ----
!  f
! (1 row)
! 
! SELECT null IS NOT DISTINCT FROM null as "yes";
!  yes 
! -----
!  t
! (1 row)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/select_distinct_on.out	Sun Oct  3 21:26:00 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/select_distinct_on.out	Tue Oct 28 15:53:05 2014
***************
*** 1,75 ****
! --
! -- SELECT_DISTINCT_ON
! --
! SELECT DISTINCT ON (string4) string4, two, ten
!    FROM tmp
!    ORDER BY string4 using <, two using >, ten using <;
!  string4 | two | ten 
! ---------+-----+-----
!  AAAAxx  |   1 |   1
!  HHHHxx  |   1 |   1
!  OOOOxx  |   1 |   1
!  VVVVxx  |   1 |   1
! (4 rows)
! 
! -- this will fail due to conflict of ordering requirements
! SELECT DISTINCT ON (string4, ten) string4, two, ten
!    FROM tmp
!    ORDER BY string4 using <, two using <, ten using <;
! ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions
! LINE 1: SELECT DISTINCT ON (string4, ten) string4, two, ten
!                                      ^
! SELECT DISTINCT ON (string4, ten) string4, ten, two
!    FROM tmp
!    ORDER BY string4 using <, ten using >, two using <;
!  string4 | ten | two 
! ---------+-----+-----
!  AAAAxx  |   9 |   1
!  AAAAxx  |   8 |   0
!  AAAAxx  |   7 |   1
!  AAAAxx  |   6 |   0
!  AAAAxx  |   5 |   1
!  AAAAxx  |   4 |   0
!  AAAAxx  |   3 |   1
!  AAAAxx  |   2 |   0
!  AAAAxx  |   1 |   1
!  AAAAxx  |   0 |   0
!  HHHHxx  |   9 |   1
!  HHHHxx  |   8 |   0
!  HHHHxx  |   7 |   1
!  HHHHxx  |   6 |   0
!  HHHHxx  |   5 |   1
!  HHHHxx  |   4 |   0
!  HHHHxx  |   3 |   1
!  HHHHxx  |   2 |   0
!  HHHHxx  |   1 |   1
!  HHHHxx  |   0 |   0
!  OOOOxx  |   9 |   1
!  OOOOxx  |   8 |   0
!  OOOOxx  |   7 |   1
!  OOOOxx  |   6 |   0
!  OOOOxx  |   5 |   1
!  OOOOxx  |   4 |   0
!  OOOOxx  |   3 |   1
!  OOOOxx  |   2 |   0
!  OOOOxx  |   1 |   1
!  OOOOxx  |   0 |   0
!  VVVVxx  |   9 |   1
!  VVVVxx  |   8 |   0
!  VVVVxx  |   7 |   1
!  VVVVxx  |   6 |   0
!  VVVVxx  |   5 |   1
!  VVVVxx  |   4 |   0
!  VVVVxx  |   3 |   1
!  VVVVxx  |   2 |   0
!  VVVVxx  |   1 |   1
!  VVVVxx  |   0 |   0
! (40 rows)
! 
! -- bug #5049: early 8.4.x chokes on volatile DISTINCT ON clauses
! select distinct on (1) floor(random()) as r, f1 from int4_tbl order by 1,2;
!  r |     f1      
! ---+-------------
!  0 | -2147483647
! (1 row)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/select_implicit.out	Sun Dec 12 20:21:38 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/select_implicit.out	Tue Oct 28 15:53:05 2014
***************
*** 1,336 ****
! --
! -- SELECT_IMPLICIT
! -- Test cases for queries with ordering terms missing from the target list.
! -- This used to be called "junkfilter.sql".
! -- The parser uses the term "resjunk" to handle these cases.
! -- - thomas 1998-07-09
! --
! -- load test data
! CREATE TABLE test_missing_target (a int, b int, c char(8), d char);
! INSERT INTO test_missing_target VALUES (0, 1, 'XXXX', 'A');
! INSERT INTO test_missing_target VALUES (1, 2, 'ABAB', 'b');
! INSERT INTO test_missing_target VALUES (2, 2, 'ABAB', 'c');
! INSERT INTO test_missing_target VALUES (3, 3, 'BBBB', 'D');
! INSERT INTO test_missing_target VALUES (4, 3, 'BBBB', 'e');
! INSERT INTO test_missing_target VALUES (5, 3, 'bbbb', 'F');
! INSERT INTO test_missing_target VALUES (6, 4, 'cccc', 'g');
! INSERT INTO test_missing_target VALUES (7, 4, 'cccc', 'h');
! INSERT INTO test_missing_target VALUES (8, 4, 'CCCC', 'I');
! INSERT INTO test_missing_target VALUES (9, 4, 'CCCC', 'j');
! --   w/ existing GROUP BY target
! SELECT c, count(*) FROM test_missing_target GROUP BY test_missing_target.c ORDER BY c;
!     c     | count 
! ----------+-------
!  ABAB     |     2
!  BBBB     |     2
!  CCCC     |     2
!  XXXX     |     1
!  bbbb     |     1
!  cccc     |     2
! (6 rows)
! 
! --   w/o existing GROUP BY target using a relation name in GROUP BY clause
! SELECT count(*) FROM test_missing_target GROUP BY test_missing_target.c ORDER BY c;
!  count 
! -------
!      2
!      2
!      2
!      1
!      1
!      2
! (6 rows)
! 
! --   w/o existing GROUP BY target and w/o existing a different ORDER BY target
! --   failure expected
! SELECT count(*) FROM test_missing_target GROUP BY a ORDER BY b;
! ERROR:  column "test_missing_target.b" must appear in the GROUP BY clause or be used in an aggregate function
! LINE 1: ...ECT count(*) FROM test_missing_target GROUP BY a ORDER BY b;
!                                                                      ^
! --   w/o existing GROUP BY target and w/o existing same ORDER BY target
! SELECT count(*) FROM test_missing_target GROUP BY b ORDER BY b;
!  count 
! -------
!      1
!      2
!      3
!      4
! (4 rows)
! 
! --   w/ existing GROUP BY target using a relation name in target
! SELECT test_missing_target.b, count(*)
!   FROM test_missing_target GROUP BY b ORDER BY b;
!  b | count 
! ---+-------
!  1 |     1
!  2 |     2
!  3 |     3
!  4 |     4
! (4 rows)
! 
! --   w/o existing GROUP BY target
! SELECT c FROM test_missing_target ORDER BY a;
!     c     
! ----------
!  XXXX    
!  ABAB    
!  ABAB    
!  BBBB    
!  BBBB    
!  bbbb    
!  cccc    
!  cccc    
!  CCCC    
!  CCCC    
! (10 rows)
! 
! --   w/o existing ORDER BY target
! SELECT count(*) FROM test_missing_target GROUP BY b ORDER BY b desc;
!  count 
! -------
!      4
!      3
!      2
!      1
! (4 rows)
! 
! --   group using reference number
! SELECT count(*) FROM test_missing_target ORDER BY 1 desc;
!  count 
! -------
!     10
! (1 row)
! 
! --   order using reference number
! SELECT c, count(*) FROM test_missing_target GROUP BY 1 ORDER BY 1;
!     c     | count 
! ----------+-------
!  ABAB     |     2
!  BBBB     |     2
!  CCCC     |     2
!  XXXX     |     1
!  bbbb     |     1
!  cccc     |     2
! (6 rows)
! 
! --   group using reference number out of range
! --   failure expected
! SELECT c, count(*) FROM test_missing_target GROUP BY 3;
! ERROR:  GROUP BY position 3 is not in select list
! LINE 1: SELECT c, count(*) FROM test_missing_target GROUP BY 3;
!                                                              ^
! --   group w/o existing GROUP BY and ORDER BY target under ambiguous condition
! --   failure expected
! SELECT count(*) FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY b ORDER BY b;
! ERROR:  column reference "b" is ambiguous
! LINE 3:  GROUP BY b ORDER BY b;
!                              ^
! --   order w/ target under ambiguous condition
! --   failure NOT expected
! SELECT a, a FROM test_missing_target
! 	ORDER BY a;
!  a | a 
! ---+---
!  0 | 0
!  1 | 1
!  2 | 2
!  3 | 3
!  4 | 4
!  5 | 5
!  6 | 6
!  7 | 7
!  8 | 8
!  9 | 9
! (10 rows)
! 
! --   order expression w/ target under ambiguous condition
! --   failure NOT expected
! SELECT a/2, a/2 FROM test_missing_target
! 	ORDER BY a/2;
!  ?column? | ?column? 
! ----------+----------
!         0 |        0
!         0 |        0
!         1 |        1
!         1 |        1
!         2 |        2
!         2 |        2
!         3 |        3
!         3 |        3
!         4 |        4
!         4 |        4
! (10 rows)
! 
! --   group expression w/ target under ambiguous condition
! --   failure NOT expected
! SELECT a/2, a/2 FROM test_missing_target
! 	GROUP BY a/2 ORDER BY a/2;
!  ?column? | ?column? 
! ----------+----------
!         0 |        0
!         1 |        1
!         2 |        2
!         3 |        3
!         4 |        4
! (5 rows)
! 
! --   group w/ existing GROUP BY target under ambiguous condition
! SELECT x.b, count(*) FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY x.b ORDER BY x.b;
!  b | count 
! ---+-------
!  1 |     1
!  2 |     2
!  3 |     3
!  4 |     4
! (4 rows)
! 
! --   group w/o existing GROUP BY target under ambiguous condition
! SELECT count(*) FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY x.b ORDER BY x.b;
!  count 
! -------
!      1
!      2
!      3
!      4
! (4 rows)
! 
! --   group w/o existing GROUP BY target under ambiguous condition
! --   into a table
! SELECT count(*) INTO TABLE test_missing_target2
! FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY x.b ORDER BY x.b;
! SELECT * FROM test_missing_target2;
!  count 
! -------
!      1
!      2
!      3
!      4
! (4 rows)
! 
! --  Functions and expressions
! --   w/ existing GROUP BY target
! SELECT a%2, count(b) FROM test_missing_target
! GROUP BY test_missing_target.a%2
! ORDER BY test_missing_target.a%2;
!  ?column? | count 
! ----------+-------
!         0 |     5
!         1 |     5
! (2 rows)
! 
! --   w/o existing GROUP BY target using a relation name in GROUP BY clause
! SELECT count(c) FROM test_missing_target
! GROUP BY lower(test_missing_target.c)
! ORDER BY lower(test_missing_target.c);
!  count 
! -------
!      2
!      3
!      4
!      1
! (4 rows)
! 
! --   w/o existing GROUP BY target and w/o existing a different ORDER BY target
! --   failure expected
! SELECT count(a) FROM test_missing_target GROUP BY a ORDER BY b;
! ERROR:  column "test_missing_target.b" must appear in the GROUP BY clause or be used in an aggregate function
! LINE 1: ...ECT count(a) FROM test_missing_target GROUP BY a ORDER BY b;
!                                                                      ^
! --   w/o existing GROUP BY target and w/o existing same ORDER BY target
! SELECT count(b) FROM test_missing_target GROUP BY b/2 ORDER BY b/2;
!  count 
! -------
!      1
!      5
!      4
! (3 rows)
! 
! --   w/ existing GROUP BY target using a relation name in target
! SELECT lower(test_missing_target.c), count(c)
!   FROM test_missing_target GROUP BY lower(c) ORDER BY lower(c);
!  lower | count 
! -------+-------
!  abab  |     2
!  bbbb  |     3
!  cccc  |     4
!  xxxx  |     1
! (4 rows)
! 
! --   w/o existing GROUP BY target
! SELECT a FROM test_missing_target ORDER BY upper(d);
!  a 
! ---
!  0
!  1
!  2
!  3
!  4
!  5
!  6
!  7
!  8
!  9
! (10 rows)
! 
! --   w/o existing ORDER BY target
! SELECT count(b) FROM test_missing_target
! 	GROUP BY (b + 1) / 2 ORDER BY (b + 1) / 2 desc;
!  count 
! -------
!      7
!      3
! (2 rows)
! 
! --   group w/o existing GROUP BY and ORDER BY target under ambiguous condition
! --   failure expected
! SELECT count(x.a) FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY b/2 ORDER BY b/2;
! ERROR:  column reference "b" is ambiguous
! LINE 3:  GROUP BY b/2 ORDER BY b/2;
!                                ^
! --   group w/ existing GROUP BY target under ambiguous condition
! SELECT x.b/2, count(x.b) FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY x.b/2 ORDER BY x.b/2;
!  ?column? | count 
! ----------+-------
!         0 |     1
!         1 |     5
!         2 |     4
! (3 rows)
! 
! --   group w/o existing GROUP BY target under ambiguous condition
! --   failure expected due to ambiguous b in count(b)
! SELECT count(b) FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY x.b/2;
! ERROR:  column reference "b" is ambiguous
! LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
!                      ^
! --   group w/o existing GROUP BY target under ambiguous condition
! --   into a table
! SELECT count(x.b) INTO TABLE test_missing_target3
! FROM test_missing_target x, test_missing_target y
! 	WHERE x.a = y.a
! 	GROUP BY x.b/2 ORDER BY x.b/2;
! SELECT * FROM test_missing_target3;
!  count 
! -------
!      1
!      5
!      4
! (3 rows)
! 
! --   Cleanup
! DROP TABLE test_missing_target;
! DROP TABLE test_missing_target2;
! DROP TABLE test_missing_target3;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/select_having.out	Sun Oct  3 21:26:00 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/select_having.out	Tue Oct 28 15:53:05 2014
***************
*** 1,93 ****
! --
! -- SELECT_HAVING
! --
! -- load test data
! CREATE TABLE test_having (a int, b int, c char(8), d char);
! INSERT INTO test_having VALUES (0, 1, 'XXXX', 'A');
! INSERT INTO test_having VALUES (1, 2, 'AAAA', 'b');
! INSERT INTO test_having VALUES (2, 2, 'AAAA', 'c');
! INSERT INTO test_having VALUES (3, 3, 'BBBB', 'D');
! INSERT INTO test_having VALUES (4, 3, 'BBBB', 'e');
! INSERT INTO test_having VALUES (5, 3, 'bbbb', 'F');
! INSERT INTO test_having VALUES (6, 4, 'cccc', 'g');
! INSERT INTO test_having VALUES (7, 4, 'cccc', 'h');
! INSERT INTO test_having VALUES (8, 4, 'CCCC', 'I');
! INSERT INTO test_having VALUES (9, 4, 'CCCC', 'j');
! SELECT b, c FROM test_having
! 	GROUP BY b, c HAVING count(*) = 1 ORDER BY b, c;
!  b |    c     
! ---+----------
!  1 | XXXX    
!  3 | bbbb    
! (2 rows)
! 
! -- HAVING is effectively equivalent to WHERE in this case
! SELECT b, c FROM test_having
! 	GROUP BY b, c HAVING b = 3 ORDER BY b, c;
!  b |    c     
! ---+----------
!  3 | BBBB    
!  3 | bbbb    
! (2 rows)
! 
! SELECT lower(c), count(c) FROM test_having
! 	GROUP BY lower(c) HAVING count(*) > 2 OR min(a) = max(a)
! 	ORDER BY lower(c);
!  lower | count 
! -------+-------
!  bbbb  |     3
!  cccc  |     4
!  xxxx  |     1
! (3 rows)
! 
! SELECT c, max(a) FROM test_having
! 	GROUP BY c HAVING count(*) > 2 OR min(a) = max(a)
! 	ORDER BY c;
!     c     | max 
! ----------+-----
!  XXXX     |   0
!  bbbb     |   5
! (2 rows)
! 
! -- test degenerate cases involving HAVING without GROUP BY
! -- Per SQL spec, these should generate 0 or 1 row, even without aggregates
! SELECT min(a), max(a) FROM test_having HAVING min(a) = max(a);
!  min | max 
! -----+-----
! (0 rows)
! 
! SELECT min(a), max(a) FROM test_having HAVING min(a) < max(a);
!  min | max 
! -----+-----
!    0 |   9
! (1 row)
! 
! -- errors: ungrouped column references
! SELECT a FROM test_having HAVING min(a) < max(a);
! ERROR:  column "test_having.a" must appear in the GROUP BY clause or be used in an aggregate function
! LINE 1: SELECT a FROM test_having HAVING min(a) < max(a);
!                ^
! SELECT 1 AS one FROM test_having HAVING a > 1;
! ERROR:  column "test_having.a" must appear in the GROUP BY clause or be used in an aggregate function
! LINE 1: SELECT 1 AS one FROM test_having HAVING a > 1;
!                                                 ^
! -- the really degenerate case: need not scan table at all
! SELECT 1 AS one FROM test_having HAVING 1 > 2;
!  one 
! -----
! (0 rows)
! 
! SELECT 1 AS one FROM test_having HAVING 1 < 2;
!  one 
! -----
!    1
! (1 row)
! 
! -- and just to prove that we aren't scanning the table:
! SELECT 1 AS one FROM test_having WHERE 1/a = 1 HAVING 1 < 2;
!  one 
! -----
!    1
! (1 row)
! 
! DROP TABLE test_having;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/subselect.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/subselect.out	Tue Oct 28 15:53:05 2014
***************
*** 1,823 ****
! --
! -- SUBSELECT
! --
! SELECT 1 AS one WHERE 1 IN (SELECT 1);
!  one 
! -----
!    1
! (1 row)
! 
! SELECT 1 AS zero WHERE 1 NOT IN (SELECT 1);
!  zero 
! ------
! (0 rows)
! 
! SELECT 1 AS zero WHERE 1 IN (SELECT 2);
!  zero 
! ------
! (0 rows)
! 
! -- Check grammar's handling of extra parens in assorted contexts
! SELECT * FROM (SELECT 1 AS x) ss;
!  x 
! ---
!  1
! (1 row)
! 
! SELECT * FROM ((SELECT 1 AS x)) ss;
!  x 
! ---
!  1
! (1 row)
! 
! (SELECT 2) UNION SELECT 2;
!  ?column? 
! ----------
!         2
! (1 row)
! 
! ((SELECT 2)) UNION SELECT 2;
!  ?column? 
! ----------
!         2
! (1 row)
! 
! SELECT ((SELECT 2) UNION SELECT 2);
!  ?column? 
! ----------
!         2
! (1 row)
! 
! SELECT (((SELECT 2)) UNION SELECT 2);
!  ?column? 
! ----------
!         2
! (1 row)
! 
! SELECT (SELECT ARRAY[1,2,3])[1];
!  array 
! -------
!      1
! (1 row)
! 
! SELECT ((SELECT ARRAY[1,2,3]))[2];
!  array 
! -------
!      2
! (1 row)
! 
! SELECT (((SELECT ARRAY[1,2,3])))[3];
!  array 
! -------
!      3
! (1 row)
! 
! -- Set up some simple test tables
! CREATE TABLE SUBSELECT_TBL (
!   f1 integer,
!   f2 integer,
!   f3 float
! );
! INSERT INTO SUBSELECT_TBL VALUES (1, 2, 3);
! INSERT INTO SUBSELECT_TBL VALUES (2, 3, 4);
! INSERT INTO SUBSELECT_TBL VALUES (3, 4, 5);
! INSERT INTO SUBSELECT_TBL VALUES (1, 1, 1);
! INSERT INTO SUBSELECT_TBL VALUES (2, 2, 2);
! INSERT INTO SUBSELECT_TBL VALUES (3, 3, 3);
! INSERT INTO SUBSELECT_TBL VALUES (6, 7, 8);
! INSERT INTO SUBSELECT_TBL VALUES (8, 9, NULL);
! SELECT '' AS eight, * FROM SUBSELECT_TBL;
!  eight | f1 | f2 | f3 
! -------+----+----+----
!        |  1 |  2 |  3
!        |  2 |  3 |  4
!        |  3 |  4 |  5
!        |  1 |  1 |  1
!        |  2 |  2 |  2
!        |  3 |  3 |  3
!        |  6 |  7 |  8
!        |  8 |  9 |   
! (8 rows)
! 
! -- Uncorrelated subselects
! SELECT '' AS two, f1 AS "Constant Select" FROM SUBSELECT_TBL
!   WHERE f1 IN (SELECT 1);
!  two | Constant Select 
! -----+-----------------
!      |               1
!      |               1
! (2 rows)
! 
! SELECT '' AS six, f1 AS "Uncorrelated Field" FROM SUBSELECT_TBL
!   WHERE f1 IN (SELECT f2 FROM SUBSELECT_TBL);
!  six | Uncorrelated Field 
! -----+--------------------
!      |                  1
!      |                  2
!      |                  3
!      |                  1
!      |                  2
!      |                  3
! (6 rows)
! 
! SELECT '' AS six, f1 AS "Uncorrelated Field" FROM SUBSELECT_TBL
!   WHERE f1 IN (SELECT f2 FROM SUBSELECT_TBL WHERE
!     f2 IN (SELECT f1 FROM SUBSELECT_TBL));
!  six | Uncorrelated Field 
! -----+--------------------
!      |                  1
!      |                  2
!      |                  3
!      |                  1
!      |                  2
!      |                  3
! (6 rows)
! 
! SELECT '' AS three, f1, f2
!   FROM SUBSELECT_TBL
!   WHERE (f1, f2) NOT IN (SELECT f2, CAST(f3 AS int4) FROM SUBSELECT_TBL
!                          WHERE f3 IS NOT NULL);
!  three | f1 | f2 
! -------+----+----
!        |  1 |  2
!        |  6 |  7
!        |  8 |  9
! (3 rows)
! 
! -- Correlated subselects
! SELECT '' AS six, f1 AS "Correlated Field", f2 AS "Second Field"
!   FROM SUBSELECT_TBL upper
!   WHERE f1 IN (SELECT f2 FROM SUBSELECT_TBL WHERE f1 = upper.f1);
!  six | Correlated Field | Second Field 
! -----+------------------+--------------
!      |                1 |            2
!      |                2 |            3
!      |                3 |            4
!      |                1 |            1
!      |                2 |            2
!      |                3 |            3
! (6 rows)
! 
! SELECT '' AS six, f1 AS "Correlated Field", f3 AS "Second Field"
!   FROM SUBSELECT_TBL upper
!   WHERE f1 IN
!     (SELECT f2 FROM SUBSELECT_TBL WHERE CAST(upper.f2 AS float) = f3);
!  six | Correlated Field | Second Field 
! -----+------------------+--------------
!      |                2 |            4
!      |                3 |            5
!      |                1 |            1
!      |                2 |            2
!      |                3 |            3
! (5 rows)
! 
! SELECT '' AS six, f1 AS "Correlated Field", f3 AS "Second Field"
!   FROM SUBSELECT_TBL upper
!   WHERE f3 IN (SELECT upper.f1 + f2 FROM SUBSELECT_TBL
!                WHERE f2 = CAST(f3 AS integer));
!  six | Correlated Field | Second Field 
! -----+------------------+--------------
!      |                1 |            3
!      |                2 |            4
!      |                3 |            5
!      |                6 |            8
! (4 rows)
! 
! SELECT '' AS five, f1 AS "Correlated Field"
!   FROM SUBSELECT_TBL
!   WHERE (f1, f2) IN (SELECT f2, CAST(f3 AS int4) FROM SUBSELECT_TBL
!                      WHERE f3 IS NOT NULL);
!  five | Correlated Field 
! ------+------------------
!       |                2
!       |                3
!       |                1
!       |                2
!       |                3
! (5 rows)
! 
! --
! -- Use some existing tables in the regression test
! --
! SELECT '' AS eight, ss.f1 AS "Correlated Field", ss.f3 AS "Second Field"
!   FROM SUBSELECT_TBL ss
!   WHERE f1 NOT IN (SELECT f1+1 FROM INT4_TBL
!                    WHERE f1 != ss.f1 AND f1 < 2147483647);
!  eight | Correlated Field | Second Field 
! -------+------------------+--------------
!        |                2 |            4
!        |                3 |            5
!        |                2 |            2
!        |                3 |            3
!        |                6 |            8
!        |                8 |             
! (6 rows)
! 
! select q1, float8(count(*)) / (select count(*) from int8_tbl)
! from int8_tbl group by q1 order by q1;
!         q1        | ?column? 
! ------------------+----------
!               123 |      0.4
!  4567890123456789 |      0.6
! (2 rows)
! 
! --
! -- Test cases to catch unpleasant interactions between IN-join processing
! -- and subquery pullup.
! --
! select count(*) from
!   (select 1 from tenk1 a
!    where unique1 IN (select hundred from tenk1 b)) ss;
!  count 
! -------
!    100
! (1 row)
! 
! select count(distinct ss.ten) from
!   (select ten from tenk1 a
!    where unique1 IN (select hundred from tenk1 b)) ss;
!  count 
! -------
!     10
! (1 row)
! 
! select count(*) from
!   (select 1 from tenk1 a
!    where unique1 IN (select distinct hundred from tenk1 b)) ss;
!  count 
! -------
!    100
! (1 row)
! 
! select count(distinct ss.ten) from
!   (select ten from tenk1 a
!    where unique1 IN (select distinct hundred from tenk1 b)) ss;
!  count 
! -------
!     10
! (1 row)
! 
! --
! -- Test cases to check for overenthusiastic optimization of
! -- "IN (SELECT DISTINCT ...)" and related cases.  Per example from
! -- Luca Pireddu and Michael Fuhr.
! --
! CREATE TEMP TABLE foo (id integer);
! CREATE TEMP TABLE bar (id1 integer, id2 integer);
! INSERT INTO foo VALUES (1);
! INSERT INTO bar VALUES (1, 1);
! INSERT INTO bar VALUES (2, 2);
! INSERT INTO bar VALUES (3, 1);
! -- These cases require an extra level of distinct-ing above subquery s
! SELECT * FROM foo WHERE id IN
!     (SELECT id2 FROM (SELECT DISTINCT id1, id2 FROM bar) AS s);
!  id 
! ----
!   1
! (1 row)
! 
! SELECT * FROM foo WHERE id IN
!     (SELECT id2 FROM (SELECT id1,id2 FROM bar GROUP BY id1,id2) AS s);
!  id 
! ----
!   1
! (1 row)
! 
! SELECT * FROM foo WHERE id IN
!     (SELECT id2 FROM (SELECT id1, id2 FROM bar UNION
!                       SELECT id1, id2 FROM bar) AS s);
!  id 
! ----
!   1
! (1 row)
! 
! -- These cases do not
! SELECT * FROM foo WHERE id IN
!     (SELECT id2 FROM (SELECT DISTINCT ON (id2) id1, id2 FROM bar) AS s);
!  id 
! ----
!   1
! (1 row)
! 
! SELECT * FROM foo WHERE id IN
!     (SELECT id2 FROM (SELECT id2 FROM bar GROUP BY id2) AS s);
!  id 
! ----
!   1
! (1 row)
! 
! SELECT * FROM foo WHERE id IN
!     (SELECT id2 FROM (SELECT id2 FROM bar UNION
!                       SELECT id2 FROM bar) AS s);
!  id 
! ----
!   1
! (1 row)
! 
! --
! -- Test case to catch problems with multiply nested sub-SELECTs not getting
! -- recalculated properly.  Per bug report from Didier Moens.
! --
! CREATE TABLE orderstest (
!     approver_ref integer,
!     po_ref integer,
!     ordercanceled boolean
! );
! INSERT INTO orderstest VALUES (1, 1, false);
! INSERT INTO orderstest VALUES (66, 5, false);
! INSERT INTO orderstest VALUES (66, 6, false);
! INSERT INTO orderstest VALUES (66, 7, false);
! INSERT INTO orderstest VALUES (66, 1, true);
! INSERT INTO orderstest VALUES (66, 8, false);
! INSERT INTO orderstest VALUES (66, 1, false);
! INSERT INTO orderstest VALUES (77, 1, false);
! INSERT INTO orderstest VALUES (1, 1, false);
! INSERT INTO orderstest VALUES (66, 1, false);
! INSERT INTO orderstest VALUES (1, 1, false);
! CREATE VIEW orders_view AS
! SELECT *,
! (SELECT CASE
!    WHEN ord.approver_ref=1 THEN '---' ELSE 'Approved'
!  END) AS "Approved",
! (SELECT CASE
!  WHEN ord.ordercanceled
!  THEN 'Canceled'
!  ELSE
!   (SELECT CASE
! 		WHEN ord.po_ref=1
! 		THEN
! 		 (SELECT CASE
! 				WHEN ord.approver_ref=1
! 				THEN '---'
! 				ELSE 'Approved'
! 			END)
! 		ELSE 'PO'
! 	END)
! END) AS "Status",
! (CASE
!  WHEN ord.ordercanceled
!  THEN 'Canceled'
!  ELSE
!   (CASE
! 		WHEN ord.po_ref=1
! 		THEN
! 		 (CASE
! 				WHEN ord.approver_ref=1
! 				THEN '---'
! 				ELSE 'Approved'
! 			END)
! 		ELSE 'PO'
! 	END)
! END) AS "Status_OK"
! FROM orderstest ord;
! SELECT * FROM orders_view;
!  approver_ref | po_ref | ordercanceled | Approved |  Status  | Status_OK 
! --------------+--------+---------------+----------+----------+-----------
!             1 |      1 | f             | ---      | ---      | ---
!            66 |      5 | f             | Approved | PO       | PO
!            66 |      6 | f             | Approved | PO       | PO
!            66 |      7 | f             | Approved | PO       | PO
!            66 |      1 | t             | Approved | Canceled | Canceled
!            66 |      8 | f             | Approved | PO       | PO
!            66 |      1 | f             | Approved | Approved | Approved
!            77 |      1 | f             | Approved | Approved | Approved
!             1 |      1 | f             | ---      | ---      | ---
!            66 |      1 | f             | Approved | Approved | Approved
!             1 |      1 | f             | ---      | ---      | ---
! (11 rows)
! 
! DROP TABLE orderstest cascade;
! NOTICE:  drop cascades to view orders_view
! --
! -- Test cases to catch situations where rule rewriter fails to propagate
! -- hasSubLinks flag correctly.  Per example from Kyle Bateman.
! --
! create temp table parts (
!     partnum     text,
!     cost        float8
! );
! create temp table shipped (
!     ttype       char(2),
!     ordnum      int4,
!     partnum     text,
!     value       float8
! );
! create temp view shipped_view as
!     select * from shipped where ttype = 'wt';
! create rule shipped_view_insert as on insert to shipped_view do instead
!     insert into shipped values('wt', new.ordnum, new.partnum, new.value);
! insert into parts (partnum, cost) values (1, 1234.56);
! insert into shipped_view (ordnum, partnum, value)
!     values (0, 1, (select cost from parts where partnum = '1'));
! select * from shipped_view;
!  ttype | ordnum | partnum |  value  
! -------+--------+---------+---------
!  wt    |      0 | 1       | 1234.56
! (1 row)
! 
! create rule shipped_view_update as on update to shipped_view do instead
!     update shipped set partnum = new.partnum, value = new.value
!         where ttype = new.ttype and ordnum = new.ordnum;
! update shipped_view set value = 11
!     from int4_tbl a join int4_tbl b
!       on (a.f1 = (select f1 from int4_tbl c where c.f1=b.f1))
!     where ordnum = a.f1;
! select * from shipped_view;
!  ttype | ordnum | partnum | value 
! -------+--------+---------+-------
!  wt    |      0 | 1       |    11
! (1 row)
! 
! select f1, ss1 as relabel from
!     (select *, (select sum(f1) from int4_tbl b where f1 >= a.f1) as ss1
!      from int4_tbl a) ss;
!      f1      |  relabel   
! -------------+------------
!            0 | 2147607103
!       123456 | 2147607103
!      -123456 | 2147483647
!   2147483647 | 2147483647
!  -2147483647 |          0
! (5 rows)
! 
! --
! -- Test cases involving PARAM_EXEC parameters and min/max index optimizations.
! -- Per bug report from David Sanchez i Gregori.
! --
! select * from (
!   select max(unique1) from tenk1 as a
!   where exists (select 1 from tenk1 as b where b.thousand = a.unique2)
! ) ss;
!  max  
! ------
!  9997
! (1 row)
! 
! select * from (
!   select min(unique1) from tenk1 as a
!   where not exists (select 1 from tenk1 as b where b.unique2 = 10000)
! ) ss;
!  min 
! -----
!    0
! (1 row)
! 
! --
! -- Test that an IN implemented using a UniquePath does unique-ification
! -- with the right semantics, as per bug #4113.  (Unfortunately we have
! -- no simple way to ensure that this test case actually chooses that type
! -- of plan, but it does in releases 7.4-8.3.  Note that an ordering difference
! -- here might mean that some other plan type is being used, rendering the test
! -- pointless.)
! --
! create temp table numeric_table (num_col numeric);
! insert into numeric_table values (1), (1.000000000000000000001), (2), (3);
! create temp table float_table (float_col float8);
! insert into float_table values (1), (2), (3);
! select * from float_table
!   where float_col in (select num_col from numeric_table);
!  float_col 
! -----------
!          1
!          2
!          3
! (3 rows)
! 
! select * from numeric_table
!   where num_col in (select float_col from float_table);
!          num_col         
! -------------------------
!                        1
!  1.000000000000000000001
!                        2
!                        3
! (4 rows)
! 
! --
! -- Test case for bug #4290: bogus calculation of subplan param sets
! --
! create temp table ta (id int primary key, val int);
! insert into ta values(1,1);
! insert into ta values(2,2);
! create temp table tb (id int primary key, aval int);
! insert into tb values(1,1);
! insert into tb values(2,1);
! insert into tb values(3,2);
! insert into tb values(4,2);
! create temp table tc (id int primary key, aid int);
! insert into tc values(1,1);
! insert into tc values(2,2);
! select
!   ( select min(tb.id) from tb
!     where tb.aval = (select ta.val from ta where ta.id = tc.aid) ) as min_tb_id
! from tc;
!  min_tb_id 
! -----------
!          1
!          3
! (2 rows)
! 
! --
! -- Test case for 8.3 "failed to locate grouping columns" bug
! --
! create temp table t1 (f1 numeric(14,0), f2 varchar(30));
! select * from
!   (select distinct f1, f2, (select f2 from t1 x where x.f1 = up.f1) as fs
!    from t1 up) ss
! group by f1,f2,fs;
!  f1 | f2 | fs 
! ----+----+----
! (0 rows)
! 
! --
! -- Test case for bug #5514 (mishandling of whole-row Vars in subselects)
! --
! create temp table table_a(id integer);
! insert into table_a values (42);
! create temp view view_a as select * from table_a;
! select view_a from view_a;
!  view_a 
! --------
!  (42)
! (1 row)
! 
! select (select view_a) from view_a;
!  view_a 
! --------
!  (42)
! (1 row)
! 
! select (select (select view_a)) from view_a;
!  view_a 
! --------
!  (42)
! (1 row)
! 
! select (select (a.*)::text) from view_a a;
!   a   
! ------
!  (42)
! (1 row)
! 
! --
! -- Check that whole-row Vars reading the result of a subselect don't include
! -- any junk columns therein
! --
! select q from (select max(f1) from int4_tbl group by f1 order by f1) q;
!        q       
! ---------------
!  (-2147483647)
!  (-123456)
!  (0)
!  (123456)
!  (2147483647)
! (5 rows)
! 
! with q as (select max(f1) from int4_tbl group by f1 order by f1)
!   select q from q;
!        q       
! ---------------
!  (-2147483647)
!  (-123456)
!  (0)
!  (123456)
!  (2147483647)
! (5 rows)
! 
! --
! -- Test case for sublinks pushed down into subselects via join alias expansion
! --
! select
!   (select sq1) as qq1
! from
!   (select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy
!    from int8_tbl) sq0
!   join
!   int4_tbl i4 on dummy = i4.f1;
!  qq1 
! -----
! (0 rows)
! 
! --
! -- Test case for cross-type partial matching in hashed subplan (bug #7597)
! --
! create temp table outer_7597 (f1 int4, f2 int4);
! insert into outer_7597 values (0, 0);
! insert into outer_7597 values (1, 0);
! insert into outer_7597 values (0, null);
! insert into outer_7597 values (1, null);
! create temp table inner_7597(c1 int8, c2 int8);
! insert into inner_7597 values(0, null);
! select * from outer_7597 where (f1, f2) not in (select * from inner_7597);
!  f1 | f2 
! ----+----
!   1 |  0
!   1 |   
! (2 rows)
! 
! --
! -- Test case for premature memory release during hashing of subplan output
! --
! select '1'::text in (select '1'::name union all select '1'::name);
!  ?column? 
! ----------
!  t
! (1 row)
! 
! --
! -- Test case for planner bug with nested EXISTS handling
! --
! select a.thousand from tenk1 a, tenk1 b
! where a.thousand = b.thousand
!   and exists ( select 1 from tenk1 c where b.hundred = c.hundred
!                    and not exists ( select 1 from tenk1 d
!                                     where a.thousand = d.thousand ) );
!  thousand 
! ----------
! (0 rows)
! 
! --
! -- Check that nested sub-selects are not pulled up if they contain volatiles
! --
! explain (verbose, costs off)
!   select x, x from
!     (select (select now()) as x from (values(1),(2)) v(y)) ss;
!         QUERY PLAN         
! ---------------------------
!  Values Scan on "*VALUES*"
!    Output: $0, $1
!    InitPlan 1 (returns $0)
!      ->  Result
!            Output: now()
!    InitPlan 2 (returns $1)
!      ->  Result
!            Output: now()
! (8 rows)
! 
! explain (verbose, costs off)
!   select x, x from
!     (select (select random()) as x from (values(1),(2)) v(y)) ss;
!             QUERY PLAN            
! ----------------------------------
!  Subquery Scan on ss
!    Output: ss.x, ss.x
!    ->  Values Scan on "*VALUES*"
!          Output: $0
!          InitPlan 1 (returns $0)
!            ->  Result
!                  Output: random()
! (7 rows)
! 
! explain (verbose, costs off)
!   select x, x from
!     (select (select now() where y=y) as x from (values(1),(2)) v(y)) ss;
!                               QUERY PLAN                              
! ----------------------------------------------------------------------
!  Values Scan on "*VALUES*"
!    Output: (SubPlan 1), (SubPlan 2)
!    SubPlan 1
!      ->  Result
!            Output: now()
!            One-Time Filter: ("*VALUES*".column1 = "*VALUES*".column1)
!    SubPlan 2
!      ->  Result
!            Output: now()
!            One-Time Filter: ("*VALUES*".column1 = "*VALUES*".column1)
! (10 rows)
! 
! explain (verbose, costs off)
!   select x, x from
!     (select (select random() where y=y) as x from (values(1),(2)) v(y)) ss;
!                                  QUERY PLAN                                 
! ----------------------------------------------------------------------------
!  Subquery Scan on ss
!    Output: ss.x, ss.x
!    ->  Values Scan on "*VALUES*"
!          Output: (SubPlan 1)
!          SubPlan 1
!            ->  Result
!                  Output: random()
!                  One-Time Filter: ("*VALUES*".column1 = "*VALUES*".column1)
! (8 rows)
! 
! --
! -- Check we behave sanely in corner case of empty SELECT list (bug #8648)
! --
! create temp table nocolumns();
! select exists(select * from nocolumns);
!  exists 
! --------
!  f
! (1 row)
! 
! --
! -- Check sane behavior with nested IN SubLinks
! --
! explain (verbose, costs off)
! select * from int4_tbl where
!   (case when f1 in (select unique1 from tenk1 a) then f1 else null end) in
!   (select ten from tenk1 b);
!                                                                                       QUERY PLAN                                                                                       
! ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  Nested Loop Semi Join
!    Output: int4_tbl.f1
!    Join Filter: (CASE WHEN (hashed SubPlan 1) THEN int4_tbl.f1 ELSE NULL::integer END = b.ten)
!    ->  Seq Scan on public.int4_tbl
!          Output: int4_tbl.f1
!    ->  Seq Scan on public.tenk1 b
!          Output: b.unique1, b.unique2, b.two, b.four, b.ten, b.twenty, b.hundred, b.thousand, b.twothousand, b.fivethous, b.tenthous, b.odd, b.even, b.stringu1, b.stringu2, b.string4
!    SubPlan 1
!      ->  Index Only Scan using tenk1_unique1 on public.tenk1 a
!            Output: a.unique1
! (10 rows)
! 
! select * from int4_tbl where
!   (case when f1 in (select unique1 from tenk1 a) then f1 else null end) in
!   (select ten from tenk1 b);
!  f1 
! ----
!   0
! (1 row)
! 
! --
! -- Check for incorrect optimization when IN subquery contains a SRF
! --
! explain (verbose, costs off)
! select * from int4_tbl o where (f1, f1) in
!   (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
!                               QUERY PLAN                              
! ----------------------------------------------------------------------
!  Hash Join
!    Output: o.f1
!    Hash Cond: (o.f1 = "ANY_subquery".f1)
!    ->  Seq Scan on public.int4_tbl o
!          Output: o.f1
!    ->  Hash
!          Output: "ANY_subquery".f1, "ANY_subquery".g
!          ->  HashAggregate
!                Output: "ANY_subquery".f1, "ANY_subquery".g
!                Group Key: "ANY_subquery".f1, "ANY_subquery".g
!                ->  Subquery Scan on "ANY_subquery"
!                      Output: "ANY_subquery".f1, "ANY_subquery".g
!                      Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
!                      ->  HashAggregate
!                            Output: i.f1, (generate_series(1, 2) / 10)
!                            Group Key: i.f1
!                            ->  Seq Scan on public.int4_tbl i
!                                  Output: i.f1
! (18 rows)
! 
! select * from int4_tbl o where (f1, f1) in
!   (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
!  f1 
! ----
!   0
! (1 row)
! 
! --
! -- check for over-optimization of whole-row Var referencing an Append plan
! --
! select (select q from
!          (select 1,2,3 where f1 > 0
!           union all
!           select 4,5,6.0 where f1 <= 0
!          ) q )
! from int4_tbl;
!      q     
! -----------
!  (4,5,6.0)
!  (1,2,3)
!  (4,5,6.0)
!  (1,2,3)
!  (4,5,6.0)
! (5 rows)
! 
! --
! -- Check that volatile quals aren't pushed down past a DISTINCT:
! -- nextval() should not be called more than the nominal number of times
! --
! create temp sequence ts1;
! select * from
!   (select distinct ten from tenk1) ss
!   where ten < 10 + nextval('ts1')
!   order by 1;
!  ten 
! -----
!    0
!    1
!    2
!    3
!    4
!    5
!    6
!    7
!    8
!    9
! (10 rows)
! 
! select nextval('ts1');
!  nextval 
! ---------
!       11
! (1 row)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/union.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/union.out	Tue Oct 28 15:53:05 2014
***************
*** 1,712 ****
! --
! -- UNION (also INTERSECT, EXCEPT)
! --
! -- Simple UNION constructs
! SELECT 1 AS two UNION SELECT 2;
!  two 
! -----
!    1
!    2
! (2 rows)
! 
! SELECT 1 AS one UNION SELECT 1;
!  one 
! -----
!    1
! (1 row)
! 
! SELECT 1 AS two UNION ALL SELECT 2;
!  two 
! -----
!    1
!    2
! (2 rows)
! 
! SELECT 1 AS two UNION ALL SELECT 1;
!  two 
! -----
!    1
!    1
! (2 rows)
! 
! SELECT 1 AS three UNION SELECT 2 UNION SELECT 3;
!  three 
! -------
!      1
!      2
!      3
! (3 rows)
! 
! SELECT 1 AS two UNION SELECT 2 UNION SELECT 2;
!  two 
! -----
!    1
!    2
! (2 rows)
! 
! SELECT 1 AS three UNION SELECT 2 UNION ALL SELECT 2;
!  three 
! -------
!      1
!      2
!      2
! (3 rows)
! 
! SELECT 1.1 AS two UNION SELECT 2.2;
!  two 
! -----
!  1.1
!  2.2
! (2 rows)
! 
! -- Mixed types
! SELECT 1.1 AS two UNION SELECT 2;
!  two 
! -----
!  1.1
!    2
! (2 rows)
! 
! SELECT 1 AS two UNION SELECT 2.2;
!  two 
! -----
!    1
!  2.2
! (2 rows)
! 
! SELECT 1 AS one UNION SELECT 1.0::float8;
!  one 
! -----
!    1
! (1 row)
! 
! SELECT 1.1 AS two UNION ALL SELECT 2;
!  two 
! -----
!  1.1
!    2
! (2 rows)
! 
! SELECT 1.0::float8 AS two UNION ALL SELECT 1;
!  two 
! -----
!    1
!    1
! (2 rows)
! 
! SELECT 1.1 AS three UNION SELECT 2 UNION SELECT 3;
!  three 
! -------
!    1.1
!      2
!      3
! (3 rows)
! 
! SELECT 1.1::float8 AS two UNION SELECT 2 UNION SELECT 2.0::float8 ORDER BY 1;
!  two 
! -----
!  1.1
!    2
! (2 rows)
! 
! SELECT 1.1 AS three UNION SELECT 2 UNION ALL SELECT 2;
!  three 
! -------
!    1.1
!      2
!      2
! (3 rows)
! 
! SELECT 1.1 AS two UNION (SELECT 2 UNION ALL SELECT 2);
!  two 
! -----
!  1.1
!    2
! (2 rows)
! 
! --
! -- Try testing from tables...
! --
! SELECT f1 AS five FROM FLOAT8_TBL
! UNION
! SELECT f1 FROM FLOAT8_TBL
! ORDER BY 1;
!          five          
! -----------------------
!  -1.2345678901234e+200
!                -1004.3
!                 -34.84
!  -1.2345678901234e-200
!                      0
! (5 rows)
! 
! SELECT f1 AS ten FROM FLOAT8_TBL
! UNION ALL
! SELECT f1 FROM FLOAT8_TBL;
!           ten          
! -----------------------
!                      0
!                 -34.84
!                -1004.3
!  -1.2345678901234e+200
!  -1.2345678901234e-200
!                      0
!                 -34.84
!                -1004.3
!  -1.2345678901234e+200
!  -1.2345678901234e-200
! (10 rows)
! 
! SELECT f1 AS nine FROM FLOAT8_TBL
! UNION
! SELECT f1 FROM INT4_TBL
! ORDER BY 1;
!          nine          
! -----------------------
!  -1.2345678901234e+200
!            -2147483647
!                -123456
!                -1004.3
!                 -34.84
!  -1.2345678901234e-200
!                      0
!                 123456
!             2147483647
! (9 rows)
! 
! SELECT f1 AS ten FROM FLOAT8_TBL
! UNION ALL
! SELECT f1 FROM INT4_TBL;
!           ten          
! -----------------------
!                      0
!                 -34.84
!                -1004.3
!  -1.2345678901234e+200
!  -1.2345678901234e-200
!                      0
!                 123456
!                -123456
!             2147483647
!            -2147483647
! (10 rows)
! 
! SELECT f1 AS five FROM FLOAT8_TBL
!   WHERE f1 BETWEEN -1e6 AND 1e6
! UNION
! SELECT f1 FROM INT4_TBL
!   WHERE f1 BETWEEN 0 AND 1000000;
!          five          
! -----------------------
!                -1004.3
!                 -34.84
!  -1.2345678901234e-200
!                      0
!                 123456
! (5 rows)
! 
! SELECT CAST(f1 AS char(4)) AS three FROM VARCHAR_TBL
! UNION
! SELECT f1 FROM CHAR_TBL
! ORDER BY 1;
!  three 
! -------
!  a   
!  ab  
!  abcd
! (3 rows)
! 
! SELECT f1 AS three FROM VARCHAR_TBL
! UNION
! SELECT CAST(f1 AS varchar) FROM CHAR_TBL
! ORDER BY 1;
!  three 
! -------
!  a
!  ab
!  abcd
! (3 rows)
! 
! SELECT f1 AS eight FROM VARCHAR_TBL
! UNION ALL
! SELECT f1 FROM CHAR_TBL;
!  eight 
! -------
!  a
!  ab
!  abcd
!  abcd
!  a
!  ab
!  abcd
!  abcd
! (8 rows)
! 
! SELECT f1 AS five FROM TEXT_TBL
! UNION
! SELECT f1 FROM VARCHAR_TBL
! UNION
! SELECT TRIM(TRAILING FROM f1) FROM CHAR_TBL
! ORDER BY 1;
!        five        
! -------------------
!  a
!  ab
!  abcd
!  doh!
!  hi de ho neighbor
! (5 rows)
! 
! --
! -- INTERSECT and EXCEPT
! --
! SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
!         q2        
! ------------------
!  4567890123456789
!               123
! (2 rows)
! 
! SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
!         q2        
! ------------------
!  4567890123456789
!  4567890123456789
!               123
! (3 rows)
! 
! SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
!         q2         
! -------------------
!  -4567890123456789
!                456
! (2 rows)
! 
! SELECT q2 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl ORDER BY 1;
!         q2         
! -------------------
!  -4567890123456789
!                456
! (2 rows)
! 
! SELECT q2 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q1 FROM int8_tbl ORDER BY 1;
!         q2         
! -------------------
!  -4567890123456789
!                456
!   4567890123456789
! (3 rows)
! 
! SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
!  q1 
! ----
! (0 rows)
! 
! SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
!         q1        
! ------------------
!  4567890123456789
!               123
! (2 rows)
! 
! SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
!         q1        
! ------------------
!  4567890123456789
!  4567890123456789
!               123
! (3 rows)
! 
! SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q1 FROM int8_tbl FOR NO KEY UPDATE;
! ERROR:  FOR NO KEY UPDATE is not allowed with UNION/INTERSECT/EXCEPT
! --
! -- Mixed types
! --
! SELECT f1 FROM float8_tbl INTERSECT SELECT f1 FROM int4_tbl;
!  f1 
! ----
!   0
! (1 row)
! 
! SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
!           f1           
! -----------------------
!  -1.2345678901234e+200
!                -1004.3
!                 -34.84
!  -1.2345678901234e-200
! (4 rows)
! 
! --
! -- Operator precedence and (((((extra))))) parentheses
! --
! SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
!         q1         
! -------------------
!   4567890123456789
!                123
!                456
!   4567890123456789
!                123
!   4567890123456789
!  -4567890123456789
! (7 rows)
! 
! SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
!         q1        
! ------------------
!  4567890123456789
!               123
! (2 rows)
! 
! (((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
!         q1         
! -------------------
!   4567890123456789
!                123
!                456
!   4567890123456789
!                123
!   4567890123456789
!  -4567890123456789
! (7 rows)
! 
! SELECT q1 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
!         q1         
! -------------------
!  -4567890123456789
!                456
! (2 rows)
! 
! SELECT q1 FROM int8_tbl UNION ALL (((SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1)));
!         q1         
! -------------------
!                123
!                123
!   4567890123456789
!   4567890123456789
!   4567890123456789
!  -4567890123456789
!                456
! (7 rows)
! 
! (((SELECT q1 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl))) EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
!         q1         
! -------------------
!  -4567890123456789
!                456
! (2 rows)
! 
! --
! -- Subqueries with ORDER BY & LIMIT clauses
! --
! -- In this syntax, ORDER BY/LIMIT apply to the result of the EXCEPT
! SELECT q1,q2 FROM int8_tbl EXCEPT SELECT q2,q1 FROM int8_tbl
! ORDER BY q2,q1;
!         q1        |        q2         
! ------------------+-------------------
!  4567890123456789 | -4567890123456789
!               123 |               456
! (2 rows)
! 
! -- This should fail, because q2 isn't a name of an EXCEPT output column
! SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1;
! ERROR:  column "q2" does not exist
! LINE 1: ... int8_tbl EXCEPT SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1...
!                                                              ^
! HINT:  There is a column named "q2" in table "*SELECT* 2", but it cannot be referenced from this part of the query.
! -- But this should work:
! SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
!         q1        
! ------------------
!  4567890123456789
!               123
! (2 rows)
! 
! --
! -- New syntaxes (7.1) permit new tests
! --
! (((((select * from int8_tbl)))));
!         q1        |        q2         
! ------------------+-------------------
!               123 |               456
!               123 |  4567890123456789
!  4567890123456789 |               123
!  4567890123456789 |  4567890123456789
!  4567890123456789 | -4567890123456789
! (5 rows)
! 
! --
! -- Check handling of a case with unknown constants.  We don't guarantee
! -- an undecorated constant will work in all cases, but historically this
! -- usage has worked, so test we don't break it.
! --
! SELECT a.f1 FROM (SELECT 'test' AS f1 FROM varchar_tbl) a
! UNION
! SELECT b.f1 FROM (SELECT f1 FROM varchar_tbl) b
! ORDER BY 1;
!   f1  
! ------
!  a
!  ab
!  abcd
!  test
! (4 rows)
! 
! -- This should fail, but it should produce an error cursor
! SELECT '3.4'::numeric UNION SELECT 'foo';
! ERROR:  invalid input syntax for type numeric: "foo"
! LINE 1: SELECT '3.4'::numeric UNION SELECT 'foo';
!                                            ^
! --
! -- Test that expression-index constraints can be pushed down through
! -- UNION or UNION ALL
! --
! CREATE TEMP TABLE t1 (a text, b text);
! CREATE INDEX t1_ab_idx on t1 ((a || b));
! CREATE TEMP TABLE t2 (ab text primary key);
! INSERT INTO t1 VALUES ('a', 'b'), ('x', 'y');
! INSERT INTO t2 VALUES ('ab'), ('xy');
! set enable_seqscan = off;
! set enable_indexscan = on;
! set enable_bitmapscan = off;
! explain (costs off)
!  SELECT * FROM
!  (SELECT a || b AS ab FROM t1
!   UNION ALL
!   SELECT * FROM t2) t
!  WHERE ab = 'ab';
!                  QUERY PLAN                  
! ---------------------------------------------
!  Append
!    ->  Index Scan using t1_ab_idx on t1
!          Index Cond: ((a || b) = 'ab'::text)
!    ->  Index Only Scan using t2_pkey on t2
!          Index Cond: (ab = 'ab'::text)
! (5 rows)
! 
! explain (costs off)
!  SELECT * FROM
!  (SELECT a || b AS ab FROM t1
!   UNION
!   SELECT * FROM t2) t
!  WHERE ab = 'ab';
!                     QUERY PLAN                     
! ---------------------------------------------------
!  HashAggregate
!    Group Key: ((t1.a || t1.b))
!    ->  Append
!          ->  Index Scan using t1_ab_idx on t1
!                Index Cond: ((a || b) = 'ab'::text)
!          ->  Index Only Scan using t2_pkey on t2
!                Index Cond: (ab = 'ab'::text)
! (7 rows)
! 
! --
! -- Test that ORDER BY for UNION ALL can be pushed down to inheritance
! -- children.
! --
! CREATE TEMP TABLE t1c (b text, a text);
! ALTER TABLE t1c INHERIT t1;
! CREATE TEMP TABLE t2c (primary key (ab)) INHERITS (t2);
! INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'), ('m', 'n'), ('e', 'f');
! INSERT INTO t2c VALUES ('vw'), ('cd'), ('mn'), ('ef');
! CREATE INDEX t1c_ab_idx on t1c ((a || b));
! set enable_seqscan = on;
! set enable_indexonlyscan = off;
! explain (costs off)
!   SELECT * FROM
!   (SELECT a || b AS ab FROM t1
!    UNION ALL
!    SELECT ab FROM t2) t
!   ORDER BY 1 LIMIT 8;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Limit
!    ->  Merge Append
!          Sort Key: ((t1.a || t1.b))
!          ->  Index Scan using t1_ab_idx on t1
!          ->  Index Scan using t1c_ab_idx on t1c
!          ->  Index Scan using t2_pkey on t2
!          ->  Index Scan using t2c_pkey on t2c
! (7 rows)
! 
!   SELECT * FROM
!   (SELECT a || b AS ab FROM t1
!    UNION ALL
!    SELECT ab FROM t2) t
!   ORDER BY 1 LIMIT 8;
!  ab 
! ----
!  ab
!  ab
!  cd
!  dc
!  ef
!  fe
!  mn
!  nm
! (8 rows)
! 
! reset enable_seqscan;
! reset enable_indexscan;
! reset enable_bitmapscan;
! -- This simpler variant of the above test has been observed to fail differently
! create table events (event_id int primary key);
! create table other_events (event_id int primary key);
! create table events_child () inherits (events);
! explain (costs off)
! select event_id
!  from (select event_id from events
!        union all
!        select event_id from other_events) ss
!  order by event_id;
!                         QUERY PLAN                        
! ----------------------------------------------------------
!  Merge Append
!    Sort Key: events.event_id
!    ->  Index Scan using events_pkey on events
!    ->  Sort
!          Sort Key: events_child.event_id
!          ->  Seq Scan on events_child
!    ->  Index Scan using other_events_pkey on other_events
! (7 rows)
! 
! drop table events_child, events, other_events;
! reset enable_indexonlyscan;
! -- Test constraint exclusion of UNION ALL subqueries
! explain (costs off)
!  SELECT * FROM
!   (SELECT 1 AS t, * FROM tenk1 a
!    UNION ALL
!    SELECT 2 AS t, * FROM tenk1 b) c
!  WHERE t = 2;
!         QUERY PLAN         
! ---------------------------
!  Append
!    ->  Seq Scan on tenk1 b
! (2 rows)
! 
! -- Test that we push quals into UNION sub-selects only when it's safe
! explain (costs off)
! SELECT * FROM
!   (SELECT 1 AS t, 2 AS x
!    UNION
!    SELECT 2 AS t, 4 AS x) ss
! WHERE x < 4;
!                  QUERY PLAN                 
! --------------------------------------------
!  Unique
!    ->  Sort
!          Sort Key: (1), (2)
!          ->  Append
!                ->  Result
!                ->  Result
!                      One-Time Filter: false
! (7 rows)
! 
! SELECT * FROM
!   (SELECT 1 AS t, 2 AS x
!    UNION
!    SELECT 2 AS t, 4 AS x) ss
! WHERE x < 4;
!  t | x 
! ---+---
!  1 | 2
! (1 row)
! 
! explain (costs off)
! SELECT * FROM
!   (SELECT 1 AS t, generate_series(1,10) AS x
!    UNION
!    SELECT 2 AS t, 4 AS x) ss
! WHERE x < 4
! ORDER BY x;
!                        QUERY PLAN                       
! --------------------------------------------------------
!  Sort
!    Sort Key: ss.x
!    ->  Subquery Scan on ss
!          Filter: (ss.x < 4)
!          ->  HashAggregate
!                Group Key: (1), (generate_series(1, 10))
!                ->  Append
!                      ->  Result
!                      ->  Result
! (9 rows)
! 
! SELECT * FROM
!   (SELECT 1 AS t, generate_series(1,10) AS x
!    UNION
!    SELECT 2 AS t, 4 AS x) ss
! WHERE x < 4
! ORDER BY x;
!  t | x 
! ---+---
!  1 | 1
!  1 | 2
!  1 | 3
! (3 rows)
! 
! explain (costs off)
! SELECT * FROM
!   (SELECT 1 AS t, (random()*3)::int AS x
!    UNION
!    SELECT 2 AS t, 4 AS x) ss
! WHERE x > 3;
!                                  QUERY PLAN                                 
! ----------------------------------------------------------------------------
!  Subquery Scan on ss
!    Filter: (ss.x > 3)
!    ->  Unique
!          ->  Sort
!                Sort Key: (1), (((random() * 3::double precision))::integer)
!                ->  Append
!                      ->  Result
!                      ->  Result
! (8 rows)
! 
! SELECT * FROM
!   (SELECT 1 AS t, (random()*3)::int AS x
!    UNION
!    SELECT 2 AS t, 4 AS x) ss
! WHERE x > 3;
!  t | x 
! ---+---
!  2 | 4
! (1 row)
! 
! -- Test proper handling of parameterized appendrel paths when the
! -- potential join qual is expensive
! create function expensivefunc(int) returns int
! language plpgsql immutable strict cost 10000
! as $$begin return $1; end$$;
! create temp table t3 as select generate_series(-1000,1000) as x;
! create index t3i on t3 (expensivefunc(x));
! analyze t3;
! explain (costs off)
! select * from
!   (select * from t3 a union all select * from t3 b) ss
!   join int4_tbl on f1 = expensivefunc(x);
!                          QUERY PLAN                         
! ------------------------------------------------------------
!  Nested Loop
!    ->  Seq Scan on int4_tbl
!    ->  Append
!          ->  Index Scan using t3i on t3 a
!                Index Cond: (expensivefunc(x) = int4_tbl.f1)
!          ->  Index Scan using t3i on t3 b
!                Index Cond: (expensivefunc(x) = int4_tbl.f1)
! (7 rows)
! 
! select * from
!   (select * from t3 a union all select * from t3 b) ss
!   join int4_tbl on f1 = expensivefunc(x);
!  x | f1 
! ---+----
!  0 |  0
!  0 |  0
! (2 rows)
! 
! drop table t3;
! drop function expensivefunc(int);
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/case.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/case.out	Tue Oct 28 15:53:05 2014
***************
*** 1,303 ****
! --
! -- CASE
! -- Test the case statement
! --
! CREATE TABLE CASE_TBL (
!   i integer,
!   f double precision
! );
! CREATE TABLE CASE2_TBL (
!   i integer,
!   j integer
! );
! INSERT INTO CASE_TBL VALUES (1, 10.1);
! INSERT INTO CASE_TBL VALUES (2, 20.2);
! INSERT INTO CASE_TBL VALUES (3, -30.3);
! INSERT INTO CASE_TBL VALUES (4, NULL);
! INSERT INTO CASE2_TBL VALUES (1, -1);
! INSERT INTO CASE2_TBL VALUES (2, -2);
! INSERT INTO CASE2_TBL VALUES (3, -3);
! INSERT INTO CASE2_TBL VALUES (2, -4);
! INSERT INTO CASE2_TBL VALUES (1, NULL);
! INSERT INTO CASE2_TBL VALUES (NULL, -6);
! --
! -- Simplest examples without tables
! --
! SELECT '3' AS "One",
!   CASE
!     WHEN 1 < 2 THEN 3
!   END AS "Simple WHEN";
!  One | Simple WHEN 
! -----+-------------
!  3   |           3
! (1 row)
! 
! SELECT '<NULL>' AS "One",
!   CASE
!     WHEN 1 > 2 THEN 3
!   END AS "Simple default";
!   One   | Simple default 
! --------+----------------
!  <NULL> |               
! (1 row)
! 
! SELECT '3' AS "One",
!   CASE
!     WHEN 1 < 2 THEN 3
!     ELSE 4
!   END AS "Simple ELSE";
!  One | Simple ELSE 
! -----+-------------
!  3   |           3
! (1 row)
! 
! SELECT '4' AS "One",
!   CASE
!     WHEN 1 > 2 THEN 3
!     ELSE 4
!   END AS "ELSE default";
!  One | ELSE default 
! -----+--------------
!  4   |            4
! (1 row)
! 
! SELECT '6' AS "One",
!   CASE
!     WHEN 1 > 2 THEN 3
!     WHEN 4 < 5 THEN 6
!     ELSE 7
!   END AS "Two WHEN with default";
!  One | Two WHEN with default 
! -----+-----------------------
!  6   |                     6
! (1 row)
! 
! -- Constant-expression folding shouldn't evaluate unreachable subexpressions
! SELECT CASE WHEN 1=0 THEN 1/0 WHEN 1=1 THEN 1 ELSE 2/0 END;
!  case 
! ------
!     1
! (1 row)
! 
! SELECT CASE 1 WHEN 0 THEN 1/0 WHEN 1 THEN 1 ELSE 2/0 END;
!  case 
! ------
!     1
! (1 row)
! 
! -- However we do not currently suppress folding of potentially
! -- reachable subexpressions
! SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl;
! ERROR:  division by zero
! -- Test for cases involving untyped literals in test expression
! SELECT CASE 'a' WHEN 'a' THEN 1 ELSE 2 END;
!  case 
! ------
!     1
! (1 row)
! 
! --
! -- Examples of targets involving tables
! --
! SELECT '' AS "Five",
!   CASE
!     WHEN i >= 3 THEN i
!   END AS ">= 3 or Null"
!   FROM CASE_TBL;
!  Five | >= 3 or Null 
! ------+--------------
!       |             
!       |             
!       |            3
!       |            4
! (4 rows)
! 
! SELECT '' AS "Five",
!   CASE WHEN i >= 3 THEN (i + i)
!        ELSE i
!   END AS "Simplest Math"
!   FROM CASE_TBL;
!  Five | Simplest Math 
! ------+---------------
!       |             1
!       |             2
!       |             6
!       |             8
! (4 rows)
! 
! SELECT '' AS "Five", i AS "Value",
!   CASE WHEN (i < 0) THEN 'small'
!        WHEN (i = 0) THEN 'zero'
!        WHEN (i = 1) THEN 'one'
!        WHEN (i = 2) THEN 'two'
!        ELSE 'big'
!   END AS "Category"
!   FROM CASE_TBL;
!  Five | Value | Category 
! ------+-------+----------
!       |     1 | one
!       |     2 | two
!       |     3 | big
!       |     4 | big
! (4 rows)
! 
! SELECT '' AS "Five",
!   CASE WHEN ((i < 0) or (i < 0)) THEN 'small'
!        WHEN ((i = 0) or (i = 0)) THEN 'zero'
!        WHEN ((i = 1) or (i = 1)) THEN 'one'
!        WHEN ((i = 2) or (i = 2)) THEN 'two'
!        ELSE 'big'
!   END AS "Category"
!   FROM CASE_TBL;
!  Five | Category 
! ------+----------
!       | one
!       | two
!       | big
!       | big
! (4 rows)
! 
! --
! -- Examples of qualifications involving tables
! --
! --
! -- NULLIF() and COALESCE()
! -- Shorthand forms for typical CASE constructs
! --  defined in the SQL standard.
! --
! SELECT * FROM CASE_TBL WHERE COALESCE(f,i) = 4;
!  i | f 
! ---+---
!  4 |  
! (1 row)
! 
! SELECT * FROM CASE_TBL WHERE NULLIF(f,i) = 2;
!  i | f 
! ---+---
! (0 rows)
! 
! SELECT COALESCE(a.f, b.i, b.j)
!   FROM CASE_TBL a, CASE2_TBL b;
!  coalesce 
! ----------
!      10.1
!      20.2
!     -30.3
!         1
!      10.1
!      20.2
!     -30.3
!         2
!      10.1
!      20.2
!     -30.3
!         3
!      10.1
!      20.2
!     -30.3
!         2
!      10.1
!      20.2
!     -30.3
!         1
!      10.1
!      20.2
!     -30.3
!        -6
! (24 rows)
! 
! SELECT *
!   FROM CASE_TBL a, CASE2_TBL b
!   WHERE COALESCE(a.f, b.i, b.j) = 2;
!  i | f | i | j  
! ---+---+---+----
!  4 |   | 2 | -2
!  4 |   | 2 | -4
! (2 rows)
! 
! SELECT '' AS Five, NULLIF(a.i,b.i) AS "NULLIF(a.i,b.i)",
!   NULLIF(b.i, 4) AS "NULLIF(b.i,4)"
!   FROM CASE_TBL a, CASE2_TBL b;
!  five | NULLIF(a.i,b.i) | NULLIF(b.i,4) 
! ------+-----------------+---------------
!       |                 |             1
!       |               2 |             1
!       |               3 |             1
!       |               4 |             1
!       |               1 |             2
!       |                 |             2
!       |               3 |             2
!       |               4 |             2
!       |               1 |             3
!       |               2 |             3
!       |                 |             3
!       |               4 |             3
!       |               1 |             2
!       |                 |             2
!       |               3 |             2
!       |               4 |             2
!       |                 |             1
!       |               2 |             1
!       |               3 |             1
!       |               4 |             1
!       |               1 |              
!       |               2 |              
!       |               3 |              
!       |               4 |              
! (24 rows)
! 
! SELECT '' AS "Two", *
!   FROM CASE_TBL a, CASE2_TBL b
!   WHERE COALESCE(f,b.i) = 2;
!  Two | i | f | i | j  
! -----+---+---+---+----
!      | 4 |   | 2 | -2
!      | 4 |   | 2 | -4
! (2 rows)
! 
! --
! -- Examples of updates involving tables
! --
! UPDATE CASE_TBL
!   SET i = CASE WHEN i >= 3 THEN (- i)
!                 ELSE (2 * i) END;
! SELECT * FROM CASE_TBL;
!  i  |   f   
! ----+-------
!   2 |  10.1
!   4 |  20.2
!  -3 | -30.3
!  -4 |      
! (4 rows)
! 
! UPDATE CASE_TBL
!   SET i = CASE WHEN i >= 2 THEN (2 * i)
!                 ELSE (3 * i) END;
! SELECT * FROM CASE_TBL;
!   i  |   f   
! -----+-------
!    4 |  10.1
!    8 |  20.2
!   -9 | -30.3
!  -12 |      
! (4 rows)
! 
! UPDATE CASE_TBL
!   SET i = CASE WHEN b.i >= 2 THEN (2 * j)
!                 ELSE (3 * j) END
!   FROM CASE2_TBL b
!   WHERE j = -CASE_TBL.i;
! SELECT * FROM CASE_TBL;
!   i  |   f   
! -----+-------
!    8 |  20.2
!   -9 | -30.3
!  -12 |      
!   -8 |  10.1
! (4 rows)
! 
! --
! -- Clean up
! --
! DROP TABLE CASE_TBL;
! DROP TABLE CASE2_TBL;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/join.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/join.out	Tue Oct 28 15:53:05 2014
***************
*** 1,4341 ****
! --
! -- JOIN
! -- Test JOIN clauses
! --
! CREATE TABLE J1_TBL (
!   i integer,
!   j integer,
!   t text
! );
! CREATE TABLE J2_TBL (
!   i integer,
!   k integer
! );
! INSERT INTO J1_TBL VALUES (1, 4, 'one');
! INSERT INTO J1_TBL VALUES (2, 3, 'two');
! INSERT INTO J1_TBL VALUES (3, 2, 'three');
! INSERT INTO J1_TBL VALUES (4, 1, 'four');
! INSERT INTO J1_TBL VALUES (5, 0, 'five');
! INSERT INTO J1_TBL VALUES (6, 6, 'six');
! INSERT INTO J1_TBL VALUES (7, 7, 'seven');
! INSERT INTO J1_TBL VALUES (8, 8, 'eight');
! INSERT INTO J1_TBL VALUES (0, NULL, 'zero');
! INSERT INTO J1_TBL VALUES (NULL, NULL, 'null');
! INSERT INTO J1_TBL VALUES (NULL, 0, 'zero');
! INSERT INTO J2_TBL VALUES (1, -1);
! INSERT INTO J2_TBL VALUES (2, 2);
! INSERT INTO J2_TBL VALUES (3, -3);
! INSERT INTO J2_TBL VALUES (2, 4);
! INSERT INTO J2_TBL VALUES (5, -5);
! INSERT INTO J2_TBL VALUES (5, -5);
! INSERT INTO J2_TBL VALUES (0, NULL);
! INSERT INTO J2_TBL VALUES (NULL, NULL);
! INSERT INTO J2_TBL VALUES (NULL, 0);
! --
! -- CORRELATION NAMES
! -- Make sure that table/column aliases are supported
! -- before diving into more complex join syntax.
! --
! SELECT '' AS "xxx", *
!   FROM J1_TBL AS tx;
!  xxx | i | j |   t   
! -----+---+---+-------
!      | 1 | 4 | one
!      | 2 | 3 | two
!      | 3 | 2 | three
!      | 4 | 1 | four
!      | 5 | 0 | five
!      | 6 | 6 | six
!      | 7 | 7 | seven
!      | 8 | 8 | eight
!      | 0 |   | zero
!      |   |   | null
!      |   | 0 | zero
! (11 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL tx;
!  xxx | i | j |   t   
! -----+---+---+-------
!      | 1 | 4 | one
!      | 2 | 3 | two
!      | 3 | 2 | three
!      | 4 | 1 | four
!      | 5 | 0 | five
!      | 6 | 6 | six
!      | 7 | 7 | seven
!      | 8 | 8 | eight
!      | 0 |   | zero
!      |   |   | null
!      |   | 0 | zero
! (11 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL AS t1 (a, b, c);
!  xxx | a | b |   c   
! -----+---+---+-------
!      | 1 | 4 | one
!      | 2 | 3 | two
!      | 3 | 2 | three
!      | 4 | 1 | four
!      | 5 | 0 | five
!      | 6 | 6 | six
!      | 7 | 7 | seven
!      | 8 | 8 | eight
!      | 0 |   | zero
!      |   |   | null
!      |   | 0 | zero
! (11 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b, c);
!  xxx | a | b |   c   
! -----+---+---+-------
!      | 1 | 4 | one
!      | 2 | 3 | two
!      | 3 | 2 | three
!      | 4 | 1 | four
!      | 5 | 0 | five
!      | 6 | 6 | six
!      | 7 | 7 | seven
!      | 8 | 8 | eight
!      | 0 |   | zero
!      |   |   | null
!      |   | 0 | zero
! (11 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b, c), J2_TBL t2 (d, e);
!  xxx | a | b |   c   | d | e  
! -----+---+---+-------+---+----
!      | 1 | 4 | one   | 1 | -1
!      | 2 | 3 | two   | 1 | -1
!      | 3 | 2 | three | 1 | -1
!      | 4 | 1 | four  | 1 | -1
!      | 5 | 0 | five  | 1 | -1
!      | 6 | 6 | six   | 1 | -1
!      | 7 | 7 | seven | 1 | -1
!      | 8 | 8 | eight | 1 | -1
!      | 0 |   | zero  | 1 | -1
!      |   |   | null  | 1 | -1
!      |   | 0 | zero  | 1 | -1
!      | 1 | 4 | one   | 2 |  2
!      | 2 | 3 | two   | 2 |  2
!      | 3 | 2 | three | 2 |  2
!      | 4 | 1 | four  | 2 |  2
!      | 5 | 0 | five  | 2 |  2
!      | 6 | 6 | six   | 2 |  2
!      | 7 | 7 | seven | 2 |  2
!      | 8 | 8 | eight | 2 |  2
!      | 0 |   | zero  | 2 |  2
!      |   |   | null  | 2 |  2
!      |   | 0 | zero  | 2 |  2
!      | 1 | 4 | one   | 3 | -3
!      | 2 | 3 | two   | 3 | -3
!      | 3 | 2 | three | 3 | -3
!      | 4 | 1 | four  | 3 | -3
!      | 5 | 0 | five  | 3 | -3
!      | 6 | 6 | six   | 3 | -3
!      | 7 | 7 | seven | 3 | -3
!      | 8 | 8 | eight | 3 | -3
!      | 0 |   | zero  | 3 | -3
!      |   |   | null  | 3 | -3
!      |   | 0 | zero  | 3 | -3
!      | 1 | 4 | one   | 2 |  4
!      | 2 | 3 | two   | 2 |  4
!      | 3 | 2 | three | 2 |  4
!      | 4 | 1 | four  | 2 |  4
!      | 5 | 0 | five  | 2 |  4
!      | 6 | 6 | six   | 2 |  4
!      | 7 | 7 | seven | 2 |  4
!      | 8 | 8 | eight | 2 |  4
!      | 0 |   | zero  | 2 |  4
!      |   |   | null  | 2 |  4
!      |   | 0 | zero  | 2 |  4
!      | 1 | 4 | one   | 5 | -5
!      | 2 | 3 | two   | 5 | -5
!      | 3 | 2 | three | 5 | -5
!      | 4 | 1 | four  | 5 | -5
!      | 5 | 0 | five  | 5 | -5
!      | 6 | 6 | six   | 5 | -5
!      | 7 | 7 | seven | 5 | -5
!      | 8 | 8 | eight | 5 | -5
!      | 0 |   | zero  | 5 | -5
!      |   |   | null  | 5 | -5
!      |   | 0 | zero  | 5 | -5
!      | 1 | 4 | one   | 5 | -5
!      | 2 | 3 | two   | 5 | -5
!      | 3 | 2 | three | 5 | -5
!      | 4 | 1 | four  | 5 | -5
!      | 5 | 0 | five  | 5 | -5
!      | 6 | 6 | six   | 5 | -5
!      | 7 | 7 | seven | 5 | -5
!      | 8 | 8 | eight | 5 | -5
!      | 0 |   | zero  | 5 | -5
!      |   |   | null  | 5 | -5
!      |   | 0 | zero  | 5 | -5
!      | 1 | 4 | one   | 0 |   
!      | 2 | 3 | two   | 0 |   
!      | 3 | 2 | three | 0 |   
!      | 4 | 1 | four  | 0 |   
!      | 5 | 0 | five  | 0 |   
!      | 6 | 6 | six   | 0 |   
!      | 7 | 7 | seven | 0 |   
!      | 8 | 8 | eight | 0 |   
!      | 0 |   | zero  | 0 |   
!      |   |   | null  | 0 |   
!      |   | 0 | zero  | 0 |   
!      | 1 | 4 | one   |   |   
!      | 2 | 3 | two   |   |   
!      | 3 | 2 | three |   |   
!      | 4 | 1 | four  |   |   
!      | 5 | 0 | five  |   |   
!      | 6 | 6 | six   |   |   
!      | 7 | 7 | seven |   |   
!      | 8 | 8 | eight |   |   
!      | 0 |   | zero  |   |   
!      |   |   | null  |   |   
!      |   | 0 | zero  |   |   
!      | 1 | 4 | one   |   |  0
!      | 2 | 3 | two   |   |  0
!      | 3 | 2 | three |   |  0
!      | 4 | 1 | four  |   |  0
!      | 5 | 0 | five  |   |  0
!      | 6 | 6 | six   |   |  0
!      | 7 | 7 | seven |   |  0
!      | 8 | 8 | eight |   |  0
!      | 0 |   | zero  |   |  0
!      |   |   | null  |   |  0
!      |   | 0 | zero  |   |  0
! (99 rows)
! 
! SELECT '' AS "xxx", t1.a, t2.e
!   FROM J1_TBL t1 (a, b, c), J2_TBL t2 (d, e)
!   WHERE t1.a = t2.d;
!  xxx | a | e  
! -----+---+----
!      | 0 |   
!      | 1 | -1
!      | 2 |  2
!      | 2 |  4
!      | 3 | -3
!      | 5 | -5
!      | 5 | -5
! (7 rows)
! 
! --
! -- CROSS JOIN
! -- Qualifications are not allowed on cross joins,
! -- which degenerate into a standard unqualified inner join.
! --
! SELECT '' AS "xxx", *
!   FROM J1_TBL CROSS JOIN J2_TBL;
!  xxx | i | j |   t   | i | k  
! -----+---+---+-------+---+----
!      | 1 | 4 | one   | 1 | -1
!      | 2 | 3 | two   | 1 | -1
!      | 3 | 2 | three | 1 | -1
!      | 4 | 1 | four  | 1 | -1
!      | 5 | 0 | five  | 1 | -1
!      | 6 | 6 | six   | 1 | -1
!      | 7 | 7 | seven | 1 | -1
!      | 8 | 8 | eight | 1 | -1
!      | 0 |   | zero  | 1 | -1
!      |   |   | null  | 1 | -1
!      |   | 0 | zero  | 1 | -1
!      | 1 | 4 | one   | 2 |  2
!      | 2 | 3 | two   | 2 |  2
!      | 3 | 2 | three | 2 |  2
!      | 4 | 1 | four  | 2 |  2
!      | 5 | 0 | five  | 2 |  2
!      | 6 | 6 | six   | 2 |  2
!      | 7 | 7 | seven | 2 |  2
!      | 8 | 8 | eight | 2 |  2
!      | 0 |   | zero  | 2 |  2
!      |   |   | null  | 2 |  2
!      |   | 0 | zero  | 2 |  2
!      | 1 | 4 | one   | 3 | -3
!      | 2 | 3 | two   | 3 | -3
!      | 3 | 2 | three | 3 | -3
!      | 4 | 1 | four  | 3 | -3
!      | 5 | 0 | five  | 3 | -3
!      | 6 | 6 | six   | 3 | -3
!      | 7 | 7 | seven | 3 | -3
!      | 8 | 8 | eight | 3 | -3
!      | 0 |   | zero  | 3 | -3
!      |   |   | null  | 3 | -3
!      |   | 0 | zero  | 3 | -3
!      | 1 | 4 | one   | 2 |  4
!      | 2 | 3 | two   | 2 |  4
!      | 3 | 2 | three | 2 |  4
!      | 4 | 1 | four  | 2 |  4
!      | 5 | 0 | five  | 2 |  4
!      | 6 | 6 | six   | 2 |  4
!      | 7 | 7 | seven | 2 |  4
!      | 8 | 8 | eight | 2 |  4
!      | 0 |   | zero  | 2 |  4
!      |   |   | null  | 2 |  4
!      |   | 0 | zero  | 2 |  4
!      | 1 | 4 | one   | 5 | -5
!      | 2 | 3 | two   | 5 | -5
!      | 3 | 2 | three | 5 | -5
!      | 4 | 1 | four  | 5 | -5
!      | 5 | 0 | five  | 5 | -5
!      | 6 | 6 | six   | 5 | -5
!      | 7 | 7 | seven | 5 | -5
!      | 8 | 8 | eight | 5 | -5
!      | 0 |   | zero  | 5 | -5
!      |   |   | null  | 5 | -5
!      |   | 0 | zero  | 5 | -5
!      | 1 | 4 | one   | 5 | -5
!      | 2 | 3 | two   | 5 | -5
!      | 3 | 2 | three | 5 | -5
!      | 4 | 1 | four  | 5 | -5
!      | 5 | 0 | five  | 5 | -5
!      | 6 | 6 | six   | 5 | -5
!      | 7 | 7 | seven | 5 | -5
!      | 8 | 8 | eight | 5 | -5
!      | 0 |   | zero  | 5 | -5
!      |   |   | null  | 5 | -5
!      |   | 0 | zero  | 5 | -5
!      | 1 | 4 | one   | 0 |   
!      | 2 | 3 | two   | 0 |   
!      | 3 | 2 | three | 0 |   
!      | 4 | 1 | four  | 0 |   
!      | 5 | 0 | five  | 0 |   
!      | 6 | 6 | six   | 0 |   
!      | 7 | 7 | seven | 0 |   
!      | 8 | 8 | eight | 0 |   
!      | 0 |   | zero  | 0 |   
!      |   |   | null  | 0 |   
!      |   | 0 | zero  | 0 |   
!      | 1 | 4 | one   |   |   
!      | 2 | 3 | two   |   |   
!      | 3 | 2 | three |   |   
!      | 4 | 1 | four  |   |   
!      | 5 | 0 | five  |   |   
!      | 6 | 6 | six   |   |   
!      | 7 | 7 | seven |   |   
!      | 8 | 8 | eight |   |   
!      | 0 |   | zero  |   |   
!      |   |   | null  |   |   
!      |   | 0 | zero  |   |   
!      | 1 | 4 | one   |   |  0
!      | 2 | 3 | two   |   |  0
!      | 3 | 2 | three |   |  0
!      | 4 | 1 | four  |   |  0
!      | 5 | 0 | five  |   |  0
!      | 6 | 6 | six   |   |  0
!      | 7 | 7 | seven |   |  0
!      | 8 | 8 | eight |   |  0
!      | 0 |   | zero  |   |  0
!      |   |   | null  |   |  0
!      |   | 0 | zero  |   |  0
! (99 rows)
! 
! -- ambiguous column
! SELECT '' AS "xxx", i, k, t
!   FROM J1_TBL CROSS JOIN J2_TBL;
! ERROR:  column reference "i" is ambiguous
! LINE 1: SELECT '' AS "xxx", i, k, t
!                             ^
! -- resolve previous ambiguity by specifying the table name
! SELECT '' AS "xxx", t1.i, k, t
!   FROM J1_TBL t1 CROSS JOIN J2_TBL t2;
!  xxx | i | k  |   t   
! -----+---+----+-------
!      | 1 | -1 | one
!      | 2 | -1 | two
!      | 3 | -1 | three
!      | 4 | -1 | four
!      | 5 | -1 | five
!      | 6 | -1 | six
!      | 7 | -1 | seven
!      | 8 | -1 | eight
!      | 0 | -1 | zero
!      |   | -1 | null
!      |   | -1 | zero
!      | 1 |  2 | one
!      | 2 |  2 | two
!      | 3 |  2 | three
!      | 4 |  2 | four
!      | 5 |  2 | five
!      | 6 |  2 | six
!      | 7 |  2 | seven
!      | 8 |  2 | eight
!      | 0 |  2 | zero
!      |   |  2 | null
!      |   |  2 | zero
!      | 1 | -3 | one
!      | 2 | -3 | two
!      | 3 | -3 | three
!      | 4 | -3 | four
!      | 5 | -3 | five
!      | 6 | -3 | six
!      | 7 | -3 | seven
!      | 8 | -3 | eight
!      | 0 | -3 | zero
!      |   | -3 | null
!      |   | -3 | zero
!      | 1 |  4 | one
!      | 2 |  4 | two
!      | 3 |  4 | three
!      | 4 |  4 | four
!      | 5 |  4 | five
!      | 6 |  4 | six
!      | 7 |  4 | seven
!      | 8 |  4 | eight
!      | 0 |  4 | zero
!      |   |  4 | null
!      |   |  4 | zero
!      | 1 | -5 | one
!      | 2 | -5 | two
!      | 3 | -5 | three
!      | 4 | -5 | four
!      | 5 | -5 | five
!      | 6 | -5 | six
!      | 7 | -5 | seven
!      | 8 | -5 | eight
!      | 0 | -5 | zero
!      |   | -5 | null
!      |   | -5 | zero
!      | 1 | -5 | one
!      | 2 | -5 | two
!      | 3 | -5 | three
!      | 4 | -5 | four
!      | 5 | -5 | five
!      | 6 | -5 | six
!      | 7 | -5 | seven
!      | 8 | -5 | eight
!      | 0 | -5 | zero
!      |   | -5 | null
!      |   | -5 | zero
!      | 1 |    | one
!      | 2 |    | two
!      | 3 |    | three
!      | 4 |    | four
!      | 5 |    | five
!      | 6 |    | six
!      | 7 |    | seven
!      | 8 |    | eight
!      | 0 |    | zero
!      |   |    | null
!      |   |    | zero
!      | 1 |    | one
!      | 2 |    | two
!      | 3 |    | three
!      | 4 |    | four
!      | 5 |    | five
!      | 6 |    | six
!      | 7 |    | seven
!      | 8 |    | eight
!      | 0 |    | zero
!      |   |    | null
!      |   |    | zero
!      | 1 |  0 | one
!      | 2 |  0 | two
!      | 3 |  0 | three
!      | 4 |  0 | four
!      | 5 |  0 | five
!      | 6 |  0 | six
!      | 7 |  0 | seven
!      | 8 |  0 | eight
!      | 0 |  0 | zero
!      |   |  0 | null
!      |   |  0 | zero
! (99 rows)
! 
! SELECT '' AS "xxx", ii, tt, kk
!   FROM (J1_TBL CROSS JOIN J2_TBL)
!     AS tx (ii, jj, tt, ii2, kk);
!  xxx | ii |  tt   | kk 
! -----+----+-------+----
!      |  1 | one   | -1
!      |  2 | two   | -1
!      |  3 | three | -1
!      |  4 | four  | -1
!      |  5 | five  | -1
!      |  6 | six   | -1
!      |  7 | seven | -1
!      |  8 | eight | -1
!      |  0 | zero  | -1
!      |    | null  | -1
!      |    | zero  | -1
!      |  1 | one   |  2
!      |  2 | two   |  2
!      |  3 | three |  2
!      |  4 | four  |  2
!      |  5 | five  |  2
!      |  6 | six   |  2
!      |  7 | seven |  2
!      |  8 | eight |  2
!      |  0 | zero  |  2
!      |    | null  |  2
!      |    | zero  |  2
!      |  1 | one   | -3
!      |  2 | two   | -3
!      |  3 | three | -3
!      |  4 | four  | -3
!      |  5 | five  | -3
!      |  6 | six   | -3
!      |  7 | seven | -3
!      |  8 | eight | -3
!      |  0 | zero  | -3
!      |    | null  | -3
!      |    | zero  | -3
!      |  1 | one   |  4
!      |  2 | two   |  4
!      |  3 | three |  4
!      |  4 | four  |  4
!      |  5 | five  |  4
!      |  6 | six   |  4
!      |  7 | seven |  4
!      |  8 | eight |  4
!      |  0 | zero  |  4
!      |    | null  |  4
!      |    | zero  |  4
!      |  1 | one   | -5
!      |  2 | two   | -5
!      |  3 | three | -5
!      |  4 | four  | -5
!      |  5 | five  | -5
!      |  6 | six   | -5
!      |  7 | seven | -5
!      |  8 | eight | -5
!      |  0 | zero  | -5
!      |    | null  | -5
!      |    | zero  | -5
!      |  1 | one   | -5
!      |  2 | two   | -5
!      |  3 | three | -5
!      |  4 | four  | -5
!      |  5 | five  | -5
!      |  6 | six   | -5
!      |  7 | seven | -5
!      |  8 | eight | -5
!      |  0 | zero  | -5
!      |    | null  | -5
!      |    | zero  | -5
!      |  1 | one   |   
!      |  2 | two   |   
!      |  3 | three |   
!      |  4 | four  |   
!      |  5 | five  |   
!      |  6 | six   |   
!      |  7 | seven |   
!      |  8 | eight |   
!      |  0 | zero  |   
!      |    | null  |   
!      |    | zero  |   
!      |  1 | one   |   
!      |  2 | two   |   
!      |  3 | three |   
!      |  4 | four  |   
!      |  5 | five  |   
!      |  6 | six   |   
!      |  7 | seven |   
!      |  8 | eight |   
!      |  0 | zero  |   
!      |    | null  |   
!      |    | zero  |   
!      |  1 | one   |  0
!      |  2 | two   |  0
!      |  3 | three |  0
!      |  4 | four  |  0
!      |  5 | five  |  0
!      |  6 | six   |  0
!      |  7 | seven |  0
!      |  8 | eight |  0
!      |  0 | zero  |  0
!      |    | null  |  0
!      |    | zero  |  0
! (99 rows)
! 
! SELECT '' AS "xxx", tx.ii, tx.jj, tx.kk
!   FROM (J1_TBL t1 (a, b, c) CROSS JOIN J2_TBL t2 (d, e))
!     AS tx (ii, jj, tt, ii2, kk);
!  xxx | ii | jj | kk 
! -----+----+----+----
!      |  1 |  4 | -1
!      |  2 |  3 | -1
!      |  3 |  2 | -1
!      |  4 |  1 | -1
!      |  5 |  0 | -1
!      |  6 |  6 | -1
!      |  7 |  7 | -1
!      |  8 |  8 | -1
!      |  0 |    | -1
!      |    |    | -1
!      |    |  0 | -1
!      |  1 |  4 |  2
!      |  2 |  3 |  2
!      |  3 |  2 |  2
!      |  4 |  1 |  2
!      |  5 |  0 |  2
!      |  6 |  6 |  2
!      |  7 |  7 |  2
!      |  8 |  8 |  2
!      |  0 |    |  2
!      |    |    |  2
!      |    |  0 |  2
!      |  1 |  4 | -3
!      |  2 |  3 | -3
!      |  3 |  2 | -3
!      |  4 |  1 | -3
!      |  5 |  0 | -3
!      |  6 |  6 | -3
!      |  7 |  7 | -3
!      |  8 |  8 | -3
!      |  0 |    | -3
!      |    |    | -3
!      |    |  0 | -3
!      |  1 |  4 |  4
!      |  2 |  3 |  4
!      |  3 |  2 |  4
!      |  4 |  1 |  4
!      |  5 |  0 |  4
!      |  6 |  6 |  4
!      |  7 |  7 |  4
!      |  8 |  8 |  4
!      |  0 |    |  4
!      |    |    |  4
!      |    |  0 |  4
!      |  1 |  4 | -5
!      |  2 |  3 | -5
!      |  3 |  2 | -5
!      |  4 |  1 | -5
!      |  5 |  0 | -5
!      |  6 |  6 | -5
!      |  7 |  7 | -5
!      |  8 |  8 | -5
!      |  0 |    | -5
!      |    |    | -5
!      |    |  0 | -5
!      |  1 |  4 | -5
!      |  2 |  3 | -5
!      |  3 |  2 | -5
!      |  4 |  1 | -5
!      |  5 |  0 | -5
!      |  6 |  6 | -5
!      |  7 |  7 | -5
!      |  8 |  8 | -5
!      |  0 |    | -5
!      |    |    | -5
!      |    |  0 | -5
!      |  1 |  4 |   
!      |  2 |  3 |   
!      |  3 |  2 |   
!      |  4 |  1 |   
!      |  5 |  0 |   
!      |  6 |  6 |   
!      |  7 |  7 |   
!      |  8 |  8 |   
!      |  0 |    |   
!      |    |    |   
!      |    |  0 |   
!      |  1 |  4 |   
!      |  2 |  3 |   
!      |  3 |  2 |   
!      |  4 |  1 |   
!      |  5 |  0 |   
!      |  6 |  6 |   
!      |  7 |  7 |   
!      |  8 |  8 |   
!      |  0 |    |   
!      |    |    |   
!      |    |  0 |   
!      |  1 |  4 |  0
!      |  2 |  3 |  0
!      |  3 |  2 |  0
!      |  4 |  1 |  0
!      |  5 |  0 |  0
!      |  6 |  6 |  0
!      |  7 |  7 |  0
!      |  8 |  8 |  0
!      |  0 |    |  0
!      |    |    |  0
!      |    |  0 |  0
! (99 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL CROSS JOIN J2_TBL a CROSS JOIN J2_TBL b;
!  xxx | i | j |   t   | i | k  | i | k  
! -----+---+---+-------+---+----+---+----
!      | 1 | 4 | one   | 1 | -1 | 1 | -1
!      | 1 | 4 | one   | 1 | -1 | 2 |  2
!      | 1 | 4 | one   | 1 | -1 | 3 | -3
!      | 1 | 4 | one   | 1 | -1 | 2 |  4
!      | 1 | 4 | one   | 1 | -1 | 5 | -5
!      | 1 | 4 | one   | 1 | -1 | 5 | -5
!      | 1 | 4 | one   | 1 | -1 | 0 |   
!      | 1 | 4 | one   | 1 | -1 |   |   
!      | 1 | 4 | one   | 1 | -1 |   |  0
!      | 2 | 3 | two   | 1 | -1 | 1 | -1
!      | 2 | 3 | two   | 1 | -1 | 2 |  2
!      | 2 | 3 | two   | 1 | -1 | 3 | -3
!      | 2 | 3 | two   | 1 | -1 | 2 |  4
!      | 2 | 3 | two   | 1 | -1 | 5 | -5
!      | 2 | 3 | two   | 1 | -1 | 5 | -5
!      | 2 | 3 | two   | 1 | -1 | 0 |   
!      | 2 | 3 | two   | 1 | -1 |   |   
!      | 2 | 3 | two   | 1 | -1 |   |  0
!      | 3 | 2 | three | 1 | -1 | 1 | -1
!      | 3 | 2 | three | 1 | -1 | 2 |  2
!      | 3 | 2 | three | 1 | -1 | 3 | -3
!      | 3 | 2 | three | 1 | -1 | 2 |  4
!      | 3 | 2 | three | 1 | -1 | 5 | -5
!      | 3 | 2 | three | 1 | -1 | 5 | -5
!      | 3 | 2 | three | 1 | -1 | 0 |   
!      | 3 | 2 | three | 1 | -1 |   |   
!      | 3 | 2 | three | 1 | -1 |   |  0
!      | 4 | 1 | four  | 1 | -1 | 1 | -1
!      | 4 | 1 | four  | 1 | -1 | 2 |  2
!      | 4 | 1 | four  | 1 | -1 | 3 | -3
!      | 4 | 1 | four  | 1 | -1 | 2 |  4
!      | 4 | 1 | four  | 1 | -1 | 5 | -5
!      | 4 | 1 | four  | 1 | -1 | 5 | -5
!      | 4 | 1 | four  | 1 | -1 | 0 |   
!      | 4 | 1 | four  | 1 | -1 |   |   
!      | 4 | 1 | four  | 1 | -1 |   |  0
!      | 5 | 0 | five  | 1 | -1 | 1 | -1
!      | 5 | 0 | five  | 1 | -1 | 2 |  2
!      | 5 | 0 | five  | 1 | -1 | 3 | -3
!      | 5 | 0 | five  | 1 | -1 | 2 |  4
!      | 5 | 0 | five  | 1 | -1 | 5 | -5
!      | 5 | 0 | five  | 1 | -1 | 5 | -5
!      | 5 | 0 | five  | 1 | -1 | 0 |   
!      | 5 | 0 | five  | 1 | -1 |   |   
!      | 5 | 0 | five  | 1 | -1 |   |  0
!      | 6 | 6 | six   | 1 | -1 | 1 | -1
!      | 6 | 6 | six   | 1 | -1 | 2 |  2
!      | 6 | 6 | six   | 1 | -1 | 3 | -3
!      | 6 | 6 | six   | 1 | -1 | 2 |  4
!      | 6 | 6 | six   | 1 | -1 | 5 | -5
!      | 6 | 6 | six   | 1 | -1 | 5 | -5
!      | 6 | 6 | six   | 1 | -1 | 0 |   
!      | 6 | 6 | six   | 1 | -1 |   |   
!      | 6 | 6 | six   | 1 | -1 |   |  0
!      | 7 | 7 | seven | 1 | -1 | 1 | -1
!      | 7 | 7 | seven | 1 | -1 | 2 |  2
!      | 7 | 7 | seven | 1 | -1 | 3 | -3
!      | 7 | 7 | seven | 1 | -1 | 2 |  4
!      | 7 | 7 | seven | 1 | -1 | 5 | -5
!      | 7 | 7 | seven | 1 | -1 | 5 | -5
!      | 7 | 7 | seven | 1 | -1 | 0 |   
!      | 7 | 7 | seven | 1 | -1 |   |   
!      | 7 | 7 | seven | 1 | -1 |   |  0
!      | 8 | 8 | eight | 1 | -1 | 1 | -1
!      | 8 | 8 | eight | 1 | -1 | 2 |  2
!      | 8 | 8 | eight | 1 | -1 | 3 | -3
!      | 8 | 8 | eight | 1 | -1 | 2 |  4
!      | 8 | 8 | eight | 1 | -1 | 5 | -5
!      | 8 | 8 | eight | 1 | -1 | 5 | -5
!      | 8 | 8 | eight | 1 | -1 | 0 |   
!      | 8 | 8 | eight | 1 | -1 |   |   
!      | 8 | 8 | eight | 1 | -1 |   |  0
!      | 0 |   | zero  | 1 | -1 | 1 | -1
!      | 0 |   | zero  | 1 | -1 | 2 |  2
!      | 0 |   | zero  | 1 | -1 | 3 | -3
!      | 0 |   | zero  | 1 | -1 | 2 |  4
!      | 0 |   | zero  | 1 | -1 | 5 | -5
!      | 0 |   | zero  | 1 | -1 | 5 | -5
!      | 0 |   | zero  | 1 | -1 | 0 |   
!      | 0 |   | zero  | 1 | -1 |   |   
!      | 0 |   | zero  | 1 | -1 |   |  0
!      |   |   | null  | 1 | -1 | 1 | -1
!      |   |   | null  | 1 | -1 | 2 |  2
!      |   |   | null  | 1 | -1 | 3 | -3
!      |   |   | null  | 1 | -1 | 2 |  4
!      |   |   | null  | 1 | -1 | 5 | -5
!      |   |   | null  | 1 | -1 | 5 | -5
!      |   |   | null  | 1 | -1 | 0 |   
!      |   |   | null  | 1 | -1 |   |   
!      |   |   | null  | 1 | -1 |   |  0
!      |   | 0 | zero  | 1 | -1 | 1 | -1
!      |   | 0 | zero  | 1 | -1 | 2 |  2
!      |   | 0 | zero  | 1 | -1 | 3 | -3
!      |   | 0 | zero  | 1 | -1 | 2 |  4
!      |   | 0 | zero  | 1 | -1 | 5 | -5
!      |   | 0 | zero  | 1 | -1 | 5 | -5
!      |   | 0 | zero  | 1 | -1 | 0 |   
!      |   | 0 | zero  | 1 | -1 |   |   
!      |   | 0 | zero  | 1 | -1 |   |  0
!      | 1 | 4 | one   | 2 |  2 | 1 | -1
!      | 1 | 4 | one   | 2 |  2 | 2 |  2
!      | 1 | 4 | one   | 2 |  2 | 3 | -3
!      | 1 | 4 | one   | 2 |  2 | 2 |  4
!      | 1 | 4 | one   | 2 |  2 | 5 | -5
!      | 1 | 4 | one   | 2 |  2 | 5 | -5
!      | 1 | 4 | one   | 2 |  2 | 0 |   
!      | 1 | 4 | one   | 2 |  2 |   |   
!      | 1 | 4 | one   | 2 |  2 |   |  0
!      | 2 | 3 | two   | 2 |  2 | 1 | -1
!      | 2 | 3 | two   | 2 |  2 | 2 |  2
!      | 2 | 3 | two   | 2 |  2 | 3 | -3
!      | 2 | 3 | two   | 2 |  2 | 2 |  4
!      | 2 | 3 | two   | 2 |  2 | 5 | -5
!      | 2 | 3 | two   | 2 |  2 | 5 | -5
!      | 2 | 3 | two   | 2 |  2 | 0 |   
!      | 2 | 3 | two   | 2 |  2 |   |   
!      | 2 | 3 | two   | 2 |  2 |   |  0
!      | 3 | 2 | three | 2 |  2 | 1 | -1
!      | 3 | 2 | three | 2 |  2 | 2 |  2
!      | 3 | 2 | three | 2 |  2 | 3 | -3
!      | 3 | 2 | three | 2 |  2 | 2 |  4
!      | 3 | 2 | three | 2 |  2 | 5 | -5
!      | 3 | 2 | three | 2 |  2 | 5 | -5
!      | 3 | 2 | three | 2 |  2 | 0 |   
!      | 3 | 2 | three | 2 |  2 |   |   
!      | 3 | 2 | three | 2 |  2 |   |  0
!      | 4 | 1 | four  | 2 |  2 | 1 | -1
!      | 4 | 1 | four  | 2 |  2 | 2 |  2
!      | 4 | 1 | four  | 2 |  2 | 3 | -3
!      | 4 | 1 | four  | 2 |  2 | 2 |  4
!      | 4 | 1 | four  | 2 |  2 | 5 | -5
!      | 4 | 1 | four  | 2 |  2 | 5 | -5
!      | 4 | 1 | four  | 2 |  2 | 0 |   
!      | 4 | 1 | four  | 2 |  2 |   |   
!      | 4 | 1 | four  | 2 |  2 |   |  0
!      | 5 | 0 | five  | 2 |  2 | 1 | -1
!      | 5 | 0 | five  | 2 |  2 | 2 |  2
!      | 5 | 0 | five  | 2 |  2 | 3 | -3
!      | 5 | 0 | five  | 2 |  2 | 2 |  4
!      | 5 | 0 | five  | 2 |  2 | 5 | -5
!      | 5 | 0 | five  | 2 |  2 | 5 | -5
!      | 5 | 0 | five  | 2 |  2 | 0 |   
!      | 5 | 0 | five  | 2 |  2 |   |   
!      | 5 | 0 | five  | 2 |  2 |   |  0
!      | 6 | 6 | six   | 2 |  2 | 1 | -1
!      | 6 | 6 | six   | 2 |  2 | 2 |  2
!      | 6 | 6 | six   | 2 |  2 | 3 | -3
!      | 6 | 6 | six   | 2 |  2 | 2 |  4
!      | 6 | 6 | six   | 2 |  2 | 5 | -5
!      | 6 | 6 | six   | 2 |  2 | 5 | -5
!      | 6 | 6 | six   | 2 |  2 | 0 |   
!      | 6 | 6 | six   | 2 |  2 |   |   
!      | 6 | 6 | six   | 2 |  2 |   |  0
!      | 7 | 7 | seven | 2 |  2 | 1 | -1
!      | 7 | 7 | seven | 2 |  2 | 2 |  2
!      | 7 | 7 | seven | 2 |  2 | 3 | -3
!      | 7 | 7 | seven | 2 |  2 | 2 |  4
!      | 7 | 7 | seven | 2 |  2 | 5 | -5
!      | 7 | 7 | seven | 2 |  2 | 5 | -5
!      | 7 | 7 | seven | 2 |  2 | 0 |   
!      | 7 | 7 | seven | 2 |  2 |   |   
!      | 7 | 7 | seven | 2 |  2 |   |  0
!      | 8 | 8 | eight | 2 |  2 | 1 | -1
!      | 8 | 8 | eight | 2 |  2 | 2 |  2
!      | 8 | 8 | eight | 2 |  2 | 3 | -3
!      | 8 | 8 | eight | 2 |  2 | 2 |  4
!      | 8 | 8 | eight | 2 |  2 | 5 | -5
!      | 8 | 8 | eight | 2 |  2 | 5 | -5
!      | 8 | 8 | eight | 2 |  2 | 0 |   
!      | 8 | 8 | eight | 2 |  2 |   |   
!      | 8 | 8 | eight | 2 |  2 |   |  0
!      | 0 |   | zero  | 2 |  2 | 1 | -1
!      | 0 |   | zero  | 2 |  2 | 2 |  2
!      | 0 |   | zero  | 2 |  2 | 3 | -3
!      | 0 |   | zero  | 2 |  2 | 2 |  4
!      | 0 |   | zero  | 2 |  2 | 5 | -5
!      | 0 |   | zero  | 2 |  2 | 5 | -5
!      | 0 |   | zero  | 2 |  2 | 0 |   
!      | 0 |   | zero  | 2 |  2 |   |   
!      | 0 |   | zero  | 2 |  2 |   |  0
!      |   |   | null  | 2 |  2 | 1 | -1
!      |   |   | null  | 2 |  2 | 2 |  2
!      |   |   | null  | 2 |  2 | 3 | -3
!      |   |   | null  | 2 |  2 | 2 |  4
!      |   |   | null  | 2 |  2 | 5 | -5
!      |   |   | null  | 2 |  2 | 5 | -5
!      |   |   | null  | 2 |  2 | 0 |   
!      |   |   | null  | 2 |  2 |   |   
!      |   |   | null  | 2 |  2 |   |  0
!      |   | 0 | zero  | 2 |  2 | 1 | -1
!      |   | 0 | zero  | 2 |  2 | 2 |  2
!      |   | 0 | zero  | 2 |  2 | 3 | -3
!      |   | 0 | zero  | 2 |  2 | 2 |  4
!      |   | 0 | zero  | 2 |  2 | 5 | -5
!      |   | 0 | zero  | 2 |  2 | 5 | -5
!      |   | 0 | zero  | 2 |  2 | 0 |   
!      |   | 0 | zero  | 2 |  2 |   |   
!      |   | 0 | zero  | 2 |  2 |   |  0
!      | 1 | 4 | one   | 3 | -3 | 1 | -1
!      | 1 | 4 | one   | 3 | -3 | 2 |  2
!      | 1 | 4 | one   | 3 | -3 | 3 | -3
!      | 1 | 4 | one   | 3 | -3 | 2 |  4
!      | 1 | 4 | one   | 3 | -3 | 5 | -5
!      | 1 | 4 | one   | 3 | -3 | 5 | -5
!      | 1 | 4 | one   | 3 | -3 | 0 |   
!      | 1 | 4 | one   | 3 | -3 |   |   
!      | 1 | 4 | one   | 3 | -3 |   |  0
!      | 2 | 3 | two   | 3 | -3 | 1 | -1
!      | 2 | 3 | two   | 3 | -3 | 2 |  2
!      | 2 | 3 | two   | 3 | -3 | 3 | -3
!      | 2 | 3 | two   | 3 | -3 | 2 |  4
!      | 2 | 3 | two   | 3 | -3 | 5 | -5
!      | 2 | 3 | two   | 3 | -3 | 5 | -5
!      | 2 | 3 | two   | 3 | -3 | 0 |   
!      | 2 | 3 | two   | 3 | -3 |   |   
!      | 2 | 3 | two   | 3 | -3 |   |  0
!      | 3 | 2 | three | 3 | -3 | 1 | -1
!      | 3 | 2 | three | 3 | -3 | 2 |  2
!      | 3 | 2 | three | 3 | -3 | 3 | -3
!      | 3 | 2 | three | 3 | -3 | 2 |  4
!      | 3 | 2 | three | 3 | -3 | 5 | -5
!      | 3 | 2 | three | 3 | -3 | 5 | -5
!      | 3 | 2 | three | 3 | -3 | 0 |   
!      | 3 | 2 | three | 3 | -3 |   |   
!      | 3 | 2 | three | 3 | -3 |   |  0
!      | 4 | 1 | four  | 3 | -3 | 1 | -1
!      | 4 | 1 | four  | 3 | -3 | 2 |  2
!      | 4 | 1 | four  | 3 | -3 | 3 | -3
!      | 4 | 1 | four  | 3 | -3 | 2 |  4
!      | 4 | 1 | four  | 3 | -3 | 5 | -5
!      | 4 | 1 | four  | 3 | -3 | 5 | -5
!      | 4 | 1 | four  | 3 | -3 | 0 |   
!      | 4 | 1 | four  | 3 | -3 |   |   
!      | 4 | 1 | four  | 3 | -3 |   |  0
!      | 5 | 0 | five  | 3 | -3 | 1 | -1
!      | 5 | 0 | five  | 3 | -3 | 2 |  2
!      | 5 | 0 | five  | 3 | -3 | 3 | -3
!      | 5 | 0 | five  | 3 | -3 | 2 |  4
!      | 5 | 0 | five  | 3 | -3 | 5 | -5
!      | 5 | 0 | five  | 3 | -3 | 5 | -5
!      | 5 | 0 | five  | 3 | -3 | 0 |   
!      | 5 | 0 | five  | 3 | -3 |   |   
!      | 5 | 0 | five  | 3 | -3 |   |  0
!      | 6 | 6 | six   | 3 | -3 | 1 | -1
!      | 6 | 6 | six   | 3 | -3 | 2 |  2
!      | 6 | 6 | six   | 3 | -3 | 3 | -3
!      | 6 | 6 | six   | 3 | -3 | 2 |  4
!      | 6 | 6 | six   | 3 | -3 | 5 | -5
!      | 6 | 6 | six   | 3 | -3 | 5 | -5
!      | 6 | 6 | six   | 3 | -3 | 0 |   
!      | 6 | 6 | six   | 3 | -3 |   |   
!      | 6 | 6 | six   | 3 | -3 |   |  0
!      | 7 | 7 | seven | 3 | -3 | 1 | -1
!      | 7 | 7 | seven | 3 | -3 | 2 |  2
!      | 7 | 7 | seven | 3 | -3 | 3 | -3
!      | 7 | 7 | seven | 3 | -3 | 2 |  4
!      | 7 | 7 | seven | 3 | -3 | 5 | -5
!      | 7 | 7 | seven | 3 | -3 | 5 | -5
!      | 7 | 7 | seven | 3 | -3 | 0 |   
!      | 7 | 7 | seven | 3 | -3 |   |   
!      | 7 | 7 | seven | 3 | -3 |   |  0
!      | 8 | 8 | eight | 3 | -3 | 1 | -1
!      | 8 | 8 | eight | 3 | -3 | 2 |  2
!      | 8 | 8 | eight | 3 | -3 | 3 | -3
!      | 8 | 8 | eight | 3 | -3 | 2 |  4
!      | 8 | 8 | eight | 3 | -3 | 5 | -5
!      | 8 | 8 | eight | 3 | -3 | 5 | -5
!      | 8 | 8 | eight | 3 | -3 | 0 |   
!      | 8 | 8 | eight | 3 | -3 |   |   
!      | 8 | 8 | eight | 3 | -3 |   |  0
!      | 0 |   | zero  | 3 | -3 | 1 | -1
!      | 0 |   | zero  | 3 | -3 | 2 |  2
!      | 0 |   | zero  | 3 | -3 | 3 | -3
!      | 0 |   | zero  | 3 | -3 | 2 |  4
!      | 0 |   | zero  | 3 | -3 | 5 | -5
!      | 0 |   | zero  | 3 | -3 | 5 | -5
!      | 0 |   | zero  | 3 | -3 | 0 |   
!      | 0 |   | zero  | 3 | -3 |   |   
!      | 0 |   | zero  | 3 | -3 |   |  0
!      |   |   | null  | 3 | -3 | 1 | -1
!      |   |   | null  | 3 | -3 | 2 |  2
!      |   |   | null  | 3 | -3 | 3 | -3
!      |   |   | null  | 3 | -3 | 2 |  4
!      |   |   | null  | 3 | -3 | 5 | -5
!      |   |   | null  | 3 | -3 | 5 | -5
!      |   |   | null  | 3 | -3 | 0 |   
!      |   |   | null  | 3 | -3 |   |   
!      |   |   | null  | 3 | -3 |   |  0
!      |   | 0 | zero  | 3 | -3 | 1 | -1
!      |   | 0 | zero  | 3 | -3 | 2 |  2
!      |   | 0 | zero  | 3 | -3 | 3 | -3
!      |   | 0 | zero  | 3 | -3 | 2 |  4
!      |   | 0 | zero  | 3 | -3 | 5 | -5
!      |   | 0 | zero  | 3 | -3 | 5 | -5
!      |   | 0 | zero  | 3 | -3 | 0 |   
!      |   | 0 | zero  | 3 | -3 |   |   
!      |   | 0 | zero  | 3 | -3 |   |  0
!      | 1 | 4 | one   | 2 |  4 | 1 | -1
!      | 1 | 4 | one   | 2 |  4 | 2 |  2
!      | 1 | 4 | one   | 2 |  4 | 3 | -3
!      | 1 | 4 | one   | 2 |  4 | 2 |  4
!      | 1 | 4 | one   | 2 |  4 | 5 | -5
!      | 1 | 4 | one   | 2 |  4 | 5 | -5
!      | 1 | 4 | one   | 2 |  4 | 0 |   
!      | 1 | 4 | one   | 2 |  4 |   |   
!      | 1 | 4 | one   | 2 |  4 |   |  0
!      | 2 | 3 | two   | 2 |  4 | 1 | -1
!      | 2 | 3 | two   | 2 |  4 | 2 |  2
!      | 2 | 3 | two   | 2 |  4 | 3 | -3
!      | 2 | 3 | two   | 2 |  4 | 2 |  4
!      | 2 | 3 | two   | 2 |  4 | 5 | -5
!      | 2 | 3 | two   | 2 |  4 | 5 | -5
!      | 2 | 3 | two   | 2 |  4 | 0 |   
!      | 2 | 3 | two   | 2 |  4 |   |   
!      | 2 | 3 | two   | 2 |  4 |   |  0
!      | 3 | 2 | three | 2 |  4 | 1 | -1
!      | 3 | 2 | three | 2 |  4 | 2 |  2
!      | 3 | 2 | three | 2 |  4 | 3 | -3
!      | 3 | 2 | three | 2 |  4 | 2 |  4
!      | 3 | 2 | three | 2 |  4 | 5 | -5
!      | 3 | 2 | three | 2 |  4 | 5 | -5
!      | 3 | 2 | three | 2 |  4 | 0 |   
!      | 3 | 2 | three | 2 |  4 |   |   
!      | 3 | 2 | three | 2 |  4 |   |  0
!      | 4 | 1 | four  | 2 |  4 | 1 | -1
!      | 4 | 1 | four  | 2 |  4 | 2 |  2
!      | 4 | 1 | four  | 2 |  4 | 3 | -3
!      | 4 | 1 | four  | 2 |  4 | 2 |  4
!      | 4 | 1 | four  | 2 |  4 | 5 | -5
!      | 4 | 1 | four  | 2 |  4 | 5 | -5
!      | 4 | 1 | four  | 2 |  4 | 0 |   
!      | 4 | 1 | four  | 2 |  4 |   |   
!      | 4 | 1 | four  | 2 |  4 |   |  0
!      | 5 | 0 | five  | 2 |  4 | 1 | -1
!      | 5 | 0 | five  | 2 |  4 | 2 |  2
!      | 5 | 0 | five  | 2 |  4 | 3 | -3
!      | 5 | 0 | five  | 2 |  4 | 2 |  4
!      | 5 | 0 | five  | 2 |  4 | 5 | -5
!      | 5 | 0 | five  | 2 |  4 | 5 | -5
!      | 5 | 0 | five  | 2 |  4 | 0 |   
!      | 5 | 0 | five  | 2 |  4 |   |   
!      | 5 | 0 | five  | 2 |  4 |   |  0
!      | 6 | 6 | six   | 2 |  4 | 1 | -1
!      | 6 | 6 | six   | 2 |  4 | 2 |  2
!      | 6 | 6 | six   | 2 |  4 | 3 | -3
!      | 6 | 6 | six   | 2 |  4 | 2 |  4
!      | 6 | 6 | six   | 2 |  4 | 5 | -5
!      | 6 | 6 | six   | 2 |  4 | 5 | -5
!      | 6 | 6 | six   | 2 |  4 | 0 |   
!      | 6 | 6 | six   | 2 |  4 |   |   
!      | 6 | 6 | six   | 2 |  4 |   |  0
!      | 7 | 7 | seven | 2 |  4 | 1 | -1
!      | 7 | 7 | seven | 2 |  4 | 2 |  2
!      | 7 | 7 | seven | 2 |  4 | 3 | -3
!      | 7 | 7 | seven | 2 |  4 | 2 |  4
!      | 7 | 7 | seven | 2 |  4 | 5 | -5
!      | 7 | 7 | seven | 2 |  4 | 5 | -5
!      | 7 | 7 | seven | 2 |  4 | 0 |   
!      | 7 | 7 | seven | 2 |  4 |   |   
!      | 7 | 7 | seven | 2 |  4 |   |  0
!      | 8 | 8 | eight | 2 |  4 | 1 | -1
!      | 8 | 8 | eight | 2 |  4 | 2 |  2
!      | 8 | 8 | eight | 2 |  4 | 3 | -3
!      | 8 | 8 | eight | 2 |  4 | 2 |  4
!      | 8 | 8 | eight | 2 |  4 | 5 | -5
!      | 8 | 8 | eight | 2 |  4 | 5 | -5
!      | 8 | 8 | eight | 2 |  4 | 0 |   
!      | 8 | 8 | eight | 2 |  4 |   |   
!      | 8 | 8 | eight | 2 |  4 |   |  0
!      | 0 |   | zero  | 2 |  4 | 1 | -1
!      | 0 |   | zero  | 2 |  4 | 2 |  2
!      | 0 |   | zero  | 2 |  4 | 3 | -3
!      | 0 |   | zero  | 2 |  4 | 2 |  4
!      | 0 |   | zero  | 2 |  4 | 5 | -5
!      | 0 |   | zero  | 2 |  4 | 5 | -5
!      | 0 |   | zero  | 2 |  4 | 0 |   
!      | 0 |   | zero  | 2 |  4 |   |   
!      | 0 |   | zero  | 2 |  4 |   |  0
!      |   |   | null  | 2 |  4 | 1 | -1
!      |   |   | null  | 2 |  4 | 2 |  2
!      |   |   | null  | 2 |  4 | 3 | -3
!      |   |   | null  | 2 |  4 | 2 |  4
!      |   |   | null  | 2 |  4 | 5 | -5
!      |   |   | null  | 2 |  4 | 5 | -5
!      |   |   | null  | 2 |  4 | 0 |   
!      |   |   | null  | 2 |  4 |   |   
!      |   |   | null  | 2 |  4 |   |  0
!      |   | 0 | zero  | 2 |  4 | 1 | -1
!      |   | 0 | zero  | 2 |  4 | 2 |  2
!      |   | 0 | zero  | 2 |  4 | 3 | -3
!      |   | 0 | zero  | 2 |  4 | 2 |  4
!      |   | 0 | zero  | 2 |  4 | 5 | -5
!      |   | 0 | zero  | 2 |  4 | 5 | -5
!      |   | 0 | zero  | 2 |  4 | 0 |   
!      |   | 0 | zero  | 2 |  4 |   |   
!      |   | 0 | zero  | 2 |  4 |   |  0
!      | 1 | 4 | one   | 5 | -5 | 1 | -1
!      | 1 | 4 | one   | 5 | -5 | 2 |  2
!      | 1 | 4 | one   | 5 | -5 | 3 | -3
!      | 1 | 4 | one   | 5 | -5 | 2 |  4
!      | 1 | 4 | one   | 5 | -5 | 5 | -5
!      | 1 | 4 | one   | 5 | -5 | 5 | -5
!      | 1 | 4 | one   | 5 | -5 | 0 |   
!      | 1 | 4 | one   | 5 | -5 |   |   
!      | 1 | 4 | one   | 5 | -5 |   |  0
!      | 2 | 3 | two   | 5 | -5 | 1 | -1
!      | 2 | 3 | two   | 5 | -5 | 2 |  2
!      | 2 | 3 | two   | 5 | -5 | 3 | -3
!      | 2 | 3 | two   | 5 | -5 | 2 |  4
!      | 2 | 3 | two   | 5 | -5 | 5 | -5
!      | 2 | 3 | two   | 5 | -5 | 5 | -5
!      | 2 | 3 | two   | 5 | -5 | 0 |   
!      | 2 | 3 | two   | 5 | -5 |   |   
!      | 2 | 3 | two   | 5 | -5 |   |  0
!      | 3 | 2 | three | 5 | -5 | 1 | -1
!      | 3 | 2 | three | 5 | -5 | 2 |  2
!      | 3 | 2 | three | 5 | -5 | 3 | -3
!      | 3 | 2 | three | 5 | -5 | 2 |  4
!      | 3 | 2 | three | 5 | -5 | 5 | -5
!      | 3 | 2 | three | 5 | -5 | 5 | -5
!      | 3 | 2 | three | 5 | -5 | 0 |   
!      | 3 | 2 | three | 5 | -5 |   |   
!      | 3 | 2 | three | 5 | -5 |   |  0
!      | 4 | 1 | four  | 5 | -5 | 1 | -1
!      | 4 | 1 | four  | 5 | -5 | 2 |  2
!      | 4 | 1 | four  | 5 | -5 | 3 | -3
!      | 4 | 1 | four  | 5 | -5 | 2 |  4
!      | 4 | 1 | four  | 5 | -5 | 5 | -5
!      | 4 | 1 | four  | 5 | -5 | 5 | -5
!      | 4 | 1 | four  | 5 | -5 | 0 |   
!      | 4 | 1 | four  | 5 | -5 |   |   
!      | 4 | 1 | four  | 5 | -5 |   |  0
!      | 5 | 0 | five  | 5 | -5 | 1 | -1
!      | 5 | 0 | five  | 5 | -5 | 2 |  2
!      | 5 | 0 | five  | 5 | -5 | 3 | -3
!      | 5 | 0 | five  | 5 | -5 | 2 |  4
!      | 5 | 0 | five  | 5 | -5 | 5 | -5
!      | 5 | 0 | five  | 5 | -5 | 5 | -5
!      | 5 | 0 | five  | 5 | -5 | 0 |   
!      | 5 | 0 | five  | 5 | -5 |   |   
!      | 5 | 0 | five  | 5 | -5 |   |  0
!      | 6 | 6 | six   | 5 | -5 | 1 | -1
!      | 6 | 6 | six   | 5 | -5 | 2 |  2
!      | 6 | 6 | six   | 5 | -5 | 3 | -3
!      | 6 | 6 | six   | 5 | -5 | 2 |  4
!      | 6 | 6 | six   | 5 | -5 | 5 | -5
!      | 6 | 6 | six   | 5 | -5 | 5 | -5
!      | 6 | 6 | six   | 5 | -5 | 0 |   
!      | 6 | 6 | six   | 5 | -5 |   |   
!      | 6 | 6 | six   | 5 | -5 |   |  0
!      | 7 | 7 | seven | 5 | -5 | 1 | -1
!      | 7 | 7 | seven | 5 | -5 | 2 |  2
!      | 7 | 7 | seven | 5 | -5 | 3 | -3
!      | 7 | 7 | seven | 5 | -5 | 2 |  4
!      | 7 | 7 | seven | 5 | -5 | 5 | -5
!      | 7 | 7 | seven | 5 | -5 | 5 | -5
!      | 7 | 7 | seven | 5 | -5 | 0 |   
!      | 7 | 7 | seven | 5 | -5 |   |   
!      | 7 | 7 | seven | 5 | -5 |   |  0
!      | 8 | 8 | eight | 5 | -5 | 1 | -1
!      | 8 | 8 | eight | 5 | -5 | 2 |  2
!      | 8 | 8 | eight | 5 | -5 | 3 | -3
!      | 8 | 8 | eight | 5 | -5 | 2 |  4
!      | 8 | 8 | eight | 5 | -5 | 5 | -5
!      | 8 | 8 | eight | 5 | -5 | 5 | -5
!      | 8 | 8 | eight | 5 | -5 | 0 |   
!      | 8 | 8 | eight | 5 | -5 |   |   
!      | 8 | 8 | eight | 5 | -5 |   |  0
!      | 0 |   | zero  | 5 | -5 | 1 | -1
!      | 0 |   | zero  | 5 | -5 | 2 |  2
!      | 0 |   | zero  | 5 | -5 | 3 | -3
!      | 0 |   | zero  | 5 | -5 | 2 |  4
!      | 0 |   | zero  | 5 | -5 | 5 | -5
!      | 0 |   | zero  | 5 | -5 | 5 | -5
!      | 0 |   | zero  | 5 | -5 | 0 |   
!      | 0 |   | zero  | 5 | -5 |   |   
!      | 0 |   | zero  | 5 | -5 |   |  0
!      |   |   | null  | 5 | -5 | 1 | -1
!      |   |   | null  | 5 | -5 | 2 |  2
!      |   |   | null  | 5 | -5 | 3 | -3
!      |   |   | null  | 5 | -5 | 2 |  4
!      |   |   | null  | 5 | -5 | 5 | -5
!      |   |   | null  | 5 | -5 | 5 | -5
!      |   |   | null  | 5 | -5 | 0 |   
!      |   |   | null  | 5 | -5 |   |   
!      |   |   | null  | 5 | -5 |   |  0
!      |   | 0 | zero  | 5 | -5 | 1 | -1
!      |   | 0 | zero  | 5 | -5 | 2 |  2
!      |   | 0 | zero  | 5 | -5 | 3 | -3
!      |   | 0 | zero  | 5 | -5 | 2 |  4
!      |   | 0 | zero  | 5 | -5 | 5 | -5
!      |   | 0 | zero  | 5 | -5 | 5 | -5
!      |   | 0 | zero  | 5 | -5 | 0 |   
!      |   | 0 | zero  | 5 | -5 |   |   
!      |   | 0 | zero  | 5 | -5 |   |  0
!      | 1 | 4 | one   | 5 | -5 | 1 | -1
!      | 1 | 4 | one   | 5 | -5 | 2 |  2
!      | 1 | 4 | one   | 5 | -5 | 3 | -3
!      | 1 | 4 | one   | 5 | -5 | 2 |  4
!      | 1 | 4 | one   | 5 | -5 | 5 | -5
!      | 1 | 4 | one   | 5 | -5 | 5 | -5
!      | 1 | 4 | one   | 5 | -5 | 0 |   
!      | 1 | 4 | one   | 5 | -5 |   |   
!      | 1 | 4 | one   | 5 | -5 |   |  0
!      | 2 | 3 | two   | 5 | -5 | 1 | -1
!      | 2 | 3 | two   | 5 | -5 | 2 |  2
!      | 2 | 3 | two   | 5 | -5 | 3 | -3
!      | 2 | 3 | two   | 5 | -5 | 2 |  4
!      | 2 | 3 | two   | 5 | -5 | 5 | -5
!      | 2 | 3 | two   | 5 | -5 | 5 | -5
!      | 2 | 3 | two   | 5 | -5 | 0 |   
!      | 2 | 3 | two   | 5 | -5 |   |   
!      | 2 | 3 | two   | 5 | -5 |   |  0
!      | 3 | 2 | three | 5 | -5 | 1 | -1
!      | 3 | 2 | three | 5 | -5 | 2 |  2
!      | 3 | 2 | three | 5 | -5 | 3 | -3
!      | 3 | 2 | three | 5 | -5 | 2 |  4
!      | 3 | 2 | three | 5 | -5 | 5 | -5
!      | 3 | 2 | three | 5 | -5 | 5 | -5
!      | 3 | 2 | three | 5 | -5 | 0 |   
!      | 3 | 2 | three | 5 | -5 |   |   
!      | 3 | 2 | three | 5 | -5 |   |  0
!      | 4 | 1 | four  | 5 | -5 | 1 | -1
!      | 4 | 1 | four  | 5 | -5 | 2 |  2
!      | 4 | 1 | four  | 5 | -5 | 3 | -3
!      | 4 | 1 | four  | 5 | -5 | 2 |  4
!      | 4 | 1 | four  | 5 | -5 | 5 | -5
!      | 4 | 1 | four  | 5 | -5 | 5 | -5
!      | 4 | 1 | four  | 5 | -5 | 0 |   
!      | 4 | 1 | four  | 5 | -5 |   |   
!      | 4 | 1 | four  | 5 | -5 |   |  0
!      | 5 | 0 | five  | 5 | -5 | 1 | -1
!      | 5 | 0 | five  | 5 | -5 | 2 |  2
!      | 5 | 0 | five  | 5 | -5 | 3 | -3
!      | 5 | 0 | five  | 5 | -5 | 2 |  4
!      | 5 | 0 | five  | 5 | -5 | 5 | -5
!      | 5 | 0 | five  | 5 | -5 | 5 | -5
!      | 5 | 0 | five  | 5 | -5 | 0 |   
!      | 5 | 0 | five  | 5 | -5 |   |   
!      | 5 | 0 | five  | 5 | -5 |   |  0
!      | 6 | 6 | six   | 5 | -5 | 1 | -1
!      | 6 | 6 | six   | 5 | -5 | 2 |  2
!      | 6 | 6 | six   | 5 | -5 | 3 | -3
!      | 6 | 6 | six   | 5 | -5 | 2 |  4
!      | 6 | 6 | six   | 5 | -5 | 5 | -5
!      | 6 | 6 | six   | 5 | -5 | 5 | -5
!      | 6 | 6 | six   | 5 | -5 | 0 |   
!      | 6 | 6 | six   | 5 | -5 |   |   
!      | 6 | 6 | six   | 5 | -5 |   |  0
!      | 7 | 7 | seven | 5 | -5 | 1 | -1
!      | 7 | 7 | seven | 5 | -5 | 2 |  2
!      | 7 | 7 | seven | 5 | -5 | 3 | -3
!      | 7 | 7 | seven | 5 | -5 | 2 |  4
!      | 7 | 7 | seven | 5 | -5 | 5 | -5
!      | 7 | 7 | seven | 5 | -5 | 5 | -5
!      | 7 | 7 | seven | 5 | -5 | 0 |   
!      | 7 | 7 | seven | 5 | -5 |   |   
!      | 7 | 7 | seven | 5 | -5 |   |  0
!      | 8 | 8 | eight | 5 | -5 | 1 | -1
!      | 8 | 8 | eight | 5 | -5 | 2 |  2
!      | 8 | 8 | eight | 5 | -5 | 3 | -3
!      | 8 | 8 | eight | 5 | -5 | 2 |  4
!      | 8 | 8 | eight | 5 | -5 | 5 | -5
!      | 8 | 8 | eight | 5 | -5 | 5 | -5
!      | 8 | 8 | eight | 5 | -5 | 0 |   
!      | 8 | 8 | eight | 5 | -5 |   |   
!      | 8 | 8 | eight | 5 | -5 |   |  0
!      | 0 |   | zero  | 5 | -5 | 1 | -1
!      | 0 |   | zero  | 5 | -5 | 2 |  2
!      | 0 |   | zero  | 5 | -5 | 3 | -3
!      | 0 |   | zero  | 5 | -5 | 2 |  4
!      | 0 |   | zero  | 5 | -5 | 5 | -5
!      | 0 |   | zero  | 5 | -5 | 5 | -5
!      | 0 |   | zero  | 5 | -5 | 0 |   
!      | 0 |   | zero  | 5 | -5 |   |   
!      | 0 |   | zero  | 5 | -5 |   |  0
!      |   |   | null  | 5 | -5 | 1 | -1
!      |   |   | null  | 5 | -5 | 2 |  2
!      |   |   | null  | 5 | -5 | 3 | -3
!      |   |   | null  | 5 | -5 | 2 |  4
!      |   |   | null  | 5 | -5 | 5 | -5
!      |   |   | null  | 5 | -5 | 5 | -5
!      |   |   | null  | 5 | -5 | 0 |   
!      |   |   | null  | 5 | -5 |   |   
!      |   |   | null  | 5 | -5 |   |  0
!      |   | 0 | zero  | 5 | -5 | 1 | -1
!      |   | 0 | zero  | 5 | -5 | 2 |  2
!      |   | 0 | zero  | 5 | -5 | 3 | -3
!      |   | 0 | zero  | 5 | -5 | 2 |  4
!      |   | 0 | zero  | 5 | -5 | 5 | -5
!      |   | 0 | zero  | 5 | -5 | 5 | -5
!      |   | 0 | zero  | 5 | -5 | 0 |   
!      |   | 0 | zero  | 5 | -5 |   |   
!      |   | 0 | zero  | 5 | -5 |   |  0
!      | 1 | 4 | one   | 0 |    | 1 | -1
!      | 1 | 4 | one   | 0 |    | 2 |  2
!      | 1 | 4 | one   | 0 |    | 3 | -3
!      | 1 | 4 | one   | 0 |    | 2 |  4
!      | 1 | 4 | one   | 0 |    | 5 | -5
!      | 1 | 4 | one   | 0 |    | 5 | -5
!      | 1 | 4 | one   | 0 |    | 0 |   
!      | 1 | 4 | one   | 0 |    |   |   
!      | 1 | 4 | one   | 0 |    |   |  0
!      | 2 | 3 | two   | 0 |    | 1 | -1
!      | 2 | 3 | two   | 0 |    | 2 |  2
!      | 2 | 3 | two   | 0 |    | 3 | -3
!      | 2 | 3 | two   | 0 |    | 2 |  4
!      | 2 | 3 | two   | 0 |    | 5 | -5
!      | 2 | 3 | two   | 0 |    | 5 | -5
!      | 2 | 3 | two   | 0 |    | 0 |   
!      | 2 | 3 | two   | 0 |    |   |   
!      | 2 | 3 | two   | 0 |    |   |  0
!      | 3 | 2 | three | 0 |    | 1 | -1
!      | 3 | 2 | three | 0 |    | 2 |  2
!      | 3 | 2 | three | 0 |    | 3 | -3
!      | 3 | 2 | three | 0 |    | 2 |  4
!      | 3 | 2 | three | 0 |    | 5 | -5
!      | 3 | 2 | three | 0 |    | 5 | -5
!      | 3 | 2 | three | 0 |    | 0 |   
!      | 3 | 2 | three | 0 |    |   |   
!      | 3 | 2 | three | 0 |    |   |  0
!      | 4 | 1 | four  | 0 |    | 1 | -1
!      | 4 | 1 | four  | 0 |    | 2 |  2
!      | 4 | 1 | four  | 0 |    | 3 | -3
!      | 4 | 1 | four  | 0 |    | 2 |  4
!      | 4 | 1 | four  | 0 |    | 5 | -5
!      | 4 | 1 | four  | 0 |    | 5 | -5
!      | 4 | 1 | four  | 0 |    | 0 |   
!      | 4 | 1 | four  | 0 |    |   |   
!      | 4 | 1 | four  | 0 |    |   |  0
!      | 5 | 0 | five  | 0 |    | 1 | -1
!      | 5 | 0 | five  | 0 |    | 2 |  2
!      | 5 | 0 | five  | 0 |    | 3 | -3
!      | 5 | 0 | five  | 0 |    | 2 |  4
!      | 5 | 0 | five  | 0 |    | 5 | -5
!      | 5 | 0 | five  | 0 |    | 5 | -5
!      | 5 | 0 | five  | 0 |    | 0 |   
!      | 5 | 0 | five  | 0 |    |   |   
!      | 5 | 0 | five  | 0 |    |   |  0
!      | 6 | 6 | six   | 0 |    | 1 | -1
!      | 6 | 6 | six   | 0 |    | 2 |  2
!      | 6 | 6 | six   | 0 |    | 3 | -3
!      | 6 | 6 | six   | 0 |    | 2 |  4
!      | 6 | 6 | six   | 0 |    | 5 | -5
!      | 6 | 6 | six   | 0 |    | 5 | -5
!      | 6 | 6 | six   | 0 |    | 0 |   
!      | 6 | 6 | six   | 0 |    |   |   
!      | 6 | 6 | six   | 0 |    |   |  0
!      | 7 | 7 | seven | 0 |    | 1 | -1
!      | 7 | 7 | seven | 0 |    | 2 |  2
!      | 7 | 7 | seven | 0 |    | 3 | -3
!      | 7 | 7 | seven | 0 |    | 2 |  4
!      | 7 | 7 | seven | 0 |    | 5 | -5
!      | 7 | 7 | seven | 0 |    | 5 | -5
!      | 7 | 7 | seven | 0 |    | 0 |   
!      | 7 | 7 | seven | 0 |    |   |   
!      | 7 | 7 | seven | 0 |    |   |  0
!      | 8 | 8 | eight | 0 |    | 1 | -1
!      | 8 | 8 | eight | 0 |    | 2 |  2
!      | 8 | 8 | eight | 0 |    | 3 | -3
!      | 8 | 8 | eight | 0 |    | 2 |  4
!      | 8 | 8 | eight | 0 |    | 5 | -5
!      | 8 | 8 | eight | 0 |    | 5 | -5
!      | 8 | 8 | eight | 0 |    | 0 |   
!      | 8 | 8 | eight | 0 |    |   |   
!      | 8 | 8 | eight | 0 |    |   |  0
!      | 0 |   | zero  | 0 |    | 1 | -1
!      | 0 |   | zero  | 0 |    | 2 |  2
!      | 0 |   | zero  | 0 |    | 3 | -3
!      | 0 |   | zero  | 0 |    | 2 |  4
!      | 0 |   | zero  | 0 |    | 5 | -5
!      | 0 |   | zero  | 0 |    | 5 | -5
!      | 0 |   | zero  | 0 |    | 0 |   
!      | 0 |   | zero  | 0 |    |   |   
!      | 0 |   | zero  | 0 |    |   |  0
!      |   |   | null  | 0 |    | 1 | -1
!      |   |   | null  | 0 |    | 2 |  2
!      |   |   | null  | 0 |    | 3 | -3
!      |   |   | null  | 0 |    | 2 |  4
!      |   |   | null  | 0 |    | 5 | -5
!      |   |   | null  | 0 |    | 5 | -5
!      |   |   | null  | 0 |    | 0 |   
!      |   |   | null  | 0 |    |   |   
!      |   |   | null  | 0 |    |   |  0
!      |   | 0 | zero  | 0 |    | 1 | -1
!      |   | 0 | zero  | 0 |    | 2 |  2
!      |   | 0 | zero  | 0 |    | 3 | -3
!      |   | 0 | zero  | 0 |    | 2 |  4
!      |   | 0 | zero  | 0 |    | 5 | -5
!      |   | 0 | zero  | 0 |    | 5 | -5
!      |   | 0 | zero  | 0 |    | 0 |   
!      |   | 0 | zero  | 0 |    |   |   
!      |   | 0 | zero  | 0 |    |   |  0
!      | 1 | 4 | one   |   |    | 1 | -1
!      | 1 | 4 | one   |   |    | 2 |  2
!      | 1 | 4 | one   |   |    | 3 | -3
!      | 1 | 4 | one   |   |    | 2 |  4
!      | 1 | 4 | one   |   |    | 5 | -5
!      | 1 | 4 | one   |   |    | 5 | -5
!      | 1 | 4 | one   |   |    | 0 |   
!      | 1 | 4 | one   |   |    |   |   
!      | 1 | 4 | one   |   |    |   |  0
!      | 2 | 3 | two   |   |    | 1 | -1
!      | 2 | 3 | two   |   |    | 2 |  2
!      | 2 | 3 | two   |   |    | 3 | -3
!      | 2 | 3 | two   |   |    | 2 |  4
!      | 2 | 3 | two   |   |    | 5 | -5
!      | 2 | 3 | two   |   |    | 5 | -5
!      | 2 | 3 | two   |   |    | 0 |   
!      | 2 | 3 | two   |   |    |   |   
!      | 2 | 3 | two   |   |    |   |  0
!      | 3 | 2 | three |   |    | 1 | -1
!      | 3 | 2 | three |   |    | 2 |  2
!      | 3 | 2 | three |   |    | 3 | -3
!      | 3 | 2 | three |   |    | 2 |  4
!      | 3 | 2 | three |   |    | 5 | -5
!      | 3 | 2 | three |   |    | 5 | -5
!      | 3 | 2 | three |   |    | 0 |   
!      | 3 | 2 | three |   |    |   |   
!      | 3 | 2 | three |   |    |   |  0
!      | 4 | 1 | four  |   |    | 1 | -1
!      | 4 | 1 | four  |   |    | 2 |  2
!      | 4 | 1 | four  |   |    | 3 | -3
!      | 4 | 1 | four  |   |    | 2 |  4
!      | 4 | 1 | four  |   |    | 5 | -5
!      | 4 | 1 | four  |   |    | 5 | -5
!      | 4 | 1 | four  |   |    | 0 |   
!      | 4 | 1 | four  |   |    |   |   
!      | 4 | 1 | four  |   |    |   |  0
!      | 5 | 0 | five  |   |    | 1 | -1
!      | 5 | 0 | five  |   |    | 2 |  2
!      | 5 | 0 | five  |   |    | 3 | -3
!      | 5 | 0 | five  |   |    | 2 |  4
!      | 5 | 0 | five  |   |    | 5 | -5
!      | 5 | 0 | five  |   |    | 5 | -5
!      | 5 | 0 | five  |   |    | 0 |   
!      | 5 | 0 | five  |   |    |   |   
!      | 5 | 0 | five  |   |    |   |  0
!      | 6 | 6 | six   |   |    | 1 | -1
!      | 6 | 6 | six   |   |    | 2 |  2
!      | 6 | 6 | six   |   |    | 3 | -3
!      | 6 | 6 | six   |   |    | 2 |  4
!      | 6 | 6 | six   |   |    | 5 | -5
!      | 6 | 6 | six   |   |    | 5 | -5
!      | 6 | 6 | six   |   |    | 0 |   
!      | 6 | 6 | six   |   |    |   |   
!      | 6 | 6 | six   |   |    |   |  0
!      | 7 | 7 | seven |   |    | 1 | -1
!      | 7 | 7 | seven |   |    | 2 |  2
!      | 7 | 7 | seven |   |    | 3 | -3
!      | 7 | 7 | seven |   |    | 2 |  4
!      | 7 | 7 | seven |   |    | 5 | -5
!      | 7 | 7 | seven |   |    | 5 | -5
!      | 7 | 7 | seven |   |    | 0 |   
!      | 7 | 7 | seven |   |    |   |   
!      | 7 | 7 | seven |   |    |   |  0
!      | 8 | 8 | eight |   |    | 1 | -1
!      | 8 | 8 | eight |   |    | 2 |  2
!      | 8 | 8 | eight |   |    | 3 | -3
!      | 8 | 8 | eight |   |    | 2 |  4
!      | 8 | 8 | eight |   |    | 5 | -5
!      | 8 | 8 | eight |   |    | 5 | -5
!      | 8 | 8 | eight |   |    | 0 |   
!      | 8 | 8 | eight |   |    |   |   
!      | 8 | 8 | eight |   |    |   |  0
!      | 0 |   | zero  |   |    | 1 | -1
!      | 0 |   | zero  |   |    | 2 |  2
!      | 0 |   | zero  |   |    | 3 | -3
!      | 0 |   | zero  |   |    | 2 |  4
!      | 0 |   | zero  |   |    | 5 | -5
!      | 0 |   | zero  |   |    | 5 | -5
!      | 0 |   | zero  |   |    | 0 |   
!      | 0 |   | zero  |   |    |   |   
!      | 0 |   | zero  |   |    |   |  0
!      |   |   | null  |   |    | 1 | -1
!      |   |   | null  |   |    | 2 |  2
!      |   |   | null  |   |    | 3 | -3
!      |   |   | null  |   |    | 2 |  4
!      |   |   | null  |   |    | 5 | -5
!      |   |   | null  |   |    | 5 | -5
!      |   |   | null  |   |    | 0 |   
!      |   |   | null  |   |    |   |   
!      |   |   | null  |   |    |   |  0
!      |   | 0 | zero  |   |    | 1 | -1
!      |   | 0 | zero  |   |    | 2 |  2
!      |   | 0 | zero  |   |    | 3 | -3
!      |   | 0 | zero  |   |    | 2 |  4
!      |   | 0 | zero  |   |    | 5 | -5
!      |   | 0 | zero  |   |    | 5 | -5
!      |   | 0 | zero  |   |    | 0 |   
!      |   | 0 | zero  |   |    |   |   
!      |   | 0 | zero  |   |    |   |  0
!      | 1 | 4 | one   |   |  0 | 1 | -1
!      | 1 | 4 | one   |   |  0 | 2 |  2
!      | 1 | 4 | one   |   |  0 | 3 | -3
!      | 1 | 4 | one   |   |  0 | 2 |  4
!      | 1 | 4 | one   |   |  0 | 5 | -5
!      | 1 | 4 | one   |   |  0 | 5 | -5
!      | 1 | 4 | one   |   |  0 | 0 |   
!      | 1 | 4 | one   |   |  0 |   |   
!      | 1 | 4 | one   |   |  0 |   |  0
!      | 2 | 3 | two   |   |  0 | 1 | -1
!      | 2 | 3 | two   |   |  0 | 2 |  2
!      | 2 | 3 | two   |   |  0 | 3 | -3
!      | 2 | 3 | two   |   |  0 | 2 |  4
!      | 2 | 3 | two   |   |  0 | 5 | -5
!      | 2 | 3 | two   |   |  0 | 5 | -5
!      | 2 | 3 | two   |   |  0 | 0 |   
!      | 2 | 3 | two   |   |  0 |   |   
!      | 2 | 3 | two   |   |  0 |   |  0
!      | 3 | 2 | three |   |  0 | 1 | -1
!      | 3 | 2 | three |   |  0 | 2 |  2
!      | 3 | 2 | three |   |  0 | 3 | -3
!      | 3 | 2 | three |   |  0 | 2 |  4
!      | 3 | 2 | three |   |  0 | 5 | -5
!      | 3 | 2 | three |   |  0 | 5 | -5
!      | 3 | 2 | three |   |  0 | 0 |   
!      | 3 | 2 | three |   |  0 |   |   
!      | 3 | 2 | three |   |  0 |   |  0
!      | 4 | 1 | four  |   |  0 | 1 | -1
!      | 4 | 1 | four  |   |  0 | 2 |  2
!      | 4 | 1 | four  |   |  0 | 3 | -3
!      | 4 | 1 | four  |   |  0 | 2 |  4
!      | 4 | 1 | four  |   |  0 | 5 | -5
!      | 4 | 1 | four  |   |  0 | 5 | -5
!      | 4 | 1 | four  |   |  0 | 0 |   
!      | 4 | 1 | four  |   |  0 |   |   
!      | 4 | 1 | four  |   |  0 |   |  0
!      | 5 | 0 | five  |   |  0 | 1 | -1
!      | 5 | 0 | five  |   |  0 | 2 |  2
!      | 5 | 0 | five  |   |  0 | 3 | -3
!      | 5 | 0 | five  |   |  0 | 2 |  4
!      | 5 | 0 | five  |   |  0 | 5 | -5
!      | 5 | 0 | five  |   |  0 | 5 | -5
!      | 5 | 0 | five  |   |  0 | 0 |   
!      | 5 | 0 | five  |   |  0 |   |   
!      | 5 | 0 | five  |   |  0 |   |  0
!      | 6 | 6 | six   |   |  0 | 1 | -1
!      | 6 | 6 | six   |   |  0 | 2 |  2
!      | 6 | 6 | six   |   |  0 | 3 | -3
!      | 6 | 6 | six   |   |  0 | 2 |  4
!      | 6 | 6 | six   |   |  0 | 5 | -5
!      | 6 | 6 | six   |   |  0 | 5 | -5
!      | 6 | 6 | six   |   |  0 | 0 |   
!      | 6 | 6 | six   |   |  0 |   |   
!      | 6 | 6 | six   |   |  0 |   |  0
!      | 7 | 7 | seven |   |  0 | 1 | -1
!      | 7 | 7 | seven |   |  0 | 2 |  2
!      | 7 | 7 | seven |   |  0 | 3 | -3
!      | 7 | 7 | seven |   |  0 | 2 |  4
!      | 7 | 7 | seven |   |  0 | 5 | -5
!      | 7 | 7 | seven |   |  0 | 5 | -5
!      | 7 | 7 | seven |   |  0 | 0 |   
!      | 7 | 7 | seven |   |  0 |   |   
!      | 7 | 7 | seven |   |  0 |   |  0
!      | 8 | 8 | eight |   |  0 | 1 | -1
!      | 8 | 8 | eight |   |  0 | 2 |  2
!      | 8 | 8 | eight |   |  0 | 3 | -3
!      | 8 | 8 | eight |   |  0 | 2 |  4
!      | 8 | 8 | eight |   |  0 | 5 | -5
!      | 8 | 8 | eight |   |  0 | 5 | -5
!      | 8 | 8 | eight |   |  0 | 0 |   
!      | 8 | 8 | eight |   |  0 |   |   
!      | 8 | 8 | eight |   |  0 |   |  0
!      | 0 |   | zero  |   |  0 | 1 | -1
!      | 0 |   | zero  |   |  0 | 2 |  2
!      | 0 |   | zero  |   |  0 | 3 | -3
!      | 0 |   | zero  |   |  0 | 2 |  4
!      | 0 |   | zero  |   |  0 | 5 | -5
!      | 0 |   | zero  |   |  0 | 5 | -5
!      | 0 |   | zero  |   |  0 | 0 |   
!      | 0 |   | zero  |   |  0 |   |   
!      | 0 |   | zero  |   |  0 |   |  0
!      |   |   | null  |   |  0 | 1 | -1
!      |   |   | null  |   |  0 | 2 |  2
!      |   |   | null  |   |  0 | 3 | -3
!      |   |   | null  |   |  0 | 2 |  4
!      |   |   | null  |   |  0 | 5 | -5
!      |   |   | null  |   |  0 | 5 | -5
!      |   |   | null  |   |  0 | 0 |   
!      |   |   | null  |   |  0 |   |   
!      |   |   | null  |   |  0 |   |  0
!      |   | 0 | zero  |   |  0 | 1 | -1
!      |   | 0 | zero  |   |  0 | 2 |  2
!      |   | 0 | zero  |   |  0 | 3 | -3
!      |   | 0 | zero  |   |  0 | 2 |  4
!      |   | 0 | zero  |   |  0 | 5 | -5
!      |   | 0 | zero  |   |  0 | 5 | -5
!      |   | 0 | zero  |   |  0 | 0 |   
!      |   | 0 | zero  |   |  0 |   |   
!      |   | 0 | zero  |   |  0 |   |  0
! (891 rows)
! 
! --
! --
! -- Inner joins (equi-joins)
! --
! --
! --
! -- Inner joins (equi-joins) with USING clause
! -- The USING syntax changes the shape of the resulting table
! -- by including a column in the USING clause only once in the result.
! --
! -- Inner equi-join on specified column
! SELECT '' AS "xxx", *
!   FROM J1_TBL INNER JOIN J2_TBL USING (i);
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
! (7 rows)
! 
! -- Same as above, slightly different syntax
! SELECT '' AS "xxx", *
!   FROM J1_TBL JOIN J2_TBL USING (i);
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
! (7 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b, c) JOIN J2_TBL t2 (a, d) USING (a)
!   ORDER BY a, d;
!  xxx | a | b |   c   | d  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
! (7 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b, c) JOIN J2_TBL t2 (a, b) USING (b)
!   ORDER BY b, t1.a;
!  xxx | b | a |   c   | a 
! -----+---+---+-------+---
!      | 0 | 5 | five  |  
!      | 0 |   | zero  |  
!      | 2 | 3 | three | 2
!      | 4 | 1 | one   | 2
! (4 rows)
! 
! --
! -- NATURAL JOIN
! -- Inner equi-join on all columns with the same name
! --
! SELECT '' AS "xxx", *
!   FROM J1_TBL NATURAL JOIN J2_TBL;
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
! (7 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b, c) NATURAL JOIN J2_TBL t2 (a, d);
!  xxx | a | b |   c   | d  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
! (7 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b, c) NATURAL JOIN J2_TBL t2 (d, a);
!  xxx | a | b |  c   | d 
! -----+---+---+------+---
!      | 0 |   | zero |  
!      | 2 | 3 | two  | 2
!      | 4 | 1 | four | 2
! (3 rows)
! 
! -- mismatch number of columns
! -- currently, Postgres will fill in with underlying names
! SELECT '' AS "xxx", *
!   FROM J1_TBL t1 (a, b) NATURAL JOIN J2_TBL t2 (a);
!  xxx | a | b |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
! (7 rows)
! 
! --
! -- Inner joins (equi-joins)
! --
! SELECT '' AS "xxx", *
!   FROM J1_TBL JOIN J2_TBL ON (J1_TBL.i = J2_TBL.i);
!  xxx | i | j |   t   | i | k  
! -----+---+---+-------+---+----
!      | 0 |   | zero  | 0 |   
!      | 1 | 4 | one   | 1 | -1
!      | 2 | 3 | two   | 2 |  2
!      | 2 | 3 | two   | 2 |  4
!      | 3 | 2 | three | 3 | -3
!      | 5 | 0 | five  | 5 | -5
!      | 5 | 0 | five  | 5 | -5
! (7 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL JOIN J2_TBL ON (J1_TBL.i = J2_TBL.k);
!  xxx | i | j |  t   | i | k 
! -----+---+---+------+---+---
!      | 0 |   | zero |   | 0
!      | 2 | 3 | two  | 2 | 2
!      | 4 | 1 | four | 2 | 4
! (3 rows)
! 
! --
! -- Non-equi-joins
! --
! SELECT '' AS "xxx", *
!   FROM J1_TBL JOIN J2_TBL ON (J1_TBL.i <= J2_TBL.k);
!  xxx | i | j |   t   | i | k 
! -----+---+---+-------+---+---
!      | 1 | 4 | one   | 2 | 2
!      | 2 | 3 | two   | 2 | 2
!      | 0 |   | zero  | 2 | 2
!      | 1 | 4 | one   | 2 | 4
!      | 2 | 3 | two   | 2 | 4
!      | 3 | 2 | three | 2 | 4
!      | 4 | 1 | four  | 2 | 4
!      | 0 |   | zero  | 2 | 4
!      | 0 |   | zero  |   | 0
! (9 rows)
! 
! --
! -- Outer joins
! -- Note that OUTER is a noise word
! --
! SELECT '' AS "xxx", *
!   FROM J1_TBL LEFT OUTER JOIN J2_TBL USING (i)
!   ORDER BY i, k, t;
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 4 | 1 | four  |   
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
!      | 6 | 6 | six   |   
!      | 7 | 7 | seven |   
!      | 8 | 8 | eight |   
!      |   |   | null  |   
!      |   | 0 | zero  |   
! (13 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL LEFT JOIN J2_TBL USING (i)
!   ORDER BY i, k, t;
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 4 | 1 | four  |   
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
!      | 6 | 6 | six   |   
!      | 7 | 7 | seven |   
!      | 8 | 8 | eight |   
!      |   |   | null  |   
!      |   | 0 | zero  |   
! (13 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL RIGHT OUTER JOIN J2_TBL USING (i);
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
!      |   |   |       |   
!      |   |   |       |  0
! (9 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL RIGHT JOIN J2_TBL USING (i);
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
!      |   |   |       |   
!      |   |   |       |  0
! (9 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL FULL OUTER JOIN J2_TBL USING (i)
!   ORDER BY i, k, t;
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 4 | 1 | four  |   
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
!      | 6 | 6 | six   |   
!      | 7 | 7 | seven |   
!      | 8 | 8 | eight |   
!      |   |   |       |  0
!      |   |   | null  |   
!      |   | 0 | zero  |   
!      |   |   |       |   
! (15 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL FULL JOIN J2_TBL USING (i)
!   ORDER BY i, k, t;
!  xxx | i | j |   t   | k  
! -----+---+---+-------+----
!      | 0 |   | zero  |   
!      | 1 | 4 | one   | -1
!      | 2 | 3 | two   |  2
!      | 2 | 3 | two   |  4
!      | 3 | 2 | three | -3
!      | 4 | 1 | four  |   
!      | 5 | 0 | five  | -5
!      | 5 | 0 | five  | -5
!      | 6 | 6 | six   |   
!      | 7 | 7 | seven |   
!      | 8 | 8 | eight |   
!      |   |   |       |  0
!      |   |   | null  |   
!      |   | 0 | zero  |   
!      |   |   |       |   
! (15 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL LEFT JOIN J2_TBL USING (i) WHERE (k = 1);
!  xxx | i | j | t | k 
! -----+---+---+---+---
! (0 rows)
! 
! SELECT '' AS "xxx", *
!   FROM J1_TBL LEFT JOIN J2_TBL USING (i) WHERE (i = 1);
!  xxx | i | j |  t  | k  
! -----+---+---+-----+----
!      | 1 | 4 | one | -1
! (1 row)
! 
! --
! -- More complicated constructs
! --
! --
! -- Multiway full join
! --
! CREATE TABLE t1 (name TEXT, n INTEGER);
! CREATE TABLE t2 (name TEXT, n INTEGER);
! CREATE TABLE t3 (name TEXT, n INTEGER);
! INSERT INTO t1 VALUES ( 'bb', 11 );
! INSERT INTO t2 VALUES ( 'bb', 12 );
! INSERT INTO t2 VALUES ( 'cc', 22 );
! INSERT INTO t2 VALUES ( 'ee', 42 );
! INSERT INTO t3 VALUES ( 'bb', 13 );
! INSERT INTO t3 VALUES ( 'cc', 23 );
! INSERT INTO t3 VALUES ( 'dd', 33 );
! SELECT * FROM t1 FULL JOIN t2 USING (name) FULL JOIN t3 USING (name);
!  name | n  | n  | n  
! ------+----+----+----
!  bb   | 11 | 12 | 13
!  cc   |    | 22 | 23
!  dd   |    |    | 33
!  ee   |    | 42 |   
! (4 rows)
! 
! --
! -- Test interactions of join syntax and subqueries
! --
! -- Basic cases (we expect planner to pull up the subquery here)
! SELECT * FROM
! (SELECT * FROM t2) as s2
! INNER JOIN
! (SELECT * FROM t3) s3
! USING (name);
!  name | n  | n  
! ------+----+----
!  bb   | 12 | 13
!  cc   | 22 | 23
! (2 rows)
! 
! SELECT * FROM
! (SELECT * FROM t2) as s2
! LEFT JOIN
! (SELECT * FROM t3) s3
! USING (name);
!  name | n  | n  
! ------+----+----
!  bb   | 12 | 13
!  cc   | 22 | 23
!  ee   | 42 |   
! (3 rows)
! 
! SELECT * FROM
! (SELECT * FROM t2) as s2
! FULL JOIN
! (SELECT * FROM t3) s3
! USING (name);
!  name | n  | n  
! ------+----+----
!  bb   | 12 | 13
!  cc   | 22 | 23
!  dd   |    | 33
!  ee   | 42 |   
! (4 rows)
! 
! -- Cases with non-nullable expressions in subquery results;
! -- make sure these go to null as expected
! SELECT * FROM
! (SELECT name, n as s2_n, 2 as s2_2 FROM t2) as s2
! NATURAL INNER JOIN
! (SELECT name, n as s3_n, 3 as s3_2 FROM t3) s3;
!  name | s2_n | s2_2 | s3_n | s3_2 
! ------+------+------+------+------
!  bb   |   12 |    2 |   13 |    3
!  cc   |   22 |    2 |   23 |    3
! (2 rows)
! 
! SELECT * FROM
! (SELECT name, n as s2_n, 2 as s2_2 FROM t2) as s2
! NATURAL LEFT JOIN
! (SELECT name, n as s3_n, 3 as s3_2 FROM t3) s3;
!  name | s2_n | s2_2 | s3_n | s3_2 
! ------+------+------+------+------
!  bb   |   12 |    2 |   13 |    3
!  cc   |   22 |    2 |   23 |    3
!  ee   |   42 |    2 |      |     
! (3 rows)
! 
! SELECT * FROM
! (SELECT name, n as s2_n, 2 as s2_2 FROM t2) as s2
! NATURAL FULL JOIN
! (SELECT name, n as s3_n, 3 as s3_2 FROM t3) s3;
!  name | s2_n | s2_2 | s3_n | s3_2 
! ------+------+------+------+------
!  bb   |   12 |    2 |   13 |    3
!  cc   |   22 |    2 |   23 |    3
!  dd   |      |      |   33 |    3
!  ee   |   42 |    2 |      |     
! (4 rows)
! 
! SELECT * FROM
! (SELECT name, n as s1_n, 1 as s1_1 FROM t1) as s1
! NATURAL INNER JOIN
! (SELECT name, n as s2_n, 2 as s2_2 FROM t2) as s2
! NATURAL INNER JOIN
! (SELECT name, n as s3_n, 3 as s3_2 FROM t3) s3;
!  name | s1_n | s1_1 | s2_n | s2_2 | s3_n | s3_2 
! ------+------+------+------+------+------+------
!  bb   |   11 |    1 |   12 |    2 |   13 |    3
! (1 row)
! 
! SELECT * FROM
! (SELECT name, n as s1_n, 1 as s1_1 FROM t1) as s1
! NATURAL FULL JOIN
! (SELECT name, n as s2_n, 2 as s2_2 FROM t2) as s2
! NATURAL FULL JOIN
! (SELECT name, n as s3_n, 3 as s3_2 FROM t3) s3;
!  name | s1_n | s1_1 | s2_n | s2_2 | s3_n | s3_2 
! ------+------+------+------+------+------+------
!  bb   |   11 |    1 |   12 |    2 |   13 |    3
!  cc   |      |      |   22 |    2 |   23 |    3
!  dd   |      |      |      |      |   33 |    3
!  ee   |      |      |   42 |    2 |      |     
! (4 rows)
! 
! SELECT * FROM
! (SELECT name, n as s1_n FROM t1) as s1
! NATURAL FULL JOIN
!   (SELECT * FROM
!     (SELECT name, n as s2_n FROM t2) as s2
!     NATURAL FULL JOIN
!     (SELECT name, n as s3_n FROM t3) as s3
!   ) ss2;
!  name | s1_n | s2_n | s3_n 
! ------+------+------+------
!  bb   |   11 |   12 |   13
!  cc   |      |   22 |   23
!  dd   |      |      |   33
!  ee   |      |   42 |     
! (4 rows)
! 
! SELECT * FROM
! (SELECT name, n as s1_n FROM t1) as s1
! NATURAL FULL JOIN
!   (SELECT * FROM
!     (SELECT name, n as s2_n, 2 as s2_2 FROM t2) as s2
!     NATURAL FULL JOIN
!     (SELECT name, n as s3_n FROM t3) as s3
!   ) ss2;
!  name | s1_n | s2_n | s2_2 | s3_n 
! ------+------+------+------+------
!  bb   |   11 |   12 |    2 |   13
!  cc   |      |   22 |    2 |   23
!  dd   |      |      |      |   33
!  ee   |      |   42 |    2 |     
! (4 rows)
! 
! -- Test for propagation of nullability constraints into sub-joins
! create temp table x (x1 int, x2 int);
! insert into x values (1,11);
! insert into x values (2,22);
! insert into x values (3,null);
! insert into x values (4,44);
! insert into x values (5,null);
! create temp table y (y1 int, y2 int);
! insert into y values (1,111);
! insert into y values (2,222);
! insert into y values (3,333);
! insert into y values (4,null);
! select * from x;
!  x1 | x2 
! ----+----
!   1 | 11
!   2 | 22
!   3 |   
!   4 | 44
!   5 |   
! (5 rows)
! 
! select * from y;
!  y1 | y2  
! ----+-----
!   1 | 111
!   2 | 222
!   3 | 333
!   4 |    
! (4 rows)
! 
! select * from x left join y on (x1 = y1 and x2 is not null);
!  x1 | x2 | y1 | y2  
! ----+----+----+-----
!   1 | 11 |  1 | 111
!   2 | 22 |  2 | 222
!   3 |    |    |    
!   4 | 44 |  4 |    
!   5 |    |    |    
! (5 rows)
! 
! select * from x left join y on (x1 = y1 and y2 is not null);
!  x1 | x2 | y1 | y2  
! ----+----+----+-----
!   1 | 11 |  1 | 111
!   2 | 22 |  2 | 222
!   3 |    |  3 | 333
!   4 | 44 |    |    
!   5 |    |    |    
! (5 rows)
! 
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   3 |    |  3 | 333 |   3 |    
!   4 | 44 |  4 |     |   4 |  44
!   5 |    |    |     |   5 |    
! (5 rows)
! 
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1 and x2 is not null);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   3 |    |  3 | 333 |     |    
!   4 | 44 |  4 |     |   4 |  44
!   5 |    |    |     |     |    
! (5 rows)
! 
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1 and y2 is not null);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   3 |    |  3 | 333 |   3 |    
!   4 | 44 |  4 |     |     |    
!   5 |    |    |     |     |    
! (5 rows)
! 
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1 and xx2 is not null);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   3 |    |  3 | 333 |     |    
!   4 | 44 |  4 |     |   4 |  44
!   5 |    |    |     |     |    
! (5 rows)
! 
! -- these should NOT give the same answers as above
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1) where (x2 is not null);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   4 | 44 |  4 |     |   4 |  44
! (3 rows)
! 
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1) where (y2 is not null);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   3 |    |  3 | 333 |   3 |    
! (3 rows)
! 
! select * from (x left join y on (x1 = y1)) left join x xx(xx1,xx2)
! on (x1 = xx1) where (xx2 is not null);
!  x1 | x2 | y1 | y2  | xx1 | xx2 
! ----+----+----+-----+-----+-----
!   1 | 11 |  1 | 111 |   1 |  11
!   2 | 22 |  2 | 222 |   2 |  22
!   4 | 44 |  4 |     |   4 |  44
! (3 rows)
! 
! --
! -- regression test: check for bug with propagation of implied equality
! -- to outside an IN
! --
! select count(*) from tenk1 a where unique1 in
!   (select unique1 from tenk1 b join tenk1 c using (unique1)
!    where b.unique2 = 42);
!  count 
! -------
!      1
! (1 row)
! 
! --
! -- regression test: check for failure to generate a plan with multiple
! -- degenerate IN clauses
! --
! select count(*) from tenk1 x where
!   x.unique1 in (select a.f1 from int4_tbl a,float8_tbl b where a.f1=b.f1) and
!   x.unique1 = 0 and
!   x.unique1 in (select aa.f1 from int4_tbl aa,float8_tbl bb where aa.f1=bb.f1);
!  count 
! -------
!      1
! (1 row)
! 
! -- try that with GEQO too
! begin;
! set geqo = on;
! set geqo_threshold = 2;
! select count(*) from tenk1 x where
!   x.unique1 in (select a.f1 from int4_tbl a,float8_tbl b where a.f1=b.f1) and
!   x.unique1 = 0 and
!   x.unique1 in (select aa.f1 from int4_tbl aa,float8_tbl bb where aa.f1=bb.f1);
!  count 
! -------
!      1
! (1 row)
! 
! rollback;
! --
! -- Clean up
! --
! DROP TABLE t1;
! DROP TABLE t2;
! DROP TABLE t3;
! DROP TABLE J1_TBL;
! DROP TABLE J2_TBL;
! -- Both DELETE and UPDATE allow the specification of additional tables
! -- to "join" against to determine which rows should be modified.
! CREATE TEMP TABLE t1 (a int, b int);
! CREATE TEMP TABLE t2 (a int, b int);
! CREATE TEMP TABLE t3 (x int, y int);
! INSERT INTO t1 VALUES (5, 10);
! INSERT INTO t1 VALUES (15, 20);
! INSERT INTO t1 VALUES (100, 100);
! INSERT INTO t1 VALUES (200, 1000);
! INSERT INTO t2 VALUES (200, 2000);
! INSERT INTO t3 VALUES (5, 20);
! INSERT INTO t3 VALUES (6, 7);
! INSERT INTO t3 VALUES (7, 8);
! INSERT INTO t3 VALUES (500, 100);
! DELETE FROM t3 USING t1 table1 WHERE t3.x = table1.a;
! SELECT * FROM t3;
!   x  |  y  
! -----+-----
!    6 |   7
!    7 |   8
!  500 | 100
! (3 rows)
! 
! DELETE FROM t3 USING t1 JOIN t2 USING (a) WHERE t3.x > t1.a;
! SELECT * FROM t3;
!  x | y 
! ---+---
!  6 | 7
!  7 | 8
! (2 rows)
! 
! DELETE FROM t3 USING t3 t3_other WHERE t3.x = t3_other.x AND t3.y = t3_other.y;
! SELECT * FROM t3;
!  x | y 
! ---+---
! (0 rows)
! 
! -- Test join against inheritance tree
! create temp table t2a () inherits (t2);
! insert into t2a values (200, 2001);
! select * from t1 left join t2 on (t1.a = t2.a);
!   a  |  b   |  a  |  b   
! -----+------+-----+------
!    5 |   10 |     |     
!   15 |   20 |     |     
!  100 |  100 |     |     
!  200 | 1000 | 200 | 2000
!  200 | 1000 | 200 | 2001
! (5 rows)
! 
! --
! -- regression test for 8.1 merge right join bug
! --
! CREATE TEMP TABLE tt1 ( tt1_id int4, joincol int4 );
! INSERT INTO tt1 VALUES (1, 11);
! INSERT INTO tt1 VALUES (2, NULL);
! CREATE TEMP TABLE tt2 ( tt2_id int4, joincol int4 );
! INSERT INTO tt2 VALUES (21, 11);
! INSERT INTO tt2 VALUES (22, 11);
! set enable_hashjoin to off;
! set enable_nestloop to off;
! -- these should give the same results
! select tt1.*, tt2.* from tt1 left join tt2 on tt1.joincol = tt2.joincol;
!  tt1_id | joincol | tt2_id | joincol 
! --------+---------+--------+---------
!       1 |      11 |     21 |      11
!       1 |      11 |     22 |      11
!       2 |         |        |        
! (3 rows)
! 
! select tt1.*, tt2.* from tt2 right join tt1 on tt1.joincol = tt2.joincol;
!  tt1_id | joincol | tt2_id | joincol 
! --------+---------+--------+---------
!       1 |      11 |     21 |      11
!       1 |      11 |     22 |      11
!       2 |         |        |        
! (3 rows)
! 
! reset enable_hashjoin;
! reset enable_nestloop;
! --
! -- regression test for 8.2 bug with improper re-ordering of left joins
! --
! create temp table tt3(f1 int, f2 text);
! insert into tt3 select x, repeat('xyzzy', 100) from generate_series(1,10000) x;
! create index tt3i on tt3(f1);
! analyze tt3;
! create temp table tt4(f1 int);
! insert into tt4 values (0),(1),(9999);
! analyze tt4;
! SELECT a.f1
! FROM tt4 a
! LEFT JOIN (
!         SELECT b.f1
!         FROM tt3 b LEFT JOIN tt3 c ON (b.f1 = c.f1)
!         WHERE c.f1 IS NULL
! ) AS d ON (a.f1 = d.f1)
! WHERE d.f1 IS NULL;
!   f1  
! ------
!     0
!     1
!  9999
! (3 rows)
! 
! --
! -- regression test for problems of the sort depicted in bug #3494
! --
! create temp table tt5(f1 int, f2 int);
! create temp table tt6(f1 int, f2 int);
! insert into tt5 values(1, 10);
! insert into tt5 values(1, 11);
! insert into tt6 values(1, 9);
! insert into tt6 values(1, 2);
! insert into tt6 values(2, 9);
! select * from tt5,tt6 where tt5.f1 = tt6.f1 and tt5.f1 = tt5.f2 - tt6.f2;
!  f1 | f2 | f1 | f2 
! ----+----+----+----
!   1 | 10 |  1 |  9
! (1 row)
! 
! --
! -- regression test for problems of the sort depicted in bug #3588
! --
! create temp table xx (pkxx int);
! create temp table yy (pkyy int, pkxx int);
! insert into xx values (1);
! insert into xx values (2);
! insert into xx values (3);
! insert into yy values (101, 1);
! insert into yy values (201, 2);
! insert into yy values (301, NULL);
! select yy.pkyy as yy_pkyy, yy.pkxx as yy_pkxx, yya.pkyy as yya_pkyy,
!        xxa.pkxx as xxa_pkxx, xxb.pkxx as xxb_pkxx
! from yy
!      left join (SELECT * FROM yy where pkyy = 101) as yya ON yy.pkyy = yya.pkyy
!      left join xx xxa on yya.pkxx = xxa.pkxx
!      left join xx xxb on coalesce (xxa.pkxx, 1) = xxb.pkxx;
!  yy_pkyy | yy_pkxx | yya_pkyy | xxa_pkxx | xxb_pkxx 
! ---------+---------+----------+----------+----------
!      101 |       1 |      101 |        1 |        1
!      201 |       2 |          |          |        1
!      301 |         |          |          |        1
! (3 rows)
! 
! --
! -- regression test for improper pushing of constants across outer-join clauses
! -- (as seen in early 8.2.x releases)
! --
! create temp table zt1 (f1 int primary key);
! create temp table zt2 (f2 int primary key);
! create temp table zt3 (f3 int primary key);
! insert into zt1 values(53);
! insert into zt2 values(53);
! select * from
!   zt2 left join zt3 on (f2 = f3)
!       left join zt1 on (f3 = f1)
! where f2 = 53;
!  f2 | f3 | f1 
! ----+----+----
!  53 |    |   
! (1 row)
! 
! create temp view zv1 as select *,'dummy'::text AS junk from zt1;
! select * from
!   zt2 left join zt3 on (f2 = f3)
!       left join zv1 on (f3 = f1)
! where f2 = 53;
!  f2 | f3 | f1 | junk 
! ----+----+----+------
!  53 |    |    | 
! (1 row)
! 
! --
! -- regression test for improper extraction of OR indexqual conditions
! -- (as seen in early 8.3.x releases)
! --
! select a.unique2, a.ten, b.tenthous, b.unique2, b.hundred
! from tenk1 a left join tenk1 b on a.unique2 = b.tenthous
! where a.unique1 = 42 and
!       ((b.unique2 is null and a.ten = 2) or b.hundred = 3);
!  unique2 | ten | tenthous | unique2 | hundred 
! ---------+-----+----------+---------+---------
! (0 rows)
! 
! --
! -- test proper positioning of one-time quals in EXISTS (8.4devel bug)
! --
! prepare foo(bool) as
!   select count(*) from tenk1 a left join tenk1 b
!     on (a.unique2 = b.unique1 and exists
!         (select 1 from tenk1 c where c.thousand = b.unique2 and $1));
! execute foo(true);
!  count 
! -------
!  10000
! (1 row)
! 
! execute foo(false);
!  count 
! -------
!  10000
! (1 row)
! 
! --
! -- test for sane behavior with noncanonical merge clauses, per bug #4926
! --
! begin;
! set enable_mergejoin = 1;
! set enable_hashjoin = 0;
! set enable_nestloop = 0;
! create temp table a (i integer);
! create temp table b (x integer, y integer);
! select * from a left join b on i = x and i = y and x = i;
!  i | x | y 
! ---+---+---
! (0 rows)
! 
! rollback;
! --
! -- test NULL behavior of whole-row Vars, per bug #5025
! --
! select t1.q2, count(t2.*)
! from int8_tbl t1 left join int8_tbl t2 on (t1.q2 = t2.q1)
! group by t1.q2 order by 1;
!         q2         | count 
! -------------------+-------
!  -4567890123456789 |     0
!                123 |     2
!                456 |     0
!   4567890123456789 |     6
! (4 rows)
! 
! select t1.q2, count(t2.*)
! from int8_tbl t1 left join (select * from int8_tbl) t2 on (t1.q2 = t2.q1)
! group by t1.q2 order by 1;
!         q2         | count 
! -------------------+-------
!  -4567890123456789 |     0
!                123 |     2
!                456 |     0
!   4567890123456789 |     6
! (4 rows)
! 
! select t1.q2, count(t2.*)
! from int8_tbl t1 left join (select * from int8_tbl offset 0) t2 on (t1.q2 = t2.q1)
! group by t1.q2 order by 1;
!         q2         | count 
! -------------------+-------
!  -4567890123456789 |     0
!                123 |     2
!                456 |     0
!   4567890123456789 |     6
! (4 rows)
! 
! select t1.q2, count(t2.*)
! from int8_tbl t1 left join
!   (select q1, case when q2=1 then 1 else q2 end as q2 from int8_tbl) t2
!   on (t1.q2 = t2.q1)
! group by t1.q2 order by 1;
!         q2         | count 
! -------------------+-------
!  -4567890123456789 |     0
!                123 |     2
!                456 |     0
!   4567890123456789 |     6
! (4 rows)
! 
! --
! -- test incorrect failure to NULL pulled-up subexpressions
! --
! begin;
! create temp table a (
!      code char not null,
!      constraint a_pk primary key (code)
! );
! create temp table b (
!      a char not null,
!      num integer not null,
!      constraint b_pk primary key (a, num)
! );
! create temp table c (
!      name char not null,
!      a char,
!      constraint c_pk primary key (name)
! );
! insert into a (code) values ('p');
! insert into a (code) values ('q');
! insert into b (a, num) values ('p', 1);
! insert into b (a, num) values ('p', 2);
! insert into c (name, a) values ('A', 'p');
! insert into c (name, a) values ('B', 'q');
! insert into c (name, a) values ('C', null);
! select c.name, ss.code, ss.b_cnt, ss.const
! from c left join
!   (select a.code, coalesce(b_grp.cnt, 0) as b_cnt, -1 as const
!    from a left join
!      (select count(1) as cnt, b.a from b group by b.a) as b_grp
!      on a.code = b_grp.a
!   ) as ss
!   on (c.a = ss.code)
! order by c.name;
!  name | code | b_cnt | const 
! ------+------+-------+-------
!  A    | p    |     2 |    -1
!  B    | q    |     0 |    -1
!  C    |      |       |      
! (3 rows)
! 
! rollback;
! --
! -- test incorrect handling of placeholders that only appear in targetlists,
! -- per bug #6154
! --
! SELECT * FROM
! ( SELECT 1 as key1 ) sub1
! LEFT JOIN
! ( SELECT sub3.key3, sub4.value2, COALESCE(sub4.value2, 66) as value3 FROM
!     ( SELECT 1 as key3 ) sub3
!     LEFT JOIN
!     ( SELECT sub5.key5, COALESCE(sub6.value1, 1) as value2 FROM
!         ( SELECT 1 as key5 ) sub5
!         LEFT JOIN
!         ( SELECT 2 as key6, 42 as value1 ) sub6
!         ON sub5.key5 = sub6.key6
!     ) sub4
!     ON sub4.key5 = sub3.key3
! ) sub2
! ON sub1.key1 = sub2.key3;
!  key1 | key3 | value2 | value3 
! ------+------+--------+--------
!     1 |    1 |      1 |      1
! (1 row)
! 
! -- test the path using join aliases, too
! SELECT * FROM
! ( SELECT 1 as key1 ) sub1
! LEFT JOIN
! ( SELECT sub3.key3, value2, COALESCE(value2, 66) as value3 FROM
!     ( SELECT 1 as key3 ) sub3
!     LEFT JOIN
!     ( SELECT sub5.key5, COALESCE(sub6.value1, 1) as value2 FROM
!         ( SELECT 1 as key5 ) sub5
!         LEFT JOIN
!         ( SELECT 2 as key6, 42 as value1 ) sub6
!         ON sub5.key5 = sub6.key6
!     ) sub4
!     ON sub4.key5 = sub3.key3
! ) sub2
! ON sub1.key1 = sub2.key3;
!  key1 | key3 | value2 | value3 
! ------+------+--------+--------
!     1 |    1 |      1 |      1
! (1 row)
! 
! --
! -- test case where a PlaceHolderVar is used as a nestloop parameter
! --
! EXPLAIN (COSTS OFF)
! SELECT qq, unique1
!   FROM
!   ( SELECT COALESCE(q1, 0) AS qq FROM int8_tbl a ) AS ss1
!   FULL OUTER JOIN
!   ( SELECT COALESCE(q2, -1) AS qq FROM int8_tbl b ) AS ss2
!   USING (qq)
!   INNER JOIN tenk1 c ON qq = unique2;
!                                               QUERY PLAN                                               
! -------------------------------------------------------------------------------------------------------
!  Nested Loop
!    ->  Hash Full Join
!          Hash Cond: (COALESCE(a.q1, 0::bigint) = COALESCE(b.q2, (-1)::bigint))
!          ->  Seq Scan on int8_tbl a
!          ->  Hash
!                ->  Seq Scan on int8_tbl b
!    ->  Index Scan using tenk1_unique2 on tenk1 c
!          Index Cond: (unique2 = COALESCE((COALESCE(a.q1, 0::bigint)), (COALESCE(b.q2, (-1)::bigint))))
! (8 rows)
! 
! SELECT qq, unique1
!   FROM
!   ( SELECT COALESCE(q1, 0) AS qq FROM int8_tbl a ) AS ss1
!   FULL OUTER JOIN
!   ( SELECT COALESCE(q2, -1) AS qq FROM int8_tbl b ) AS ss2
!   USING (qq)
!   INNER JOIN tenk1 c ON qq = unique2;
!  qq  | unique1 
! -----+---------
!  123 |    4596
!  123 |    4596
!  456 |    7318
! (3 rows)
! 
! --
! -- nested nestloops can require nested PlaceHolderVars
! --
! create temp table nt1 (
!   id int primary key,
!   a1 boolean,
!   a2 boolean
! );
! create temp table nt2 (
!   id int primary key,
!   nt1_id int,
!   b1 boolean,
!   b2 boolean,
!   foreign key (nt1_id) references nt1(id)
! );
! create temp table nt3 (
!   id int primary key,
!   nt2_id int,
!   c1 boolean,
!   foreign key (nt2_id) references nt2(id)
! );
! insert into nt1 values (1,true,true);
! insert into nt1 values (2,true,false);
! insert into nt1 values (3,false,false);
! insert into nt2 values (1,1,true,true);
! insert into nt2 values (2,2,true,false);
! insert into nt2 values (3,3,false,false);
! insert into nt3 values (1,1,true);
! insert into nt3 values (2,2,false);
! insert into nt3 values (3,3,true);
! explain (costs off)
! select nt3.id
! from nt3 as nt3
!   left join
!     (select nt2.*, (nt2.b1 and ss1.a3) AS b3
!      from nt2 as nt2
!        left join
!          (select nt1.*, (nt1.id is not null) as a3 from nt1) as ss1
!          on ss1.id = nt2.nt1_id
!     ) as ss2
!     on ss2.id = nt3.nt2_id
! where nt3.id = 1 and ss2.b3;
!                   QUERY PLAN                   
! -----------------------------------------------
!  Nested Loop
!    ->  Nested Loop
!          ->  Index Scan using nt3_pkey on nt3
!                Index Cond: (id = 1)
!          ->  Index Scan using nt2_pkey on nt2
!                Index Cond: (id = nt3.nt2_id)
!    ->  Index Only Scan using nt1_pkey on nt1
!          Index Cond: (id = nt2.nt1_id)
!          Filter: (nt2.b1 AND (id IS NOT NULL))
! (9 rows)
! 
! select nt3.id
! from nt3 as nt3
!   left join
!     (select nt2.*, (nt2.b1 and ss1.a3) AS b3
!      from nt2 as nt2
!        left join
!          (select nt1.*, (nt1.id is not null) as a3 from nt1) as ss1
!          on ss1.id = nt2.nt1_id
!     ) as ss2
!     on ss2.id = nt3.nt2_id
! where nt3.id = 1 and ss2.b3;
!  id 
! ----
!   1
! (1 row)
! 
! --
! -- test case where a PlaceHolderVar is propagated into a subquery
! --
! explain (costs off)
! select * from
!   int8_tbl t1 left join
!   (select q1 as x, 42 as y from int8_tbl t2) ss
!   on t1.q2 = ss.x
! where
!   1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1)
! order by 1,2;
!                         QUERY PLAN                         
! -----------------------------------------------------------
!  Sort
!    Sort Key: t1.q1, t1.q2
!    ->  Hash Left Join
!          Hash Cond: (t1.q2 = t2.q1)
!          Filter: (1 = (SubPlan 1))
!          ->  Seq Scan on int8_tbl t1
!          ->  Hash
!                ->  Seq Scan on int8_tbl t2
!          SubPlan 1
!            ->  Limit
!                  ->  Result
!                        One-Time Filter: ((42) IS NOT NULL)
!                        ->  Seq Scan on int8_tbl t3
! (13 rows)
! 
! select * from
!   int8_tbl t1 left join
!   (select q1 as x, 42 as y from int8_tbl t2) ss
!   on t1.q2 = ss.x
! where
!   1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1)
! order by 1,2;
!         q1        |        q2        |        x         | y  
! ------------------+------------------+------------------+----
!               123 | 4567890123456789 | 4567890123456789 | 42
!               123 | 4567890123456789 | 4567890123456789 | 42
!               123 | 4567890123456789 | 4567890123456789 | 42
!  4567890123456789 |              123 |              123 | 42
!  4567890123456789 |              123 |              123 | 42
!  4567890123456789 | 4567890123456789 | 4567890123456789 | 42
!  4567890123456789 | 4567890123456789 | 4567890123456789 | 42
!  4567890123456789 | 4567890123456789 | 4567890123456789 | 42
! (8 rows)
! 
! --
! -- test the corner cases FULL JOIN ON TRUE and FULL JOIN ON FALSE
! --
! select * from int4_tbl a full join int4_tbl b on true;
!      f1      |     f1      
! -------------+-------------
!            0 |           0
!            0 |      123456
!            0 |     -123456
!            0 |  2147483647
!            0 | -2147483647
!       123456 |           0
!       123456 |      123456
!       123456 |     -123456
!       123456 |  2147483647
!       123456 | -2147483647
!      -123456 |           0
!      -123456 |      123456
!      -123456 |     -123456
!      -123456 |  2147483647
!      -123456 | -2147483647
!   2147483647 |           0
!   2147483647 |      123456
!   2147483647 |     -123456
!   2147483647 |  2147483647
!   2147483647 | -2147483647
!  -2147483647 |           0
!  -2147483647 |      123456
!  -2147483647 |     -123456
!  -2147483647 |  2147483647
!  -2147483647 | -2147483647
! (25 rows)
! 
! select * from int4_tbl a full join int4_tbl b on false;
!      f1      |     f1      
! -------------+-------------
!              |           0
!              |      123456
!              |     -123456
!              |  2147483647
!              | -2147483647
!            0 |            
!       123456 |            
!      -123456 |            
!   2147483647 |            
!  -2147483647 |            
! (10 rows)
! 
! --
! -- test for ability to use a cartesian join when necessary
! --
! explain (costs off)
! select * from
!   tenk1 join int4_tbl on f1 = twothousand,
!   int4(sin(1)) q1,
!   int4(sin(0)) q2
! where q1 = thousand or q2 = thousand;
!                                QUERY PLAN                               
! ------------------------------------------------------------------------
!  Hash Join
!    Hash Cond: (tenk1.twothousand = int4_tbl.f1)
!    ->  Nested Loop
!          ->  Nested Loop
!                ->  Function Scan on q1
!                ->  Function Scan on q2
!          ->  Bitmap Heap Scan on tenk1
!                Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand))
!                ->  BitmapOr
!                      ->  Bitmap Index Scan on tenk1_thous_tenthous
!                            Index Cond: (q1.q1 = thousand)
!                      ->  Bitmap Index Scan on tenk1_thous_tenthous
!                            Index Cond: (q2.q2 = thousand)
!    ->  Hash
!          ->  Seq Scan on int4_tbl
! (15 rows)
! 
! explain (costs off)
! select * from
!   tenk1 join int4_tbl on f1 = twothousand,
!   int4(sin(1)) q1,
!   int4(sin(0)) q2
! where thousand = (q1 + q2);
!                           QUERY PLAN                          
! --------------------------------------------------------------
!  Hash Join
!    Hash Cond: (tenk1.twothousand = int4_tbl.f1)
!    ->  Nested Loop
!          ->  Nested Loop
!                ->  Function Scan on q1
!                ->  Function Scan on q2
!          ->  Bitmap Heap Scan on tenk1
!                Recheck Cond: (thousand = (q1.q1 + q2.q2))
!                ->  Bitmap Index Scan on tenk1_thous_tenthous
!                      Index Cond: (thousand = (q1.q1 + q2.q2))
!    ->  Hash
!          ->  Seq Scan on int4_tbl
! (12 rows)
! 
! --
! -- test extraction of restriction OR clauses from join OR clause
! -- (we used to only do this for indexable clauses)
! --
! explain (costs off)
! select * from tenk1 a join tenk1 b on
!   (a.unique1 = 1 and b.unique1 = 2) or (a.unique2 = 3 and b.hundred = 4);
!                                            QUERY PLAN                                            
! -------------------------------------------------------------------------------------------------
!  Nested Loop
!    Join Filter: (((a.unique1 = 1) AND (b.unique1 = 2)) OR ((a.unique2 = 3) AND (b.hundred = 4)))
!    ->  Bitmap Heap Scan on tenk1 b
!          Recheck Cond: ((unique1 = 2) OR (hundred = 4))
!          ->  BitmapOr
!                ->  Bitmap Index Scan on tenk1_unique1
!                      Index Cond: (unique1 = 2)
!                ->  Bitmap Index Scan on tenk1_hundred
!                      Index Cond: (hundred = 4)
!    ->  Materialize
!          ->  Bitmap Heap Scan on tenk1 a
!                Recheck Cond: ((unique1 = 1) OR (unique2 = 3))
!                ->  BitmapOr
!                      ->  Bitmap Index Scan on tenk1_unique1
!                            Index Cond: (unique1 = 1)
!                      ->  Bitmap Index Scan on tenk1_unique2
!                            Index Cond: (unique2 = 3)
! (17 rows)
! 
! explain (costs off)
! select * from tenk1 a join tenk1 b on
!   (a.unique1 = 1 and b.unique1 = 2) or (a.unique2 = 3 and b.ten = 4);
!                                          QUERY PLAN                                          
! ---------------------------------------------------------------------------------------------
!  Nested Loop
!    Join Filter: (((a.unique1 = 1) AND (b.unique1 = 2)) OR ((a.unique2 = 3) AND (b.ten = 4)))
!    ->  Seq Scan on tenk1 b
!          Filter: ((unique1 = 2) OR (ten = 4))
!    ->  Materialize
!          ->  Bitmap Heap Scan on tenk1 a
!                Recheck Cond: ((unique1 = 1) OR (unique2 = 3))
!                ->  BitmapOr
!                      ->  Bitmap Index Scan on tenk1_unique1
!                            Index Cond: (unique1 = 1)
!                      ->  Bitmap Index Scan on tenk1_unique2
!                            Index Cond: (unique2 = 3)
! (12 rows)
! 
! explain (costs off)
! select * from tenk1 a join tenk1 b on
!   (a.unique1 = 1 and b.unique1 = 2) or
!   ((a.unique2 = 3 or a.unique2 = 7) and b.hundred = 4);
!                                                       QUERY PLAN                                                      
! ----------------------------------------------------------------------------------------------------------------------
!  Nested Loop
!    Join Filter: (((a.unique1 = 1) AND (b.unique1 = 2)) OR (((a.unique2 = 3) OR (a.unique2 = 7)) AND (b.hundred = 4)))
!    ->  Bitmap Heap Scan on tenk1 b
!          Recheck Cond: ((unique1 = 2) OR (hundred = 4))
!          ->  BitmapOr
!                ->  Bitmap Index Scan on tenk1_unique1
!                      Index Cond: (unique1 = 2)
!                ->  Bitmap Index Scan on tenk1_hundred
!                      Index Cond: (hundred = 4)
!    ->  Materialize
!          ->  Bitmap Heap Scan on tenk1 a
!                Recheck Cond: ((unique1 = 1) OR (unique2 = 3) OR (unique2 = 7))
!                ->  BitmapOr
!                      ->  Bitmap Index Scan on tenk1_unique1
!                            Index Cond: (unique1 = 1)
!                      ->  Bitmap Index Scan on tenk1_unique2
!                            Index Cond: (unique2 = 3)
!                      ->  Bitmap Index Scan on tenk1_unique2
!                            Index Cond: (unique2 = 7)
! (19 rows)
! 
! --
! -- test placement of movable quals in a parameterized join tree
! --
! explain (costs off)
! select * from tenk1 t1 left join
!   (tenk1 t2 join tenk1 t3 on t2.thousand = t3.unique2)
!   on t1.hundred = t2.hundred and t1.ten = t3.ten
! where t1.unique1 = 1;
!                        QUERY PLAN                       
! --------------------------------------------------------
!  Nested Loop Left Join
!    ->  Index Scan using tenk1_unique1 on tenk1 t1
!          Index Cond: (unique1 = 1)
!    ->  Nested Loop
!          Join Filter: (t1.ten = t3.ten)
!          ->  Bitmap Heap Scan on tenk1 t2
!                Recheck Cond: (t1.hundred = hundred)
!                ->  Bitmap Index Scan on tenk1_hundred
!                      Index Cond: (t1.hundred = hundred)
!          ->  Index Scan using tenk1_unique2 on tenk1 t3
!                Index Cond: (unique2 = t2.thousand)
! (11 rows)
! 
! explain (costs off)
! select * from tenk1 t1 left join
!   (tenk1 t2 join tenk1 t3 on t2.thousand = t3.unique2)
!   on t1.hundred = t2.hundred and t1.ten + t2.ten = t3.ten
! where t1.unique1 = 1;
!                        QUERY PLAN                       
! --------------------------------------------------------
!  Nested Loop Left Join
!    ->  Index Scan using tenk1_unique1 on tenk1 t1
!          Index Cond: (unique1 = 1)
!    ->  Nested Loop
!          Join Filter: ((t1.ten + t2.ten) = t3.ten)
!          ->  Bitmap Heap Scan on tenk1 t2
!                Recheck Cond: (t1.hundred = hundred)
!                ->  Bitmap Index Scan on tenk1_hundred
!                      Index Cond: (t1.hundred = hundred)
!          ->  Index Scan using tenk1_unique2 on tenk1 t3
!                Index Cond: (unique2 = t2.thousand)
! (11 rows)
! 
! explain (costs off)
! select count(*) from
!   tenk1 a join tenk1 b on a.unique1 = b.unique2
!   left join tenk1 c on a.unique2 = b.unique1 and c.thousand = a.thousand
!   join int4_tbl on b.thousand = f1;
!                                QUERY PLAN                                
! -------------------------------------------------------------------------
!  Aggregate
!    ->  Nested Loop Left Join
!          Join Filter: (a.unique2 = b.unique1)
!          ->  Nested Loop
!                ->  Nested Loop
!                      ->  Seq Scan on int4_tbl
!                      ->  Bitmap Heap Scan on tenk1 b
!                            Recheck Cond: (thousand = int4_tbl.f1)
!                            ->  Bitmap Index Scan on tenk1_thous_tenthous
!                                  Index Cond: (thousand = int4_tbl.f1)
!                ->  Index Scan using tenk1_unique1 on tenk1 a
!                      Index Cond: (unique1 = b.unique2)
!          ->  Index Only Scan using tenk1_thous_tenthous on tenk1 c
!                Index Cond: (thousand = a.thousand)
! (14 rows)
! 
! select count(*) from
!   tenk1 a join tenk1 b on a.unique1 = b.unique2
!   left join tenk1 c on a.unique2 = b.unique1 and c.thousand = a.thousand
!   join int4_tbl on b.thousand = f1;
!  count 
! -------
!     10
! (1 row)
! 
! explain (costs off)
! select b.unique1 from
!   tenk1 a join tenk1 b on a.unique1 = b.unique2
!   left join tenk1 c on b.unique1 = 42 and c.thousand = a.thousand
!   join int4_tbl i1 on b.thousand = f1
!   right join int4_tbl i2 on i2.f1 = b.tenthous
!   order by 1;
!                                        QUERY PLAN                                        
! -----------------------------------------------------------------------------------------
!  Sort
!    Sort Key: b.unique1
!    ->  Nested Loop Left Join
!          ->  Seq Scan on int4_tbl i2
!          ->  Nested Loop Left Join
!                Join Filter: (b.unique1 = 42)
!                ->  Nested Loop
!                      ->  Nested Loop
!                            ->  Seq Scan on int4_tbl i1
!                            ->  Index Scan using tenk1_thous_tenthous on tenk1 b
!                                  Index Cond: ((thousand = i1.f1) AND (i2.f1 = tenthous))
!                      ->  Index Scan using tenk1_unique1 on tenk1 a
!                            Index Cond: (unique1 = b.unique2)
!                ->  Index Only Scan using tenk1_thous_tenthous on tenk1 c
!                      Index Cond: (thousand = a.thousand)
! (15 rows)
! 
! select b.unique1 from
!   tenk1 a join tenk1 b on a.unique1 = b.unique2
!   left join tenk1 c on b.unique1 = 42 and c.thousand = a.thousand
!   join int4_tbl i1 on b.thousand = f1
!   right join int4_tbl i2 on i2.f1 = b.tenthous
!   order by 1;
!  unique1 
! ---------
!        0
!         
!         
!         
!         
! (5 rows)
! 
! explain (costs off)
! select * from
! (
!   select unique1, q1, coalesce(unique1, -1) + q1 as fault
!   from int8_tbl left join tenk1 on (q2 = unique2)
! ) ss
! where fault = 122
! order by fault;
!                            QUERY PLAN                            
! -----------------------------------------------------------------
!  Nested Loop Left Join
!    Filter: ((COALESCE(tenk1.unique1, (-1)) + int8_tbl.q1) = 122)
!    ->  Seq Scan on int8_tbl
!    ->  Index Scan using tenk1_unique2 on tenk1
!          Index Cond: (int8_tbl.q2 = unique2)
! (5 rows)
! 
! select * from
! (
!   select unique1, q1, coalesce(unique1, -1) + q1 as fault
!   from int8_tbl left join tenk1 on (q2 = unique2)
! ) ss
! where fault = 122
! order by fault;
!  unique1 | q1  | fault 
! ---------+-----+-------
!          | 123 |   122
! (1 row)
! 
! --
! -- test handling of potential equivalence clauses above outer joins
! --
! explain (costs off)
! select q1, unique2, thousand, hundred
!   from int8_tbl a left join tenk1 b on q1 = unique2
!   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
!                                       QUERY PLAN                                      
! --------------------------------------------------------------------------------------
!  Nested Loop Left Join
!    Filter: ((COALESCE(b.thousand, 123) = a.q1) AND (a.q1 = COALESCE(b.hundred, 123)))
!    ->  Seq Scan on int8_tbl a
!    ->  Index Scan using tenk1_unique2 on tenk1 b
!          Index Cond: (a.q1 = unique2)
! (5 rows)
! 
! select q1, unique2, thousand, hundred
!   from int8_tbl a left join tenk1 b on q1 = unique2
!   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
!  q1 | unique2 | thousand | hundred 
! ----+---------+----------+---------
! (0 rows)
! 
! explain (costs off)
! select f1, unique2, case when unique2 is null then f1 else 0 end
!   from int4_tbl a left join tenk1 b on f1 = unique2
!   where (case when unique2 is null then f1 else 0 end) = 0;
!                              QUERY PLAN                             
! --------------------------------------------------------------------
!  Nested Loop Left Join
!    Filter: (CASE WHEN (b.unique2 IS NULL) THEN a.f1 ELSE 0 END = 0)
!    ->  Seq Scan on int4_tbl a
!    ->  Index Only Scan using tenk1_unique2 on tenk1 b
!          Index Cond: (unique2 = a.f1)
! (5 rows)
! 
! select f1, unique2, case when unique2 is null then f1 else 0 end
!   from int4_tbl a left join tenk1 b on f1 = unique2
!   where (case when unique2 is null then f1 else 0 end) = 0;
!  f1 | unique2 | case 
! ----+---------+------
!   0 |       0 |    0
! (1 row)
! 
! --
! -- another case with equivalence clauses above outer joins (bug #8591)
! --
! explain (costs off)
! select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
!   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
!   where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
!                                          QUERY PLAN                                          
! ---------------------------------------------------------------------------------------------
!  Nested Loop Left Join
!    ->  Nested Loop Left Join
!          Filter: (COALESCE(b.twothousand, a.twothousand) = 44)
!          ->  Index Scan using tenk1_unique2 on tenk1 a
!                Index Cond: (unique2 = 5530)
!          ->  Bitmap Heap Scan on tenk1 b
!                Recheck Cond: (thousand = a.unique1)
!                ->  Bitmap Index Scan on tenk1_thous_tenthous
!                      Index Cond: (thousand = a.unique1)
!    ->  Index Scan using tenk1_unique2 on tenk1 c
!          Index Cond: ((unique2 = COALESCE(b.twothousand, a.twothousand)) AND (unique2 = 44))
! (11 rows)
! 
! select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
!   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
!   where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
!  unique1 | unique1 | unique1 | coalesce 
! ---------+---------+---------+----------
! (0 rows)
! 
! --
! -- check handling of join aliases when flattening multiple levels of subquery
! --
! explain (verbose, costs off)
! select foo1.join_key as foo1_id, foo3.join_key AS foo3_id, bug_field from
!   (values (0),(1)) foo1(join_key)
! left join
!   (select join_key, bug_field from
!     (select ss1.join_key, ss1.bug_field from
!       (select f1 as join_key, 666 as bug_field from int4_tbl i1) ss1
!     ) foo2
!    left join
!     (select unique2 as join_key from tenk1 i2) ss2
!    using (join_key)
!   ) foo3
! using (join_key);
!                                 QUERY PLAN                                
! --------------------------------------------------------------------------
!  Nested Loop Left Join
!    Output: "*VALUES*".column1, i1.f1, (666)
!    Join Filter: ("*VALUES*".column1 = i1.f1)
!    ->  Values Scan on "*VALUES*"
!          Output: "*VALUES*".column1
!    ->  Materialize
!          Output: i1.f1, (666)
!          ->  Nested Loop Left Join
!                Output: i1.f1, 666
!                ->  Seq Scan on public.int4_tbl i1
!                      Output: i1.f1
!                ->  Index Only Scan using tenk1_unique2 on public.tenk1 i2
!                      Output: i2.unique2
!                      Index Cond: (i2.unique2 = i1.f1)
! (14 rows)
! 
! select foo1.join_key as foo1_id, foo3.join_key AS foo3_id, bug_field from
!   (values (0),(1)) foo1(join_key)
! left join
!   (select join_key, bug_field from
!     (select ss1.join_key, ss1.bug_field from
!       (select f1 as join_key, 666 as bug_field from int4_tbl i1) ss1
!     ) foo2
!    left join
!     (select unique2 as join_key from tenk1 i2) ss2
!    using (join_key)
!   ) foo3
! using (join_key);
!  foo1_id | foo3_id | bug_field 
! ---------+---------+-----------
!        0 |       0 |       666
!        1 |         |          
! (2 rows)
! 
! --
! -- test ability to push constants through outer join clauses
! --
! explain (costs off)
!   select * from int4_tbl a left join tenk1 b on f1 = unique2 where f1 = 0;
!                    QUERY PLAN                    
! -------------------------------------------------
!  Nested Loop Left Join
!    Join Filter: (a.f1 = b.unique2)
!    ->  Seq Scan on int4_tbl a
!          Filter: (f1 = 0)
!    ->  Index Scan using tenk1_unique2 on tenk1 b
!          Index Cond: (unique2 = 0)
! (6 rows)
! 
! explain (costs off)
!   select * from tenk1 a full join tenk1 b using(unique2) where unique2 = 42;
!                    QUERY PLAN                    
! -------------------------------------------------
!  Merge Full Join
!    Merge Cond: (a.unique2 = b.unique2)
!    ->  Index Scan using tenk1_unique2 on tenk1 a
!          Index Cond: (unique2 = 42)
!    ->  Index Scan using tenk1_unique2 on tenk1 b
!          Index Cond: (unique2 = 42)
! (6 rows)
! 
! --
! -- test join removal
! --
! begin;
! CREATE TEMP TABLE a (id int PRIMARY KEY, b_id int);
! CREATE TEMP TABLE b (id int PRIMARY KEY, c_id int);
! CREATE TEMP TABLE c (id int PRIMARY KEY);
! CREATE TEMP TABLE d (a int, b int);
! INSERT INTO a VALUES (0, 0), (1, NULL);
! INSERT INTO b VALUES (0, 0), (1, NULL);
! INSERT INTO c VALUES (0), (1);
! INSERT INTO d VALUES (1,3), (2,2), (3,1);
! -- all three cases should be optimizable into a simple seqscan
! explain (costs off) SELECT a.* FROM a LEFT JOIN b ON a.b_id = b.id;
!   QUERY PLAN   
! ---------------
!  Seq Scan on a
! (1 row)
! 
! explain (costs off) SELECT b.* FROM b LEFT JOIN c ON b.c_id = c.id;
!   QUERY PLAN   
! ---------------
!  Seq Scan on b
! (1 row)
! 
! explain (costs off)
!   SELECT a.* FROM a LEFT JOIN (b left join c on b.c_id = c.id)
!   ON (a.b_id = b.id);
!   QUERY PLAN   
! ---------------
!  Seq Scan on a
! (1 row)
! 
! -- check optimization of outer join within another special join
! explain (costs off)
! select id from a where id in (
! 	select b.id from b left join c on b.id = c.id
! );
!          QUERY PLAN         
! ----------------------------
!  Hash Semi Join
!    Hash Cond: (a.id = b.id)
!    ->  Seq Scan on a
!    ->  Hash
!          ->  Seq Scan on b
! (5 rows)
! 
! -- check that join removal works for a left join when joining a subquery
! -- that is guaranteed to be unique by its GROUP BY clause
! explain (costs off)
! select d.* from d left join (select * from b group by b.id, b.c_id) s
!   on d.a = s.id and d.b = s.c_id;
!   QUERY PLAN   
! ---------------
!  Seq Scan on d
! (1 row)
! 
! -- similarly, but keying off a DISTINCT clause
! explain (costs off)
! select d.* from d left join (select distinct * from b) s
!   on d.a = s.id and d.b = s.c_id;
!   QUERY PLAN   
! ---------------
!  Seq Scan on d
! (1 row)
! 
! -- join removal is not possible when the GROUP BY contains a column that is
! -- not in the join condition
! explain (costs off)
! select d.* from d left join (select * from b group by b.id, b.c_id) s
!   on d.a = s.id;
!                  QUERY PLAN                  
! ---------------------------------------------
!  Merge Left Join
!    Merge Cond: (d.a = s.id)
!    ->  Sort
!          Sort Key: d.a
!          ->  Seq Scan on d
!    ->  Sort
!          Sort Key: s.id
!          ->  Subquery Scan on s
!                ->  HashAggregate
!                      Group Key: b.id, b.c_id
!                      ->  Seq Scan on b
! (11 rows)
! 
! -- similarly, but keying off a DISTINCT clause
! explain (costs off)
! select d.* from d left join (select distinct * from b) s
!   on d.a = s.id;
!                  QUERY PLAN                  
! ---------------------------------------------
!  Merge Left Join
!    Merge Cond: (d.a = s.id)
!    ->  Sort
!          Sort Key: d.a
!          ->  Seq Scan on d
!    ->  Sort
!          Sort Key: s.id
!          ->  Subquery Scan on s
!                ->  HashAggregate
!                      Group Key: b.id, b.c_id
!                      ->  Seq Scan on b
! (11 rows)
! 
! -- check join removal works when uniqueness of the join condition is enforced
! -- by a UNION
! explain (costs off)
! select d.* from d left join (select id from a union select id from b) s
!   on d.a = s.id;
!   QUERY PLAN   
! ---------------
!  Seq Scan on d
! (1 row)
! 
! -- check join removal with a cross-type comparison operator
! explain (costs off)
! select i8.* from int8_tbl i8 left join (select f1 from int4_tbl group by f1) i4
!   on i8.q1 = i4.f1;
!        QUERY PLAN        
! -------------------------
!  Seq Scan on int8_tbl i8
! (1 row)
! 
! rollback;
! create temp table parent (k int primary key, pd int);
! create temp table child (k int unique, cd int);
! insert into parent values (1, 10), (2, 20), (3, 30);
! insert into child values (1, 100), (4, 400);
! -- this case is optimizable
! select p.* from parent p left join child c on (p.k = c.k);
!  k | pd 
! ---+----
!  1 | 10
!  2 | 20
!  3 | 30
! (3 rows)
! 
! explain (costs off)
!   select p.* from parent p left join child c on (p.k = c.k);
!       QUERY PLAN      
! ----------------------
!  Seq Scan on parent p
! (1 row)
! 
! -- this case is not
! select p.*, linked from parent p
!   left join (select c.*, true as linked from child c) as ss
!   on (p.k = ss.k);
!  k | pd | linked 
! ---+----+--------
!  1 | 10 | t
!  2 | 20 | 
!  3 | 30 | 
! (3 rows)
! 
! explain (costs off)
!   select p.*, linked from parent p
!     left join (select c.*, true as linked from child c) as ss
!     on (p.k = ss.k);
!            QUERY PLAN            
! ---------------------------------
!  Hash Left Join
!    Hash Cond: (p.k = c.k)
!    ->  Seq Scan on parent p
!    ->  Hash
!          ->  Seq Scan on child c
! (5 rows)
! 
! -- check for a 9.0rc1 bug: join removal breaks pseudoconstant qual handling
! select p.* from
!   parent p left join child c on (p.k = c.k)
!   where p.k = 1 and p.k = 2;
!  k | pd 
! ---+----
! (0 rows)
! 
! explain (costs off)
! select p.* from
!   parent p left join child c on (p.k = c.k)
!   where p.k = 1 and p.k = 2;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Result
!    One-Time Filter: false
!    ->  Index Scan using parent_pkey on parent p
!          Index Cond: (k = 1)
! (4 rows)
! 
! select p.* from
!   (parent p left join child c on (p.k = c.k)) join parent x on p.k = x.k
!   where p.k = 1 and p.k = 2;
!  k | pd 
! ---+----
! (0 rows)
! 
! explain (costs off)
! select p.* from
!   (parent p left join child c on (p.k = c.k)) join parent x on p.k = x.k
!   where p.k = 1 and p.k = 2;
!         QUERY PLAN        
! --------------------------
!  Result
!    One-Time Filter: false
! (2 rows)
! 
! -- bug 5255: this is not optimizable by join removal
! begin;
! CREATE TEMP TABLE a (id int PRIMARY KEY);
! CREATE TEMP TABLE b (id int PRIMARY KEY, a_id int);
! INSERT INTO a VALUES (0), (1);
! INSERT INTO b VALUES (0, 0), (1, NULL);
! SELECT * FROM b LEFT JOIN a ON (b.a_id = a.id) WHERE (a.id IS NULL OR a.id > 0);
!  id | a_id | id 
! ----+------+----
!   1 |      |   
! (1 row)
! 
! SELECT b.* FROM b LEFT JOIN a ON (b.a_id = a.id) WHERE (a.id IS NULL OR a.id > 0);
!  id | a_id 
! ----+------
!   1 |     
! (1 row)
! 
! rollback;
! -- another join removal bug: this is not optimizable, either
! begin;
! create temp table innertab (id int8 primary key, dat1 int8);
! insert into innertab values(123, 42);
! SELECT * FROM
!     (SELECT 1 AS x) ss1
!   LEFT JOIN
!     (SELECT q1, q2, COALESCE(dat1, q1) AS y
!      FROM int8_tbl LEFT JOIN innertab ON q2 = id) ss2
!   ON true;
!  x |        q1        |        q2         |        y         
! ---+------------------+-------------------+------------------
!  1 |              123 |               456 |              123
!  1 |              123 |  4567890123456789 |              123
!  1 | 4567890123456789 |               123 |               42
!  1 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  1 | 4567890123456789 | -4567890123456789 | 4567890123456789
! (5 rows)
! 
! rollback;
! -- bug #8444: we've historically allowed duplicate aliases within aliased JOINs
! select * from
!   int8_tbl x join (int4_tbl x cross join int4_tbl y) j on q1 = f1; -- error
! ERROR:  column reference "f1" is ambiguous
! LINE 2: ..._tbl x join (int4_tbl x cross join int4_tbl y) j on q1 = f1;
!                                                                     ^
! select * from
!   int8_tbl x join (int4_tbl x cross join int4_tbl y) j on q1 = y.f1; -- error
! ERROR:  invalid reference to FROM-clause entry for table "y"
! LINE 2: ...bl x join (int4_tbl x cross join int4_tbl y) j on q1 = y.f1;
!                                                                   ^
! HINT:  There is an entry for table "y", but it cannot be referenced from this part of the query.
! select * from
!   int8_tbl x join (int4_tbl x cross join int4_tbl y(ff)) j on q1 = f1; -- ok
!  q1 | q2 | f1 | ff 
! ----+----+----+----
! (0 rows)
! 
! --
! -- Test LATERAL
! --
! select unique2, x.*
! from tenk1 a, lateral (select * from int4_tbl b where f1 = a.unique1) x;
!  unique2 | f1 
! ---------+----
!     9998 |  0
! (1 row)
! 
! explain (costs off)
!   select unique2, x.*
!   from tenk1 a, lateral (select * from int4_tbl b where f1 = a.unique1) x;
!                    QUERY PLAN                    
! -------------------------------------------------
!  Nested Loop
!    ->  Seq Scan on int4_tbl b
!    ->  Index Scan using tenk1_unique1 on tenk1 a
!          Index Cond: (unique1 = b.f1)
! (4 rows)
! 
! select unique2, x.*
! from int4_tbl x, lateral (select unique2 from tenk1 where f1 = unique1) ss;
!  unique2 | f1 
! ---------+----
!     9998 |  0
! (1 row)
! 
! explain (costs off)
!   select unique2, x.*
!   from int4_tbl x, lateral (select unique2 from tenk1 where f1 = unique1) ss;
!                   QUERY PLAN                   
! -----------------------------------------------
!  Nested Loop
!    ->  Seq Scan on int4_tbl x
!    ->  Index Scan using tenk1_unique1 on tenk1
!          Index Cond: (unique1 = x.f1)
! (4 rows)
! 
! explain (costs off)
!   select unique2, x.*
!   from int4_tbl x cross join lateral (select unique2 from tenk1 where f1 = unique1) ss;
!                   QUERY PLAN                   
! -----------------------------------------------
!  Nested Loop
!    ->  Seq Scan on int4_tbl x
!    ->  Index Scan using tenk1_unique1 on tenk1
!          Index Cond: (unique1 = x.f1)
! (4 rows)
! 
! select unique2, x.*
! from int4_tbl x left join lateral (select unique1, unique2 from tenk1 where f1 = unique1) ss on true;
!  unique2 |     f1      
! ---------+-------------
!     9998 |           0
!          |      123456
!          |     -123456
!          |  2147483647
!          | -2147483647
! (5 rows)
! 
! explain (costs off)
!   select unique2, x.*
!   from int4_tbl x left join lateral (select unique1, unique2 from tenk1 where f1 = unique1) ss on true;
!                   QUERY PLAN                   
! -----------------------------------------------
!  Nested Loop Left Join
!    ->  Seq Scan on int4_tbl x
!    ->  Index Scan using tenk1_unique1 on tenk1
!          Index Cond: (x.f1 = unique1)
! (4 rows)
! 
! -- check scoping of lateral versus parent references
! -- the first of these should return int8_tbl.q2, the second int8_tbl.q1
! select *, (select r from (select q1 as q2) x, (select q2 as r) y) from int8_tbl;
!         q1        |        q2         |         r         
! ------------------+-------------------+-------------------
!               123 |               456 |               456
!               123 |  4567890123456789 |  4567890123456789
!  4567890123456789 |               123 |               123
!  4567890123456789 |  4567890123456789 |  4567890123456789
!  4567890123456789 | -4567890123456789 | -4567890123456789
! (5 rows)
! 
! select *, (select r from (select q1 as q2) x, lateral (select q2 as r) y) from int8_tbl;
!         q1        |        q2         |        r         
! ------------------+-------------------+------------------
!               123 |               456 |              123
!               123 |  4567890123456789 |              123
!  4567890123456789 |               123 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789
! (5 rows)
! 
! -- lateral with function in FROM
! select count(*) from tenk1 a, lateral generate_series(1,two) g;
!  count 
! -------
!   5000
! (1 row)
! 
! explain (costs off)
!   select count(*) from tenk1 a, lateral generate_series(1,two) g;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Aggregate
!    ->  Nested Loop
!          ->  Seq Scan on tenk1 a
!          ->  Function Scan on generate_series g
! (4 rows)
! 
! explain (costs off)
!   select count(*) from tenk1 a cross join lateral generate_series(1,two) g;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Aggregate
!    ->  Nested Loop
!          ->  Seq Scan on tenk1 a
!          ->  Function Scan on generate_series g
! (4 rows)
! 
! -- don't need the explicit LATERAL keyword for functions
! explain (costs off)
!   select count(*) from tenk1 a, generate_series(1,two) g;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Aggregate
!    ->  Nested Loop
!          ->  Seq Scan on tenk1 a
!          ->  Function Scan on generate_series g
! (4 rows)
! 
! -- lateral with UNION ALL subselect
! explain (costs off)
!   select * from generate_series(100,200) g,
!     lateral (select * from int8_tbl a where g = q1 union all
!              select * from int8_tbl b where g = q2) ss;
!                 QUERY PLAN                
! ------------------------------------------
!  Nested Loop
!    ->  Function Scan on generate_series g
!    ->  Append
!          ->  Seq Scan on int8_tbl a
!                Filter: (g.g = q1)
!          ->  Seq Scan on int8_tbl b
!                Filter: (g.g = q2)
! (7 rows)
! 
! select * from generate_series(100,200) g,
!   lateral (select * from int8_tbl a where g = q1 union all
!            select * from int8_tbl b where g = q2) ss;
!   g  |        q1        |        q2        
! -----+------------------+------------------
!  123 |              123 |              456
!  123 |              123 | 4567890123456789
!  123 | 4567890123456789 |              123
! (3 rows)
! 
! -- lateral with VALUES
! explain (costs off)
!   select count(*) from tenk1 a,
!     tenk1 b join lateral (values(a.unique1)) ss(x) on b.unique2 = ss.x;
!                             QUERY PLAN                            
! ------------------------------------------------------------------
!  Aggregate
!    ->  Hash Join
!          Hash Cond: ("*VALUES*".column1 = b.unique2)
!          ->  Nested Loop
!                ->  Index Only Scan using tenk1_unique1 on tenk1 a
!                ->  Values Scan on "*VALUES*"
!          ->  Hash
!                ->  Index Only Scan using tenk1_unique2 on tenk1 b
! (8 rows)
! 
! select count(*) from tenk1 a,
!   tenk1 b join lateral (values(a.unique1)) ss(x) on b.unique2 = ss.x;
!  count 
! -------
!  10000
! (1 row)
! 
! -- lateral injecting a strange outer join condition
! explain (costs off)
!   select * from int8_tbl a,
!     int8_tbl x left join lateral (select a.q1 from int4_tbl y) ss(z)
!       on x.q2 = ss.z;
!                 QUERY PLAN                
! ------------------------------------------
!  Nested Loop
!    ->  Seq Scan on int8_tbl a
!    ->  Hash Left Join
!          Hash Cond: (x.q2 = (a.q1))
!          ->  Seq Scan on int8_tbl x
!          ->  Hash
!                ->  Seq Scan on int4_tbl y
! (7 rows)
! 
! select * from int8_tbl a,
!   int8_tbl x left join lateral (select a.q1 from int4_tbl y) ss(z)
!     on x.q2 = ss.z;
!         q1        |        q2         |        q1        |        q2         |        z         
! ------------------+-------------------+------------------+-------------------+------------------
!               123 |               456 |              123 |               456 |                 
!               123 |               456 |              123 |  4567890123456789 |                 
!               123 |               456 | 4567890123456789 |               123 |              123
!               123 |               456 | 4567890123456789 |               123 |              123
!               123 |               456 | 4567890123456789 |               123 |              123
!               123 |               456 | 4567890123456789 |               123 |              123
!               123 |               456 | 4567890123456789 |               123 |              123
!               123 |               456 | 4567890123456789 |  4567890123456789 |                 
!               123 |               456 | 4567890123456789 | -4567890123456789 |                 
!               123 |  4567890123456789 |              123 |               456 |                 
!               123 |  4567890123456789 |              123 |  4567890123456789 |                 
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123
!               123 |  4567890123456789 | 4567890123456789 |  4567890123456789 |                 
!               123 |  4567890123456789 | 4567890123456789 | -4567890123456789 |                 
!  4567890123456789 |               123 |              123 |               456 |                 
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 | 4567890123456789 |               123 |                 
!  4567890123456789 |               123 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |               123 | 4567890123456789 | -4567890123456789 |                 
!  4567890123456789 |  4567890123456789 |              123 |               456 |                 
!  4567890123456789 |  4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |               123 |                 
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | -4567890123456789 |                 
!  4567890123456789 | -4567890123456789 |              123 |               456 |                 
!  4567890123456789 | -4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |              123 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789 |               123 |                 
!  4567890123456789 | -4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 | 4567890123456789 | -4567890123456789 |                 
! (57 rows)
! 
! -- lateral reference to a join alias variable
! select * from (select f1/2 as x from int4_tbl) ss1 join int4_tbl i4 on x = f1,
!   lateral (select x) ss2(y);
!  x | f1 | y 
! ---+----+---
!  0 |  0 | 0
! (1 row)
! 
! select * from (select f1 as x from int4_tbl) ss1 join int4_tbl i4 on x = f1,
!   lateral (values(x)) ss2(y);
!       x      |     f1      |      y      
! -------------+-------------+-------------
!            0 |           0 |           0
!       123456 |      123456 |      123456
!      -123456 |     -123456 |     -123456
!   2147483647 |  2147483647 |  2147483647
!  -2147483647 | -2147483647 | -2147483647
! (5 rows)
! 
! select * from ((select f1/2 as x from int4_tbl) ss1 join int4_tbl i4 on x = f1) j,
!   lateral (select x) ss2(y);
!  x | f1 | y 
! ---+----+---
!  0 |  0 | 0
! (1 row)
! 
! -- lateral references requiring pullup
! select * from (values(1)) x(lb),
!   lateral generate_series(lb,4) x4;
!  lb | x4 
! ----+----
!   1 |  1
!   1 |  2
!   1 |  3
!   1 |  4
! (4 rows)
! 
! select * from (select f1/1000000000 from int4_tbl) x(lb),
!   lateral generate_series(lb,4) x4;
!  lb | x4 
! ----+----
!   0 |  0
!   0 |  1
!   0 |  2
!   0 |  3
!   0 |  4
!   0 |  0
!   0 |  1
!   0 |  2
!   0 |  3
!   0 |  4
!   0 |  0
!   0 |  1
!   0 |  2
!   0 |  3
!   0 |  4
!   2 |  2
!   2 |  3
!   2 |  4
!  -2 | -2
!  -2 | -1
!  -2 |  0
!  -2 |  1
!  -2 |  2
!  -2 |  3
!  -2 |  4
! (25 rows)
! 
! select * from (values(1)) x(lb),
!   lateral (values(lb)) y(lbcopy);
!  lb | lbcopy 
! ----+--------
!   1 |      1
! (1 row)
! 
! select * from (values(1)) x(lb),
!   lateral (select lb from int4_tbl) y(lbcopy);
!  lb | lbcopy 
! ----+--------
!   1 |      1
!   1 |      1
!   1 |      1
!   1 |      1
!   1 |      1
! (5 rows)
! 
! select * from
!   int8_tbl x left join (select q1,coalesce(q2,0) q2 from int8_tbl) y on x.q2 = y.q1,
!   lateral (values(x.q1,y.q1,y.q2)) v(xq1,yq1,yq2);
!         q1        |        q2         |        q1        |        q2         |       xq1        |       yq1        |        yq2        
! ------------------+-------------------+------------------+-------------------+------------------+------------------+-------------------
!               123 |               456 |                  |                   |              123 |                  |                  
!               123 |  4567890123456789 | 4567890123456789 | -4567890123456789 |              123 | 4567890123456789 | -4567890123456789
!               123 |  4567890123456789 | 4567890123456789 |  4567890123456789 |              123 | 4567890123456789 |  4567890123456789
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123 | 4567890123456789 |               123
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789 |              123 |  4567890123456789
!  4567890123456789 |               123 |              123 |               456 | 4567890123456789 |              123 |               456
!  4567890123456789 |  4567890123456789 | 4567890123456789 | -4567890123456789 | 4567890123456789 | 4567890123456789 | -4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 |  4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |               123 | 4567890123456789 | 4567890123456789 |               123
!  4567890123456789 | -4567890123456789 |                  |                   | 4567890123456789 |                  |                  
! (10 rows)
! 
! select * from
!   int8_tbl x left join (select q1,coalesce(q2,0) q2 from int8_tbl) y on x.q2 = y.q1,
!   lateral (select x.q1,y.q1,y.q2) v(xq1,yq1,yq2);
!         q1        |        q2         |        q1        |        q2         |       xq1        |       yq1        |        yq2        
! ------------------+-------------------+------------------+-------------------+------------------+------------------+-------------------
!               123 |               456 |                  |                   |              123 |                  |                  
!               123 |  4567890123456789 | 4567890123456789 | -4567890123456789 |              123 | 4567890123456789 | -4567890123456789
!               123 |  4567890123456789 | 4567890123456789 |  4567890123456789 |              123 | 4567890123456789 |  4567890123456789
!               123 |  4567890123456789 | 4567890123456789 |               123 |              123 | 4567890123456789 |               123
!  4567890123456789 |               123 |              123 |  4567890123456789 | 4567890123456789 |              123 |  4567890123456789
!  4567890123456789 |               123 |              123 |               456 | 4567890123456789 |              123 |               456
!  4567890123456789 |  4567890123456789 | 4567890123456789 | -4567890123456789 | 4567890123456789 | 4567890123456789 | -4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 |  4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |               123 | 4567890123456789 | 4567890123456789 |               123
!  4567890123456789 | -4567890123456789 |                  |                   | 4567890123456789 |                  |                  
! (10 rows)
! 
! select x.* from
!   int8_tbl x left join (select q1,coalesce(q2,0) q2 from int8_tbl) y on x.q2 = y.q1,
!   lateral (select x.q1,y.q1,y.q2) v(xq1,yq1,yq2);
!         q1        |        q2         
! ------------------+-------------------
!               123 |               456
!               123 |  4567890123456789
!               123 |  4567890123456789
!               123 |  4567890123456789
!  4567890123456789 |               123
!  4567890123456789 |               123
!  4567890123456789 |  4567890123456789
!  4567890123456789 |  4567890123456789
!  4567890123456789 |  4567890123456789
!  4567890123456789 | -4567890123456789
! (10 rows)
! 
! select v.* from
!   (int8_tbl x left join (select q1,coalesce(q2,0) q2 from int8_tbl) y on x.q2 = y.q1)
!   left join int4_tbl z on z.f1 = x.q2,
!   lateral (select x.q1,y.q1 union all select x.q2,y.q2) v(vx,vy);
!         vx         |        vy         
! -------------------+-------------------
!                123 |                  
!                456 |                  
!                123 |  4567890123456789
!   4567890123456789 | -4567890123456789
!                123 |  4567890123456789
!   4567890123456789 |  4567890123456789
!                123 |  4567890123456789
!   4567890123456789 |               123
!   4567890123456789 |               123
!                123 |  4567890123456789
!   4567890123456789 |               123
!                123 |               456
!   4567890123456789 |  4567890123456789
!   4567890123456789 | -4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |               123
!   4567890123456789 |                  
!  -4567890123456789 |                  
! (20 rows)
! 
! select v.* from
!   (int8_tbl x left join (select q1,(select coalesce(q2,0)) q2 from int8_tbl) y on x.q2 = y.q1)
!   left join int4_tbl z on z.f1 = x.q2,
!   lateral (select x.q1,y.q1 union all select x.q2,y.q2) v(vx,vy);
!         vx         |        vy         
! -------------------+-------------------
!                123 |                  
!                456 |                  
!                123 |  4567890123456789
!   4567890123456789 | -4567890123456789
!                123 |  4567890123456789
!   4567890123456789 |  4567890123456789
!                123 |  4567890123456789
!   4567890123456789 |               123
!   4567890123456789 |               123
!                123 |  4567890123456789
!   4567890123456789 |               123
!                123 |               456
!   4567890123456789 |  4567890123456789
!   4567890123456789 | -4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |               123
!   4567890123456789 |                  
!  -4567890123456789 |                  
! (20 rows)
! 
! create temp table dual();
! insert into dual default values;
! analyze dual;
! select v.* from
!   (int8_tbl x left join (select q1,(select coalesce(q2,0)) q2 from int8_tbl) y on x.q2 = y.q1)
!   left join int4_tbl z on z.f1 = x.q2,
!   lateral (select x.q1,y.q1 from dual union all select x.q2,y.q2 from dual) v(vx,vy);
!         vx         |        vy         
! -------------------+-------------------
!                123 |                  
!                456 |                  
!                123 |  4567890123456789
!   4567890123456789 | -4567890123456789
!                123 |  4567890123456789
!   4567890123456789 |  4567890123456789
!                123 |  4567890123456789
!   4567890123456789 |               123
!   4567890123456789 |               123
!                123 |  4567890123456789
!   4567890123456789 |               123
!                123 |               456
!   4567890123456789 |  4567890123456789
!   4567890123456789 | -4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |  4567890123456789
!   4567890123456789 |               123
!   4567890123456789 |                  
!  -4567890123456789 |                  
! (20 rows)
! 
! explain (verbose, costs off)
! select * from
!   int8_tbl a left join
!   lateral (select *, a.q2 as x from int8_tbl b) ss on a.q2 = ss.q1;
!                 QUERY PLAN                
! ------------------------------------------
!  Nested Loop Left Join
!    Output: a.q1, a.q2, b.q1, b.q2, (a.q2)
!    ->  Seq Scan on public.int8_tbl a
!          Output: a.q1, a.q2
!    ->  Seq Scan on public.int8_tbl b
!          Output: b.q1, b.q2, a.q2
!          Filter: (a.q2 = b.q1)
! (7 rows)
! 
! select * from
!   int8_tbl a left join
!   lateral (select *, a.q2 as x from int8_tbl b) ss on a.q2 = ss.q1;
!         q1        |        q2         |        q1        |        q2         |        x         
! ------------------+-------------------+------------------+-------------------+------------------
!               123 |               456 |                  |                   |                 
!               123 |  4567890123456789 | 4567890123456789 |               123 | 4567890123456789
!               123 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!               123 |  4567890123456789 | 4567890123456789 | -4567890123456789 | 4567890123456789
!  4567890123456789 |               123 |              123 |               456 |              123
!  4567890123456789 |               123 |              123 |  4567890123456789 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 |               123 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | -4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |                  |                   |                 
! (10 rows)
! 
! explain (verbose, costs off)
! select * from
!   int8_tbl a left join
!   lateral (select *, coalesce(a.q2, 42) as x from int8_tbl b) ss on a.q2 = ss.q1;
!                            QUERY PLAN                           
! ----------------------------------------------------------------
!  Nested Loop Left Join
!    Output: a.q1, a.q2, b.q1, b.q2, (COALESCE(a.q2, 42::bigint))
!    ->  Seq Scan on public.int8_tbl a
!          Output: a.q1, a.q2
!    ->  Seq Scan on public.int8_tbl b
!          Output: b.q1, b.q2, COALESCE(a.q2, 42::bigint)
!          Filter: (a.q2 = b.q1)
! (7 rows)
! 
! select * from
!   int8_tbl a left join
!   lateral (select *, coalesce(a.q2, 42) as x from int8_tbl b) ss on a.q2 = ss.q1;
!         q1        |        q2         |        q1        |        q2         |        x         
! ------------------+-------------------+------------------+-------------------+------------------
!               123 |               456 |                  |                   |                 
!               123 |  4567890123456789 | 4567890123456789 |               123 | 4567890123456789
!               123 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!               123 |  4567890123456789 | 4567890123456789 | -4567890123456789 | 4567890123456789
!  4567890123456789 |               123 |              123 |               456 |              123
!  4567890123456789 |               123 |              123 |  4567890123456789 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 |               123 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |  4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | -4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |                  |                   |                 
! (10 rows)
! 
! -- lateral can result in join conditions appearing below their
! -- real semantic level
! explain (verbose, costs off)
! select * from int4_tbl i left join
!   lateral (select * from int2_tbl j where i.f1 = j.f1) k on true;
!                 QUERY PLAN                 
! -------------------------------------------
!  Hash Left Join
!    Output: i.f1, j.f1
!    Hash Cond: (i.f1 = j.f1)
!    ->  Seq Scan on public.int4_tbl i
!          Output: i.f1
!    ->  Hash
!          Output: j.f1
!          ->  Seq Scan on public.int2_tbl j
!                Output: j.f1
! (9 rows)
! 
! select * from int4_tbl i left join
!   lateral (select * from int2_tbl j where i.f1 = j.f1) k on true;
!      f1      | f1 
! -------------+----
!            0 |  0
!       123456 |   
!      -123456 |   
!   2147483647 |   
!  -2147483647 |   
! (5 rows)
! 
! explain (verbose, costs off)
! select * from int4_tbl i left join
!   lateral (select coalesce(i) from int2_tbl j where i.f1 = j.f1) k on true;
!              QUERY PLAN              
! -------------------------------------
!  Nested Loop Left Join
!    Output: i.f1, (COALESCE(i.*))
!    ->  Seq Scan on public.int4_tbl i
!          Output: i.f1, i.*
!    ->  Seq Scan on public.int2_tbl j
!          Output: j.f1, COALESCE(i.*)
!          Filter: (i.f1 = j.f1)
! (7 rows)
! 
! select * from int4_tbl i left join
!   lateral (select coalesce(i) from int2_tbl j where i.f1 = j.f1) k on true;
!      f1      | coalesce 
! -------------+----------
!            0 | (0)
!       123456 | 
!      -123456 | 
!   2147483647 | 
!  -2147483647 | 
! (5 rows)
! 
! explain (verbose, costs off)
! select * from int4_tbl a,
!   lateral (
!     select * from int4_tbl b left join int8_tbl c on (b.f1 = q1 and a.f1 = q2)
!   ) ss;
!                    QUERY PLAN                    
! -------------------------------------------------
!  Nested Loop
!    Output: a.f1, b.f1, c.q1, c.q2
!    ->  Seq Scan on public.int4_tbl a
!          Output: a.f1
!    ->  Hash Left Join
!          Output: b.f1, c.q1, c.q2
!          Hash Cond: (b.f1 = c.q1)
!          ->  Seq Scan on public.int4_tbl b
!                Output: b.f1
!          ->  Hash
!                Output: c.q1, c.q2
!                ->  Seq Scan on public.int8_tbl c
!                      Output: c.q1, c.q2
!                      Filter: (a.f1 = c.q2)
! (14 rows)
! 
! select * from int4_tbl a,
!   lateral (
!     select * from int4_tbl b left join int8_tbl c on (b.f1 = q1 and a.f1 = q2)
!   ) ss;
!      f1      |     f1      | q1 | q2 
! -------------+-------------+----+----
!            0 |           0 |    |   
!            0 |      123456 |    |   
!            0 |     -123456 |    |   
!            0 |  2147483647 |    |   
!            0 | -2147483647 |    |   
!       123456 |           0 |    |   
!       123456 |      123456 |    |   
!       123456 |     -123456 |    |   
!       123456 |  2147483647 |    |   
!       123456 | -2147483647 |    |   
!      -123456 |           0 |    |   
!      -123456 |      123456 |    |   
!      -123456 |     -123456 |    |   
!      -123456 |  2147483647 |    |   
!      -123456 | -2147483647 |    |   
!   2147483647 |           0 |    |   
!   2147483647 |      123456 |    |   
!   2147483647 |     -123456 |    |   
!   2147483647 |  2147483647 |    |   
!   2147483647 | -2147483647 |    |   
!  -2147483647 |           0 |    |   
!  -2147483647 |      123456 |    |   
!  -2147483647 |     -123456 |    |   
!  -2147483647 |  2147483647 |    |   
!  -2147483647 | -2147483647 |    |   
! (25 rows)
! 
! -- lateral reference in a PlaceHolderVar evaluated at join level
! explain (verbose, costs off)
! select * from
!   int8_tbl a left join lateral
!   (select b.q1 as bq1, c.q1 as cq1, least(a.q1,b.q1,c.q1) from
!    int8_tbl b cross join int8_tbl c) ss
!   on a.q2 = ss.bq1;
!                          QUERY PLAN                          
! -------------------------------------------------------------
!  Nested Loop Left Join
!    Output: a.q1, a.q2, b.q1, c.q1, (LEAST(a.q1, b.q1, c.q1))
!    ->  Seq Scan on public.int8_tbl a
!          Output: a.q1, a.q2
!    ->  Nested Loop
!          Output: b.q1, c.q1, LEAST(a.q1, b.q1, c.q1)
!          Join Filter: (a.q2 = b.q1)
!          ->  Seq Scan on public.int8_tbl b
!                Output: b.q1, b.q2
!          ->  Materialize
!                Output: c.q1
!                ->  Seq Scan on public.int8_tbl c
!                      Output: c.q1
! (13 rows)
! 
! select * from
!   int8_tbl a left join lateral
!   (select b.q1 as bq1, c.q1 as cq1, least(a.q1,b.q1,c.q1) from
!    int8_tbl b cross join int8_tbl c) ss
!   on a.q2 = ss.bq1;
!         q1        |        q2         |       bq1        |       cq1        |      least       
! ------------------+-------------------+------------------+------------------+------------------
!               123 |               456 |                  |                  |                 
!               123 |  4567890123456789 | 4567890123456789 |              123 |              123
!               123 |  4567890123456789 | 4567890123456789 |              123 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 |              123 |              123
!               123 |  4567890123456789 | 4567890123456789 |              123 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 |              123 |              123
!               123 |  4567890123456789 | 4567890123456789 |              123 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!               123 |  4567890123456789 | 4567890123456789 | 4567890123456789 |              123
!  4567890123456789 |               123 |              123 |              123 |              123
!  4567890123456789 |               123 |              123 |              123 |              123
!  4567890123456789 |               123 |              123 | 4567890123456789 |              123
!  4567890123456789 |               123 |              123 | 4567890123456789 |              123
!  4567890123456789 |               123 |              123 | 4567890123456789 |              123
!  4567890123456789 |               123 |              123 |              123 |              123
!  4567890123456789 |               123 |              123 |              123 |              123
!  4567890123456789 |               123 |              123 | 4567890123456789 |              123
!  4567890123456789 |               123 |              123 | 4567890123456789 |              123
!  4567890123456789 |               123 |              123 | 4567890123456789 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 |              123 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 |              123 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |              123 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 |              123 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 |              123 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 |              123 |              123
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 |  4567890123456789 | 4567890123456789 | 4567890123456789 | 4567890123456789
!  4567890123456789 | -4567890123456789 |                  |                  |                 
! (42 rows)
! 
! -- case requiring nested PlaceHolderVars
! explain (verbose, costs off)
! select * from
!   int8_tbl c left join (
!     int8_tbl a left join (select q1, coalesce(q2,42) as x from int8_tbl b) ss1
!       on a.q2 = ss1.q1
!     cross join
!     lateral (select q1, coalesce(ss1.x,q2) as y from int8_tbl d) ss2
!   ) on c.q2 = ss2.q1,
!   lateral (select ss2.y) ss3;
!                                                                                   QUERY PLAN                                                                                  
! ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  Nested Loop
!    Output: c.q1, c.q2, a.q1, a.q2, b.q1, (COALESCE(b.q2, 42::bigint)), d.q1, (COALESCE((COALESCE(b.q2, 42::bigint)), d.q2)), ((COALESCE((COALESCE(b.q2, 42::bigint)), d.q2)))
!    ->  Hash Right Join
!          Output: c.q1, c.q2, a.q1, a.q2, b.q1, d.q1, (COALESCE(b.q2, 42::bigint)), (COALESCE((COALESCE(b.q2, 42::bigint)), d.q2))
!          Hash Cond: (d.q1 = c.q2)
!          ->  Nested Loop
!                Output: a.q1, a.q2, b.q1, d.q1, (COALESCE(b.q2, 42::bigint)), (COALESCE((COALESCE(b.q2, 42::bigint)), d.q2))
!                ->  Hash Left Join
!                      Output: a.q1, a.q2, b.q1, (COALESCE(b.q2, 42::bigint))
!                      Hash Cond: (a.q2 = b.q1)
!                      ->  Seq Scan on public.int8_tbl a
!                            Output: a.q1, a.q2
!                      ->  Hash
!                            Output: b.q1, (COALESCE(b.q2, 42::bigint))
!                            ->  Seq Scan on public.int8_tbl b
!                                  Output: b.q1, COALESCE(b.q2, 42::bigint)
!                ->  Seq Scan on public.int8_tbl d
!                      Output: d.q1, COALESCE((COALESCE(b.q2, 42::bigint)), d.q2)
!          ->  Hash
!                Output: c.q1, c.q2
!                ->  Seq Scan on public.int8_tbl c
!                      Output: c.q1, c.q2
!    ->  Result
!          Output: (COALESCE((COALESCE(b.q2, 42::bigint)), d.q2))
! (24 rows)
! 
! -- case that breaks the old ph_may_need optimization
! explain (verbose, costs off)
! select c.*,a.*,ss1.q1,ss2.q1,ss3.* from
!   int8_tbl c left join (
!     int8_tbl a left join
!       (select q1, coalesce(q2,f1) as x from int8_tbl b, int4_tbl b2
!        where q1 < f1) ss1
!       on a.q2 = ss1.q1
!     cross join
!     lateral (select q1, coalesce(ss1.x,q2) as y from int8_tbl d) ss2
!   ) on c.q2 = ss2.q1,
!   lateral (select * from int4_tbl i where ss2.y > f1) ss3;
!                                                QUERY PLAN                                                
! ---------------------------------------------------------------------------------------------------------
!  Nested Loop
!    Output: c.q1, c.q2, a.q1, a.q2, b.q1, d.q1, i.f1
!    Join Filter: ((COALESCE((COALESCE(b.q2, (b2.f1)::bigint)), d.q2)) > i.f1)
!    ->  Hash Right Join
!          Output: c.q1, c.q2, a.q1, a.q2, b.q1, d.q1, (COALESCE((COALESCE(b.q2, (b2.f1)::bigint)), d.q2))
!          Hash Cond: (d.q1 = c.q2)
!          ->  Nested Loop
!                Output: a.q1, a.q2, b.q1, d.q1, (COALESCE((COALESCE(b.q2, (b2.f1)::bigint)), d.q2))
!                ->  Hash Right Join
!                      Output: a.q1, a.q2, b.q1, (COALESCE(b.q2, (b2.f1)::bigint))
!                      Hash Cond: (b.q1 = a.q2)
!                      ->  Nested Loop
!                            Output: b.q1, COALESCE(b.q2, (b2.f1)::bigint)
!                            Join Filter: (b.q1 < b2.f1)
!                            ->  Seq Scan on public.int8_tbl b
!                                  Output: b.q1, b.q2
!                            ->  Materialize
!                                  Output: b2.f1
!                                  ->  Seq Scan on public.int4_tbl b2
!                                        Output: b2.f1
!                      ->  Hash
!                            Output: a.q1, a.q2
!                            ->  Seq Scan on public.int8_tbl a
!                                  Output: a.q1, a.q2
!                ->  Seq Scan on public.int8_tbl d
!                      Output: d.q1, COALESCE((COALESCE(b.q2, (b2.f1)::bigint)), d.q2)
!          ->  Hash
!                Output: c.q1, c.q2
!                ->  Seq Scan on public.int8_tbl c
!                      Output: c.q1, c.q2
!    ->  Materialize
!          Output: i.f1
!          ->  Seq Scan on public.int4_tbl i
!                Output: i.f1
! (34 rows)
! 
! -- check processing of postponed quals (bug #9041)
! explain (verbose, costs off)
! select * from
!   (select 1 as x) x cross join (select 2 as y) y
!   left join lateral (
!     select * from (select 3 as z) z where z.z = x.x
!   ) zz on zz.z = y.y;
!                   QUERY PLAN                  
! ----------------------------------------------
!  Nested Loop Left Join
!    Output: (1), (2), (3)
!    Join Filter: (((3) = (1)) AND ((3) = (2)))
!    ->  Nested Loop
!          Output: (1), (2)
!          ->  Result
!                Output: 1
!          ->  Result
!                Output: 2
!    ->  Result
!          Output: 3
! (11 rows)
! 
! -- test some error cases where LATERAL should have been used but wasn't
! select f1,g from int4_tbl a, (select f1 as g) ss;
! ERROR:  column "f1" does not exist
! LINE 1: select f1,g from int4_tbl a, (select f1 as g) ss;
!                                              ^
! HINT:  There is a column named "f1" in table "a", but it cannot be referenced from this part of the query.
! select f1,g from int4_tbl a, (select a.f1 as g) ss;
! ERROR:  invalid reference to FROM-clause entry for table "a"
! LINE 1: select f1,g from int4_tbl a, (select a.f1 as g) ss;
!                                              ^
! HINT:  There is an entry for table "a", but it cannot be referenced from this part of the query.
! select f1,g from int4_tbl a cross join (select f1 as g) ss;
! ERROR:  column "f1" does not exist
! LINE 1: select f1,g from int4_tbl a cross join (select f1 as g) ss;
!                                                        ^
! HINT:  There is a column named "f1" in table "a", but it cannot be referenced from this part of the query.
! select f1,g from int4_tbl a cross join (select a.f1 as g) ss;
! ERROR:  invalid reference to FROM-clause entry for table "a"
! LINE 1: select f1,g from int4_tbl a cross join (select a.f1 as g) ss...
!                                                        ^
! HINT:  There is an entry for table "a", but it cannot be referenced from this part of the query.
! -- SQL:2008 says the left table is in scope but illegal to access here
! select f1,g from int4_tbl a right join lateral generate_series(0, a.f1) g on true;
! ERROR:  invalid reference to FROM-clause entry for table "a"
! LINE 1: ... int4_tbl a right join lateral generate_series(0, a.f1) g on...
!                                                              ^
! DETAIL:  The combining JOIN type must be INNER or LEFT for a LATERAL reference.
! select f1,g from int4_tbl a full join lateral generate_series(0, a.f1) g on true;
! ERROR:  invalid reference to FROM-clause entry for table "a"
! LINE 1: ...m int4_tbl a full join lateral generate_series(0, a.f1) g on...
!                                                              ^
! DETAIL:  The combining JOIN type must be INNER or LEFT for a LATERAL reference.
! -- check we complain about ambiguous table references
! select * from
!   int8_tbl x cross join (int4_tbl x cross join lateral (select x.f1) ss);
! ERROR:  table reference "x" is ambiguous
! LINE 2: ...cross join (int4_tbl x cross join lateral (select x.f1) ss);
!                                                              ^
! -- LATERAL can be used to put an aggregate into the FROM clause of its query
! select 1 from tenk1 a, lateral (select max(a.unique1) from int4_tbl b) ss;
! ERROR:  aggregate functions are not allowed in FROM clause of their own query level
! LINE 1: select 1 from tenk1 a, lateral (select max(a.unique1) from i...
!                                                ^
! -- check behavior of LATERAL in UPDATE/DELETE
! create temp table xx1 as select f1 as x1, -f1 as x2 from int4_tbl;
! -- error, can't do this:
! update xx1 set x2 = f1 from (select * from int4_tbl where f1 = x1) ss;
! ERROR:  column "x1" does not exist
! LINE 1: ... set x2 = f1 from (select * from int4_tbl where f1 = x1) ss;
!                                                                 ^
! HINT:  There is a column named "x1" in table "xx1", but it cannot be referenced from this part of the query.
! update xx1 set x2 = f1 from (select * from int4_tbl where f1 = xx1.x1) ss;
! ERROR:  invalid reference to FROM-clause entry for table "xx1"
! LINE 1: ...t x2 = f1 from (select * from int4_tbl where f1 = xx1.x1) ss...
!                                                              ^
! HINT:  There is an entry for table "xx1", but it cannot be referenced from this part of the query.
! -- can't do it even with LATERAL:
! update xx1 set x2 = f1 from lateral (select * from int4_tbl where f1 = x1) ss;
! ERROR:  invalid reference to FROM-clause entry for table "xx1"
! LINE 1: ...= f1 from lateral (select * from int4_tbl where f1 = x1) ss;
!                                                                 ^
! HINT:  There is an entry for table "xx1", but it cannot be referenced from this part of the query.
! -- we might in future allow something like this, but for now it's an error:
! update xx1 set x2 = f1 from xx1, lateral (select * from int4_tbl where f1 = x1) ss;
! ERROR:  table name "xx1" specified more than once
! -- also errors:
! delete from xx1 using (select * from int4_tbl where f1 = x1) ss;
! ERROR:  column "x1" does not exist
! LINE 1: ...te from xx1 using (select * from int4_tbl where f1 = x1) ss;
!                                                                 ^
! HINT:  There is a column named "x1" in table "xx1", but it cannot be referenced from this part of the query.
! delete from xx1 using (select * from int4_tbl where f1 = xx1.x1) ss;
! ERROR:  invalid reference to FROM-clause entry for table "xx1"
! LINE 1: ...from xx1 using (select * from int4_tbl where f1 = xx1.x1) ss...
!                                                              ^
! HINT:  There is an entry for table "xx1", but it cannot be referenced from this part of the query.
! delete from xx1 using lateral (select * from int4_tbl where f1 = x1) ss;
! ERROR:  invalid reference to FROM-clause entry for table "xx1"
! LINE 1: ...xx1 using lateral (select * from int4_tbl where f1 = x1) ss;
!                                                                 ^
! HINT:  There is an entry for table "xx1", but it cannot be referenced from this part of the query.
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/aggregates.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/aggregates.out	Tue Oct 28 15:53:05 2014
***************
*** 1,1582 ****
! --
! -- AGGREGATES
! --
! SELECT avg(four) AS avg_1 FROM onek;
!        avg_1        
! --------------------
!  1.5000000000000000
! (1 row)
! 
! SELECT avg(a) AS avg_32 FROM aggtest WHERE a < 100;
!        avg_32        
! ---------------------
!  32.6666666666666667
! (1 row)
! 
! -- In 7.1, avg(float4) is computed using float8 arithmetic.
! -- Round the result to 3 digits to avoid platform-specific results.
! SELECT avg(b)::numeric(10,3) AS avg_107_943 FROM aggtest;
!  avg_107_943 
! -------------
!      107.943
! (1 row)
! 
! SELECT avg(gpa) AS avg_3_4 FROM ONLY student;
!  avg_3_4 
! ---------
!      3.4
! (1 row)
! 
! SELECT sum(four) AS sum_1500 FROM onek;
!  sum_1500 
! ----------
!      1500
! (1 row)
! 
! SELECT sum(a) AS sum_198 FROM aggtest;
!  sum_198 
! ---------
!      198
! (1 row)
! 
! SELECT sum(b) AS avg_431_773 FROM aggtest;
!  avg_431_773 
! -------------
!      431.773
! (1 row)
! 
! SELECT sum(gpa) AS avg_6_8 FROM ONLY student;
!  avg_6_8 
! ---------
!      6.8
! (1 row)
! 
! SELECT max(four) AS max_3 FROM onek;
!  max_3 
! -------
!      3
! (1 row)
! 
! SELECT max(a) AS max_100 FROM aggtest;
!  max_100 
! ---------
!      100
! (1 row)
! 
! SELECT max(aggtest.b) AS max_324_78 FROM aggtest;
!  max_324_78 
! ------------
!      324.78
! (1 row)
! 
! SELECT max(student.gpa) AS max_3_7 FROM student;
!  max_3_7 
! ---------
!      3.7
! (1 row)
! 
! SELECT stddev_pop(b) FROM aggtest;
!    stddev_pop    
! -----------------
!  131.10703231895
! (1 row)
! 
! SELECT stddev_samp(b) FROM aggtest;
!    stddev_samp    
! ------------------
!  151.389360803998
! (1 row)
! 
! SELECT var_pop(b) FROM aggtest;
!      var_pop      
! ------------------
!  17189.0539234823
! (1 row)
! 
! SELECT var_samp(b) FROM aggtest;
!      var_samp     
! ------------------
!  22918.7385646431
! (1 row)
! 
! SELECT stddev_pop(b::numeric) FROM aggtest;
!     stddev_pop    
! ------------------
!  131.107032862199
! (1 row)
! 
! SELECT stddev_samp(b::numeric) FROM aggtest;
!    stddev_samp    
! ------------------
!  151.389361431288
! (1 row)
! 
! SELECT var_pop(b::numeric) FROM aggtest;
!       var_pop       
! --------------------
!  17189.054065929769
! (1 row)
! 
! SELECT var_samp(b::numeric) FROM aggtest;
!       var_samp      
! --------------------
!  22918.738754573025
! (1 row)
! 
! -- population variance is defined for a single tuple, sample variance
! -- is not
! SELECT var_pop(1.0), var_samp(2.0);
!  var_pop | var_samp 
! ---------+----------
!        0 |         
! (1 row)
! 
! SELECT stddev_pop(3.0::numeric), stddev_samp(4.0::numeric);
!  stddev_pop | stddev_samp 
! ------------+-------------
!           0 |            
! (1 row)
! 
! -- verify correct results for null and NaN inputs
! select sum(null::int4) from generate_series(1,3);
!  sum 
! -----
!     
! (1 row)
! 
! select sum(null::int8) from generate_series(1,3);
!  sum 
! -----
!     
! (1 row)
! 
! select sum(null::numeric) from generate_series(1,3);
!  sum 
! -----
!     
! (1 row)
! 
! select sum(null::float8) from generate_series(1,3);
!  sum 
! -----
!     
! (1 row)
! 
! select avg(null::int4) from generate_series(1,3);
!  avg 
! -----
!     
! (1 row)
! 
! select avg(null::int8) from generate_series(1,3);
!  avg 
! -----
!     
! (1 row)
! 
! select avg(null::numeric) from generate_series(1,3);
!  avg 
! -----
!     
! (1 row)
! 
! select avg(null::float8) from generate_series(1,3);
!  avg 
! -----
!     
! (1 row)
! 
! select sum('NaN'::numeric) from generate_series(1,3);
!  sum 
! -----
!  NaN
! (1 row)
! 
! select avg('NaN'::numeric) from generate_series(1,3);
!  avg 
! -----
!  NaN
! (1 row)
! 
! -- SQL2003 binary aggregates
! SELECT regr_count(b, a) FROM aggtest;
!  regr_count 
! ------------
!           4
! (1 row)
! 
! SELECT regr_sxx(b, a) FROM aggtest;
!  regr_sxx 
! ----------
!      5099
! (1 row)
! 
! SELECT regr_syy(b, a) FROM aggtest;
!      regr_syy     
! ------------------
!  68756.2156939293
! (1 row)
! 
! SELECT regr_sxy(b, a) FROM aggtest;
!      regr_sxy     
! ------------------
!  2614.51582155004
! (1 row)
! 
! SELECT regr_avgx(b, a), regr_avgy(b, a) FROM aggtest;
!  regr_avgx |    regr_avgy     
! -----------+------------------
!       49.5 | 107.943152273074
! (1 row)
! 
! SELECT regr_r2(b, a) FROM aggtest;
!       regr_r2       
! --------------------
!  0.0194977982031803
! (1 row)
! 
! SELECT regr_slope(b, a), regr_intercept(b, a) FROM aggtest;
!     regr_slope     |  regr_intercept  
! -------------------+------------------
!  0.512750700441271 | 82.5619926012309
! (1 row)
! 
! SELECT covar_pop(b, a), covar_samp(b, a) FROM aggtest;
!     covar_pop    |    covar_samp    
! -----------------+------------------
!  653.62895538751 | 871.505273850014
! (1 row)
! 
! SELECT corr(b, a) FROM aggtest;
!        corr        
! -------------------
!  0.139634516517873
! (1 row)
! 
! SELECT count(four) AS cnt_1000 FROM onek;
!  cnt_1000 
! ----------
!      1000
! (1 row)
! 
! SELECT count(DISTINCT four) AS cnt_4 FROM onek;
!  cnt_4 
! -------
!      4
! (1 row)
! 
! select ten, count(*), sum(four) from onek
! group by ten order by ten;
!  ten | count | sum 
! -----+-------+-----
!    0 |   100 | 100
!    1 |   100 | 200
!    2 |   100 | 100
!    3 |   100 | 200
!    4 |   100 | 100
!    5 |   100 | 200
!    6 |   100 | 100
!    7 |   100 | 200
!    8 |   100 | 100
!    9 |   100 | 200
! (10 rows)
! 
! select ten, count(four), sum(DISTINCT four) from onek
! group by ten order by ten;
!  ten | count | sum 
! -----+-------+-----
!    0 |   100 |   2
!    1 |   100 |   4
!    2 |   100 |   2
!    3 |   100 |   4
!    4 |   100 |   2
!    5 |   100 |   4
!    6 |   100 |   2
!    7 |   100 |   4
!    8 |   100 |   2
!    9 |   100 |   4
! (10 rows)
! 
! -- user-defined aggregates
! SELECT newavg(four) AS avg_1 FROM onek;
!        avg_1        
! --------------------
!  1.5000000000000000
! (1 row)
! 
! SELECT newsum(four) AS sum_1500 FROM onek;
!  sum_1500 
! ----------
!      1500
! (1 row)
! 
! SELECT newcnt(four) AS cnt_1000 FROM onek;
!  cnt_1000 
! ----------
!      1000
! (1 row)
! 
! SELECT newcnt(*) AS cnt_1000 FROM onek;
!  cnt_1000 
! ----------
!      1000
! (1 row)
! 
! SELECT oldcnt(*) AS cnt_1000 FROM onek;
!  cnt_1000 
! ----------
!      1000
! (1 row)
! 
! SELECT sum2(q1,q2) FROM int8_tbl;
!        sum2        
! -------------------
!  18271560493827981
! (1 row)
! 
! -- test for outer-level aggregates
! -- this should work
! select ten, sum(distinct four) from onek a
! group by ten
! having exists (select 1 from onek b where sum(distinct a.four) = b.four);
!  ten | sum 
! -----+-----
!    0 |   2
!    2 |   2
!    4 |   2
!    6 |   2
!    8 |   2
! (5 rows)
! 
! -- this should fail because subquery has an agg of its own in WHERE
! select ten, sum(distinct four) from onek a
! group by ten
! having exists (select 1 from onek b
!                where sum(distinct a.four + b.four) = b.four);
! ERROR:  aggregate functions are not allowed in WHERE
! LINE 4:                where sum(distinct a.four + b.four) = b.four)...
!                              ^
! -- Test handling of sublinks within outer-level aggregates.
! -- Per bug report from Daniel Grace.
! select
!   (select max((select i.unique2 from tenk1 i where i.unique1 = o.unique1)))
! from tenk1 o;
!  max  
! ------
!  9999
! (1 row)
! 
! --
! -- test for bitwise integer aggregates
! --
! CREATE TEMPORARY TABLE bitwise_test(
!   i2 INT2,
!   i4 INT4,
!   i8 INT8,
!   i INTEGER,
!   x INT2,
!   y BIT(4)
! );
! -- empty case
! SELECT
!   BIT_AND(i2) AS "?",
!   BIT_OR(i4)  AS "?"
! FROM bitwise_test;
!  ? | ? 
! ---+---
!    |  
! (1 row)
! 
! COPY bitwise_test FROM STDIN NULL 'null';
! SELECT
!   BIT_AND(i2) AS "1",
!   BIT_AND(i4) AS "1",
!   BIT_AND(i8) AS "1",
!   BIT_AND(i)  AS "?",
!   BIT_AND(x)  AS "0",
!   BIT_AND(y)  AS "0100",
!   BIT_OR(i2)  AS "7",
!   BIT_OR(i4)  AS "7",
!   BIT_OR(i8)  AS "7",
!   BIT_OR(i)   AS "?",
!   BIT_OR(x)   AS "7",
!   BIT_OR(y)   AS "1101"
! FROM bitwise_test;
!  1 | 1 | 1 | ? | 0 | 0100 | 7 | 7 | 7 | ? | 7 | 1101 
! ---+---+---+---+---+------+---+---+---+---+---+------
!  1 | 1 | 1 | 1 | 0 | 0100 | 7 | 7 | 7 | 3 | 7 | 1101
! (1 row)
! 
! --
! -- test boolean aggregates
! --
! -- first test all possible transition and final states
! SELECT
!   -- boolean and transitions
!   -- null because strict
!   booland_statefunc(NULL, NULL)  IS NULL AS "t",
!   booland_statefunc(TRUE, NULL)  IS NULL AS "t",
!   booland_statefunc(FALSE, NULL) IS NULL AS "t",
!   booland_statefunc(NULL, TRUE)  IS NULL AS "t",
!   booland_statefunc(NULL, FALSE) IS NULL AS "t",
!   -- and actual computations
!   booland_statefunc(TRUE, TRUE) AS "t",
!   NOT booland_statefunc(TRUE, FALSE) AS "t",
!   NOT booland_statefunc(FALSE, TRUE) AS "t",
!   NOT booland_statefunc(FALSE, FALSE) AS "t";
!  t | t | t | t | t | t | t | t | t 
! ---+---+---+---+---+---+---+---+---
!  t | t | t | t | t | t | t | t | t
! (1 row)
! 
! SELECT
!   -- boolean or transitions
!   -- null because strict
!   boolor_statefunc(NULL, NULL)  IS NULL AS "t",
!   boolor_statefunc(TRUE, NULL)  IS NULL AS "t",
!   boolor_statefunc(FALSE, NULL) IS NULL AS "t",
!   boolor_statefunc(NULL, TRUE)  IS NULL AS "t",
!   boolor_statefunc(NULL, FALSE) IS NULL AS "t",
!   -- actual computations
!   boolor_statefunc(TRUE, TRUE) AS "t",
!   boolor_statefunc(TRUE, FALSE) AS "t",
!   boolor_statefunc(FALSE, TRUE) AS "t",
!   NOT boolor_statefunc(FALSE, FALSE) AS "t";
!  t | t | t | t | t | t | t | t | t 
! ---+---+---+---+---+---+---+---+---
!  t | t | t | t | t | t | t | t | t
! (1 row)
! 
! CREATE TEMPORARY TABLE bool_test(
!   b1 BOOL,
!   b2 BOOL,
!   b3 BOOL,
!   b4 BOOL);
! -- empty case
! SELECT
!   BOOL_AND(b1)   AS "n",
!   BOOL_OR(b3)    AS "n"
! FROM bool_test;
!  n | n 
! ---+---
!    | 
! (1 row)
! 
! COPY bool_test FROM STDIN NULL 'null';
! SELECT
!   BOOL_AND(b1)     AS "f",
!   BOOL_AND(b2)     AS "t",
!   BOOL_AND(b3)     AS "f",
!   BOOL_AND(b4)     AS "n",
!   BOOL_AND(NOT b2) AS "f",
!   BOOL_AND(NOT b3) AS "t"
! FROM bool_test;
!  f | t | f | n | f | t 
! ---+---+---+---+---+---
!  f | t | f |   | f | t
! (1 row)
! 
! SELECT
!   EVERY(b1)     AS "f",
!   EVERY(b2)     AS "t",
!   EVERY(b3)     AS "f",
!   EVERY(b4)     AS "n",
!   EVERY(NOT b2) AS "f",
!   EVERY(NOT b3) AS "t"
! FROM bool_test;
!  f | t | f | n | f | t 
! ---+---+---+---+---+---
!  f | t | f |   | f | t
! (1 row)
! 
! SELECT
!   BOOL_OR(b1)      AS "t",
!   BOOL_OR(b2)      AS "t",
!   BOOL_OR(b3)      AS "f",
!   BOOL_OR(b4)      AS "n",
!   BOOL_OR(NOT b2)  AS "f",
!   BOOL_OR(NOT b3)  AS "t"
! FROM bool_test;
!  t | t | f | n | f | t 
! ---+---+---+---+---+---
!  t | t | f |   | f | t
! (1 row)
! 
! --
! -- Test cases that should be optimized into indexscans instead of
! -- the generic aggregate implementation.
! --
! -- Basic cases
! explain (costs off)
!   select min(unique1) from tenk1;
!                          QUERY PLAN                         
! ------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan using tenk1_unique1 on tenk1
!                  Index Cond: (unique1 IS NOT NULL)
! (5 rows)
! 
! select min(unique1) from tenk1;
!  min 
! -----
!    0
! (1 row)
! 
! explain (costs off)
!   select max(unique1) from tenk1;
!                              QUERY PLAN                              
! ---------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique1 on tenk1
!                  Index Cond: (unique1 IS NOT NULL)
! (5 rows)
! 
! select max(unique1) from tenk1;
!  max  
! ------
!  9999
! (1 row)
! 
! explain (costs off)
!   select max(unique1) from tenk1 where unique1 < 42;
!                                QUERY PLAN                               
! ------------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique1 on tenk1
!                  Index Cond: ((unique1 IS NOT NULL) AND (unique1 < 42))
! (5 rows)
! 
! select max(unique1) from tenk1 where unique1 < 42;
!  max 
! -----
!   41
! (1 row)
! 
! explain (costs off)
!   select max(unique1) from tenk1 where unique1 > 42;
!                                QUERY PLAN                               
! ------------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique1 on tenk1
!                  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42))
! (5 rows)
! 
! select max(unique1) from tenk1 where unique1 > 42;
!  max  
! ------
!  9999
! (1 row)
! 
! explain (costs off)
!   select max(unique1) from tenk1 where unique1 > 42000;
!                                 QUERY PLAN                                 
! ---------------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique1 on tenk1
!                  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)
! 
! select max(unique1) from tenk1 where unique1 > 42000;
!  max 
! -----
!     
! (1 row)
! 
! -- multi-column index (uses tenk1_thous_tenthous)
! explain (costs off)
!   select max(tenthous) from tenk1 where thousand = 33;
!                                  QUERY PLAN                                 
! ----------------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1
!                  Index Cond: ((thousand = 33) AND (tenthous IS NOT NULL))
! (5 rows)
! 
! select max(tenthous) from tenk1 where thousand = 33;
!  max  
! ------
!  9033
! (1 row)
! 
! explain (costs off)
!   select min(tenthous) from tenk1 where thousand = 33;
!                                 QUERY PLAN                                
! --------------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan using tenk1_thous_tenthous on tenk1
!                  Index Cond: ((thousand = 33) AND (tenthous IS NOT NULL))
! (5 rows)
! 
! select min(tenthous) from tenk1 where thousand = 33;
!  min 
! -----
!   33
! (1 row)
! 
! -- check parameter propagation into an indexscan subquery
! explain (costs off)
!   select f1, (select min(unique1) from tenk1 where unique1 > f1) AS gt
!     from int4_tbl;
!                                        QUERY PLAN                                        
! -----------------------------------------------------------------------------------------
!  Seq Scan on int4_tbl
!    SubPlan 2
!      ->  Result
!            InitPlan 1 (returns $1)
!              ->  Limit
!                    ->  Index Only Scan using tenk1_unique1 on tenk1
!                          Index Cond: ((unique1 IS NOT NULL) AND (unique1 > int4_tbl.f1))
! (7 rows)
! 
! select f1, (select min(unique1) from tenk1 where unique1 > f1) AS gt
!   from int4_tbl;
!      f1      | gt 
! -------------+----
!            0 |  1
!       123456 |   
!      -123456 |  0
!   2147483647 |   
!  -2147483647 |  0
! (5 rows)
! 
! -- check some cases that were handled incorrectly in 8.3.0
! explain (costs off)
!   select distinct max(unique2) from tenk1;
!                              QUERY PLAN                              
! ---------------------------------------------------------------------
!  HashAggregate
!    Group Key: $0
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
!                  Index Cond: (unique2 IS NOT NULL)
!    ->  Result
! (7 rows)
! 
! select distinct max(unique2) from tenk1;
!  max  
! ------
!  9999
! (1 row)
! 
! explain (costs off)
!   select max(unique2) from tenk1 order by 1;
!                              QUERY PLAN                              
! ---------------------------------------------------------------------
!  Sort
!    Sort Key: ($0)
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
!                  Index Cond: (unique2 IS NOT NULL)
!    ->  Result
! (7 rows)
! 
! select max(unique2) from tenk1 order by 1;
!  max  
! ------
!  9999
! (1 row)
! 
! explain (costs off)
!   select max(unique2) from tenk1 order by max(unique2);
!                              QUERY PLAN                              
! ---------------------------------------------------------------------
!  Sort
!    Sort Key: ($0)
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
!                  Index Cond: (unique2 IS NOT NULL)
!    ->  Result
! (7 rows)
! 
! select max(unique2) from tenk1 order by max(unique2);
!  max  
! ------
!  9999
! (1 row)
! 
! explain (costs off)
!   select max(unique2) from tenk1 order by max(unique2)+1;
!                              QUERY PLAN                              
! ---------------------------------------------------------------------
!  Sort
!    Sort Key: (($0 + 1))
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
!                  Index Cond: (unique2 IS NOT NULL)
!    ->  Result
! (7 rows)
! 
! select max(unique2) from tenk1 order by max(unique2)+1;
!  max  
! ------
!  9999
! (1 row)
! 
! explain (costs off)
!   select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
!                              QUERY PLAN                              
! ---------------------------------------------------------------------
!  Sort
!    Sort Key: (generate_series(1, 3))
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
!                  Index Cond: (unique2 IS NOT NULL)
!    ->  Result
! (7 rows)
! 
! select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
!  max  | g 
! ------+---
!  9999 | 3
!  9999 | 2
!  9999 | 1
! (3 rows)
! 
! -- try it on an inheritance tree
! create table minmaxtest(f1 int);
! create table minmaxtest1() inherits (minmaxtest);
! create table minmaxtest2() inherits (minmaxtest);
! create table minmaxtest3() inherits (minmaxtest);
! create index minmaxtesti on minmaxtest(f1);
! create index minmaxtest1i on minmaxtest1(f1);
! create index minmaxtest2i on minmaxtest2(f1 desc);
! create index minmaxtest3i on minmaxtest3(f1) where f1 is not null;
! insert into minmaxtest values(11), (12);
! insert into minmaxtest1 values(13), (14);
! insert into minmaxtest2 values(15), (16);
! insert into minmaxtest3 values(17), (18);
! explain (costs off)
!   select min(f1), max(f1) from minmaxtest;
!                                           QUERY PLAN                                          
! ----------------------------------------------------------------------------------------------
!  Result
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: minmaxtest.f1
!                  ->  Index Only Scan using minmaxtesti on minmaxtest
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan using minmaxtest1i on minmaxtest1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan Backward using minmaxtest2i on minmaxtest2
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan using minmaxtest3i on minmaxtest3
!                        Index Cond: (f1 IS NOT NULL)
!    InitPlan 2 (returns $1)
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: minmaxtest_1.f1
!                  ->  Index Only Scan Backward using minmaxtesti on minmaxtest minmaxtest_1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan Backward using minmaxtest1i on minmaxtest1 minmaxtest1_1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest2_1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest3_1
!                        Index Cond: (f1 IS NOT NULL)
! (25 rows)
! 
! select min(f1), max(f1) from minmaxtest;
!  min | max 
! -----+-----
!   11 |  18
! (1 row)
! 
! -- DISTINCT doesn't do anything useful here, but it shouldn't fail
! explain (costs off)
!   select distinct min(f1), max(f1) from minmaxtest;
!                                           QUERY PLAN                                          
! ----------------------------------------------------------------------------------------------
!  HashAggregate
!    Group Key: $0, $1
!    InitPlan 1 (returns $0)
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: minmaxtest.f1
!                  ->  Index Only Scan using minmaxtesti on minmaxtest
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan using minmaxtest1i on minmaxtest1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan Backward using minmaxtest2i on minmaxtest2
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan using minmaxtest3i on minmaxtest3
!                        Index Cond: (f1 IS NOT NULL)
!    InitPlan 2 (returns $1)
!      ->  Limit
!            ->  Merge Append
!                  Sort Key: minmaxtest_1.f1
!                  ->  Index Only Scan Backward using minmaxtesti on minmaxtest minmaxtest_1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan Backward using minmaxtest1i on minmaxtest1 minmaxtest1_1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest2_1
!                        Index Cond: (f1 IS NOT NULL)
!                  ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest3_1
!                        Index Cond: (f1 IS NOT NULL)
!    ->  Result
! (27 rows)
! 
! select distinct min(f1), max(f1) from minmaxtest;
!  min | max 
! -----+-----
!   11 |  18
! (1 row)
! 
! drop table minmaxtest cascade;
! NOTICE:  drop cascades to 3 other objects
! DETAIL:  drop cascades to table minmaxtest1
! drop cascades to table minmaxtest2
! drop cascades to table minmaxtest3
! -- check for correct detection of nested-aggregate errors
! select max(min(unique1)) from tenk1;
! ERROR:  aggregate function calls cannot be nested
! LINE 1: select max(min(unique1)) from tenk1;
!                    ^
! select (select max(min(unique1)) from int8_tbl) from tenk1;
! ERROR:  aggregate function calls cannot be nested
! LINE 1: select (select max(min(unique1)) from int8_tbl) from tenk1;
!                            ^
! --
! -- Test combinations of DISTINCT and/or ORDER BY
! --
! select array_agg(a order by b)
!   from (values (1,4),(2,3),(3,1),(4,2)) v(a,b);
!  array_agg 
! -----------
!  {3,4,2,1}
! (1 row)
! 
! select array_agg(a order by a)
!   from (values (1,4),(2,3),(3,1),(4,2)) v(a,b);
!  array_agg 
! -----------
!  {1,2,3,4}
! (1 row)
! 
! select array_agg(a order by a desc)
!   from (values (1,4),(2,3),(3,1),(4,2)) v(a,b);
!  array_agg 
! -----------
!  {4,3,2,1}
! (1 row)
! 
! select array_agg(b order by a desc)
!   from (values (1,4),(2,3),(3,1),(4,2)) v(a,b);
!  array_agg 
! -----------
!  {2,1,3,4}
! (1 row)
! 
! select array_agg(distinct a)
!   from (values (1),(2),(1),(3),(null),(2)) v(a);
!   array_agg   
! --------------
!  {1,2,3,NULL}
! (1 row)
! 
! select array_agg(distinct a order by a)
!   from (values (1),(2),(1),(3),(null),(2)) v(a);
!   array_agg   
! --------------
!  {1,2,3,NULL}
! (1 row)
! 
! select array_agg(distinct a order by a desc)
!   from (values (1),(2),(1),(3),(null),(2)) v(a);
!   array_agg   
! --------------
!  {NULL,3,2,1}
! (1 row)
! 
! select array_agg(distinct a order by a desc nulls last)
!   from (values (1),(2),(1),(3),(null),(2)) v(a);
!   array_agg   
! --------------
!  {3,2,1,NULL}
! (1 row)
! 
! -- multi-arg aggs, strict/nonstrict, distinct/order by
! select aggfstr(a,b,c)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c);
!                 aggfstr                
! ---------------------------------------
!  {"(1,3,foo)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select aggfns(a,b,c)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c);
!                     aggfns                     
! -----------------------------------------------
!  {"(1,3,foo)","(0,,)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select aggfstr(distinct a,b,c)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,3) i;
!                 aggfstr                
! ---------------------------------------
!  {"(1,3,foo)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select aggfns(distinct a,b,c)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,3) i;
!                     aggfns                     
! -----------------------------------------------
!  {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select aggfstr(distinct a,b,c order by b)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,3) i;
!                 aggfstr                
! ---------------------------------------
!  {"(3,1,baz)","(2,2,bar)","(1,3,foo)"}
! (1 row)
! 
! select aggfns(distinct a,b,c order by b)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,3) i;
!                     aggfns                     
! -----------------------------------------------
!  {"(3,1,baz)","(2,2,bar)","(1,3,foo)","(0,,)"}
! (1 row)
! 
! -- test specific code paths
! select aggfns(distinct a,a,c order by c using ~<~,a)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,2) i;
!                      aggfns                     
! ------------------------------------------------
!  {"(2,2,bar)","(3,3,baz)","(1,1,foo)","(0,0,)"}
! (1 row)
! 
! select aggfns(distinct a,a,c order by c using ~<~)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,2) i;
!                      aggfns                     
! ------------------------------------------------
!  {"(2,2,bar)","(3,3,baz)","(1,1,foo)","(0,0,)"}
! (1 row)
! 
! select aggfns(distinct a,a,c order by a)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,2) i;
!                      aggfns                     
! ------------------------------------------------
!  {"(0,0,)","(1,1,foo)","(2,2,bar)","(3,3,baz)"}
! (1 row)
! 
! select aggfns(distinct a,b,c order by a,c using ~<~,b)
!   from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!        generate_series(1,2) i;
!                     aggfns                     
! -----------------------------------------------
!  {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! -- check node I/O via view creation and usage, also deparsing logic
! create view agg_view1 as
!   select aggfns(a,b,c)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c);
! select * from agg_view1;
!                     aggfns                     
! -----------------------------------------------
!  {"(1,3,foo)","(0,,)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(v.a, v.b, v.c) AS aggfns                                                                            +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c);
! (1 row)
! 
! create or replace view agg_view1 as
!   select aggfns(distinct a,b,c)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!          generate_series(1,3) i;
! select * from agg_view1;
!                     aggfns                     
! -----------------------------------------------
!  {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(DISTINCT v.a, v.b, v.c) AS aggfns                                                                   +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c),+
!      generate_series(1, 3) i(i);
! (1 row)
! 
! create or replace view agg_view1 as
!   select aggfns(distinct a,b,c order by b)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!          generate_series(1,3) i;
! select * from agg_view1;
!                     aggfns                     
! -----------------------------------------------
!  {"(3,1,baz)","(2,2,bar)","(1,3,foo)","(0,,)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(DISTINCT v.a, v.b, v.c ORDER BY v.b) AS aggfns                                                      +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c),+
!      generate_series(1, 3) i(i);
! (1 row)
! 
! create or replace view agg_view1 as
!   select aggfns(a,b,c order by b+1)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c);
! select * from agg_view1;
!                     aggfns                     
! -----------------------------------------------
!  {"(3,1,baz)","(2,2,bar)","(1,3,foo)","(0,,)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(v.a, v.b, v.c ORDER BY (v.b + 1)) AS aggfns                                                         +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c);
! (1 row)
! 
! create or replace view agg_view1 as
!   select aggfns(a,a,c order by b)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c);
! select * from agg_view1;
!                      aggfns                     
! ------------------------------------------------
!  {"(3,3,baz)","(2,2,bar)","(1,1,foo)","(0,0,)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(v.a, v.a, v.c ORDER BY v.b) AS aggfns                                                               +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c);
! (1 row)
! 
! create or replace view agg_view1 as
!   select aggfns(a,b,c order by c using ~<~)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c);
! select * from agg_view1;
!                     aggfns                     
! -----------------------------------------------
!  {"(2,2,bar)","(3,1,baz)","(1,3,foo)","(0,,)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(v.a, v.b, v.c ORDER BY v.c USING ~<~ NULLS LAST) AS aggfns                                          +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c);
! (1 row)
! 
! create or replace view agg_view1 as
!   select aggfns(distinct a,b,c order by a,c using ~<~,b)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!          generate_series(1,2) i;
! select * from agg_view1;
!                     aggfns                     
! -----------------------------------------------
!  {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! select pg_get_viewdef('agg_view1'::regclass);
!                                                    pg_get_viewdef                                                    
! ---------------------------------------------------------------------------------------------------------------------
!   SELECT aggfns(DISTINCT v.a, v.b, v.c ORDER BY v.a, v.c USING ~<~ NULLS LAST, v.b) AS aggfns                       +
!     FROM ( VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c),+
!      generate_series(1, 2) i(i);
! (1 row)
! 
! drop view agg_view1;
! -- incorrect DISTINCT usage errors
! select aggfns(distinct a,b,c order by i)
!   from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i;
! ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
! LINE 1: select aggfns(distinct a,b,c order by i)
!                                               ^
! select aggfns(distinct a,b,c order by a,b+1)
!   from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i;
! ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
! LINE 1: select aggfns(distinct a,b,c order by a,b+1)
!                                                 ^
! select aggfns(distinct a,b,c order by a,b,i,c)
!   from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i;
! ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
! LINE 1: select aggfns(distinct a,b,c order by a,b,i,c)
!                                                   ^
! select aggfns(distinct a,a,c order by a,b)
!   from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i;
! ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
! LINE 1: select aggfns(distinct a,a,c order by a,b)
!                                                 ^
! -- string_agg tests
! select string_agg(a,',') from (values('aaaa'),('bbbb'),('cccc')) g(a);
!    string_agg   
! ----------------
!  aaaa,bbbb,cccc
! (1 row)
! 
! select string_agg(a,',') from (values('aaaa'),(null),('bbbb'),('cccc')) g(a);
!    string_agg   
! ----------------
!  aaaa,bbbb,cccc
! (1 row)
! 
! select string_agg(a,'AB') from (values(null),(null),('bbbb'),('cccc')) g(a);
!  string_agg 
! ------------
!  bbbbABcccc
! (1 row)
! 
! select string_agg(a,',') from (values(null),(null)) g(a);
!  string_agg 
! ------------
!  
! (1 row)
! 
! -- check some implicit casting cases, as per bug #5564
! select string_agg(distinct f1, ',' order by f1) from varchar_tbl;  -- ok
!  string_agg 
! ------------
!  a,ab,abcd
! (1 row)
! 
! select string_agg(distinct f1::text, ',' order by f1) from varchar_tbl;  -- not ok
! ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
! LINE 1: select string_agg(distinct f1::text, ',' order by f1) from v...
!                                                           ^
! select string_agg(distinct f1, ',' order by f1::text) from varchar_tbl;  -- not ok
! ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
! LINE 1: select string_agg(distinct f1, ',' order by f1::text) from v...
!                                                     ^
! select string_agg(distinct f1::text, ',' order by f1::text) from varchar_tbl;  -- ok
!  string_agg 
! ------------
!  a,ab,abcd
! (1 row)
! 
! -- string_agg bytea tests
! create table bytea_test_table(v bytea);
! select string_agg(v, '') from bytea_test_table;
!  string_agg 
! ------------
!  
! (1 row)
! 
! insert into bytea_test_table values(decode('ff','hex'));
! select string_agg(v, '') from bytea_test_table;
!  string_agg 
! ------------
!  \xff
! (1 row)
! 
! insert into bytea_test_table values(decode('aa','hex'));
! select string_agg(v, '') from bytea_test_table;
!  string_agg 
! ------------
!  \xffaa
! (1 row)
! 
! select string_agg(v, NULL) from bytea_test_table;
!  string_agg 
! ------------
!  \xffaa
! (1 row)
! 
! select string_agg(v, decode('ee', 'hex')) from bytea_test_table;
!  string_agg 
! ------------
!  \xffeeaa
! (1 row)
! 
! drop table bytea_test_table;
! -- FILTER tests
! select min(unique1) filter (where unique1 > 100) from tenk1;
!  min 
! -----
!  101
! (1 row)
! 
! select ten, sum(distinct four) filter (where four::text ~ '123') from onek a
! group by ten;
!  ten | sum 
! -----+-----
!    0 |    
!    1 |    
!    2 |    
!    3 |    
!    4 |    
!    5 |    
!    6 |    
!    7 |    
!    8 |    
!    9 |    
! (10 rows)
! 
! select ten, sum(distinct four) filter (where four > 10) from onek a
! group by ten
! having exists (select 1 from onek b where sum(distinct a.four) = b.four);
!  ten | sum 
! -----+-----
!    0 |    
!    2 |    
!    4 |    
!    6 |    
!    8 |    
! (5 rows)
! 
! select max(foo COLLATE "C") filter (where (bar collate "POSIX") > '0')
! from (values ('a', 'b')) AS v(foo,bar);
!  max 
! -----
!  a
! (1 row)
! 
! -- outer reference in FILTER (PostgreSQL extension)
! select (select count(*)
!         from (values (1)) t0(inner_c))
! from (values (2),(3)) t1(outer_c); -- inner query is aggregation query
!  count 
! -------
!      1
!      1
! (2 rows)
! 
! select (select count(*) filter (where outer_c <> 0)
!         from (values (1)) t0(inner_c))
! from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
!  count 
! -------
!      2
! (1 row)
! 
! select (select count(inner_c) filter (where outer_c <> 0)
!         from (values (1)) t0(inner_c))
! from (values (2),(3)) t1(outer_c); -- inner query is aggregation query
!  count 
! -------
!      1
!      1
! (2 rows)
! 
! select
!   (select max((select i.unique2 from tenk1 i where i.unique1 = o.unique1))
!      filter (where o.unique1 < 10))
! from tenk1 o;					-- outer query is aggregation query
!  max  
! ------
!  9998
! (1 row)
! 
! -- subquery in FILTER clause (PostgreSQL extension)
! select sum(unique1) FILTER (WHERE
!   unique1 IN (SELECT unique1 FROM onek where unique1 < 100)) FROM tenk1;
!  sum  
! ------
!  4950
! (1 row)
! 
! -- exercise lots of aggregate parts with FILTER
! select aggfns(distinct a,b,c order by a,c using ~<~,b) filter (where a > 1)
!     from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
!     generate_series(1,2) i;
!           aggfns           
! ---------------------------
!  {"(2,2,bar)","(3,1,baz)"}
! (1 row)
! 
! -- ordered-set aggregates
! select p, percentile_cont(p) within group (order by x::float8)
! from generate_series(1,5) x,
!      (values (0::float8),(0.1),(0.25),(0.4),(0.5),(0.6),(0.75),(0.9),(1)) v(p)
! group by p order by p;
!   p   | percentile_cont 
! ------+-----------------
!     0 |               1
!   0.1 |             1.4
!  0.25 |               2
!   0.4 |             2.6
!   0.5 |               3
!   0.6 |             3.4
!  0.75 |               4
!   0.9 |             4.6
!     1 |               5
! (9 rows)
! 
! select p, percentile_cont(p order by p) within group (order by x)  -- error
! from generate_series(1,5) x,
!      (values (0::float8),(0.1),(0.25),(0.4),(0.5),(0.6),(0.75),(0.9),(1)) v(p)
! group by p order by p;
! ERROR:  cannot use multiple ORDER BY clauses with WITHIN GROUP
! LINE 1: select p, percentile_cont(p order by p) within group (order ...
!                                                 ^
! select p, sum() within group (order by x::float8)  -- error
! from generate_series(1,5) x,
!      (values (0::float8),(0.1),(0.25),(0.4),(0.5),(0.6),(0.75),(0.9),(1)) v(p)
! group by p order by p;
! ERROR:  sum is not an ordered-set aggregate, so it cannot have WITHIN GROUP
! LINE 1: select p, sum() within group (order by x::float8)  
!                   ^
! select p, percentile_cont(p,p)  -- error
! from generate_series(1,5) x,
!      (values (0::float8),(0.1),(0.25),(0.4),(0.5),(0.6),(0.75),(0.9),(1)) v(p)
! group by p order by p;
! ERROR:  WITHIN GROUP is required for ordered-set aggregate percentile_cont
! LINE 1: select p, percentile_cont(p,p)  
!                   ^
! select percentile_cont(0.5) within group (order by b) from aggtest;
!  percentile_cont  
! ------------------
!  53.4485001564026
! (1 row)
! 
! select percentile_cont(0.5) within group (order by b), sum(b) from aggtest;
!  percentile_cont  |   sum   
! ------------------+---------
!  53.4485001564026 | 431.773
! (1 row)
! 
! select percentile_cont(0.5) within group (order by thousand) from tenk1;
!  percentile_cont 
! -----------------
!            499.5
! (1 row)
! 
! select percentile_disc(0.5) within group (order by thousand) from tenk1;
!  percentile_disc 
! -----------------
!              499
! (1 row)
! 
! select rank(3) within group (order by x)
! from (values (1),(1),(2),(2),(3),(3),(4)) v(x);
!  rank 
! ------
!     5
! (1 row)
! 
! select cume_dist(3) within group (order by x)
! from (values (1),(1),(2),(2),(3),(3),(4)) v(x);
!  cume_dist 
! -----------
!      0.875
! (1 row)
! 
! select percent_rank(3) within group (order by x)
! from (values (1),(1),(2),(2),(3),(3),(4),(5)) v(x);
!  percent_rank 
! --------------
!           0.5
! (1 row)
! 
! select dense_rank(3) within group (order by x)
! from (values (1),(1),(2),(2),(3),(3),(4)) v(x);
!  dense_rank 
! ------------
!           3
! (1 row)
! 
! select percentile_disc(array[0,0.1,0.25,0.5,0.75,0.9,1]) within group (order by thousand)
! from tenk1;
!       percentile_disc       
! ----------------------------
!  {0,99,249,499,749,899,999}
! (1 row)
! 
! select percentile_cont(array[0,0.25,0.5,0.75,1]) within group (order by thousand)
! from tenk1;
!        percentile_cont       
! -----------------------------
!  {0,249.75,499.5,749.25,999}
! (1 row)
! 
! select percentile_disc(array[[null,1,0.5],[0.75,0.25,null]]) within group (order by thousand)
! from tenk1;
!          percentile_disc         
! ---------------------------------
!  {{NULL,999,499},{749,249,NULL}}
! (1 row)
! 
! select percentile_cont(array[0,1,0.25,0.75,0.5,1]) within group (order by x)
! from generate_series(1,6) x;
!     percentile_cont    
! -----------------------
!  {1,6,2.25,4.75,3.5,6}
! (1 row)
! 
! select ten, mode() within group (order by string4) from tenk1 group by ten;
!  ten |  mode  
! -----+--------
!    0 | HHHHxx
!    1 | OOOOxx
!    2 | VVVVxx
!    3 | OOOOxx
!    4 | HHHHxx
!    5 | HHHHxx
!    6 | OOOOxx
!    7 | AAAAxx
!    8 | VVVVxx
!    9 | VVVVxx
! (10 rows)
! 
! select percentile_disc(array[0.25,0.5,0.75]) within group (order by x)
! from unnest('{fred,jim,fred,jack,jill,fred,jill,jim,jim,sheila,jim,sheila}'::text[]) u(x);
!  percentile_disc 
! -----------------
!  {fred,jill,jim}
! (1 row)
! 
! -- check collation propagates up in suitable cases:
! select pg_collation_for(percentile_disc(1) within group (order by x collate "POSIX"))
!   from (values ('fred'),('jim')) v(x);
!  pg_collation_for 
! ------------------
!  "POSIX"
! (1 row)
! 
! -- ordered-set aggs created with CREATE AGGREGATE
! select test_rank(3) within group (order by x)
! from (values (1),(1),(2),(2),(3),(3),(4)) v(x);
!  test_rank 
! -----------
!          5
! (1 row)
! 
! select test_percentile_disc(0.5) within group (order by thousand) from tenk1;
!  test_percentile_disc 
! ----------------------
!                   499
! (1 row)
! 
! -- ordered-set aggs can't use ungrouped vars in direct args:
! select rank(x) within group (order by x) from generate_series(1,5) x;
! ERROR:  column "x.x" must appear in the GROUP BY clause or be used in an aggregate function
! LINE 1: select rank(x) within group (order by x) from generate_serie...
!                     ^
! DETAIL:  Direct arguments of an ordered-set aggregate must use only grouped columns.
! -- outer-level agg can't use a grouped arg of a lower level, either:
! select array(select percentile_disc(a) within group (order by x)
!                from (values (0.3),(0.7)) v(a) group by a)
!   from generate_series(1,5) g(x);
! ERROR:  outer-level aggregate cannot contain a lower-level variable in its direct arguments
! LINE 1: select array(select percentile_disc(a) within group (order b...
!                                             ^
! -- agg in the direct args is a grouping violation, too:
! select rank(sum(x)) within group (order by x) from generate_series(1,5) x;
! ERROR:  aggregate function calls cannot be nested
! LINE 1: select rank(sum(x)) within group (order by x) from generate_...
!                     ^
! -- hypothetical-set type unification and argument-count failures:
! select rank(3) within group (order by x) from (values ('fred'),('jim')) v(x);
! ERROR:  WITHIN GROUP types text and integer cannot be matched
! LINE 1: select rank(3) within group (order by x) from (values ('fred...
!                     ^
! select rank(3) within group (order by stringu1,stringu2) from tenk1;
! ERROR:  function rank(integer, name, name) does not exist
! LINE 1: select rank(3) within group (order by stringu1,stringu2) fro...
!                ^
! HINT:  To use the hypothetical-set aggregate rank, the number of hypothetical direct arguments (here 1) must match the number of ordering columns (here 2).
! select rank('fred') within group (order by x) from generate_series(1,5) x;
! ERROR:  invalid input syntax for integer: "fred"
! LINE 1: select rank('fred') within group (order by x) from generate_...
!                     ^
! select rank('adam'::text collate "C") within group (order by x collate "POSIX")
!   from (values ('fred'),('jim')) v(x);
! ERROR:  collation mismatch between explicit collations "C" and "POSIX"
! LINE 1: ...adam'::text collate "C") within group (order by x collate "P...
!                                                              ^
! -- hypothetical-set type unification successes:
! select rank('adam'::varchar) within group (order by x) from (values ('fred'),('jim')) v(x);
!  rank 
! ------
!     1
! (1 row)
! 
! select rank('3') within group (order by x) from generate_series(1,5) x;
!  rank 
! ------
!     3
! (1 row)
! 
! -- divide by zero check
! select percent_rank(0) within group (order by x) from generate_series(1,0) x;
!  percent_rank 
! --------------
!             0
! (1 row)
! 
! -- deparse and multiple features:
! create view aggordview1 as
! select ten,
!        percentile_disc(0.5) within group (order by thousand) as p50,
!        percentile_disc(0.5) within group (order by thousand) filter (where hundred=1) as px,
!        rank(5,'AZZZZ',50) within group (order by hundred, string4 desc, hundred)
!   from tenk1
!  group by ten order by ten;
! select pg_get_viewdef('aggordview1');
!                                                         pg_get_viewdef                                                         
! -------------------------------------------------------------------------------------------------------------------------------
!   SELECT tenk1.ten,                                                                                                           +
!      percentile_disc((0.5)::double precision) WITHIN GROUP (ORDER BY tenk1.thousand) AS p50,                                  +
!      percentile_disc((0.5)::double precision) WITHIN GROUP (ORDER BY tenk1.thousand) FILTER (WHERE (tenk1.hundred = 1)) AS px,+
!      rank(5, 'AZZZZ'::name, 50) WITHIN GROUP (ORDER BY tenk1.hundred, tenk1.string4 DESC, tenk1.hundred) AS rank              +
!     FROM tenk1                                                                                                                +
!    GROUP BY tenk1.ten                                                                                                         +
!    ORDER BY tenk1.ten;
! (1 row)
! 
! select * from aggordview1 order by ten;
!  ten | p50 | px  | rank 
! -----+-----+-----+------
!    0 | 490 |     |  101
!    1 | 491 | 401 |  101
!    2 | 492 |     |  101
!    3 | 493 |     |  101
!    4 | 494 |     |  101
!    5 | 495 |     |   67
!    6 | 496 |     |    1
!    7 | 497 |     |    1
!    8 | 498 |     |    1
!    9 | 499 |     |    1
! (10 rows)
! 
! drop view aggordview1;
! -- variadic aggregates
! select least_agg(q1,q2) from int8_tbl;
!      least_agg     
! -------------------
!  -4567890123456789
! (1 row)
! 
! select least_agg(variadic array[q1,q2]) from int8_tbl;
!      least_agg     
! -------------------
!  -4567890123456789
! (1 row)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/transactions.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/transactions.out	Tue Oct 28 15:53:05 2014
***************
*** 1,623 ****
! --
! -- TRANSACTIONS
! --
! BEGIN;
! SELECT *
!    INTO TABLE xacttest
!    FROM aggtest;
! INSERT INTO xacttest (a, b) VALUES (777, 777.777);
! END;
! -- should retrieve one value--
! SELECT a FROM xacttest WHERE a > 100;
!   a  
! -----
!  777
! (1 row)
! 
! BEGIN;
! CREATE TABLE disappear (a int4);
! DELETE FROM aggtest;
! -- should be empty
! SELECT * FROM aggtest;
!  a | b 
! ---+---
! (0 rows)
! 
! ABORT;
! -- should not exist
! SELECT oid FROM pg_class WHERE relname = 'disappear';
!  oid 
! -----
! (0 rows)
! 
! -- should have members again
! SELECT * FROM aggtest;
!   a  |    b    
! -----+---------
!   56 |     7.8
!  100 |  99.097
!    0 | 0.09561
!   42 |  324.78
! (4 rows)
! 
! -- Read-only tests
! CREATE TABLE writetest (a int);
! CREATE TEMPORARY TABLE temptest (a int);
! BEGIN;
! SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE; -- ok
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! SET TRANSACTION READ WRITE; --fail
! ERROR:  transaction read-write mode must be set before any query
! COMMIT;
! BEGIN;
! SET TRANSACTION READ ONLY; -- ok
! SET TRANSACTION READ WRITE; -- ok
! SET TRANSACTION READ ONLY; -- ok
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! SAVEPOINT x;
! SET TRANSACTION READ ONLY; -- ok
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! SET TRANSACTION READ ONLY; -- ok
! SET TRANSACTION READ WRITE; --fail
! ERROR:  cannot set transaction read-write mode inside a read-only transaction
! COMMIT;
! BEGIN;
! SET TRANSACTION READ WRITE; -- ok
! SAVEPOINT x;
! SET TRANSACTION READ WRITE; -- ok
! SET TRANSACTION READ ONLY; -- ok
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! SET TRANSACTION READ ONLY; -- ok
! SET TRANSACTION READ WRITE; --fail
! ERROR:  cannot set transaction read-write mode inside a read-only transaction
! COMMIT;
! BEGIN;
! SET TRANSACTION READ WRITE; -- ok
! SAVEPOINT x;
! SET TRANSACTION READ ONLY; -- ok
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! ROLLBACK TO SAVEPOINT x;
! SHOW transaction_read_only;  -- off
!  transaction_read_only 
! -----------------------
!  off
! (1 row)
! 
! SAVEPOINT y;
! SET TRANSACTION READ ONLY; -- ok
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! RELEASE SAVEPOINT y;
! SHOW transaction_read_only;  -- off
!  transaction_read_only 
! -----------------------
!  off
! (1 row)
! 
! COMMIT;
! SET SESSION CHARACTERISTICS AS TRANSACTION READ ONLY;
! DROP TABLE writetest; -- fail
! ERROR:  cannot execute DROP TABLE in a read-only transaction
! INSERT INTO writetest VALUES (1); -- fail
! ERROR:  cannot execute INSERT in a read-only transaction
! SELECT * FROM writetest; -- ok
!  a 
! ---
! (0 rows)
! 
! DELETE FROM temptest; -- ok
! UPDATE temptest SET a = 0 FROM writetest WHERE temptest.a = 1 AND writetest.a = temptest.a; -- ok
! PREPARE test AS UPDATE writetest SET a = 0; -- ok
! EXECUTE test; -- fail
! ERROR:  cannot execute UPDATE in a read-only transaction
! SELECT * FROM writetest, temptest; -- ok
!  a | a 
! ---+---
! (0 rows)
! 
! CREATE TABLE test AS SELECT * FROM writetest; -- fail
! ERROR:  cannot execute CREATE TABLE AS in a read-only transaction
! START TRANSACTION READ WRITE;
! DROP TABLE writetest; -- ok
! COMMIT;
! -- Subtransactions, basic tests
! -- create & drop tables
! SET SESSION CHARACTERISTICS AS TRANSACTION READ WRITE;
! CREATE TABLE foobar (a int);
! BEGIN;
! 	CREATE TABLE foo (a int);
! 	SAVEPOINT one;
! 		DROP TABLE foo;
! 		CREATE TABLE bar (a int);
! 	ROLLBACK TO SAVEPOINT one;
! 	RELEASE SAVEPOINT one;
! 	SAVEPOINT two;
! 		CREATE TABLE baz (a int);
! 	RELEASE SAVEPOINT two;
! 	drop TABLE foobar;
! 	CREATE TABLE barbaz (a int);
! COMMIT;
! -- should exist: barbaz, baz, foo
! SELECT * FROM foo;		-- should be empty
!  a 
! ---
! (0 rows)
! 
! SELECT * FROM bar;		-- shouldn't exist
! ERROR:  relation "bar" does not exist
! LINE 1: SELECT * FROM bar;
!                       ^
! SELECT * FROM barbaz;	-- should be empty
!  a 
! ---
! (0 rows)
! 
! SELECT * FROM baz;		-- should be empty
!  a 
! ---
! (0 rows)
! 
! -- inserts
! BEGIN;
! 	INSERT INTO foo VALUES (1);
! 	SAVEPOINT one;
! 		INSERT into bar VALUES (1);
! ERROR:  relation "bar" does not exist
! LINE 1: INSERT into bar VALUES (1);
!                     ^
! 	ROLLBACK TO one;
! 	RELEASE SAVEPOINT one;
! 	SAVEPOINT two;
! 		INSERT into barbaz VALUES (1);
! 	RELEASE two;
! 	SAVEPOINT three;
! 		SAVEPOINT four;
! 			INSERT INTO foo VALUES (2);
! 		RELEASE SAVEPOINT four;
! 	ROLLBACK TO SAVEPOINT three;
! 	RELEASE SAVEPOINT three;
! 	INSERT INTO foo VALUES (3);
! COMMIT;
! SELECT * FROM foo;		-- should have 1 and 3
!  a 
! ---
!  1
!  3
! (2 rows)
! 
! SELECT * FROM barbaz;	-- should have 1
!  a 
! ---
!  1
! (1 row)
! 
! -- test whole-tree commit
! BEGIN;
! 	SAVEPOINT one;
! 		SELECT foo;
! ERROR:  column "foo" does not exist
! LINE 1: SELECT foo;
!                ^
! 	ROLLBACK TO SAVEPOINT one;
! 	RELEASE SAVEPOINT one;
! 	SAVEPOINT two;
! 		CREATE TABLE savepoints (a int);
! 		SAVEPOINT three;
! 			INSERT INTO savepoints VALUES (1);
! 			SAVEPOINT four;
! 				INSERT INTO savepoints VALUES (2);
! 				SAVEPOINT five;
! 					INSERT INTO savepoints VALUES (3);
! 				ROLLBACK TO SAVEPOINT five;
! COMMIT;
! COMMIT;		-- should not be in a transaction block
! WARNING:  there is no transaction in progress
! SELECT * FROM savepoints;
!  a 
! ---
!  1
!  2
! (2 rows)
! 
! -- test whole-tree rollback
! BEGIN;
! 	SAVEPOINT one;
! 		DELETE FROM savepoints WHERE a=1;
! 	RELEASE SAVEPOINT one;
! 	SAVEPOINT two;
! 		DELETE FROM savepoints WHERE a=1;
! 		SAVEPOINT three;
! 			DELETE FROM savepoints WHERE a=2;
! ROLLBACK;
! COMMIT;		-- should not be in a transaction block
! WARNING:  there is no transaction in progress
! SELECT * FROM savepoints;
!  a 
! ---
!  1
!  2
! (2 rows)
! 
! -- test whole-tree commit on an aborted subtransaction
! BEGIN;
! 	INSERT INTO savepoints VALUES (4);
! 	SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (5);
! 		SELECT foo;
! ERROR:  column "foo" does not exist
! LINE 1: SELECT foo;
!                ^
! COMMIT;
! SELECT * FROM savepoints;
!  a 
! ---
!  1
!  2
! (2 rows)
! 
! BEGIN;
! 	INSERT INTO savepoints VALUES (6);
! 	SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (7);
! 	RELEASE SAVEPOINT one;
! 	INSERT INTO savepoints VALUES (8);
! COMMIT;
! -- rows 6 and 8 should have been created by the same xact
! SELECT a.xmin = b.xmin FROM savepoints a, savepoints b WHERE a.a=6 AND b.a=8;
!  ?column? 
! ----------
!  t
! (1 row)
! 
! -- rows 6 and 7 should have been created by different xacts
! SELECT a.xmin = b.xmin FROM savepoints a, savepoints b WHERE a.a=6 AND b.a=7;
!  ?column? 
! ----------
!  f
! (1 row)
! 
! BEGIN;
! 	INSERT INTO savepoints VALUES (9);
! 	SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (10);
! 	ROLLBACK TO SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (11);
! COMMIT;
! SELECT a FROM savepoints WHERE a in (9, 10, 11);
!  a  
! ----
!   9
!  11
! (2 rows)
! 
! -- rows 9 and 11 should have been created by different xacts
! SELECT a.xmin = b.xmin FROM savepoints a, savepoints b WHERE a.a=9 AND b.a=11;
!  ?column? 
! ----------
!  f
! (1 row)
! 
! BEGIN;
! 	INSERT INTO savepoints VALUES (12);
! 	SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (13);
! 		SAVEPOINT two;
! 			INSERT INTO savepoints VALUES (14);
! 	ROLLBACK TO SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (15);
! 		SAVEPOINT two;
! 			INSERT INTO savepoints VALUES (16);
! 			SAVEPOINT three;
! 				INSERT INTO savepoints VALUES (17);
! COMMIT;
! SELECT a FROM savepoints WHERE a BETWEEN 12 AND 17;
!  a  
! ----
!  12
!  15
!  16
!  17
! (4 rows)
! 
! BEGIN;
! 	INSERT INTO savepoints VALUES (18);
! 	SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (19);
! 		SAVEPOINT two;
! 			INSERT INTO savepoints VALUES (20);
! 	ROLLBACK TO SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (21);
! 	ROLLBACK TO SAVEPOINT one;
! 		INSERT INTO savepoints VALUES (22);
! COMMIT;
! SELECT a FROM savepoints WHERE a BETWEEN 18 AND 22;
!  a  
! ----
!  18
!  22
! (2 rows)
! 
! DROP TABLE savepoints;
! -- only in a transaction block:
! SAVEPOINT one;
! ERROR:  SAVEPOINT can only be used in transaction blocks
! ROLLBACK TO SAVEPOINT one;
! ERROR:  ROLLBACK TO SAVEPOINT can only be used in transaction blocks
! RELEASE SAVEPOINT one;
! ERROR:  RELEASE SAVEPOINT can only be used in transaction blocks
! -- Only "rollback to" allowed in aborted state
! BEGIN;
!   SAVEPOINT one;
!   SELECT 0/0;
! ERROR:  division by zero
!   SAVEPOINT two;    -- ignored till the end of ...
! ERROR:  current transaction is aborted, commands ignored until end of transaction block
!   RELEASE SAVEPOINT one;      -- ignored till the end of ...
! ERROR:  current transaction is aborted, commands ignored until end of transaction block
!   ROLLBACK TO SAVEPOINT one;
!   SELECT 1;
!  ?column? 
! ----------
!         1
! (1 row)
! 
! COMMIT;
! SELECT 1;			-- this should work
!  ?column? 
! ----------
!         1
! (1 row)
! 
! -- check non-transactional behavior of cursors
! BEGIN;
! 	DECLARE c CURSOR FOR SELECT unique2 FROM tenk1 ORDER BY unique2;
! 	SAVEPOINT one;
! 		FETCH 10 FROM c;
!  unique2 
! ---------
!        0
!        1
!        2
!        3
!        4
!        5
!        6
!        7
!        8
!        9
! (10 rows)
! 
! 	ROLLBACK TO SAVEPOINT one;
! 		FETCH 10 FROM c;
!  unique2 
! ---------
!       10
!       11
!       12
!       13
!       14
!       15
!       16
!       17
!       18
!       19
! (10 rows)
! 
! 	RELEASE SAVEPOINT one;
! 	FETCH 10 FROM c;
!  unique2 
! ---------
!       20
!       21
!       22
!       23
!       24
!       25
!       26
!       27
!       28
!       29
! (10 rows)
! 
! 	CLOSE c;
! 	DECLARE c CURSOR FOR SELECT unique2/0 FROM tenk1 ORDER BY unique2;
! 	SAVEPOINT two;
! 		FETCH 10 FROM c;
! ERROR:  division by zero
! 	ROLLBACK TO SAVEPOINT two;
! 	-- c is now dead to the world ...
! 		FETCH 10 FROM c;
! ERROR:  portal "c" cannot be run
! 	ROLLBACK TO SAVEPOINT two;
! 	RELEASE SAVEPOINT two;
! 	FETCH 10 FROM c;
! ERROR:  portal "c" cannot be run
! COMMIT;
! --
! -- Check that "stable" functions are really stable.  They should not be
! -- able to see the partial results of the calling query.  (Ideally we would
! -- also check that they don't see commits of concurrent transactions, but
! -- that's a mite hard to do within the limitations of pg_regress.)
! --
! select * from xacttest;
!   a  |    b    
! -----+---------
!   56 |     7.8
!  100 |  99.097
!    0 | 0.09561
!   42 |  324.78
!  777 | 777.777
! (5 rows)
! 
! create or replace function max_xacttest() returns smallint language sql as
! 'select max(a) from xacttest' stable;
! begin;
! update xacttest set a = max_xacttest() + 10 where a > 0;
! select * from xacttest;
!   a  |    b    
! -----+---------
!    0 | 0.09561
!  787 |     7.8
!  787 |  99.097
!  787 |  324.78
!  787 | 777.777
! (5 rows)
! 
! rollback;
! -- But a volatile function can see the partial results of the calling query
! create or replace function max_xacttest() returns smallint language sql as
! 'select max(a) from xacttest' volatile;
! begin;
! update xacttest set a = max_xacttest() + 10 where a > 0;
! select * from xacttest;
!   a  |    b    
! -----+---------
!    0 | 0.09561
!  787 |     7.8
!  797 |  99.097
!  807 |  324.78
!  817 | 777.777
! (5 rows)
! 
! rollback;
! -- Now the same test with plpgsql (since it depends on SPI which is different)
! create or replace function max_xacttest() returns smallint language plpgsql as
! 'begin return max(a) from xacttest; end' stable;
! begin;
! update xacttest set a = max_xacttest() + 10 where a > 0;
! select * from xacttest;
!   a  |    b    
! -----+---------
!    0 | 0.09561
!  787 |     7.8
!  787 |  99.097
!  787 |  324.78
!  787 | 777.777
! (5 rows)
! 
! rollback;
! create or replace function max_xacttest() returns smallint language plpgsql as
! 'begin return max(a) from xacttest; end' volatile;
! begin;
! update xacttest set a = max_xacttest() + 10 where a > 0;
! select * from xacttest;
!   a  |    b    
! -----+---------
!    0 | 0.09561
!  787 |     7.8
!  797 |  99.097
!  807 |  324.78
!  817 | 777.777
! (5 rows)
! 
! rollback;
! -- test case for problems with dropping an open relation during abort
! BEGIN;
! 	savepoint x;
! 		CREATE TABLE koju (a INT UNIQUE);
! 		INSERT INTO koju VALUES (1);
! 		INSERT INTO koju VALUES (1);
! ERROR:  duplicate key value violates unique constraint "koju_a_key"
! DETAIL:  Key (a)=(1) already exists.
! 	rollback to x;
! 	CREATE TABLE koju (a INT UNIQUE);
! 	INSERT INTO koju VALUES (1);
! 	INSERT INTO koju VALUES (1);
! ERROR:  duplicate key value violates unique constraint "koju_a_key"
! DETAIL:  Key (a)=(1) already exists.
! ROLLBACK;
! DROP TABLE foo;
! DROP TABLE baz;
! DROP TABLE barbaz;
! -- test case for problems with revalidating an open relation during abort
! create function inverse(int) returns float8 as
! $$
! begin
!   analyze revalidate_bug;
!   return 1::float8/$1;
! exception
!   when division_by_zero then return 0;
! end$$ language plpgsql volatile;
! create table revalidate_bug (c float8 unique);
! insert into revalidate_bug values (1);
! insert into revalidate_bug values (inverse(0));
! drop table revalidate_bug;
! drop function inverse(int);
! -- verify that cursors created during an aborted subtransaction are
! -- closed, but that we do not rollback the effect of any FETCHs
! -- performed in the aborted subtransaction
! begin;
! savepoint x;
! create table abc (a int);
! insert into abc values (5);
! insert into abc values (10);
! declare foo cursor for select * from abc;
! fetch from foo;
!  a 
! ---
!  5
! (1 row)
! 
! rollback to x;
! -- should fail
! fetch from foo;
! ERROR:  cursor "foo" does not exist
! commit;
! begin;
! create table abc (a int);
! insert into abc values (5);
! insert into abc values (10);
! insert into abc values (15);
! declare foo cursor for select * from abc;
! fetch from foo;
!  a 
! ---
!  5
! (1 row)
! 
! savepoint x;
! fetch from foo;
!  a  
! ----
!  10
! (1 row)
! 
! rollback to x;
! fetch from foo;
!  a  
! ----
!  15
! (1 row)
! 
! abort;
! -- Test for successful cleanup of an aborted transaction at session exit.
! -- THIS MUST BE THE LAST TEST IN THIS FILE.
! begin;
! select 1/0;
! ERROR:  division by zero
! rollback to X;
! ERROR:  no such savepoint
! -- DO NOT ADD ANYTHING HERE.
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/random.out	Sun Oct  3 21:26:00 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/random.out	Tue Oct 28 15:53:05 2014
***************
*** 1,52 ****
! --
! -- RANDOM
! -- Test the random function
! --
! -- count the number of tuples originally, should be 1000
! SELECT count(*) FROM onek;
!  count 
! -------
!   1000
! (1 row)
! 
! -- pick three random rows, they shouldn't match
! (SELECT unique1 AS random
!   FROM onek ORDER BY random() LIMIT 1)
! INTERSECT
! (SELECT unique1 AS random
!   FROM onek ORDER BY random() LIMIT 1)
! INTERSECT
! (SELECT unique1 AS random
!   FROM onek ORDER BY random() LIMIT 1);
!  random 
! --------
! (0 rows)
! 
! -- count roughly 1/10 of the tuples
! SELECT count(*) AS random INTO RANDOM_TBL
!   FROM onek WHERE random() < 1.0/10;
! -- select again, the count should be different
! INSERT INTO RANDOM_TBL (random)
!   SELECT count(*)
!   FROM onek WHERE random() < 1.0/10;
! -- select again, the count should be different
! INSERT INTO RANDOM_TBL (random)
!   SELECT count(*)
!   FROM onek WHERE random() < 1.0/10;
! -- select again, the count should be different
! INSERT INTO RANDOM_TBL (random)
!   SELECT count(*)
!   FROM onek WHERE random() < 1.0/10;
! -- now test that they are different counts
! SELECT random, count(random) FROM RANDOM_TBL
!   GROUP BY random HAVING count(random) > 3;
!  random | count 
! --------+-------
! (0 rows)
! 
! SELECT AVG(random) FROM RANDOM_TBL
!   HAVING AVG(random) NOT BETWEEN 80 AND 120;
!  avg 
! -----
! (0 rows)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/portals.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/portals.out	Tue Oct 28 15:53:05 2014
***************
*** 1,1287 ****
! --
! -- Cursor regression tests
! --
! BEGIN;
! DECLARE foo1 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo2 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo3 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo4 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo5 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo6 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo7 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo8 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo9 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo10 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo11 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo12 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo13 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo14 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo15 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo16 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo17 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo18 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo19 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo20 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo21 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! DECLARE foo22 SCROLL CURSOR FOR SELECT * FROM tenk2;
! DECLARE foo23 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! FETCH 1 in foo1;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (1 row)
! 
! FETCH 2 in foo2;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
! (2 rows)
! 
! FETCH 3 in foo3;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
! (3 rows)
! 
! FETCH 4 in foo4;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
! (4 rows)
! 
! FETCH 5 in foo5;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
! (5 rows)
! 
! FETCH 6 in foo6;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
! (6 rows)
! 
! FETCH 7 in foo7;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
! (7 rows)
! 
! FETCH 8 in foo8;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
! (8 rows)
! 
! FETCH 9 in foo9;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
! (9 rows)
! 
! FETCH 10 in foo10;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
! (10 rows)
! 
! FETCH 11 in foo11;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
! (11 rows)
! 
! FETCH 12 in foo12;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
! (12 rows)
! 
! FETCH 13 in foo13;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
! (13 rows)
! 
! FETCH 14 in foo14;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
! (14 rows)
! 
! FETCH 15 in foo15;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
! (15 rows)
! 
! FETCH 16 in foo16;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
! (16 rows)
! 
! FETCH 17 in foo17;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
! (17 rows)
! 
! FETCH 18 in foo18;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
! (18 rows)
! 
! FETCH 19 in foo19;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
! (19 rows)
! 
! FETCH 20 in foo20;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
!     6969 |      19 |   1 |    1 |   9 |      9 |      69 |      969 |         969 |      1969 |     6969 | 138 |  139 | BIAAAA   | TAAAAA   | VVVVxx
! (20 rows)
! 
! FETCH 21 in foo21;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
!     6969 |      19 |   1 |    1 |   9 |      9 |      69 |      969 |         969 |      1969 |     6969 | 138 |  139 | BIAAAA   | TAAAAA   | VVVVxx
!     9460 |      20 |   0 |    0 |   0 |      0 |      60 |      460 |        1460 |      4460 |     9460 | 120 |  121 | WZAAAA   | UAAAAA   | AAAAxx
! (21 rows)
! 
! FETCH 22 in foo22;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
!     6969 |      19 |   1 |    1 |   9 |      9 |      69 |      969 |         969 |      1969 |     6969 | 138 |  139 | BIAAAA   | TAAAAA   | VVVVxx
!     9460 |      20 |   0 |    0 |   0 |      0 |      60 |      460 |        1460 |      4460 |     9460 | 120 |  121 | WZAAAA   | UAAAAA   | AAAAxx
!       59 |      21 |   1 |    3 |   9 |     19 |      59 |       59 |          59 |        59 |       59 | 118 |  119 | HCAAAA   | VAAAAA   | HHHHxx
! (22 rows)
! 
! FETCH 23 in foo23;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
!     6969 |      19 |   1 |    1 |   9 |      9 |      69 |      969 |         969 |      1969 |     6969 | 138 |  139 | BIAAAA   | TAAAAA   | VVVVxx
!     9460 |      20 |   0 |    0 |   0 |      0 |      60 |      460 |        1460 |      4460 |     9460 | 120 |  121 | WZAAAA   | UAAAAA   | AAAAxx
!       59 |      21 |   1 |    3 |   9 |     19 |      59 |       59 |          59 |        59 |       59 | 118 |  119 | HCAAAA   | VAAAAA   | HHHHxx
!     8020 |      22 |   0 |    0 |   0 |      0 |      20 |       20 |          20 |      3020 |     8020 |  40 |   41 | MWAAAA   | WAAAAA   | OOOOxx
! (23 rows)
! 
! FETCH backward 1 in foo23;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!       59 |      21 |   1 |    3 |   9 |     19 |      59 |       59 |          59 |        59 |       59 | 118 |  119 | HCAAAA   | VAAAAA   | HHHHxx
! (1 row)
! 
! FETCH backward 2 in foo22;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     9460 |      20 |   0 |    0 |   0 |      0 |      60 |      460 |        1460 |      4460 |     9460 | 120 |  121 | WZAAAA   | UAAAAA   | AAAAxx
!     6969 |      19 |   1 |    1 |   9 |      9 |      69 |      969 |         969 |      1969 |     6969 | 138 |  139 | BIAAAA   | TAAAAA   | VVVVxx
! (2 rows)
! 
! FETCH backward 3 in foo21;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     6969 |      19 |   1 |    1 |   9 |      9 |      69 |      969 |         969 |      1969 |     6969 | 138 |  139 | BIAAAA   | TAAAAA   | VVVVxx
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
! (3 rows)
! 
! FETCH backward 4 in foo20;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     6621 |      18 |   1 |    1 |   1 |      1 |      21 |      621 |         621 |      1621 |     6621 |  42 |   43 | RUAAAA   | SAAAAA   | OOOOxx
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
! (4 rows)
! 
! FETCH backward 5 in foo19;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     5785 |      17 |   1 |    1 |   5 |      5 |      85 |      785 |        1785 |       785 |     5785 | 170 |  171 | NOAAAA   | RAAAAA   | HHHHxx
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
! (5 rows)
! 
! FETCH backward 6 in foo18;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     5387 |      16 |   1 |    3 |   7 |      7 |      87 |      387 |        1387 |       387 |     5387 | 174 |  175 | FZAAAA   | QAAAAA   | AAAAxx
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
! (6 rows)
! 
! FETCH backward 7 in foo17;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     5006 |      15 |   0 |    2 |   6 |      6 |       6 |        6 |        1006 |         6 |     5006 |  12 |   13 | OKAAAA   | PAAAAA   | VVVVxx
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
! (7 rows)
! 
! FETCH backward 8 in foo16;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     5471 |      14 |   1 |    3 |   1 |     11 |      71 |      471 |        1471 |       471 |     5471 | 142 |  143 | LCAAAA   | OAAAAA   | OOOOxx
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
! (8 rows)
! 
! FETCH backward 9 in foo15;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     6243 |      13 |   1 |    3 |   3 |      3 |      43 |      243 |         243 |      1243 |     6243 |  86 |   87 | DGAAAA   | NAAAAA   | HHHHxx
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
! (9 rows)
! 
! FETCH backward 10 in foo14;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     5222 |      12 |   0 |    2 |   2 |      2 |      22 |      222 |        1222 |       222 |     5222 |  44 |   45 | WSAAAA   | MAAAAA   | AAAAxx
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
! (10 rows)
! 
! FETCH backward 11 in foo13;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     1504 |      11 |   0 |    0 |   4 |      4 |       4 |      504 |        1504 |      1504 |     1504 |   8 |    9 | WFAAAA   | LAAAAA   | VVVVxx
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
! (11 rows)
! 
! FETCH backward 12 in foo12;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     1314 |      10 |   0 |    2 |   4 |     14 |      14 |      314 |        1314 |      1314 |     1314 |  28 |   29 | OYAAAA   | KAAAAA   | OOOOxx
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (11 rows)
! 
! FETCH backward 13 in foo11;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     3043 |       9 |   1 |    3 |   3 |      3 |      43 |       43 |        1043 |      3043 |     3043 |  86 |   87 | BNAAAA   | JAAAAA   | HHHHxx
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (10 rows)
! 
! FETCH backward 14 in foo10;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     4321 |       8 |   1 |    1 |   1 |      1 |      21 |      321 |         321 |      4321 |     4321 |  42 |   43 | FKAAAA   | IAAAAA   | AAAAxx
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (9 rows)
! 
! FETCH backward 15 in foo9;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     6701 |       7 |   1 |    1 |   1 |      1 |       1 |      701 |         701 |      1701 |     6701 |   2 |    3 | TXAAAA   | HAAAAA   | VVVVxx
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (8 rows)
! 
! FETCH backward 16 in foo8;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     5057 |       6 |   1 |    1 |   7 |     17 |      57 |       57 |        1057 |        57 |     5057 | 114 |  115 | NMAAAA   | GAAAAA   | OOOOxx
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (7 rows)
! 
! FETCH backward 17 in foo7;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8009 |       5 |   1 |    1 |   9 |      9 |       9 |        9 |           9 |      3009 |     8009 |  18 |   19 | BWAAAA   | FAAAAA   | HHHHxx
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (6 rows)
! 
! FETCH backward 18 in foo6;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     7164 |       4 |   0 |    0 |   4 |      4 |      64 |      164 |        1164 |      2164 |     7164 | 128 |  129 | OPAAAA   | EAAAAA   | AAAAxx
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (5 rows)
! 
! FETCH backward 19 in foo5;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     9850 |       3 |   0 |    2 |   0 |     10 |      50 |      850 |        1850 |      4850 |     9850 | 100 |  101 | WOAAAA   | DAAAAA   | VVVVxx
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (4 rows)
! 
! FETCH backward 20 in foo4;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (3 rows)
! 
! FETCH backward 21 in foo3;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (2 rows)
! 
! FETCH backward 22 in foo2;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (1 row)
! 
! FETCH backward 23 in foo1;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
! (0 rows)
! 
! CLOSE foo1;
! CLOSE foo2;
! CLOSE foo3;
! CLOSE foo4;
! CLOSE foo5;
! CLOSE foo6;
! CLOSE foo7;
! CLOSE foo8;
! CLOSE foo9;
! CLOSE foo10;
! CLOSE foo11;
! CLOSE foo12;
! -- leave some cursors open, to test that auto-close works.
! -- record this in the system view as well (don't query the time field there
! -- however)
! SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors ORDER BY 1;
!  name  |                               statement                               | is_holdable | is_binary | is_scrollable 
! -------+-----------------------------------------------------------------------+-------------+-----------+---------------
!  foo13 | DECLARE foo13 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2; | f           | f         | t
!  foo14 | DECLARE foo14 SCROLL CURSOR FOR SELECT * FROM tenk2;                  | f           | f         | t
!  foo15 | DECLARE foo15 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2; | f           | f         | t
!  foo16 | DECLARE foo16 SCROLL CURSOR FOR SELECT * FROM tenk2;                  | f           | f         | t
!  foo17 | DECLARE foo17 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2; | f           | f         | t
!  foo18 | DECLARE foo18 SCROLL CURSOR FOR SELECT * FROM tenk2;                  | f           | f         | t
!  foo19 | DECLARE foo19 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2; | f           | f         | t
!  foo20 | DECLARE foo20 SCROLL CURSOR FOR SELECT * FROM tenk2;                  | f           | f         | t
!  foo21 | DECLARE foo21 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2; | f           | f         | t
!  foo22 | DECLARE foo22 SCROLL CURSOR FOR SELECT * FROM tenk2;                  | f           | f         | t
!  foo23 | DECLARE foo23 SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2; | f           | f         | t
! (11 rows)
! 
! END;
! SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors;
!  name | statement | is_holdable | is_binary | is_scrollable 
! ------+-----------+-------------+-----------+---------------
! (0 rows)
! 
! --
! -- NO SCROLL disallows backward fetching
! --
! BEGIN;
! DECLARE foo24 NO SCROLL CURSOR FOR SELECT * FROM tenk1 ORDER BY unique2;
! FETCH 1 FROM foo24;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (1 row)
! 
! FETCH BACKWARD 1 FROM foo24; -- should fail
! ERROR:  cursor can only scan forward
! HINT:  Declare it with SCROLL option to enable backward scan.
! END;
! --
! -- Cursors outside transaction blocks
! --
! SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors;
!  name | statement | is_holdable | is_binary | is_scrollable 
! ------+-----------+-------------+-----------+---------------
! (0 rows)
! 
! BEGIN;
! DECLARE foo25 SCROLL CURSOR WITH HOLD FOR SELECT * FROM tenk2;
! FETCH FROM foo25;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     8800 |       0 |   0 |    0 |   0 |      0 |       0 |      800 |         800 |      3800 |     8800 |   0 |    1 | MAAAAA   | AAAAAA   | AAAAxx
! (1 row)
! 
! FETCH FROM foo25;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
! (1 row)
! 
! COMMIT;
! FETCH FROM foo25;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     3420 |       2 |   0 |    0 |   0 |      0 |      20 |      420 |        1420 |      3420 |     3420 |  40 |   41 | OBAAAA   | CAAAAA   | OOOOxx
! (1 row)
! 
! FETCH BACKWARD FROM foo25;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     1891 |       1 |   1 |    3 |   1 |     11 |      91 |      891 |        1891 |      1891 |     1891 | 182 |  183 | TUAAAA   | BAAAAA   | HHHHxx
! (1 row)
! 
! FETCH ABSOLUTE -1 FROM foo25;
!  unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4 
! ---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
!     2968 |    9999 |   0 |    0 |   8 |      8 |      68 |      968 |         968 |      2968 |     2968 | 136 |  137 | EKAAAA   | PUOAAA   | VVVVxx
! (1 row)
! 
! SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors;
!  name  |                           statement                            | is_holdable | is_binary | is_scrollable 
! -------+----------------------------------------------------------------+-------------+-----------+---------------
!  foo25 | DECLARE foo25 SCROLL CURSOR WITH HOLD FOR SELECT * FROM tenk2; | t           | f         | t
! (1 row)
! 
! CLOSE foo25;
! --
! -- ROLLBACK should close holdable cursors
! --
! BEGIN;
! DECLARE foo26 CURSOR WITH HOLD FOR SELECT * FROM tenk1 ORDER BY unique2;
! ROLLBACK;
! -- should fail
! FETCH FROM foo26;
! ERROR:  cursor "foo26" does not exist
! --
! -- Parameterized DECLARE needs to insert param values into the cursor portal
! --
! BEGIN;
! CREATE FUNCTION declares_cursor(text)
!    RETURNS void
!    AS 'DECLARE c CURSOR FOR SELECT stringu1 FROM tenk1 WHERE stringu1 LIKE $1;'
!    LANGUAGE SQL;
! SELECT declares_cursor('AB%');
!  declares_cursor 
! -----------------
!  
! (1 row)
! 
! FETCH ALL FROM c;
!  stringu1 
! ----------
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
!  ABAAAA
! (15 rows)
! 
! ROLLBACK;
! --
! -- Test behavior of both volatile and stable functions inside a cursor;
! -- in particular we want to see what happens during commit of a holdable
! -- cursor
! --
! create temp table tt1(f1 int);
! create function count_tt1_v() returns int8 as
! 'select count(*) from tt1' language sql volatile;
! create function count_tt1_s() returns int8 as
! 'select count(*) from tt1' language sql stable;
! begin;
! insert into tt1 values(1);
! declare c1 cursor for select count_tt1_v(), count_tt1_s();
! insert into tt1 values(2);
! fetch all from c1;
!  count_tt1_v | count_tt1_s 
! -------------+-------------
!            2 |           1
! (1 row)
! 
! rollback;
! begin;
! insert into tt1 values(1);
! declare c2 cursor with hold for select count_tt1_v(), count_tt1_s();
! insert into tt1 values(2);
! commit;
! delete from tt1;
! fetch all from c2;
!  count_tt1_v | count_tt1_s 
! -------------+-------------
!            2 |           1
! (1 row)
! 
! drop function count_tt1_v();
! drop function count_tt1_s();
! -- Create a cursor with the BINARY option and check the pg_cursors view
! BEGIN;
! SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors;
!  name |                              statement                               | is_holdable | is_binary | is_scrollable 
! ------+----------------------------------------------------------------------+-------------+-----------+---------------
!  c2   | declare c2 cursor with hold for select count_tt1_v(), count_tt1_s(); | t           | f         | f
! (1 row)
! 
! DECLARE bc BINARY CURSOR FOR SELECT * FROM tenk1;
! SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors ORDER BY 1;
!  name |                              statement                               | is_holdable | is_binary | is_scrollable 
! ------+----------------------------------------------------------------------+-------------+-----------+---------------
!  bc   | DECLARE bc BINARY CURSOR FOR SELECT * FROM tenk1;                    | f           | t         | t
!  c2   | declare c2 cursor with hold for select count_tt1_v(), count_tt1_s(); | t           | f         | f
! (2 rows)
! 
! ROLLBACK;
! -- We should not see the portal that is created internally to
! -- implement EXECUTE in pg_cursors
! PREPARE cprep AS
!   SELECT name, statement, is_holdable, is_binary, is_scrollable FROM pg_cursors;
! EXECUTE cprep;
!  name |                              statement                               | is_holdable | is_binary | is_scrollable 
! ------+----------------------------------------------------------------------+-------------+-----------+---------------
!  c2   | declare c2 cursor with hold for select count_tt1_v(), count_tt1_s(); | t           | f         | f
! (1 row)
! 
! -- test CLOSE ALL;
! SELECT name FROM pg_cursors ORDER BY 1;
!  name 
! ------
!  c2
! (1 row)
! 
! CLOSE ALL;
! SELECT name FROM pg_cursors ORDER BY 1;
!  name 
! ------
! (0 rows)
! 
! BEGIN;
! DECLARE foo1 CURSOR WITH HOLD FOR SELECT 1;
! DECLARE foo2 CURSOR WITHOUT HOLD FOR SELECT 1;
! SELECT name FROM pg_cursors ORDER BY 1;
!  name 
! ------
!  foo1
!  foo2
! (2 rows)
! 
! CLOSE ALL;
! SELECT name FROM pg_cursors ORDER BY 1;
!  name 
! ------
! (0 rows)
! 
! COMMIT;
! --
! -- Tests for updatable cursors
! --
! CREATE TEMP TABLE uctest(f1 int, f2 text);
! INSERT INTO uctest VALUES (1, 'one'), (2, 'two'), (3, 'three');
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   1 | one
!   2 | two
!   3 | three
! (3 rows)
! 
! -- Check DELETE WHERE CURRENT
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest;
! FETCH 2 FROM c1;
!  f1 | f2  
! ----+-----
!   1 | one
!   2 | two
! (2 rows)
! 
! DELETE FROM uctest WHERE CURRENT OF c1;
! -- should show deletion
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   1 | one
!   3 | three
! (2 rows)
! 
! -- cursor did not move
! FETCH ALL FROM c1;
!  f1 |  f2   
! ----+-------
!   3 | three
! (1 row)
! 
! -- cursor is insensitive
! MOVE BACKWARD ALL IN c1;
! FETCH ALL FROM c1;
!  f1 |  f2   
! ----+-------
!   1 | one
!   2 | two
!   3 | three
! (3 rows)
! 
! COMMIT;
! -- should still see deletion
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   1 | one
!   3 | three
! (2 rows)
! 
! -- Check UPDATE WHERE CURRENT; this time use FOR UPDATE
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest FOR UPDATE;
! FETCH c1;
!  f1 | f2  
! ----+-----
!   1 | one
! (1 row)
! 
! UPDATE uctest SET f1 = 8 WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   3 | three
!   8 | one
! (2 rows)
! 
! COMMIT;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   3 | three
!   8 | one
! (2 rows)
! 
! -- Check repeated-update and update-then-delete cases
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest;
! FETCH c1;
!  f1 |  f2   
! ----+-------
!   3 | three
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   8 | one
!  13 | three
! (2 rows)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   8 | one
!  23 | three
! (2 rows)
! 
! -- insensitive cursor should not show effects of updates or deletes
! FETCH RELATIVE 0 FROM c1;
!  f1 |  f2   
! ----+-------
!   3 | three
! (1 row)
! 
! DELETE FROM uctest WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! DELETE FROM uctest WHERE CURRENT OF c1; -- no-op
! SELECT * FROM uctest;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1; -- no-op
! SELECT * FROM uctest;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! FETCH RELATIVE 0 FROM c1;
!  f1 |  f2   
! ----+-------
!   3 | three
! (1 row)
! 
! ROLLBACK;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   3 | three
!   8 | one
! (2 rows)
! 
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest FOR UPDATE;
! FETCH c1;
!  f1 |  f2   
! ----+-------
!   3 | three
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   8 | one
!  13 | three
! (2 rows)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   8 | one
!  23 | three
! (2 rows)
! 
! DELETE FROM uctest WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! DELETE FROM uctest WHERE CURRENT OF c1; -- no-op
! SELECT * FROM uctest;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1; -- no-op
! SELECT * FROM uctest;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! --- sensitive cursors can't currently scroll back, so this is an error:
! FETCH RELATIVE 0 FROM c1;
! ERROR:  cursor can only scan forward
! HINT:  Declare it with SCROLL option to enable backward scan.
! ROLLBACK;
! SELECT * FROM uctest;
!  f1 |  f2   
! ----+-------
!   3 | three
!   8 | one
! (2 rows)
! 
! -- Check inheritance cases
! CREATE TEMP TABLE ucchild () inherits (uctest);
! INSERT INTO ucchild values(100, 'hundred');
! SELECT * FROM uctest;
!  f1  |   f2    
! -----+---------
!    3 | three
!    8 | one
!  100 | hundred
! (3 rows)
! 
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest FOR UPDATE;
! FETCH 1 FROM c1;
!  f1 |  f2   
! ----+-------
!   3 | three
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! FETCH 1 FROM c1;
!  f1 | f2  
! ----+-----
!   8 | one
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! FETCH 1 FROM c1;
!  f1  |   f2    
! -----+---------
!  100 | hundred
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! FETCH 1 FROM c1;
!  f1 | f2 
! ----+----
! (0 rows)
! 
! COMMIT;
! SELECT * FROM uctest;
!  f1  |   f2    
! -----+---------
!   13 | three
!   18 | one
!  110 | hundred
! (3 rows)
! 
! -- Can update from a self-join, but only if FOR UPDATE says which to use
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest a, uctest b WHERE a.f1 = b.f1 + 5;
! FETCH 1 FROM c1;
!  f1 | f2  | f1 |  f2   
! ----+-----+----+-------
!  18 | one | 13 | three
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;  -- fail
! ERROR:  cursor "c1" is not a simply updatable scan of table "uctest"
! ROLLBACK;
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest a, uctest b WHERE a.f1 = b.f1 + 5 FOR UPDATE;
! FETCH 1 FROM c1;
!  f1 | f2  | f1 |  f2   
! ----+-----+----+-------
!  18 | one | 13 | three
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;  -- fail
! ERROR:  cursor "c1" has multiple FOR UPDATE/SHARE references to table "uctest"
! ROLLBACK;
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest a, uctest b WHERE a.f1 = b.f1 + 5 FOR SHARE OF a;
! FETCH 1 FROM c1;
!  f1 | f2  | f1 |  f2   
! ----+-----+----+-------
!  18 | one | 13 | three
! (1 row)
! 
! UPDATE uctest SET f1 = f1 + 10 WHERE CURRENT OF c1;
! SELECT * FROM uctest;
!  f1  |   f2    
! -----+---------
!   13 | three
!   28 | one
!  110 | hundred
! (3 rows)
! 
! ROLLBACK;
! -- Check various error cases
! DELETE FROM uctest WHERE CURRENT OF c1;  -- fail, no such cursor
! ERROR:  cursor "c1" does not exist
! DECLARE cx CURSOR WITH HOLD FOR SELECT * FROM uctest;
! DELETE FROM uctest WHERE CURRENT OF cx;  -- fail, can't use held cursor
! ERROR:  cursor "cx" is held from a previous transaction
! BEGIN;
! DECLARE c CURSOR FOR SELECT * FROM tenk2;
! DELETE FROM uctest WHERE CURRENT OF c;  -- fail, cursor on wrong table
! ERROR:  cursor "c" is not a simply updatable scan of table "uctest"
! ROLLBACK;
! BEGIN;
! DECLARE c CURSOR FOR SELECT * FROM tenk2 FOR SHARE;
! DELETE FROM uctest WHERE CURRENT OF c;  -- fail, cursor on wrong table
! ERROR:  cursor "c" does not have a FOR UPDATE/SHARE reference to table "uctest"
! ROLLBACK;
! BEGIN;
! DECLARE c CURSOR FOR SELECT * FROM tenk1 JOIN tenk2 USING (unique1);
! DELETE FROM tenk1 WHERE CURRENT OF c;  -- fail, cursor is on a join
! ERROR:  cursor "c" is not a simply updatable scan of table "tenk1"
! ROLLBACK;
! BEGIN;
! DECLARE c CURSOR FOR SELECT f1,count(*) FROM uctest GROUP BY f1;
! DELETE FROM uctest WHERE CURRENT OF c;  -- fail, cursor is on aggregation
! ERROR:  cursor "c" is not a simply updatable scan of table "uctest"
! ROLLBACK;
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM uctest;
! DELETE FROM uctest WHERE CURRENT OF c1; -- fail, no current row
! ERROR:  cursor "c1" is not positioned on a row
! ROLLBACK;
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT MIN(f1) FROM uctest FOR UPDATE;
! ERROR:  FOR UPDATE is not allowed with aggregate functions
! ROLLBACK;
! -- WHERE CURRENT OF may someday work with views, but today is not that day.
! -- For now, just make sure it errors out cleanly.
! CREATE TEMP VIEW ucview AS SELECT * FROM uctest;
! CREATE RULE ucrule AS ON DELETE TO ucview DO INSTEAD
!   DELETE FROM uctest WHERE f1 = OLD.f1;
! BEGIN;
! DECLARE c1 CURSOR FOR SELECT * FROM ucview;
! FETCH FROM c1;
!  f1 |  f2   
! ----+-------
!  13 | three
! (1 row)
! 
! DELETE FROM ucview WHERE CURRENT OF c1; -- fail, views not supported
! ERROR:  WHERE CURRENT OF on a view is not implemented
! ROLLBACK;
! -- Make sure snapshot management works okay, per bug report in
! -- 235395b90909301035v7228ce63q392931f15aa74b31@mail.gmail.com
! BEGIN;
! SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! CREATE TABLE cursor (a int);
! INSERT INTO cursor VALUES (1);
! DECLARE c1 NO SCROLL CURSOR FOR SELECT * FROM cursor FOR UPDATE;
! UPDATE cursor SET a = 2;
! FETCH ALL FROM c1;
!  a 
! ---
! (0 rows)
! 
! COMMIT;
! DROP TABLE cursor;
! -- Check rewinding a cursor containing a stable function in LIMIT,
! -- per bug report in 8336843.9833.1399385291498.JavaMail.root@quick
! begin;
! create function nochange(int) returns int
!   as 'select $1 limit 1' language sql stable;
! declare c cursor for select * from int8_tbl limit nochange(3);
! fetch all from c;
!         q1        |        q2        
! ------------------+------------------
!               123 |              456
!               123 | 4567890123456789
!  4567890123456789 |              123
! (3 rows)
! 
! move backward all in c;
! fetch all from c;
!         q1        |        q2        
! ------------------+------------------
!               123 |              456
!               123 | 4567890123456789
!  4567890123456789 |              123
! (3 rows)
! 
! rollback;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/arrays.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/arrays.out	Tue Oct 28 15:53:05 2014
***************
*** 1,1831 ****
! --
! -- ARRAYS
! --
! CREATE TABLE arrtest (
! 	a 			int2[],
! 	b 			int4[][][],
! 	c 			name[],
! 	d			text[][],
! 	e 			float8[],
! 	f			char(5)[],
! 	g			varchar(5)[]
! );
! --
! -- only the 'e' array is 0-based, the others are 1-based.
! --
! INSERT INTO arrtest (a[1:5], b[1:1][1:2][1:2], c, d, f, g)
!    VALUES ('{1,2,3,4,5}', '{{{0,0},{1,2}}}', '{}', '{}', '{}', '{}');
! UPDATE arrtest SET e[0] = '1.1';
! UPDATE arrtest SET e[1] = '2.2';
! INSERT INTO arrtest (f)
!    VALUES ('{"too long"}');
! ERROR:  value too long for type character(5)
! INSERT INTO arrtest (a, b[1:2][1:2], c, d, e, f, g)
!    VALUES ('{11,12,23}', '{{3,4},{4,5}}', '{"foobar"}',
!            '{{"elt1", "elt2"}}', '{"3.4", "6.7"}',
!            '{"abc","abcde"}', '{"abc","abcde"}');
! INSERT INTO arrtest (a, b[1:2], c, d[1:2])
!    VALUES ('{}', '{3,4}', '{foo,bar}', '{bar,foo}');
! SELECT * FROM arrtest;
!       a      |        b        |     c     |       d       |        e        |        f        |      g      
! -------------+-----------------+-----------+---------------+-----------------+-----------------+-------------
!  {1,2,3,4,5} | {{{0,0},{1,2}}} | {}        | {}            | [0:1]={1.1,2.2} | {}              | {}
!  {11,12,23}  | {{3,4},{4,5}}   | {foobar}  | {{elt1,elt2}} | {3.4,6.7}       | {"abc  ",abcde} | {abc,abcde}
!  {}          | {3,4}           | {foo,bar} | {bar,foo}     |                 |                 | 
! (3 rows)
! 
! SELECT arrtest.a[1],
!           arrtest.b[1][1][1],
!           arrtest.c[1],
!           arrtest.d[1][1],
!           arrtest.e[0]
!    FROM arrtest;
!  a  | b |   c    |  d   |  e  
! ----+---+--------+------+-----
!   1 | 0 |        |      | 1.1
!  11 |   | foobar | elt1 |    
!     |   | foo    |      |    
! (3 rows)
! 
! SELECT a[1], b[1][1][1], c[1], d[1][1], e[0]
!    FROM arrtest;
!  a  | b |   c    |  d   |  e  
! ----+---+--------+------+-----
!   1 | 0 |        |      | 1.1
!  11 |   | foobar | elt1 |    
!     |   | foo    |      |    
! (3 rows)
! 
! SELECT a[1:3],
!           b[1:1][1:2][1:2],
!           c[1:2],
!           d[1:1][1:2]
!    FROM arrtest;
!      a      |        b        |     c     |       d       
! ------------+-----------------+-----------+---------------
!  {1,2,3}    | {{{0,0},{1,2}}} | {}        | {}
!  {11,12,23} | {}              | {foobar}  | {{elt1,elt2}}
!  {}         | {}              | {foo,bar} | {}
! (3 rows)
! 
! SELECT array_ndims(a) AS a,array_ndims(b) AS b,array_ndims(c) AS c
!    FROM arrtest;
!  a | b | c 
! ---+---+---
!  1 | 3 |  
!  1 | 2 | 1
!    | 1 | 1
! (3 rows)
! 
! SELECT array_dims(a) AS a,array_dims(b) AS b,array_dims(c) AS c
!    FROM arrtest;
!    a   |        b        |   c   
! -------+-----------------+-------
!  [1:5] | [1:1][1:2][1:2] | 
!  [1:3] | [1:2][1:2]      | [1:1]
!        | [1:2]           | [1:2]
! (3 rows)
! 
! -- returns nothing
! SELECT *
!    FROM arrtest
!    WHERE a[1] < 5 and
!          c = '{"foobar"}'::_name;
!  a | b | c | d | e | f | g 
! ---+---+---+---+---+---+---
! (0 rows)
! 
! UPDATE arrtest
!   SET a[1:2] = '{16,25}'
!   WHERE NOT a = '{}'::_int2;
! UPDATE arrtest
!   SET b[1:1][1:1][1:2] = '{113, 117}',
!       b[1:1][1:2][2:2] = '{142, 147}'
!   WHERE array_dims(b) = '[1:1][1:2][1:2]';
! UPDATE arrtest
!   SET c[2:2] = '{"new_word"}'
!   WHERE array_dims(c) is not null;
! SELECT a,b,c FROM arrtest;
!        a       |           b           |         c         
! ---------------+-----------------------+-------------------
!  {16,25,3,4,5} | {{{113,142},{1,147}}} | {}
!  {}            | {3,4}                 | {foo,new_word}
!  {16,25,23}    | {{3,4},{4,5}}         | {foobar,new_word}
! (3 rows)
! 
! SELECT a[1:3],
!           b[1:1][1:2][1:2],
!           c[1:2],
!           d[1:1][2:2]
!    FROM arrtest;
!      a      |           b           |         c         |    d     
! ------------+-----------------------+-------------------+----------
!  {16,25,3}  | {{{113,142},{1,147}}} | {}                | {}
!  {}         | {}                    | {foo,new_word}    | {}
!  {16,25,23} | {}                    | {foobar,new_word} | {{elt2}}
! (3 rows)
! 
! INSERT INTO arrtest(a) VALUES('{1,null,3}');
! SELECT a FROM arrtest;
!        a       
! ---------------
!  {16,25,3,4,5}
!  {}
!  {16,25,23}
!  {1,NULL,3}
! (4 rows)
! 
! UPDATE arrtest SET a[4] = NULL WHERE a[2] IS NULL;
! SELECT a FROM arrtest WHERE a[2] IS NULL;
!         a        
! -----------------
!  [4:4]={NULL}
!  {1,NULL,3,NULL}
! (2 rows)
! 
! DELETE FROM arrtest WHERE a[2] IS NULL AND b IS NULL;
! SELECT a,b,c FROM arrtest;
!        a       |           b           |         c         
! ---------------+-----------------------+-------------------
!  {16,25,3,4,5} | {{{113,142},{1,147}}} | {}
!  {16,25,23}    | {{3,4},{4,5}}         | {foobar,new_word}
!  [4:4]={NULL}  | {3,4}                 | {foo,new_word}
! (3 rows)
! 
! --
! -- test array extension
! --
! CREATE TEMP TABLE arrtest1 (i int[], t text[]);
! insert into arrtest1 values(array[1,2,null,4], array['one','two',null,'four']);
! select * from arrtest1;
!       i       |          t          
! --------------+---------------------
!  {1,2,NULL,4} | {one,two,NULL,four}
! (1 row)
! 
! update arrtest1 set i[2] = 22, t[2] = 'twenty-two';
! select * from arrtest1;
!        i       |             t              
! ---------------+----------------------------
!  {1,22,NULL,4} | {one,twenty-two,NULL,four}
! (1 row)
! 
! update arrtest1 set i[5] = 5, t[5] = 'five';
! select * from arrtest1;
!         i        |                t                
! -----------------+---------------------------------
!  {1,22,NULL,4,5} | {one,twenty-two,NULL,four,five}
! (1 row)
! 
! update arrtest1 set i[8] = 8, t[8] = 'eight';
! select * from arrtest1;
!               i              |                        t                        
! -----------------------------+-------------------------------------------------
!  {1,22,NULL,4,5,NULL,NULL,8} | {one,twenty-two,NULL,four,five,NULL,NULL,eight}
! (1 row)
! 
! update arrtest1 set i[0] = 0, t[0] = 'zero';
! select * from arrtest1;
!                   i                  |                             t                              
! -------------------------------------+------------------------------------------------------------
!  [0:8]={0,1,22,NULL,4,5,NULL,NULL,8} | [0:8]={zero,one,twenty-two,NULL,four,five,NULL,NULL,eight}
! (1 row)
! 
! update arrtest1 set i[-3] = -3, t[-3] = 'minus-three';
! select * from arrtest1;
!                          i                         |                                         t                                         
! ---------------------------------------------------+-----------------------------------------------------------------------------------
!  [-3:8]={-3,NULL,NULL,0,1,22,NULL,4,5,NULL,NULL,8} | [-3:8]={minus-three,NULL,NULL,zero,one,twenty-two,NULL,four,five,NULL,NULL,eight}
! (1 row)
! 
! update arrtest1 set i[0:2] = array[10,11,12], t[0:2] = array['ten','eleven','twelve'];
! select * from arrtest1;
!                           i                          |                                        t                                        
! -----------------------------------------------------+---------------------------------------------------------------------------------
!  [-3:8]={-3,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,8} | [-3:8]={minus-three,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,eight}
! (1 row)
! 
! update arrtest1 set i[8:10] = array[18,null,20], t[8:10] = array['p18',null,'p20'];
! select * from arrtest1;
!                                i                               |                                            t                                            
! ---------------------------------------------------------------+-----------------------------------------------------------------------------------------
!  [-3:10]={-3,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,18,NULL,20} | [-3:10]={minus-three,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,p18,NULL,p20}
! (1 row)
! 
! update arrtest1 set i[11:12] = array[null,22], t[11:12] = array[null,'p22'];
! select * from arrtest1;
!                                    i                                   |                                                t                                                 
! -----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------
!  [-3:12]={-3,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,18,NULL,20,NULL,22} | [-3:12]={minus-three,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,p18,NULL,p20,NULL,p22}
! (1 row)
! 
! update arrtest1 set i[15:16] = array[null,26], t[15:16] = array[null,'p26'];
! select * from arrtest1;
!                                             i                                            |                                                          t                                                          
! -----------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------
!  [-3:16]={-3,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,18,NULL,20,NULL,22,NULL,NULL,NULL,26} | [-3:16]={minus-three,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,p18,NULL,p20,NULL,p22,NULL,NULL,NULL,p26}
! (1 row)
! 
! update arrtest1 set i[-5:-3] = array[-15,-14,-13], t[-5:-3] = array['m15','m14','m13'];
! select * from arrtest1;
!                                                 i                                                 |                                                          t                                                          
! --------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------
!  [-5:16]={-15,-14,-13,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,18,NULL,20,NULL,22,NULL,NULL,NULL,26} | [-5:16]={m15,m14,m13,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,p18,NULL,p20,NULL,p22,NULL,NULL,NULL,p26}
! (1 row)
! 
! update arrtest1 set i[-7:-6] = array[-17,null], t[-7:-6] = array['m17',null];
! select * from arrtest1;
!                                                      i                                                     |                                                              t                                                               
! -----------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------
!  [-7:16]={-17,NULL,-15,-14,-13,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,18,NULL,20,NULL,22,NULL,NULL,NULL,26} | [-7:16]={m17,NULL,m15,m14,m13,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,p18,NULL,p20,NULL,p22,NULL,NULL,NULL,p26}
! (1 row)
! 
! update arrtest1 set i[-12:-10] = array[-22,null,-20], t[-12:-10] = array['m22',null,'m20'];
! select * from arrtest1;
!                                                                  i                                                                 |                                                                          t                                                                           
! -----------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------
!  [-12:16]={-22,NULL,-20,NULL,NULL,-17,NULL,-15,-14,-13,NULL,NULL,10,11,12,NULL,4,5,NULL,NULL,18,NULL,20,NULL,22,NULL,NULL,NULL,26} | [-12:16]={m22,NULL,m20,NULL,NULL,m17,NULL,m15,m14,m13,NULL,NULL,ten,eleven,twelve,NULL,four,five,NULL,NULL,p18,NULL,p20,NULL,p22,NULL,NULL,NULL,p26}
! (1 row)
! 
! delete from arrtest1;
! insert into arrtest1 values(array[1,2,null,4], array['one','two',null,'four']);
! select * from arrtest1;
!       i       |          t          
! --------------+---------------------
!  {1,2,NULL,4} | {one,two,NULL,four}
! (1 row)
! 
! update arrtest1 set i[0:5] = array[0,1,2,null,4,5], t[0:5] = array['z','p1','p2',null,'p4','p5'];
! select * from arrtest1;
!            i            |             t              
! ------------------------+----------------------------
!  [0:5]={0,1,2,NULL,4,5} | [0:5]={z,p1,p2,NULL,p4,p5}
! (1 row)
! 
! --
! -- array expressions and operators
! --
! -- table creation and INSERTs
! CREATE TEMP TABLE arrtest2 (i integer ARRAY[4], f float8[], n numeric[], t text[], d timestamp[]);
! INSERT INTO arrtest2 VALUES(
!   ARRAY[[[113,142],[1,147]]],
!   ARRAY[1.1,1.2,1.3]::float8[],
!   ARRAY[1.1,1.2,1.3],
!   ARRAY[[['aaa','aab'],['aba','abb'],['aca','acb']],[['baa','bab'],['bba','bbb'],['bca','bcb']]],
!   ARRAY['19620326','19931223','19970117']::timestamp[]
! );
! -- some more test data
! CREATE TEMP TABLE arrtest_f (f0 int, f1 text, f2 float8);
! insert into arrtest_f values(1,'cat1',1.21);
! insert into arrtest_f values(2,'cat1',1.24);
! insert into arrtest_f values(3,'cat1',1.18);
! insert into arrtest_f values(4,'cat1',1.26);
! insert into arrtest_f values(5,'cat1',1.15);
! insert into arrtest_f values(6,'cat2',1.15);
! insert into arrtest_f values(7,'cat2',1.26);
! insert into arrtest_f values(8,'cat2',1.32);
! insert into arrtest_f values(9,'cat2',1.30);
! CREATE TEMP TABLE arrtest_i (f0 int, f1 text, f2 int);
! insert into arrtest_i values(1,'cat1',21);
! insert into arrtest_i values(2,'cat1',24);
! insert into arrtest_i values(3,'cat1',18);
! insert into arrtest_i values(4,'cat1',26);
! insert into arrtest_i values(5,'cat1',15);
! insert into arrtest_i values(6,'cat2',15);
! insert into arrtest_i values(7,'cat2',26);
! insert into arrtest_i values(8,'cat2',32);
! insert into arrtest_i values(9,'cat2',30);
! -- expressions
! SELECT t.f[1][3][1] AS "131", t.f[2][2][1] AS "221" FROM (
!   SELECT ARRAY[[[111,112],[121,122],[131,132]],[[211,212],[221,122],[231,232]]] AS f
! ) AS t;
!  131 | 221 
! -----+-----
!  131 | 221
! (1 row)
! 
! SELECT ARRAY[[[[[['hello'],['world']]]]]];
!            array           
! ---------------------------
!  {{{{{{hello},{world}}}}}}
! (1 row)
! 
! SELECT ARRAY[ARRAY['hello'],ARRAY['world']];
!        array       
! -------------------
!  {{hello},{world}}
! (1 row)
! 
! SELECT ARRAY(select f2 from arrtest_f order by f2) AS "ARRAY";
!                      ARRAY                     
! -----------------------------------------------
!  {1.15,1.15,1.18,1.21,1.24,1.26,1.26,1.3,1.32}
! (1 row)
! 
! -- with nulls
! SELECT '{1,null,3}'::int[];
!     int4    
! ------------
!  {1,NULL,3}
! (1 row)
! 
! SELECT ARRAY[1,NULL,3];
!    array    
! ------------
!  {1,NULL,3}
! (1 row)
! 
! -- functions
! SELECT array_append(array[42], 6) AS "{42,6}";
!  {42,6} 
! --------
!  {42,6}
! (1 row)
! 
! SELECT array_prepend(6, array[42]) AS "{6,42}";
!  {6,42} 
! --------
!  {6,42}
! (1 row)
! 
! SELECT array_cat(ARRAY[1,2], ARRAY[3,4]) AS "{1,2,3,4}";
!  {1,2,3,4} 
! -----------
!  {1,2,3,4}
! (1 row)
! 
! SELECT array_cat(ARRAY[1,2], ARRAY[[3,4],[5,6]]) AS "{{1,2},{3,4},{5,6}}";
!  {{1,2},{3,4},{5,6}} 
! ---------------------
!  {{1,2},{3,4},{5,6}}
! (1 row)
! 
! SELECT array_cat(ARRAY[[3,4],[5,6]], ARRAY[1,2]) AS "{{3,4},{5,6},{1,2}}";
!  {{3,4},{5,6},{1,2}} 
! ---------------------
!  {{3,4},{5,6},{1,2}}
! (1 row)
! 
! -- operators
! SELECT a FROM arrtest WHERE b = ARRAY[[[113,142],[1,147]]];
!        a       
! ---------------
!  {16,25,3,4,5}
! (1 row)
! 
! SELECT NOT ARRAY[1.1,1.2,1.3] = ARRAY[1.1,1.2,1.3] AS "FALSE";
!  FALSE 
! -------
!  f
! (1 row)
! 
! SELECT ARRAY[1,2] || 3 AS "{1,2,3}";
!  {1,2,3} 
! ---------
!  {1,2,3}
! (1 row)
! 
! SELECT 0 || ARRAY[1,2] AS "{0,1,2}";
!  {0,1,2} 
! ---------
!  {0,1,2}
! (1 row)
! 
! SELECT ARRAY[1,2] || ARRAY[3,4] AS "{1,2,3,4}";
!  {1,2,3,4} 
! -----------
!  {1,2,3,4}
! (1 row)
! 
! SELECT ARRAY[[['hello','world']]] || ARRAY[[['happy','birthday']]] AS "ARRAY";
!                 ARRAY                 
! --------------------------------------
!  {{{hello,world}},{{happy,birthday}}}
! (1 row)
! 
! SELECT ARRAY[[1,2],[3,4]] || ARRAY[5,6] AS "{{1,2},{3,4},{5,6}}";
!  {{1,2},{3,4},{5,6}} 
! ---------------------
!  {{1,2},{3,4},{5,6}}
! (1 row)
! 
! SELECT ARRAY[0,0] || ARRAY[1,1] || ARRAY[2,2] AS "{0,0,1,1,2,2}";
!  {0,0,1,1,2,2} 
! ---------------
!  {0,0,1,1,2,2}
! (1 row)
! 
! SELECT 0 || ARRAY[1,2] || 3 AS "{0,1,2,3}";
!  {0,1,2,3} 
! -----------
!  {0,1,2,3}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE i @> '{32}' ORDER BY seqno;
!  seqno |                i                |                                                                 t                                                                  
! -------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!     74 | {32}                            | {AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAA22860,AAAAAA99807,AAAAA17383,AAAAAAAAAAAAAAA67062,AAAAAAAAAAA15165,AAAAAAAAAAA50956}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
!     98 | {38,34,32,89}                   | {AAAAAAAAAAAAAAAAAA71621,AAAA8857,AAAAAAAAAAAAAAAAAAA65037,AAAAAAAAAAAAAAAA31334,AAAAAAAAAA48845}
!    100 | {85,32,57,39,49,84,32,3,30}     | {AAAAAAA80240,AAAAAAAAAAAAAAAA1729,AAAAA60038,AAAAAAAAAAA92631,AAAAAAAA9523}
! (6 rows)
! 
! SELECT * FROM array_op_test WHERE i && '{32}' ORDER BY seqno;
!  seqno |                i                |                                                                 t                                                                  
! -------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!     74 | {32}                            | {AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAA22860,AAAAAA99807,AAAAA17383,AAAAAAAAAAAAAAA67062,AAAAAAAAAAA15165,AAAAAAAAAAA50956}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
!     98 | {38,34,32,89}                   | {AAAAAAAAAAAAAAAAAA71621,AAAA8857,AAAAAAAAAAAAAAAAAAA65037,AAAAAAAAAAAAAAAA31334,AAAAAAAAAA48845}
!    100 | {85,32,57,39,49,84,32,3,30}     | {AAAAAAA80240,AAAAAAAAAAAAAAAA1729,AAAAA60038,AAAAAAAAAAA92631,AAAAAAAA9523}
! (6 rows)
! 
! SELECT * FROM array_op_test WHERE i @> '{17}' ORDER BY seqno;
!  seqno |                i                |                                                                 t                                                                  
! -------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!     12 | {17,99,18,52,91,72,0,43,96,23}  | {AAAAA33250,AAAAAAAAAAAAAAAAAAA85420,AAAAAAAAAAA33576}
!     15 | {17,14,16,63,67}                | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     19 | {52,82,17,74,23,46,69,51,75}    | {AAAAAAAAAAAAA73084,AAAAA75968,AAAAAAAAAAAAAAAA14047,AAAAAAA80240,AAAAAAAAAAAAAAAAAAA1205,A68938}
!     53 | {38,17}                         | {AAAAAAAAAAA21658}
!     65 | {61,5,76,59,17}                 | {AAAAAA99807,AAAAA64741,AAAAAAAAAAA53908,AA21643,AAAAAAAAA10012}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
! (8 rows)
! 
! SELECT * FROM array_op_test WHERE i && '{17}' ORDER BY seqno;
!  seqno |                i                |                                                                 t                                                                  
! -------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!     12 | {17,99,18,52,91,72,0,43,96,23}  | {AAAAA33250,AAAAAAAAAAAAAAAAAAA85420,AAAAAAAAAAA33576}
!     15 | {17,14,16,63,67}                | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     19 | {52,82,17,74,23,46,69,51,75}    | {AAAAAAAAAAAAA73084,AAAAA75968,AAAAAAAAAAAAAAAA14047,AAAAAAA80240,AAAAAAAAAAAAAAAAAAA1205,A68938}
!     53 | {38,17}                         | {AAAAAAAAAAA21658}
!     65 | {61,5,76,59,17}                 | {AAAAAA99807,AAAAA64741,AAAAAAAAAAA53908,AA21643,AAAAAAAAA10012}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
! (8 rows)
! 
! SELECT * FROM array_op_test WHERE i @> '{32,17}' ORDER BY seqno;
!  seqno |                i                |                                                                 t                                                                  
! -------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
! (3 rows)
! 
! SELECT * FROM array_op_test WHERE i && '{32,17}' ORDER BY seqno;
!  seqno |                i                |                                                                 t                                                                  
! -------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!     12 | {17,99,18,52,91,72,0,43,96,23}  | {AAAAA33250,AAAAAAAAAAAAAAAAAAA85420,AAAAAAAAAAA33576}
!     15 | {17,14,16,63,67}                | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     19 | {52,82,17,74,23,46,69,51,75}    | {AAAAAAAAAAAAA73084,AAAAA75968,AAAAAAAAAAAAAAAA14047,AAAAAAA80240,AAAAAAAAAAAAAAAAAAA1205,A68938}
!     53 | {38,17}                         | {AAAAAAAAAAA21658}
!     65 | {61,5,76,59,17}                 | {AAAAAA99807,AAAAA64741,AAAAAAAAAAA53908,AA21643,AAAAAAAAA10012}
!     74 | {32}                            | {AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAA22860,AAAAAA99807,AAAAA17383,AAAAAAAAAAAAAAA67062,AAAAAAAAAAA15165,AAAAAAAAAAA50956}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
!     98 | {38,34,32,89}                   | {AAAAAAAAAAAAAAAAAA71621,AAAA8857,AAAAAAAAAAAAAAAAAAA65037,AAAAAAAAAAAAAAAA31334,AAAAAAAAAA48845}
!    100 | {85,32,57,39,49,84,32,3,30}     | {AAAAAAA80240,AAAAAAAAAAAAAAAA1729,AAAAA60038,AAAAAAAAAAA92631,AAAAAAAA9523}
! (11 rows)
! 
! SELECT * FROM array_op_test WHERE i <@ '{38,34,32,89}' ORDER BY seqno;
!  seqno |       i       |                                                             t                                                              
! -------+---------------+----------------------------------------------------------------------------------------------------------------------------
!     40 | {34}          | {AAAAAAAAAAAAAA10611,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAA50956,AAAAAAAAAAAAAAAA31334,AAAAA70466,AAAAAAAA81587,AAAAAAA74623}
!     74 | {32}          | {AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAA22860,AAAAAA99807,AAAAA17383,AAAAAAAAAAAAAAA67062,AAAAAAAAAAA15165,AAAAAAAAAAA50956}
!     98 | {38,34,32,89} | {AAAAAAAAAAAAAAAAAA71621,AAAA8857,AAAAAAAAAAAAAAAAAAA65037,AAAAAAAAAAAAAAAA31334,AAAAAAAAAA48845}
!    101 | {}            | {}
! (4 rows)
! 
! SELECT * FROM array_op_test WHERE i = '{}' ORDER BY seqno;
!  seqno | i  | t  
! -------+----+----
!    101 | {} | {}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE i @> '{}' ORDER BY seqno;
!  seqno |                i                |                                                                                                       t                                                                                                        
! -------+---------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!      1 | {92,75,71,52,64,83}             | {AAAAAAAA44066,AAAAAA1059,AAAAAAAAAAA176,AAAAAAA48038}
!      2 | {3,6}                           | {AAAAAA98232,AAAAAAAA79710,AAAAAAAAAAAAAAAAA69675,AAAAAAAAAAAAAAAA55798,AAAAAAAAA12793}
!      3 | {37,64,95,43,3,41,13,30,11,43}  | {AAAAAAAAAA48845,AAAAA75968,AAAAA95309,AAA54451,AAAAAAAAAA22292,AAAAAAA99836,A96617,AA17009,AAAAAAAAAAAAAA95246}
!      4 | {71,39,99,55,33,75,45}          | {AAAAAAAAA53663,AAAAAAAAAAAAAAA67062,AAAAAAAAAA64777,AAA99043,AAAAAAAAAAAAAAAAAAA91804,39557}
!      5 | {50,42,77,50,4}                 | {AAAAAAAAAAAAAAAAA26540,AAAAAAA79710,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAA176,AAAAA95309,AAAAAAAAAAA46154,AAAAAA66777,AAAAAAAAA27249,AAAAAAAAAA64777,AAAAAAAAAAAAAAAAAAA70104}
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!      7 | {12,51,88,64,8}                 | {AAAAAAAAAAAAAAAAAA12591,AAAAAAAAAAAAAAAAA50407,AAAAAAAAAAAA67946}
!      8 | {60,84}                         | {AAAAAAA81898,AAAAAA1059,AAAAAAAAAAAA81511,AAAAA961,AAAAAAAAAAAAAAAA31334,AAAAA64741,AA6416,AAAAAAAAAAAAAAAAAA32918,AAAAAAAAAAAAAAAAA50407}
!      9 | {56,52,35,27,80,44,81,22}       | {AAAAAAAAAAAAAAA73034,AAAAAAAAAAAAA7929,AAAAAAA66161,AA88409,39557,A27153,AAAAAAAA9523,AAAAAAAAAAA99000}
!     10 | {71,5,45}                       | {AAAAAAAAAAA21658,AAAAAAAAAAAA21089,AAA54451,AAAAAAAAAAAAAAAAAA54141,AAAAAAAAAAAAAA28620,AAAAAAAAAAA21658,AAAAAAAAAAA74076,AAAAAAAAA27249}
!     11 | {41,86,74,48,22,74,47,50}       | {AAAAAAAA9523,AAAAAAAAAAAA37562,AAAAAAAAAAAAAAAA14047,AAAAAAAAAAA46154,AAAA41702,AAAAAAAAAAAAAAAAA764,AAAAA62737,39557}
!     12 | {17,99,18,52,91,72,0,43,96,23}  | {AAAAA33250,AAAAAAAAAAAAAAAAAAA85420,AAAAAAAAAAA33576}
!     13 | {3,52,34,23}                    | {AAAAAA98232,AAAA49534,AAAAAAAAAAA21658}
!     14 | {78,57,19}                      | {AAAA8857,AAAAAAAAAAAAAAA73034,AAAAAAAA81587,AAAAAAAAAAAAAAA68526,AAAAA75968,AAAAAAAAAAAAAA65909,AAAAAAAAA10012,AAAAAAAAAAAAAA65909}
!     15 | {17,14,16,63,67}                | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     16 | {14,63,85,11}                   | {AAAAAA66777}
!     17 | {7,10,81,85}                    | {AAAAAA43678,AAAAAAA12144,AAAAAAAAAAA50956,AAAAAAAAAAAAAAAAAAA15356}
!     18 | {1}                             | {AAAAAAAAAAA33576,AAAAA95309,64261,AAA59323,AAAAAAAAAAAAAA95246,55847,AAAAAAAAAAAA67946,AAAAAAAAAAAAAAAAAA64374}
!     19 | {52,82,17,74,23,46,69,51,75}    | {AAAAAAAAAAAAA73084,AAAAA75968,AAAAAAAAAAAAAAAA14047,AAAAAAA80240,AAAAAAAAAAAAAAAAAAA1205,A68938}
!     20 | {72,89,70,51,54,37,8,49,79}     | {AAAAAA58494}
!     21 | {2,8,65,10,5,79,43}             | {AAAAAAAAAAAAAAAAA88852,AAAAAAAAAAAAAAAAAAA91804,AAAAA64669,AAAAAAAAAAAAAAAA1443,AAAAAAAAAAAAAAAA23657,AAAAA12179,AAAAAAAAAAAAAAAAA88852,AAAAAAAAAAAAAAAA31334,AAAAAAAAAAAAAAAA41303,AAAAAAAAAAAAAAAAAAA85420}
!     22 | {11,6,56,62,53,30}              | {AAAAAAAA72908}
!     23 | {40,90,5,38,72,40,30,10,43,55}  | {A6053,AAAAAAAAAAA6119,AA44673,AAAAAAAAAAAAAAAAA764,AA17009,AAAAA17383,AAAAA70514,AAAAA33250,AAAAA95309,AAAAAAAAAAAA37562}
!     24 | {94,61,99,35,48}                | {AAAAAAAAAAA50956,AAAAAAAAAAA15165,AAAA85070,AAAAAAAAAAAAAAA36627,AAAAA961,AAAAAAAAAA55219}
!     25 | {31,1,10,11,27,79,38}           | {AAAAAAAAAAAAAAAAAA59334,45449}
!     26 | {71,10,9,69,75}                 | {47735,AAAAAAA21462,AAAAAAAAAAAAAAAAA6897,AAAAAAAAAAAAAAAAAAA91804,AAAAAAAAA72121,AAAAAAAAAAAAAAAAAAA1205,AAAAA41597,AAAA8857,AAAAAAAAAAAAAAAAAAA15356,AA17009}
!     27 | {94}                            | {AA6416,A6053,AAAAAAA21462,AAAAAAA57334,AAAAAAAAAAAAAAAAAA12591,AA88409,AAAAAAAAAAAAA70254}
!     28 | {14,33,6,34,14}                 | {AAAAAAAAAAAAAAA13198,AAAAAAAA69452,AAAAAAAAAAA82945,AAAAAAA12144,AAAAAAAAA72121,AAAAAAAAAA18601}
!     29 | {39,21}                         | {AAAAAAAAAAAAAAAAA6897,AAAAAAAAAAAAAAAAAAA38885,AAAA85070,AAAAAAAAAAAAAAAAAAA70104,AAAAA66674,AAAAAAAAAAAAA62007,AAAAAAAA69452,AAAAAAA1242,AAAAAAAAAAAAAAAA1729,AAAA35194}
!     30 | {26,81,47,91,34}                | {AAAAAAAAAAAAAAAAAAA70104,AAAAAAA80240}
!     31 | {80,24,18,21,54}                | {AAAAAAAAAAAAAAA13198,AAAAAAAAAAAAAAAAAAA70415,A27153,AAAAAAAAA53663,AAAAAAAAAAAAAAAAA50407,A68938}
!     32 | {58,79,82,80,67,75,98,10,41}    | {AAAAAAAAAAAAAAAAAA61286,AAA54451,AAAAAAAAAAAAAAAAAAA87527,A96617,51533}
!     33 | {74,73}                         | {A85417,AAAAAAA56483,AAAAA17383,AAAAAAAAAAAAA62159,AAAAAAAAAAAA52814,AAAAAAAAAAAAA85723,AAAAAAAAAAAAAAAAAA55796}
!     34 | {70,45}                         | {AAAAAAAAAAAAAAAAAA71621,AAAAAAAAAAAAAA28620,AAAAAAAAAA55219,AAAAAAAA23648,AAAAAAAAAA22292,AAAAAAA1242}
!     35 | {23,40}                         | {AAAAAAAAAAAA52814,AAAA48949,AAAAAAAAA34727,AAAA8857,AAAAAAAAAAAAAAAAAAA62179,AAAAAAAAAAAAAAA68526,AAAAAAA99836,AAAAAAAA50094,AAAA91194,AAAAAAAAAAAAA73084}
!     36 | {79,82,14,52,30,5,79}           | {AAAAAAAAA53663,AAAAAAAAAAAAAAAA55798,AAAAAAAAAAAAAAAAAAA89194,AA88409,AAAAAAAAAAAAAAA81326,AAAAAAAAAAAAAAAAA63050,AAAAAAAAAAAAAAAA33598}
!     37 | {53,11,81,39,3,78,58,64,74}     | {AAAAAAAAAAAAAAAAAAA17075,AAAAAAA66161,AAAAAAAA23648,AAAAAAAAAAAAAA10611}
!     38 | {59,5,4,95,28}                  | {AAAAAAAAAAA82945,A96617,47735,AAAAA12179,AAAAA64669,AAAAAA99807,AA74433,AAAAAAAAAAAAAAAAA59387}
!     39 | {82,43,99,16,74}                | {AAAAAAAAAAAAAAA67062,AAAAAAA57334,AAAAAAAAAAAAAA65909,A27153,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAAAA43052,AAAAAAAAAA64777,AAAAAAAAAAAA81511,AAAAAAAAAAAAAA65909,AAAAAAAAAAAAAA28620}
!     40 | {34}                            | {AAAAAAAAAAAAAA10611,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAA50956,AAAAAAAAAAAAAAAA31334,AAAAA70466,AAAAAAAA81587,AAAAAAA74623}
!     41 | {19,26,63,12,93,73,27,94}       | {AAAAAAA79710,AAAAAAAAAA55219,AAAA41702,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAAAAA71621,AAAAAAAAAAAAAAAAA63050,AAAAAAA99836,AAAAAAAAAAAAAA8666}
!     42 | {15,76,82,75,8,91}              | {AAAAAAAAAAA176,AAAAAA38063,45449,AAAAAA54032,AAAAAAA81898,AA6416,AAAAAAAAAAAAAAAAAAA62179,45449,AAAAA60038,AAAAAAAA81587}
!     43 | {39,87,91,97,79,28}             | {AAAAAAAAAAA74076,A96617,AAAAAAAAAAAAAAAAAAA89194,AAAAAAAAAAAAAAAAAA55796,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAA67946}
!     44 | {40,58,68,29,54}                | {AAAAAAA81898,AAAAAA66777,AAAAAA98232}
!     45 | {99,45}                         | {AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}
!     46 | {53,24}                         | {AAAAAAAAAAA53908,AAAAAA54032,AAAAA17383,AAAA48949,AAAAAAAAAA18601,AAAAA64669,45449,AAAAAAAAAAA98051,AAAAAAAAAAAAAAAAAA71621}
!     47 | {98,23,64,12,75,61}             | {AAA59323,AAAAA95309,AAAAAAAAAAAAAAAA31334,AAAAAAAAA27249,AAAAA17383,AAAAAAAAAAAA37562,AAAAAA1059,A84822,55847,AAAAA70466}
!     48 | {76,14}                         | {AAAAAAAAAAAAA59671,AAAAAAAAAAAAAAAAAAA91804,AAAAAA66777,AAAAAAAAAAAAAAAAAAA89194,AAAAAAAAAAAAAAA36627,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAA73084,AAAAAAA79710,AAAAAAAAAAAAAAA40402,AAAAAAAAAAAAAAAAAAA65037}
!     49 | {56,5,54,37,49}                 | {AA21643,AAAAAAAAAAA92631,AAAAAAAA81587}
!     50 | {20,12,37,64,93}                | {AAAAAAAAAA5483,AAAAAAAAAAAAAAAAAAA1205,AA6416,AAAAAAAAAAAAAAAAA63050,AAAAAAAAAAAAAAAAAA47955}
!     51 | {47}                            | {AAAAAAAAAAAAAA96505,AAAAAAAAAAAAAAAAAA36842,AAAAA95309,AAAAAAAA81587,AA6416,AAAA91194,AAAAAA58494,AAAAAA1059,AAAAAAAA69452}
!     52 | {89,0}                          | {AAAAAAAAAAAAAAAAAA47955,AAAAAAA48038,AAAAAAAAAAAAAAAAA43052,AAAAAAAAAAAAA73084,AAAAA70466,AAAAAAAAAAAAAAAAA764,AAAAAAAAAAA46154,AA66862}
!     53 | {38,17}                         | {AAAAAAAAAAA21658}
!     54 | {70,47}                         | {AAAAAAAAAAAAAAAAAA54141,AAAAA40681,AAAAAAA48038,AAAAAAAAAAAAAAAA29150,AAAAA41597,AAAAAAAAAAAAAAAAAA59334,AA15322}
!     55 | {47,79,47,64,72,25,71,24,93}    | {AAAAAAAAAAAAAAAAAA55796,AAAAA62737}
!     56 | {33,7,60,54,93,90,77,85,39}     | {AAAAAAAAAAAAAAAAAA32918,AA42406}
!     57 | {23,45,10,42,36,21,9,96}        | {AAAAAAAAAAAAAAAAAAA70415}
!     58 | {92}                            | {AAAAAAAAAAAAAAAA98414,AAAAAAAA23648,AAAAAAAAAAAAAAAAAA55796,AA25381,AAAAAAAAAAA6119}
!     59 | {9,69,46,77}                    | {39557,AAAAAAA89932,AAAAAAAAAAAAAAAAA43052,AAAAAAAAAAAAAAAAA26540,AAA20874,AA6416,AAAAAAAAAAAAAAAAAA47955}
!     60 | {62,2,59,38,89}                 | {AAAAAAA89932,AAAAAAAAAAAAAAAAAAA15356,AA99927,AA17009,AAAAAAAAAAAAAAA35875}
!     61 | {72,2,44,95,54,54,13}           | {AAAAAAAAAAAAAAAAAAA91804}
!     62 | {83,72,29,73}                   | {AAAAAAAAAAAAA15097,AAAA8857,AAAAAAAAAAAA35809,AAAAAAAAAAAA52814,AAAAAAAAAAAAAAAAAAA38885,AAAAAAAAAAAAAAAAAA24183,AAAAAA43678,A96617}
!     63 | {11,4,61,87}                    | {AAAAAAAAA27249,AAAAAAAAAAAAAAAAAA32918,AAAAAAAAAAAAAAA13198,AAA20874,39557,51533,AAAAAAAAAAA53908,AAAAAAAAAAAAAA96505,AAAAAAAA78938}
!     64 | {26,19,34,24,81,78}             | {A96617,AAAAAAAAAAAAAAAAAAA70104,A68938,AAAAAAAAAAA53908,AAAAAAAAAAAAAAA453,AA17009,AAAAAAA80240}
!     65 | {61,5,76,59,17}                 | {AAAAAA99807,AAAAA64741,AAAAAAAAAAA53908,AA21643,AAAAAAAAA10012}
!     66 | {31,23,70,52,4,33,48,25}        | {AAAAAAAAAAAAAAAAA69675,AAAAAAAA50094,AAAAAAAAAAA92631,AAAA35194,39557,AAAAAAA99836}
!     67 | {31,94,7,10}                    | {AAAAAA38063,A96617,AAAA35194,AAAAAAAAAAAA67946}
!     68 | {90,43,38}                      | {AA75092,AAAAAAAAAAAAAAAAA69675,AAAAAAAAAAA92631,AAAAAAAAA10012,AAAAAAAAAAAAA7929,AA21643}
!     69 | {67,35,99,85,72,86,44}          | {AAAAAAAAAAAAAAAAAAA1205,AAAAAAAA50094,AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAAAAAAA47955}
!     70 | {56,70,83}                      | {AAAA41702,AAAAAAAAAAA82945,AA21643,AAAAAAAAAAA99000,A27153,AA25381,AAAAAAAAAAAAAA96505,AAAAAAA1242}
!     71 | {74,26}                         | {AAAAAAAAAAA50956,AA74433,AAAAAAA21462,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAA36627,AAAAAAAAAAAAA70254,AAAAAAAAAA43419,39557}
!     72 | {22,1,16,78,20,91,83}           | {47735,AAAAAAA56483,AAAAAAAAAAAAA93788,AA42406,AAAAAAAAAAAAA73084,AAAAAAAA72908,AAAAAAAAAAAAAAAAAA61286,AAAAA66674,AAAAAAAAAAAAAAAAA50407}
!     73 | {88,25,96,78,65,15,29,19}       | {AAA54451,AAAAAAAAA27249,AAAAAAA9228,AAAAAAAAAAAAAAA67062,AAAAAAAAAAAAAAAAAAA70415,AAAAA17383,AAAAAAAAAAAAAAAA33598}
!     74 | {32}                            | {AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAA22860,AAAAAA99807,AAAAA17383,AAAAAAAAAAAAAAA67062,AAAAAAAAAAA15165,AAAAAAAAAAA50956}
!     75 | {12,96,83,24,71,89,55}          | {AAAA48949,AAAAAAAA29716,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAAA67946,AAAAAAAAAAAAAAAA29150,AAA28075,AAAAAAAAAAAAAAAAA43052}
!     76 | {92,55,10,7}                    | {AAAAAAAAAAAAAAA67062}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     78 | {55,89,44,84,34}                | {AAAAAAAAAAA6119,AAAAAAAAAAAAAA8666,AA99927,AA42406,AAAAAAA81898,AAAAAAA9228,AAAAAAAAAAA92631,AA21643,AAAAAAAAAAAAAA28620}
!     79 | {45}                            | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
!     80 | {74,89,44,80,0}                 | {AAAA35194,AAAAAAAA79710,AAA20874,AAAAAAAAAAAAAAAAAAA70104,AAAAAAAAAAAAA73084,AAAAAAA57334,AAAAAAA9228,AAAAAAAAAAAAA62007}
!     81 | {63,77,54,48,61,53,97}          | {AAAAAAAAAAAAAAA81326,AAAAAAAAAA22292,AA25381,AAAAAAAAAAA74076,AAAAAAA81898,AAAAAAAAA72121}
!     82 | {34,60,4,79,78,16,86,89,42,50}  | {AAAAA40681,AAAAAAAAAAAAAAAAAA12591,AAAAAAA80240,AAAAAAAAAAAAAAAA55798,AAAAAAAAAAAAAAAAAAA70104}
!     83 | {14,10}                         | {AAAAAAAAAA22292,AAAAAAAAAAAAA70254,AAAAAAAAAAA6119}
!     84 | {11,83,35,13,96,94}             | {AAAAA95309,AAAAAAAAAAAAAAAAAA32918,AAAAAAAAAAAAAAAAAA24183}
!     85 | {39,60}                         | {AAAAAAAAAAAAAAAA55798,AAAAAAAAAA22292,AAAAAAA66161,AAAAAAA21462,AAAAAAAAAAAAAAAAAA12591,55847,AAAAAA98232,AAAAAAAAAAA46154}
!     86 | {33,81,72,74,45,36,82}          | {AAAAAAAA81587,AAAAAAAAAAAAAA96505,45449,AAAA80176}
!     87 | {57,27,50,12,97,68}             | {AAAAAAAAAAAAAAAAA26540,AAAAAAAAA10012,AAAAAAAAAAAA35809,AAAAAAAAAAAAAAAA29150,AAAAAAAAAAA82945,AAAAAA66777,31228,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAA96505}
!     88 | {41,90,77,24,6,24}              | {AAAA35194,AAAA35194,AAAAAAA80240,AAAAAAAAAAA46154,AAAAAA58494,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAAAAA59334,AAAAAAAAAAAAAAAAAAA91804,AA74433}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
!     90 | {88,75}                         | {AAAAA60038,AAAAAAAA23648,AAAAAAAAAAA99000,AAAA41702,AAAAAAAAAAAAA22860,AAAAAAAAAAAAAAA68526}
!     91 | {78}                            | {AAAAAAAAAAAAA62007,AAA99043}
!     92 | {85,63,49,45}                   | {AAAAAAA89932,AAAAAAAAAAAAA22860,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAAA21089}
!     93 | {11}                            | {AAAAAAAAAAA176,AAAAAAAAAAAAAA8666,AAAAAAAAAAAAAAA453,AAAAAAAAAAAAA85723,A68938,AAAAAAAAAAAAA9821,AAAAAAA48038,AAAAAAAAAAAAAAAAA59387,AA99927,AAAAA17383}
!     94 | {98,9,85,62,88,91,60,61,38,86}  | {AAAAAAAA81587,AAAAA17383,AAAAAAAA81587}
!     95 | {47,77}                         | {AAAAAAAAAAAAAAAAA764,AAAAAAAAAAA74076,AAAAAAAAAA18107,AAAAA40681,AAAAAAAAAAAAAAA35875,AAAAA60038,AAAAAAA56483}
!     96 | {23,97,43}                      | {AAAAAAAAAA646,A87088}
!     97 | {54,2,86,65}                    | {47735,AAAAAAA99836,AAAAAAAAAAAAAAAAA6897,AAAAAAAAAAAAAAAA29150,AAAAAAA80240,AAAAAAAAAAAAAAAA98414,AAAAAAA56483,AAAAAAAAAAAAAAAA29150,AAAAAAA39692,AA21643}
!     98 | {38,34,32,89}                   | {AAAAAAAAAAAAAAAAAA71621,AAAA8857,AAAAAAAAAAAAAAAAAAA65037,AAAAAAAAAAAAAAAA31334,AAAAAAAAAA48845}
!     99 | {37,86}                         | {AAAAAAAAAAAAAAAAAA32918,AAAAA70514,AAAAAAAAA10012,AAAAAAAAAAAAAAAAA59387,AAAAAAAAAA64777,AAAAAAAAAAAAAAAAAAA15356}
!    100 | {85,32,57,39,49,84,32,3,30}     | {AAAAAAA80240,AAAAAAAAAAAAAAAA1729,AAAAA60038,AAAAAAAAAAA92631,AAAAAAAA9523}
!    101 | {}                              | {}
!    102 | {NULL}                          | {NULL}
! (102 rows)
! 
! SELECT * FROM array_op_test WHERE i && '{}' ORDER BY seqno;
!  seqno | i | t 
! -------+---+---
! (0 rows)
! 
! SELECT * FROM array_op_test WHERE i <@ '{}' ORDER BY seqno;
!  seqno | i  | t  
! -------+----+----
!    101 | {} | {}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE i = '{NULL}' ORDER BY seqno;
!  seqno |   i    |   t    
! -------+--------+--------
!    102 | {NULL} | {NULL}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE i @> '{NULL}' ORDER BY seqno;
!  seqno | i | t 
! -------+---+---
! (0 rows)
! 
! SELECT * FROM array_op_test WHERE i && '{NULL}' ORDER BY seqno;
!  seqno | i | t 
! -------+---+---
! (0 rows)
! 
! SELECT * FROM array_op_test WHERE i <@ '{NULL}' ORDER BY seqno;
!  seqno | i  | t  
! -------+----+----
!    101 | {} | {}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE t @> '{AAAAAAAA72908}' ORDER BY seqno;
!  seqno |           i           |                                                                     t                                                                      
! -------+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------
!     22 | {11,6,56,62,53,30}    | {AAAAAAAA72908}
!     45 | {99,45}               | {AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}
!     72 | {22,1,16,78,20,91,83} | {47735,AAAAAAA56483,AAAAAAAAAAAAA93788,AA42406,AAAAAAAAAAAAA73084,AAAAAAAA72908,AAAAAAAAAAAAAAAAAA61286,AAAAA66674,AAAAAAAAAAAAAAAAA50407}
!     79 | {45}                  | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
! (4 rows)
! 
! SELECT * FROM array_op_test WHERE t && '{AAAAAAAA72908}' ORDER BY seqno;
!  seqno |           i           |                                                                     t                                                                      
! -------+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------
!     22 | {11,6,56,62,53,30}    | {AAAAAAAA72908}
!     45 | {99,45}               | {AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}
!     72 | {22,1,16,78,20,91,83} | {47735,AAAAAAA56483,AAAAAAAAAAAAA93788,AA42406,AAAAAAAAAAAAA73084,AAAAAAAA72908,AAAAAAAAAAAAAAAAAA61286,AAAAA66674,AAAAAAAAAAAAAAAAA50407}
!     79 | {45}                  | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
! (4 rows)
! 
! SELECT * FROM array_op_test WHERE t @> '{AAAAAAAAAA646}' ORDER BY seqno;
!  seqno |        i         |                                 t                                  
! -------+------------------+--------------------------------------------------------------------
!     15 | {17,14,16,63,67} | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     79 | {45}             | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
!     96 | {23,97,43}       | {AAAAAAAAAA646,A87088}
! (3 rows)
! 
! SELECT * FROM array_op_test WHERE t && '{AAAAAAAAAA646}' ORDER BY seqno;
!  seqno |        i         |                                 t                                  
! -------+------------------+--------------------------------------------------------------------
!     15 | {17,14,16,63,67} | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     79 | {45}             | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
!     96 | {23,97,43}       | {AAAAAAAAAA646,A87088}
! (3 rows)
! 
! SELECT * FROM array_op_test WHERE t @> '{AAAAAAAA72908,AAAAAAAAAA646}' ORDER BY seqno;
!  seqno |  i   |                                 t                                  
! -------+------+--------------------------------------------------------------------
!     79 | {45} | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE t && '{AAAAAAAA72908,AAAAAAAAAA646}' ORDER BY seqno;
!  seqno |           i           |                                                                     t                                                                      
! -------+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------
!     15 | {17,14,16,63,67}      | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     22 | {11,6,56,62,53,30}    | {AAAAAAAA72908}
!     45 | {99,45}               | {AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}
!     72 | {22,1,16,78,20,91,83} | {47735,AAAAAAA56483,AAAAAAAAAAAAA93788,AA42406,AAAAAAAAAAAAA73084,AAAAAAAA72908,AAAAAAAAAAAAAAAAAA61286,AAAAA66674,AAAAAAAAAAAAAAAAA50407}
!     79 | {45}                  | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
!     96 | {23,97,43}            | {AAAAAAAAAA646,A87088}
! (6 rows)
! 
! SELECT * FROM array_op_test WHERE t <@ '{AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}' ORDER BY seqno;
!  seqno |         i          |                                                     t                                                     
! -------+--------------------+-----------------------------------------------------------------------------------------------------------
!     22 | {11,6,56,62,53,30} | {AAAAAAAA72908}
!     45 | {99,45}            | {AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}
!    101 | {}                 | {}
! (3 rows)
! 
! SELECT * FROM array_op_test WHERE t = '{}' ORDER BY seqno;
!  seqno | i  | t  
! -------+----+----
!    101 | {} | {}
! (1 row)
! 
! SELECT * FROM array_op_test WHERE t @> '{}' ORDER BY seqno;
!  seqno |                i                |                                                                                                       t                                                                                                        
! -------+---------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!      1 | {92,75,71,52,64,83}             | {AAAAAAAA44066,AAAAAA1059,AAAAAAAAAAA176,AAAAAAA48038}
!      2 | {3,6}                           | {AAAAAA98232,AAAAAAAA79710,AAAAAAAAAAAAAAAAA69675,AAAAAAAAAAAAAAAA55798,AAAAAAAAA12793}
!      3 | {37,64,95,43,3,41,13,30,11,43}  | {AAAAAAAAAA48845,AAAAA75968,AAAAA95309,AAA54451,AAAAAAAAAA22292,AAAAAAA99836,A96617,AA17009,AAAAAAAAAAAAAA95246}
!      4 | {71,39,99,55,33,75,45}          | {AAAAAAAAA53663,AAAAAAAAAAAAAAA67062,AAAAAAAAAA64777,AAA99043,AAAAAAAAAAAAAAAAAAA91804,39557}
!      5 | {50,42,77,50,4}                 | {AAAAAAAAAAAAAAAAA26540,AAAAAAA79710,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAA176,AAAAA95309,AAAAAAAAAAA46154,AAAAAA66777,AAAAAAAAA27249,AAAAAAAAAA64777,AAAAAAAAAAAAAAAAAAA70104}
!      6 | {39,35,5,94,17,92,60,32}        | {AAAAAAAAAAAAAAA35875,AAAAAAAAAAAAAAAA23657}
!      7 | {12,51,88,64,8}                 | {AAAAAAAAAAAAAAAAAA12591,AAAAAAAAAAAAAAAAA50407,AAAAAAAAAAAA67946}
!      8 | {60,84}                         | {AAAAAAA81898,AAAAAA1059,AAAAAAAAAAAA81511,AAAAA961,AAAAAAAAAAAAAAAA31334,AAAAA64741,AA6416,AAAAAAAAAAAAAAAAAA32918,AAAAAAAAAAAAAAAAA50407}
!      9 | {56,52,35,27,80,44,81,22}       | {AAAAAAAAAAAAAAA73034,AAAAAAAAAAAAA7929,AAAAAAA66161,AA88409,39557,A27153,AAAAAAAA9523,AAAAAAAAAAA99000}
!     10 | {71,5,45}                       | {AAAAAAAAAAA21658,AAAAAAAAAAAA21089,AAA54451,AAAAAAAAAAAAAAAAAA54141,AAAAAAAAAAAAAA28620,AAAAAAAAAAA21658,AAAAAAAAAAA74076,AAAAAAAAA27249}
!     11 | {41,86,74,48,22,74,47,50}       | {AAAAAAAA9523,AAAAAAAAAAAA37562,AAAAAAAAAAAAAAAA14047,AAAAAAAAAAA46154,AAAA41702,AAAAAAAAAAAAAAAAA764,AAAAA62737,39557}
!     12 | {17,99,18,52,91,72,0,43,96,23}  | {AAAAA33250,AAAAAAAAAAAAAAAAAAA85420,AAAAAAAAAAA33576}
!     13 | {3,52,34,23}                    | {AAAAAA98232,AAAA49534,AAAAAAAAAAA21658}
!     14 | {78,57,19}                      | {AAAA8857,AAAAAAAAAAAAAAA73034,AAAAAAAA81587,AAAAAAAAAAAAAAA68526,AAAAA75968,AAAAAAAAAAAAAA65909,AAAAAAAAA10012,AAAAAAAAAAAAAA65909}
!     15 | {17,14,16,63,67}                | {AA6416,AAAAAAAAAA646,AAAAA95309}
!     16 | {14,63,85,11}                   | {AAAAAA66777}
!     17 | {7,10,81,85}                    | {AAAAAA43678,AAAAAAA12144,AAAAAAAAAAA50956,AAAAAAAAAAAAAAAAAAA15356}
!     18 | {1}                             | {AAAAAAAAAAA33576,AAAAA95309,64261,AAA59323,AAAAAAAAAAAAAA95246,55847,AAAAAAAAAAAA67946,AAAAAAAAAAAAAAAAAA64374}
!     19 | {52,82,17,74,23,46,69,51,75}    | {AAAAAAAAAAAAA73084,AAAAA75968,AAAAAAAAAAAAAAAA14047,AAAAAAA80240,AAAAAAAAAAAAAAAAAAA1205,A68938}
!     20 | {72,89,70,51,54,37,8,49,79}     | {AAAAAA58494}
!     21 | {2,8,65,10,5,79,43}             | {AAAAAAAAAAAAAAAAA88852,AAAAAAAAAAAAAAAAAAA91804,AAAAA64669,AAAAAAAAAAAAAAAA1443,AAAAAAAAAAAAAAAA23657,AAAAA12179,AAAAAAAAAAAAAAAAA88852,AAAAAAAAAAAAAAAA31334,AAAAAAAAAAAAAAAA41303,AAAAAAAAAAAAAAAAAAA85420}
!     22 | {11,6,56,62,53,30}              | {AAAAAAAA72908}
!     23 | {40,90,5,38,72,40,30,10,43,55}  | {A6053,AAAAAAAAAAA6119,AA44673,AAAAAAAAAAAAAAAAA764,AA17009,AAAAA17383,AAAAA70514,AAAAA33250,AAAAA95309,AAAAAAAAAAAA37562}
!     24 | {94,61,99,35,48}                | {AAAAAAAAAAA50956,AAAAAAAAAAA15165,AAAA85070,AAAAAAAAAAAAAAA36627,AAAAA961,AAAAAAAAAA55219}
!     25 | {31,1,10,11,27,79,38}           | {AAAAAAAAAAAAAAAAAA59334,45449}
!     26 | {71,10,9,69,75}                 | {47735,AAAAAAA21462,AAAAAAAAAAAAAAAAA6897,AAAAAAAAAAAAAAAAAAA91804,AAAAAAAAA72121,AAAAAAAAAAAAAAAAAAA1205,AAAAA41597,AAAA8857,AAAAAAAAAAAAAAAAAAA15356,AA17009}
!     27 | {94}                            | {AA6416,A6053,AAAAAAA21462,AAAAAAA57334,AAAAAAAAAAAAAAAAAA12591,AA88409,AAAAAAAAAAAAA70254}
!     28 | {14,33,6,34,14}                 | {AAAAAAAAAAAAAAA13198,AAAAAAAA69452,AAAAAAAAAAA82945,AAAAAAA12144,AAAAAAAAA72121,AAAAAAAAAA18601}
!     29 | {39,21}                         | {AAAAAAAAAAAAAAAAA6897,AAAAAAAAAAAAAAAAAAA38885,AAAA85070,AAAAAAAAAAAAAAAAAAA70104,AAAAA66674,AAAAAAAAAAAAA62007,AAAAAAAA69452,AAAAAAA1242,AAAAAAAAAAAAAAAA1729,AAAA35194}
!     30 | {26,81,47,91,34}                | {AAAAAAAAAAAAAAAAAAA70104,AAAAAAA80240}
!     31 | {80,24,18,21,54}                | {AAAAAAAAAAAAAAA13198,AAAAAAAAAAAAAAAAAAA70415,A27153,AAAAAAAAA53663,AAAAAAAAAAAAAAAAA50407,A68938}
!     32 | {58,79,82,80,67,75,98,10,41}    | {AAAAAAAAAAAAAAAAAA61286,AAA54451,AAAAAAAAAAAAAAAAAAA87527,A96617,51533}
!     33 | {74,73}                         | {A85417,AAAAAAA56483,AAAAA17383,AAAAAAAAAAAAA62159,AAAAAAAAAAAA52814,AAAAAAAAAAAAA85723,AAAAAAAAAAAAAAAAAA55796}
!     34 | {70,45}                         | {AAAAAAAAAAAAAAAAAA71621,AAAAAAAAAAAAAA28620,AAAAAAAAAA55219,AAAAAAAA23648,AAAAAAAAAA22292,AAAAAAA1242}
!     35 | {23,40}                         | {AAAAAAAAAAAA52814,AAAA48949,AAAAAAAAA34727,AAAA8857,AAAAAAAAAAAAAAAAAAA62179,AAAAAAAAAAAAAAA68526,AAAAAAA99836,AAAAAAAA50094,AAAA91194,AAAAAAAAAAAAA73084}
!     36 | {79,82,14,52,30,5,79}           | {AAAAAAAAA53663,AAAAAAAAAAAAAAAA55798,AAAAAAAAAAAAAAAAAAA89194,AA88409,AAAAAAAAAAAAAAA81326,AAAAAAAAAAAAAAAAA63050,AAAAAAAAAAAAAAAA33598}
!     37 | {53,11,81,39,3,78,58,64,74}     | {AAAAAAAAAAAAAAAAAAA17075,AAAAAAA66161,AAAAAAAA23648,AAAAAAAAAAAAAA10611}
!     38 | {59,5,4,95,28}                  | {AAAAAAAAAAA82945,A96617,47735,AAAAA12179,AAAAA64669,AAAAAA99807,AA74433,AAAAAAAAAAAAAAAAA59387}
!     39 | {82,43,99,16,74}                | {AAAAAAAAAAAAAAA67062,AAAAAAA57334,AAAAAAAAAAAAAA65909,A27153,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAAAA43052,AAAAAAAAAA64777,AAAAAAAAAAAA81511,AAAAAAAAAAAAAA65909,AAAAAAAAAAAAAA28620}
!     40 | {34}                            | {AAAAAAAAAAAAAA10611,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAA50956,AAAAAAAAAAAAAAAA31334,AAAAA70466,AAAAAAAA81587,AAAAAAA74623}
!     41 | {19,26,63,12,93,73,27,94}       | {AAAAAAA79710,AAAAAAAAAA55219,AAAA41702,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAAAAA71621,AAAAAAAAAAAAAAAAA63050,AAAAAAA99836,AAAAAAAAAAAAAA8666}
!     42 | {15,76,82,75,8,91}              | {AAAAAAAAAAA176,AAAAAA38063,45449,AAAAAA54032,AAAAAAA81898,AA6416,AAAAAAAAAAAAAAAAAAA62179,45449,AAAAA60038,AAAAAAAA81587}
!     43 | {39,87,91,97,79,28}             | {AAAAAAAAAAA74076,A96617,AAAAAAAAAAAAAAAAAAA89194,AAAAAAAAAAAAAAAAAA55796,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAA67946}
!     44 | {40,58,68,29,54}                | {AAAAAAA81898,AAAAAA66777,AAAAAA98232}
!     45 | {99,45}                         | {AAAAAAAA72908,AAAAAAAAAAAAAAAAAAA17075,AA88409,AAAAAAAAAAAAAAAAAA36842,AAAAAAA48038,AAAAAAAAAAAAAA10611}
!     46 | {53,24}                         | {AAAAAAAAAAA53908,AAAAAA54032,AAAAA17383,AAAA48949,AAAAAAAAAA18601,AAAAA64669,45449,AAAAAAAAAAA98051,AAAAAAAAAAAAAAAAAA71621}
!     47 | {98,23,64,12,75,61}             | {AAA59323,AAAAA95309,AAAAAAAAAAAAAAAA31334,AAAAAAAAA27249,AAAAA17383,AAAAAAAAAAAA37562,AAAAAA1059,A84822,55847,AAAAA70466}
!     48 | {76,14}                         | {AAAAAAAAAAAAA59671,AAAAAAAAAAAAAAAAAAA91804,AAAAAA66777,AAAAAAAAAAAAAAAAAAA89194,AAAAAAAAAAAAAAA36627,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAA73084,AAAAAAA79710,AAAAAAAAAAAAAAA40402,AAAAAAAAAAAAAAAAAAA65037}
!     49 | {56,5,54,37,49}                 | {AA21643,AAAAAAAAAAA92631,AAAAAAAA81587}
!     50 | {20,12,37,64,93}                | {AAAAAAAAAA5483,AAAAAAAAAAAAAAAAAAA1205,AA6416,AAAAAAAAAAAAAAAAA63050,AAAAAAAAAAAAAAAAAA47955}
!     51 | {47}                            | {AAAAAAAAAAAAAA96505,AAAAAAAAAAAAAAAAAA36842,AAAAA95309,AAAAAAAA81587,AA6416,AAAA91194,AAAAAA58494,AAAAAA1059,AAAAAAAA69452}
!     52 | {89,0}                          | {AAAAAAAAAAAAAAAAAA47955,AAAAAAA48038,AAAAAAAAAAAAAAAAA43052,AAAAAAAAAAAAA73084,AAAAA70466,AAAAAAAAAAAAAAAAA764,AAAAAAAAAAA46154,AA66862}
!     53 | {38,17}                         | {AAAAAAAAAAA21658}
!     54 | {70,47}                         | {AAAAAAAAAAAAAAAAAA54141,AAAAA40681,AAAAAAA48038,AAAAAAAAAAAAAAAA29150,AAAAA41597,AAAAAAAAAAAAAAAAAA59334,AA15322}
!     55 | {47,79,47,64,72,25,71,24,93}    | {AAAAAAAAAAAAAAAAAA55796,AAAAA62737}
!     56 | {33,7,60,54,93,90,77,85,39}     | {AAAAAAAAAAAAAAAAAA32918,AA42406}
!     57 | {23,45,10,42,36,21,9,96}        | {AAAAAAAAAAAAAAAAAAA70415}
!     58 | {92}                            | {AAAAAAAAAAAAAAAA98414,AAAAAAAA23648,AAAAAAAAAAAAAAAAAA55796,AA25381,AAAAAAAAAAA6119}
!     59 | {9,69,46,77}                    | {39557,AAAAAAA89932,AAAAAAAAAAAAAAAAA43052,AAAAAAAAAAAAAAAAA26540,AAA20874,AA6416,AAAAAAAAAAAAAAAAAA47955}
!     60 | {62,2,59,38,89}                 | {AAAAAAA89932,AAAAAAAAAAAAAAAAAAA15356,AA99927,AA17009,AAAAAAAAAAAAAAA35875}
!     61 | {72,2,44,95,54,54,13}           | {AAAAAAAAAAAAAAAAAAA91804}
!     62 | {83,72,29,73}                   | {AAAAAAAAAAAAA15097,AAAA8857,AAAAAAAAAAAA35809,AAAAAAAAAAAA52814,AAAAAAAAAAAAAAAAAAA38885,AAAAAAAAAAAAAAAAAA24183,AAAAAA43678,A96617}
!     63 | {11,4,61,87}                    | {AAAAAAAAA27249,AAAAAAAAAAAAAAAAAA32918,AAAAAAAAAAAAAAA13198,AAA20874,39557,51533,AAAAAAAAAAA53908,AAAAAAAAAAAAAA96505,AAAAAAAA78938}
!     64 | {26,19,34,24,81,78}             | {A96617,AAAAAAAAAAAAAAAAAAA70104,A68938,AAAAAAAAAAA53908,AAAAAAAAAAAAAAA453,AA17009,AAAAAAA80240}
!     65 | {61,5,76,59,17}                 | {AAAAAA99807,AAAAA64741,AAAAAAAAAAA53908,AA21643,AAAAAAAAA10012}
!     66 | {31,23,70,52,4,33,48,25}        | {AAAAAAAAAAAAAAAAA69675,AAAAAAAA50094,AAAAAAAAAAA92631,AAAA35194,39557,AAAAAAA99836}
!     67 | {31,94,7,10}                    | {AAAAAA38063,A96617,AAAA35194,AAAAAAAAAAAA67946}
!     68 | {90,43,38}                      | {AA75092,AAAAAAAAAAAAAAAAA69675,AAAAAAAAAAA92631,AAAAAAAAA10012,AAAAAAAAAAAAA7929,AA21643}
!     69 | {67,35,99,85,72,86,44}          | {AAAAAAAAAAAAAAAAAAA1205,AAAAAAAA50094,AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAAAAAAA47955}
!     70 | {56,70,83}                      | {AAAA41702,AAAAAAAAAAA82945,AA21643,AAAAAAAAAAA99000,A27153,AA25381,AAAAAAAAAAAAAA96505,AAAAAAA1242}
!     71 | {74,26}                         | {AAAAAAAAAAA50956,AA74433,AAAAAAA21462,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAA36627,AAAAAAAAAAAAA70254,AAAAAAAAAA43419,39557}
!     72 | {22,1,16,78,20,91,83}           | {47735,AAAAAAA56483,AAAAAAAAAAAAA93788,AA42406,AAAAAAAAAAAAA73084,AAAAAAAA72908,AAAAAAAAAAAAAAAAAA61286,AAAAA66674,AAAAAAAAAAAAAAAAA50407}
!     73 | {88,25,96,78,65,15,29,19}       | {AAA54451,AAAAAAAAA27249,AAAAAAA9228,AAAAAAAAAAAAAAA67062,AAAAAAAAAAAAAAAAAAA70415,AAAAA17383,AAAAAAAAAAAAAAAA33598}
!     74 | {32}                            | {AAAAAAAAAAAAAAAA1729,AAAAAAAAAAAAA22860,AAAAAA99807,AAAAA17383,AAAAAAAAAAAAAAA67062,AAAAAAAAAAA15165,AAAAAAAAAAA50956}
!     75 | {12,96,83,24,71,89,55}          | {AAAA48949,AAAAAAAA29716,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAAA67946,AAAAAAAAAAAAAAAA29150,AAA28075,AAAAAAAAAAAAAAAAA43052}
!     76 | {92,55,10,7}                    | {AAAAAAAAAAAAAAA67062}
!     77 | {97,15,32,17,55,59,18,37,50,39} | {AAAAAAAAAAAA67946,AAAAAA54032,AAAAAAAA81587,55847,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAAAAA43052,AAAAAA75463,AAAA49534,AAAAAAAA44066}
!     78 | {55,89,44,84,34}                | {AAAAAAAAAAA6119,AAAAAAAAAAAAAA8666,AA99927,AA42406,AAAAAAA81898,AAAAAAA9228,AAAAAAAAAAA92631,AA21643,AAAAAAAAAAAAAA28620}
!     79 | {45}                            | {AAAAAAAAAA646,AAAAAAAAAAAAAAAAAAA70415,AAAAAA43678,AAAAAAAA72908}
!     80 | {74,89,44,80,0}                 | {AAAA35194,AAAAAAAA79710,AAA20874,AAAAAAAAAAAAAAAAAAA70104,AAAAAAAAAAAAA73084,AAAAAAA57334,AAAAAAA9228,AAAAAAAAAAAAA62007}
!     81 | {63,77,54,48,61,53,97}          | {AAAAAAAAAAAAAAA81326,AAAAAAAAAA22292,AA25381,AAAAAAAAAAA74076,AAAAAAA81898,AAAAAAAAA72121}
!     82 | {34,60,4,79,78,16,86,89,42,50}  | {AAAAA40681,AAAAAAAAAAAAAAAAAA12591,AAAAAAA80240,AAAAAAAAAAAAAAAA55798,AAAAAAAAAAAAAAAAAAA70104}
!     83 | {14,10}                         | {AAAAAAAAAA22292,AAAAAAAAAAAAA70254,AAAAAAAAAAA6119}
!     84 | {11,83,35,13,96,94}             | {AAAAA95309,AAAAAAAAAAAAAAAAAA32918,AAAAAAAAAAAAAAAAAA24183}
!     85 | {39,60}                         | {AAAAAAAAAAAAAAAA55798,AAAAAAAAAA22292,AAAAAAA66161,AAAAAAA21462,AAAAAAAAAAAAAAAAAA12591,55847,AAAAAA98232,AAAAAAAAAAA46154}
!     86 | {33,81,72,74,45,36,82}          | {AAAAAAAA81587,AAAAAAAAAAAAAA96505,45449,AAAA80176}
!     87 | {57,27,50,12,97,68}             | {AAAAAAAAAAAAAAAAA26540,AAAAAAAAA10012,AAAAAAAAAAAA35809,AAAAAAAAAAAAAAAA29150,AAAAAAAAAAA82945,AAAAAA66777,31228,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAA28620,AAAAAAAAAAAAAA96505}
!     88 | {41,90,77,24,6,24}              | {AAAA35194,AAAA35194,AAAAAAA80240,AAAAAAAAAAA46154,AAAAAA58494,AAAAAAAAAAAAAAAAAAA17075,AAAAAAAAAAAAAAAAAA59334,AAAAAAAAAAAAAAAAAAA91804,AA74433}
!     89 | {40,32,17,6,30,88}              | {AA44673,AAAAAAAAAAA6119,AAAAAAAAAAAAAAAA23657,AAAAAAAAAAAAAAAAAA47955,AAAAAAAAAAAAAAAA33598,AAAAAAAAAAA33576,AA44673}
!     90 | {88,75}                         | {AAAAA60038,AAAAAAAA23648,AAAAAAAAAAA99000,AAAA41702,AAAAAAAAAAAAA22860,AAAAAAAAAAAAAAA68526}
!     91 | {78}                            | {AAAAAAAAAAAAA62007,AAA99043}
!     92 | {85,63,49,45}                   | {AAAAAAA89932,AAAAAAAAAAAAA22860,AAAAAAAAAAAAAAAAAAA1205,AAAAAAAAAAAA21089}
!     93 | {11}                            | {AAAAAAAAAAA176,AAAAAAAAAAAAAA8666,AAAAAAAAAAAAAAA453,AAAAAAAAAAAAA85723,A68938,AAAAAAAAAAAAA9821,AAAAAAA48038,AAAAAAAAAAAAAAAAA59387,AA99927,AAAAA17383}
!     94 | {98,9,85,62,88,91,60,61,38,86}  | {AAAAAAAA81587,AAAAA17383,AAAAAAAA81587}
!     95 | {47,77}                         | {AAAAAAAAAAAAAAAAA764,AAAAAAAAAAA74076,AAAAAAAAAA18107,AAAAA40681,AAAAAAAAAAAAAAA35875,AAAAA60038,AAAAAAA56483}
!     96 | {23,97,43}                      | {AAAAAAAAAA646,A87088}
!     97 | {54,2,86,65}                    | {47735,AAAAAAA99836,AAAAAAAAAAAAAAAAA6897,AAAAAAAAAAAAAAAA29150,AAAAAAA80240,AAAAAAAAAAAAAAAA98414,AAAAAAA56483,AAAAAAAAAAAAAAAA29150,AAAAAAA39692,AA21643}
!     98 | {38,34,32,89}                   | {AAAAAAAAAAAAAAAAAA71621,AAAA8857,AAAAAAAAAAAAAAAAAAA65037,AAAAAAAAAAAAAAAA31334,AAAAAAAAAA48845}
!     99 | {37,86}                         | {AAAAAAAAAAAAAAAAAA32918,AAAAA70514,AAAAAAAAA10012,AAAAAAAAAAAAAAAAA59387,AAAAAAAAAA64777,AAAAAAAAAAAAAAAAAAA15356}
!    100 | {85,32,57,39,49,84,32,3,30}     | {AAAAAAA80240,AAAAAAAAAAAAAAAA1729,AAAAA60038,AAAAAAAAAAA92631,AAAAAAAA9523}
!    101 | {}                              | {}
!    102 | {NULL}                          | {NULL}
! (102 rows)
! 
! SELECT * FROM array_op_test WHERE t && '{}' ORDER BY seqno;
!  seqno | i | t 
! -------+---+---
! (0 rows)
! 
! SELECT * FROM array_op_test WHERE t <@ '{}' ORDER BY seqno;
!  seqno | i  | t  
! -------+----+----
!    101 | {} | {}
! (1 row)
! 
! -- array casts
! SELECT ARRAY[1,2,3]::text[]::int[]::float8[] AS "{1,2,3}";
!  {1,2,3} 
! ---------
!  {1,2,3}
! (1 row)
! 
! SELECT ARRAY[1,2,3]::text[]::int[]::float8[] is of (float8[]) as "TRUE";
!  TRUE 
! ------
!  t
! (1 row)
! 
! SELECT ARRAY[['a','bc'],['def','hijk']]::text[]::varchar[] AS "{{a,bc},{def,hijk}}";
!  {{a,bc},{def,hijk}} 
! ---------------------
!  {{a,bc},{def,hijk}}
! (1 row)
! 
! SELECT ARRAY[['a','bc'],['def','hijk']]::text[]::varchar[] is of (varchar[]) as "TRUE";
!  TRUE 
! ------
!  t
! (1 row)
! 
! SELECT CAST(ARRAY[[[[[['a','bb','ccc']]]]]] as text[]) as "{{{{{{a,bb,ccc}}}}}}";
!  {{{{{{a,bb,ccc}}}}}} 
! ----------------------
!  {{{{{{a,bb,ccc}}}}}}
! (1 row)
! 
! -- scalar op any/all (array)
! select 33 = any ('{1,2,3}');
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 33 = any ('{1,2,33}');
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select 33 = all ('{1,2,33}');
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 33 >= all ('{1,2,33}');
!  ?column? 
! ----------
!  t
! (1 row)
! 
! -- boundary cases
! select null::int >= all ('{1,2,33}');
!  ?column? 
! ----------
!  
! (1 row)
! 
! select null::int >= all ('{}');
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select null::int >= any ('{}');
!  ?column? 
! ----------
!  f
! (1 row)
! 
! -- cross-datatype
! select 33.4 = any (array[1,2,3]);
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 33.4 > all (array[1,2,3]);
!  ?column? 
! ----------
!  t
! (1 row)
! 
! -- errors
! select 33 * any ('{1,2,3}');
! ERROR:  op ANY/ALL (array) requires operator to yield boolean
! LINE 1: select 33 * any ('{1,2,3}');
!                   ^
! select 33 * any (44);
! ERROR:  op ANY/ALL (array) requires array on right side
! LINE 1: select 33 * any (44);
!                   ^
! -- nulls
! select 33 = any (null::int[]);
!  ?column? 
! ----------
!  
! (1 row)
! 
! select null::int = any ('{1,2,3}');
!  ?column? 
! ----------
!  
! (1 row)
! 
! select 33 = any ('{1,null,3}');
!  ?column? 
! ----------
!  
! (1 row)
! 
! select 33 = any ('{1,null,33}');
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select 33 = all (null::int[]);
!  ?column? 
! ----------
!  
! (1 row)
! 
! select null::int = all ('{1,2,3}');
!  ?column? 
! ----------
!  
! (1 row)
! 
! select 33 = all ('{1,null,3}');
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 33 = all ('{33,null,33}');
!  ?column? 
! ----------
!  
! (1 row)
! 
! -- test indexes on arrays
! create temp table arr_tbl (f1 int[] unique);
! insert into arr_tbl values ('{1,2,3}');
! insert into arr_tbl values ('{1,2}');
! -- failure expected:
! insert into arr_tbl values ('{1,2,3}');
! ERROR:  duplicate key value violates unique constraint "arr_tbl_f1_key"
! DETAIL:  Key (f1)=({1,2,3}) already exists.
! insert into arr_tbl values ('{2,3,4}');
! insert into arr_tbl values ('{1,5,3}');
! insert into arr_tbl values ('{1,2,10}');
! set enable_seqscan to off;
! set enable_bitmapscan to off;
! select * from arr_tbl where f1 > '{1,2,3}' and f1 <= '{1,5,3}';
!     f1    
! ----------
!  {1,2,10}
!  {1,5,3}
! (2 rows)
! 
! select * from arr_tbl where f1 >= '{1,2,3}' and f1 < '{1,5,3}';
!     f1    
! ----------
!  {1,2,3}
!  {1,2,10}
! (2 rows)
! 
! -- note: if above selects don't produce the expected tuple order,
! -- then you didn't get an indexscan plan, and something is busted.
! reset enable_seqscan;
! reset enable_bitmapscan;
! -- test [not] (like|ilike) (any|all) (...)
! select 'foo' like any (array['%a', '%o']); -- t
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select 'foo' like any (array['%a', '%b']); -- f
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 'foo' like all (array['f%', '%o']); -- t
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select 'foo' like all (array['f%', '%b']); -- f
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 'foo' not like any (array['%a', '%b']); -- t
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select 'foo' not like all (array['%a', '%o']); -- f
!  ?column? 
! ----------
!  f
! (1 row)
! 
! select 'foo' ilike any (array['%A', '%O']); -- t
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select 'foo' ilike all (array['F%', '%O']); -- t
!  ?column? 
! ----------
!  t
! (1 row)
! 
! --
! -- General array parser tests
! --
! -- none of the following should be accepted
! select '{{1,{2}},{2,3}}'::text[];
! ERROR:  malformed array literal: "{{1,{2}},{2,3}}"
! LINE 1: select '{{1,{2}},{2,3}}'::text[];
!                ^
! select '{{},{}}'::text[];
! ERROR:  malformed array literal: "{{},{}}"
! LINE 1: select '{{},{}}'::text[];
!                ^
! select E'{{1,2},\\{2,3}}'::text[];
! ERROR:  malformed array literal: "{{1,2},\{2,3}}"
! LINE 1: select E'{{1,2},\\{2,3}}'::text[];
!                ^
! select '{{"1 2" x},{3}}'::text[];
! ERROR:  malformed array literal: "{{"1 2" x},{3}}"
! LINE 1: select '{{"1 2" x},{3}}'::text[];
!                ^
! select '{}}'::text[];
! ERROR:  malformed array literal: "{}}"
! LINE 1: select '{}}'::text[];
!                ^
! select '{ }}'::text[];
! ERROR:  malformed array literal: "{ }}"
! LINE 1: select '{ }}'::text[];
!                ^
! select array[];
! ERROR:  cannot determine type of empty array
! LINE 1: select array[];
!                ^
! HINT:  Explicitly cast to the desired type, for example ARRAY[]::integer[].
! -- none of the above should be accepted
! -- all of the following should be accepted
! select '{}'::text[];
!  text 
! ------
!  {}
! (1 row)
! 
! select '{{{1,2,3,4},{2,3,4,5}},{{3,4,5,6},{4,5,6,7}}}'::text[];
!                      text                      
! -----------------------------------------------
!  {{{1,2,3,4},{2,3,4,5}},{{3,4,5,6},{4,5,6,7}}}
! (1 row)
! 
! select '{0 second  ,0 second}'::interval[];
!    interval    
! ---------------
!  {"@ 0","@ 0"}
! (1 row)
! 
! select '{ { "," } , { 3 } }'::text[];
!     text     
! -------------
!  {{","},{3}}
! (1 row)
! 
! select '  {   {  "  0 second  "   ,  0 second  }   }'::text[];
!              text              
! -------------------------------
!  {{"  0 second  ","0 second"}}
! (1 row)
! 
! select '{
!            0 second,
!            @ 1 hour @ 42 minutes @ 20 seconds
!          }'::interval[];
!               interval              
! ------------------------------------
!  {"@ 0","@ 1 hour 42 mins 20 secs"}
! (1 row)
! 
! select array[]::text[];
!  array 
! -------
!  {}
! (1 row)
! 
! select '[0:1]={1.1,2.2}'::float8[];
!      float8      
! -----------------
!  [0:1]={1.1,2.2}
! (1 row)
! 
! -- all of the above should be accepted
! -- tests for array aggregates
! CREATE TEMP TABLE arraggtest ( f1 INT[], f2 TEXT[][], f3 FLOAT[]);
! INSERT INTO arraggtest (f1, f2, f3) VALUES
! ('{1,2,3,4}','{{grey,red},{blue,blue}}','{1.6, 0.0}');
! INSERT INTO arraggtest (f1, f2, f3) VALUES
! ('{1,2,3}','{{grey,red},{grey,blue}}','{1.6}');
! SELECT max(f1), min(f1), max(f2), min(f2), max(f3), min(f3) FROM arraggtest;
!     max    |   min   |           max            |           min            |   max   |  min  
! -----------+---------+--------------------------+--------------------------+---------+-------
!  {1,2,3,4} | {1,2,3} | {{grey,red},{grey,blue}} | {{grey,red},{blue,blue}} | {1.6,0} | {1.6}
! (1 row)
! 
! INSERT INTO arraggtest (f1, f2, f3) VALUES
! ('{3,3,2,4,5,6}','{{white,yellow},{pink,orange}}','{2.1,3.3,1.8,1.7,1.6}');
! SELECT max(f1), min(f1), max(f2), min(f2), max(f3), min(f3) FROM arraggtest;
!       max      |   min   |              max               |           min            |          max          |  min  
! ---------------+---------+--------------------------------+--------------------------+-----------------------+-------
!  {3,3,2,4,5,6} | {1,2,3} | {{white,yellow},{pink,orange}} | {{grey,red},{blue,blue}} | {2.1,3.3,1.8,1.7,1.6} | {1.6}
! (1 row)
! 
! INSERT INTO arraggtest (f1, f2, f3) VALUES
! ('{2}','{{black,red},{green,orange}}','{1.6,2.2,2.6,0.4}');
! SELECT max(f1), min(f1), max(f2), min(f2), max(f3), min(f3) FROM arraggtest;
!       max      |   min   |              max               |             min              |          max          |  min  
! ---------------+---------+--------------------------------+------------------------------+-----------------------+-------
!  {3,3,2,4,5,6} | {1,2,3} | {{white,yellow},{pink,orange}} | {{black,red},{green,orange}} | {2.1,3.3,1.8,1.7,1.6} | {1.6}
! (1 row)
! 
! INSERT INTO arraggtest (f1, f2, f3) VALUES
! ('{4,2,6,7,8,1}','{{red},{black},{purple},{blue},{blue}}',NULL);
! SELECT max(f1), min(f1), max(f2), min(f2), max(f3), min(f3) FROM arraggtest;
!       max      |   min   |              max               |             min              |          max          |  min  
! ---------------+---------+--------------------------------+------------------------------+-----------------------+-------
!  {4,2,6,7,8,1} | {1,2,3} | {{white,yellow},{pink,orange}} | {{black,red},{green,orange}} | {2.1,3.3,1.8,1.7,1.6} | {1.6}
! (1 row)
! 
! INSERT INTO arraggtest (f1, f2, f3) VALUES
! ('{}','{{pink,white,blue,red,grey,orange}}','{2.1,1.87,1.4,2.2}');
! SELECT max(f1), min(f1), max(f2), min(f2), max(f3), min(f3) FROM arraggtest;
!       max      | min |              max               |             min              |          max          |  min  
! ---------------+-----+--------------------------------+------------------------------+-----------------------+-------
!  {4,2,6,7,8,1} | {}  | {{white,yellow},{pink,orange}} | {{black,red},{green,orange}} | {2.1,3.3,1.8,1.7,1.6} | {1.6}
! (1 row)
! 
! -- A few simple tests for arrays of composite types
! create type comptype as (f1 int, f2 text);
! create table comptable (c1 comptype, c2 comptype[]);
! -- XXX would like to not have to specify row() construct types here ...
! insert into comptable
!   values (row(1,'foo'), array[row(2,'bar')::comptype, row(3,'baz')::comptype]);
! -- check that implicitly named array type _comptype isn't a problem
! create type _comptype as enum('fooey');
! select * from comptable;
!    c1    |          c2           
! ---------+-----------------------
!  (1,foo) | {"(2,bar)","(3,baz)"}
! (1 row)
! 
! select c2[2].f2 from comptable;
!  f2  
! -----
!  baz
! (1 row)
! 
! drop type _comptype;
! drop table comptable;
! drop type comptype;
! create or replace function unnest1(anyarray)
! returns setof anyelement as $$
! select $1[s] from generate_subscripts($1,1) g(s);
! $$ language sql immutable;
! create or replace function unnest2(anyarray)
! returns setof anyelement as $$
! select $1[s1][s2] from generate_subscripts($1,1) g1(s1),
!                    generate_subscripts($1,2) g2(s2);
! $$ language sql immutable;
! select * from unnest1(array[1,2,3]);
!  unnest1 
! ---------
!        1
!        2
!        3
! (3 rows)
! 
! select * from unnest2(array[[1,2,3],[4,5,6]]);
!  unnest2 
! ---------
!        1
!        2
!        3
!        4
!        5
!        6
! (6 rows)
! 
! drop function unnest1(anyarray);
! drop function unnest2(anyarray);
! select array_fill(null::integer, array[3,3],array[2,2]);
!                            array_fill                            
! -----------------------------------------------------------------
!  [2:4][2:4]={{NULL,NULL,NULL},{NULL,NULL,NULL},{NULL,NULL,NULL}}
! (1 row)
! 
! select array_fill(null::integer, array[3,3]);
!                       array_fill                      
! ------------------------------------------------------
!  {{NULL,NULL,NULL},{NULL,NULL,NULL},{NULL,NULL,NULL}}
! (1 row)
! 
! select array_fill(null::text, array[3,3],array[2,2]);
!                            array_fill                            
! -----------------------------------------------------------------
!  [2:4][2:4]={{NULL,NULL,NULL},{NULL,NULL,NULL},{NULL,NULL,NULL}}
! (1 row)
! 
! select array_fill(null::text, array[3,3]);
!                       array_fill                      
! ------------------------------------------------------
!  {{NULL,NULL,NULL},{NULL,NULL,NULL},{NULL,NULL,NULL}}
! (1 row)
! 
! select array_fill(7, array[3,3],array[2,2]);
!               array_fill              
! --------------------------------------
!  [2:4][2:4]={{7,7,7},{7,7,7},{7,7,7}}
! (1 row)
! 
! select array_fill(7, array[3,3]);
!         array_fill         
! ---------------------------
!  {{7,7,7},{7,7,7},{7,7,7}}
! (1 row)
! 
! select array_fill('juhu'::text, array[3,3],array[2,2]);
!                            array_fill                            
! -----------------------------------------------------------------
!  [2:4][2:4]={{juhu,juhu,juhu},{juhu,juhu,juhu},{juhu,juhu,juhu}}
! (1 row)
! 
! select array_fill('juhu'::text, array[3,3]);
!                       array_fill                      
! ------------------------------------------------------
!  {{juhu,juhu,juhu},{juhu,juhu,juhu},{juhu,juhu,juhu}}
! (1 row)
! 
! -- raise exception
! select array_fill(1, null, array[2,2]);
! ERROR:  dimension array or low bound array cannot be null
! select array_fill(1, array[2,2], null);
! ERROR:  dimension array or low bound array cannot be null
! select array_fill(1, array[3,3], array[1,1,1]);
! ERROR:  wrong number of array subscripts
! DETAIL:  Low bound array has different size than dimensions array.
! select array_fill(1, array[1,2,null]);
! ERROR:  dimension values cannot be null
! select string_to_array('1|2|3', '|');
!  string_to_array 
! -----------------
!  {1,2,3}
! (1 row)
! 
! select string_to_array('1|2|3|', '|');
!  string_to_array 
! -----------------
!  {1,2,3,""}
! (1 row)
! 
! select string_to_array('1||2|3||', '||');
!  string_to_array 
! -----------------
!  {1,2|3,""}
! (1 row)
! 
! select string_to_array('1|2|3', '');
!  string_to_array 
! -----------------
!  {1|2|3}
! (1 row)
! 
! select string_to_array('', '|');
!  string_to_array 
! -----------------
!  {}
! (1 row)
! 
! select string_to_array('1|2|3', NULL);
!  string_to_array 
! -----------------
!  {1,|,2,|,3}
! (1 row)
! 
! select string_to_array(NULL, '|') IS NULL;
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select string_to_array('abc', '');
!  string_to_array 
! -----------------
!  {abc}
! (1 row)
! 
! select string_to_array('abc', '', 'abc');
!  string_to_array 
! -----------------
!  {NULL}
! (1 row)
! 
! select string_to_array('abc', ',');
!  string_to_array 
! -----------------
!  {abc}
! (1 row)
! 
! select string_to_array('abc', ',', 'abc');
!  string_to_array 
! -----------------
!  {NULL}
! (1 row)
! 
! select string_to_array('1,2,3,4,,6', ',');
!  string_to_array 
! -----------------
!  {1,2,3,4,"",6}
! (1 row)
! 
! select string_to_array('1,2,3,4,,6', ',', '');
!  string_to_array  
! ------------------
!  {1,2,3,4,NULL,6}
! (1 row)
! 
! select string_to_array('1,2,3,4,*,6', ',', '*');
!  string_to_array  
! ------------------
!  {1,2,3,4,NULL,6}
! (1 row)
! 
! select array_to_string(NULL::int4[], ',') IS NULL;
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select array_to_string('{}'::int4[], ',');
!  array_to_string 
! -----------------
!  
! (1 row)
! 
! select array_to_string(array[1,2,3,4,NULL,6], ',');
!  array_to_string 
! -----------------
!  1,2,3,4,6
! (1 row)
! 
! select array_to_string(array[1,2,3,4,NULL,6], ',', '*');
!  array_to_string 
! -----------------
!  1,2,3,4,*,6
! (1 row)
! 
! select array_to_string(array[1,2,3,4,NULL,6], NULL);
!  array_to_string 
! -----------------
!  
! (1 row)
! 
! select array_to_string(array[1,2,3,4,NULL,6], ',', NULL);
!  array_to_string 
! -----------------
!  1,2,3,4,6
! (1 row)
! 
! select array_to_string(string_to_array('1|2|3', '|'), '|');
!  array_to_string 
! -----------------
!  1|2|3
! (1 row)
! 
! select array_length(array[1,2,3], 1);
!  array_length 
! --------------
!             3
! (1 row)
! 
! select array_length(array[[1,2,3], [4,5,6]], 0);
!  array_length 
! --------------
!              
! (1 row)
! 
! select array_length(array[[1,2,3], [4,5,6]], 1);
!  array_length 
! --------------
!             2
! (1 row)
! 
! select array_length(array[[1,2,3], [4,5,6]], 2);
!  array_length 
! --------------
!             3
! (1 row)
! 
! select array_length(array[[1,2,3], [4,5,6]], 3);
!  array_length 
! --------------
!              
! (1 row)
! 
! select cardinality(NULL::int[]);
!  cardinality 
! -------------
!             
! (1 row)
! 
! select cardinality('{}'::int[]);
!  cardinality 
! -------------
!            0
! (1 row)
! 
! select cardinality(array[1,2,3]);
!  cardinality 
! -------------
!            3
! (1 row)
! 
! select cardinality('[2:4]={5,6,7}'::int[]);
!  cardinality 
! -------------
!            3
! (1 row)
! 
! select cardinality('{{1,2}}'::int[]);
!  cardinality 
! -------------
!            2
! (1 row)
! 
! select cardinality('{{1,2},{3,4},{5,6}}'::int[]);
!  cardinality 
! -------------
!            6
! (1 row)
! 
! select cardinality('{{{1,9},{5,6}},{{2,3},{3,4}}}'::int[]);
!  cardinality 
! -------------
!            8
! (1 row)
! 
! select array_agg(unique1) from (select unique1 from tenk1 where unique1 < 15 order by unique1) ss;
!               array_agg               
! --------------------------------------
!  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}
! (1 row)
! 
! select array_agg(ten) from (select ten from tenk1 where unique1 < 15 order by unique1) ss;
!             array_agg            
! ---------------------------------
!  {0,1,2,3,4,5,6,7,8,9,0,1,2,3,4}
! (1 row)
! 
! select array_agg(nullif(ten, 4)) from (select ten from tenk1 where unique1 < 15 order by unique1) ss;
!                array_agg               
! ---------------------------------------
!  {0,1,2,3,NULL,5,6,7,8,9,0,1,2,3,NULL}
! (1 row)
! 
! select array_agg(unique1) from tenk1 where unique1 < -15;
!  array_agg 
! -----------
!  
! (1 row)
! 
! select unnest(array[1,2,3]);
!  unnest 
! --------
!       1
!       2
!       3
! (3 rows)
! 
! select * from unnest(array[1,2,3]);
!  unnest 
! --------
!       1
!       2
!       3
! (3 rows)
! 
! select unnest(array[1,2,3,4.5]::float8[]);
!  unnest 
! --------
!       1
!       2
!       3
!     4.5
! (4 rows)
! 
! select unnest(array[1,2,3,4.5]::numeric[]);
!  unnest 
! --------
!       1
!       2
!       3
!     4.5
! (4 rows)
! 
! select unnest(array[1,2,3,null,4,null,null,5,6]);
!  unnest 
! --------
!       1
!       2
!       3
!        
!       4
!        
!        
!       5
!       6
! (9 rows)
! 
! select unnest(array[1,2,3,null,4,null,null,5,6]::text[]);
!  unnest 
! --------
!  1
!  2
!  3
!  
!  4
!  
!  
!  5
!  6
! (9 rows)
! 
! select abs(unnest(array[1,2,null,-3]));
!  abs 
! -----
!    1
!    2
!     
!    3
! (4 rows)
! 
! select array_remove(array[1,2,2,3], 2);
!  array_remove 
! --------------
!  {1,3}
! (1 row)
! 
! select array_remove(array[1,2,2,3], 5);
!  array_remove 
! --------------
!  {1,2,2,3}
! (1 row)
! 
! select array_remove(array[1,NULL,NULL,3], NULL);
!  array_remove 
! --------------
!  {1,3}
! (1 row)
! 
! select array_remove(array['A','CC','D','C','RR'], 'RR');
!  array_remove 
! --------------
!  {A,CC,D,C}
! (1 row)
! 
! select array_remove('{{1,2,2},{1,4,3}}', 2); -- not allowed
! ERROR:  removing elements from multidimensional arrays is not supported
! select array_remove(array['X','X','X'], 'X') = '{}';
!  ?column? 
! ----------
!  t
! (1 row)
! 
! select array_replace(array[1,2,5,4],5,3);
!  array_replace 
! ---------------
!  {1,2,3,4}
! (1 row)
! 
! select array_replace(array[1,2,5,4],5,NULL);
!  array_replace 
! ---------------
!  {1,2,NULL,4}
! (1 row)
! 
! select array_replace(array[1,2,NULL,4,NULL],NULL,5);
!  array_replace 
! ---------------
!  {1,2,5,4,5}
! (1 row)
! 
! select array_replace(array['A','B','DD','B'],'B','CC');
!  array_replace 
! ---------------
!  {A,CC,DD,CC}
! (1 row)
! 
! select array_replace(array[1,NULL,3],NULL,NULL);
!  array_replace 
! ---------------
!  {1,NULL,3}
! (1 row)
! 
! select array_replace(array['AB',NULL,'CDE'],NULL,'12');
!  array_replace 
! ---------------
!  {AB,12,CDE}
! (1 row)
! 
! -- Insert/update on a column that is array of composite
! create temp table t1 (f1 int8_tbl[]);
! insert into t1 (f1[5].q1) values(42);
! select * from t1;
!        f1        
! -----------------
!  [5:5]={"(42,)"}
! (1 row)
! 
! update t1 set f1[5].q2 = 43;
! select * from t1;
!         f1         
! -------------------
!  [5:5]={"(42,43)"}
! (1 row)
! 
! -- Check that arrays of composites are safely detoasted when needed
! create temp table src (f1 text);
! insert into src
!   select string_agg(random()::text,'') from generate_series(1,10000);
! create type textandtext as (c1 text, c2 text);
! create temp table dest (f1 textandtext[]);
! insert into dest select array[row(f1,f1)::textandtext] from src;
! select length(md5((f1[1]).c2)) from dest;
!  length 
! --------
!      32
! (1 row)
! 
! delete from src;
! select length(md5((f1[1]).c2)) from dest;
!  length 
! --------
!      32
! (1 row)
! 
! truncate table src;
! drop table src;
! select length(md5((f1[1]).c2)) from dest;
!  length 
! --------
!      32
! (1 row)
! 
! drop table dest;
! drop type textandtext;
! -- Tests for polymorphic-array form of width_bucket()
! -- this exercises the varwidth and float8 code paths
! SELECT
!     op,
!     width_bucket(op::numeric, ARRAY[1, 3, 5, 10.0]::numeric[]) AS wb_n1,
!     width_bucket(op::numeric, ARRAY[0, 5.5, 9.99]::numeric[]) AS wb_n2,
!     width_bucket(op::numeric, ARRAY[-6, -5, 2.0]::numeric[]) AS wb_n3,
!     width_bucket(op::float8, ARRAY[1, 3, 5, 10.0]::float8[]) AS wb_f1,
!     width_bucket(op::float8, ARRAY[0, 5.5, 9.99]::float8[]) AS wb_f2,
!     width_bucket(op::float8, ARRAY[-6, -5, 2.0]::float8[]) AS wb_f3
! FROM (VALUES
!   (-5.2),
!   (-0.0000000001),
!   (0.000000000001),
!   (1),
!   (1.99999999999999),
!   (2),
!   (2.00000000000001),
!   (3),
!   (4),
!   (4.5),
!   (5),
!   (5.5),
!   (6),
!   (7),
!   (8),
!   (9),
!   (9.99999999999999),
!   (10),
!   (10.0000000000001)
! ) v(op);
!         op        | wb_n1 | wb_n2 | wb_n3 | wb_f1 | wb_f2 | wb_f3 
! ------------------+-------+-------+-------+-------+-------+-------
!              -5.2 |     0 |     0 |     1 |     0 |     0 |     1
!     -0.0000000001 |     0 |     0 |     2 |     0 |     0 |     2
!    0.000000000001 |     0 |     1 |     2 |     0 |     1 |     2
!                 1 |     1 |     1 |     2 |     1 |     1 |     2
!  1.99999999999999 |     1 |     1 |     2 |     1 |     1 |     2
!                 2 |     1 |     1 |     3 |     1 |     1 |     3
!  2.00000000000001 |     1 |     1 |     3 |     1 |     1 |     3
!                 3 |     2 |     1 |     3 |     2 |     1 |     3
!                 4 |     2 |     1 |     3 |     2 |     1 |     3
!               4.5 |     2 |     1 |     3 |     2 |     1 |     3
!                 5 |     3 |     1 |     3 |     3 |     1 |     3
!               5.5 |     3 |     2 |     3 |     3 |     2 |     3
!                 6 |     3 |     2 |     3 |     3 |     2 |     3
!                 7 |     3 |     2 |     3 |     3 |     2 |     3
!                 8 |     3 |     2 |     3 |     3 |     2 |     3
!                 9 |     3 |     2 |     3 |     3 |     2 |     3
!  9.99999999999999 |     3 |     3 |     3 |     3 |     3 |     3
!                10 |     4 |     3 |     3 |     4 |     3 |     3
!  10.0000000000001 |     4 |     3 |     3 |     4 |     3 |     3
! (19 rows)
! 
! -- ensure float8 path handles NaN properly
! SELECT
!     op,
!     width_bucket(op, ARRAY[1, 3, 9, 'NaN', 'NaN']::float8[]) AS wb
! FROM (VALUES
!   (-5.2::float8),
!   (4::float8),
!   (77::float8),
!   ('NaN'::float8)
! ) v(op);
!   op  | wb 
! ------+----
!  -5.2 |  0
!     4 |  2
!    77 |  3
!   NaN |  5
! (4 rows)
! 
! -- these exercise the generic fixed-width code path
! SELECT
!     op,
!     width_bucket(op, ARRAY[1, 3, 5, 10]) AS wb_1
! FROM generate_series(0,11) as op;
!  op | wb_1 
! ----+------
!   0 |    0
!   1 |    1
!   2 |    1
!   3 |    2
!   4 |    2
!   5 |    3
!   6 |    3
!   7 |    3
!   8 |    3
!   9 |    3
!  10 |    4
!  11 |    4
! (12 rows)
! 
! SELECT width_bucket(now(),
!                     array['yesterday', 'today', 'tomorrow']::timestamptz[]);
!  width_bucket 
! --------------
!             2
! (1 row)
! 
! -- corner cases
! SELECT width_bucket(5, ARRAY[3]);
!  width_bucket 
! --------------
!             1
! (1 row)
! 
! SELECT width_bucket(5, '{}');
!  width_bucket 
! --------------
!             0
! (1 row)
! 
! -- error cases
! SELECT width_bucket('5'::text, ARRAY[3, 4]::integer[]);
! ERROR:  function width_bucket(text, integer[]) does not exist
! LINE 1: SELECT width_bucket('5'::text, ARRAY[3, 4]::integer[]);
!                ^
! HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
! SELECT width_bucket(5, ARRAY[3, 4, NULL]);
! ERROR:  thresholds array must not contain NULLs
! SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
! ERROR:  thresholds must be one-dimensional array
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/btree_index.out	Sun Oct  3 21:26:00 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/btree_index.out	Tue Oct 28 15:53:05 2014
***************
*** 1,129 ****
! --
! -- BTREE_INDEX
! -- test retrieval of min/max keys for each index
! --
! SELECT b.*
!    FROM bt_i4_heap b
!    WHERE b.seqno < 1;
!  seqno |   random   
! -------+------------
!      0 | 1935401906
! (1 row)
! 
! SELECT b.*
!    FROM bt_i4_heap b
!    WHERE b.seqno >= 9999;
!  seqno |   random   
! -------+------------
!   9999 | 1227676208
! (1 row)
! 
! SELECT b.*
!    FROM bt_i4_heap b
!    WHERE b.seqno = 4500;
!  seqno |   random   
! -------+------------
!   4500 | 2080851358
! (1 row)
! 
! SELECT b.*
!    FROM bt_name_heap b
!    WHERE b.seqno < '1'::name;
!  seqno |   random   
! -------+------------
!  0     | 1935401906
! (1 row)
! 
! SELECT b.*
!    FROM bt_name_heap b
!    WHERE b.seqno >= '9999'::name;
!  seqno |   random   
! -------+------------
!  9999  | 1227676208
! (1 row)
! 
! SELECT b.*
!    FROM bt_name_heap b
!    WHERE b.seqno = '4500'::name;
!  seqno |   random   
! -------+------------
!  4500  | 2080851358
! (1 row)
! 
! SELECT b.*
!    FROM bt_txt_heap b
!    WHERE b.seqno < '1'::text;
!  seqno |   random   
! -------+------------
!  0     | 1935401906
! (1 row)
! 
! SELECT b.*
!    FROM bt_txt_heap b
!    WHERE b.seqno >= '9999'::text;
!  seqno |   random   
! -------+------------
!  9999  | 1227676208
! (1 row)
! 
! SELECT b.*
!    FROM bt_txt_heap b
!    WHERE b.seqno = '4500'::text;
!  seqno |   random   
! -------+------------
!  4500  | 2080851358
! (1 row)
! 
! SELECT b.*
!    FROM bt_f8_heap b
!    WHERE b.seqno < '1'::float8;
!  seqno |   random   
! -------+------------
!      0 | 1935401906
! (1 row)
! 
! SELECT b.*
!    FROM bt_f8_heap b
!    WHERE b.seqno >= '9999'::float8;
!  seqno |   random   
! -------+------------
!   9999 | 1227676208
! (1 row)
! 
! SELECT b.*
!    FROM bt_f8_heap b
!    WHERE b.seqno = '4500'::float8;
!  seqno |   random   
! -------+------------
!   4500 | 2080851358
! (1 row)
! 
! --
! -- Check correct optimization of LIKE (special index operator support)
! -- for both indexscan and bitmapscan cases
! --
! set enable_seqscan to false;
! set enable_indexscan to true;
! set enable_bitmapscan to false;
! select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
!         proname         
! ------------------------
!  RI_FKey_cascade_del
!  RI_FKey_noaction_del
!  RI_FKey_restrict_del
!  RI_FKey_setdefault_del
!  RI_FKey_setnull_del
! (5 rows)
! 
! set enable_indexscan to false;
! set enable_bitmapscan to true;
! select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
!         proname         
! ------------------------
!  RI_FKey_cascade_del
!  RI_FKey_noaction_del
!  RI_FKey_restrict_del
!  RI_FKey_setdefault_del
!  RI_FKey_setnull_del
! (5 rows)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/hash_index.out	Sun Dec 12 20:21:38 2010
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/hash_index.out	Tue Oct 28 15:53:05 2014
***************
*** 1,198 ****
! --
! -- HASH_INDEX
! -- grep 843938989 hash.data
! --
! SELECT * FROM hash_i4_heap
!    WHERE hash_i4_heap.random = 843938989;
!  seqno |  random   
! -------+-----------
!     15 | 843938989
! (1 row)
! 
! --
! -- hash index
! -- grep 66766766 hash.data
! --
! SELECT * FROM hash_i4_heap
!    WHERE hash_i4_heap.random = 66766766;
!  seqno | random 
! -------+--------
! (0 rows)
! 
! --
! -- hash index
! -- grep 1505703298 hash.data
! --
! SELECT * FROM hash_name_heap
!    WHERE hash_name_heap.random = '1505703298'::name;
!  seqno |   random   
! -------+------------
!   9838 | 1505703298
! (1 row)
! 
! --
! -- hash index
! -- grep 7777777 hash.data
! --
! SELECT * FROM hash_name_heap
!    WHERE hash_name_heap.random = '7777777'::name;
!  seqno | random 
! -------+--------
! (0 rows)
! 
! --
! -- hash index
! -- grep 1351610853 hash.data
! --
! SELECT * FROM hash_txt_heap
!    WHERE hash_txt_heap.random = '1351610853'::text;
!  seqno |   random   
! -------+------------
!   5677 | 1351610853
! (1 row)
! 
! --
! -- hash index
! -- grep 111111112222222233333333 hash.data
! --
! SELECT * FROM hash_txt_heap
!    WHERE hash_txt_heap.random = '111111112222222233333333'::text;
!  seqno | random 
! -------+--------
! (0 rows)
! 
! --
! -- hash index
! -- grep 444705537 hash.data
! --
! SELECT * FROM hash_f8_heap
!    WHERE hash_f8_heap.random = '444705537'::float8;
!  seqno |  random   
! -------+-----------
!   7853 | 444705537
! (1 row)
! 
! --
! -- hash index
! -- grep 88888888 hash.data
! --
! SELECT * FROM hash_f8_heap
!    WHERE hash_f8_heap.random = '88888888'::float8;
!  seqno | random 
! -------+--------
! (0 rows)
! 
! --
! -- hash index
! -- grep '^90[^0-9]' hashovfl.data
! --
! -- SELECT count(*) AS i988 FROM hash_ovfl_heap
! --    WHERE x = 90;
! --
! -- hash index
! -- grep '^1000[^0-9]' hashovfl.data
! --
! -- SELECT count(*) AS i0 FROM hash_ovfl_heap
! --    WHERE x = 1000;
! --
! -- HASH
! --
! UPDATE hash_i4_heap
!    SET random = 1
!    WHERE hash_i4_heap.seqno = 1492;
! SELECT h.seqno AS i1492, h.random AS i1
!    FROM hash_i4_heap h
!    WHERE h.random = 1;
!  i1492 | i1 
! -------+----
!   1492 |  1
! (1 row)
! 
! UPDATE hash_i4_heap
!    SET seqno = 20000
!    WHERE hash_i4_heap.random = 1492795354;
! SELECT h.seqno AS i20000
!    FROM hash_i4_heap h
!    WHERE h.random = 1492795354;
!  i20000 
! --------
!   20000
! (1 row)
! 
! UPDATE hash_name_heap
!    SET random = '0123456789abcdef'::name
!    WHERE hash_name_heap.seqno = 6543;
! SELECT h.seqno AS i6543, h.random AS c0_to_f
!    FROM hash_name_heap h
!    WHERE h.random = '0123456789abcdef'::name;
!  i6543 |     c0_to_f      
! -------+------------------
!   6543 | 0123456789abcdef
! (1 row)
! 
! UPDATE hash_name_heap
!    SET seqno = 20000
!    WHERE hash_name_heap.random = '76652222'::name;
! --
! -- this is the row we just replaced; index scan should return zero rows
! --
! SELECT h.seqno AS emptyset
!    FROM hash_name_heap h
!    WHERE h.random = '76652222'::name;
!  emptyset 
! ----------
! (0 rows)
! 
! UPDATE hash_txt_heap
!    SET random = '0123456789abcdefghijklmnop'::text
!    WHERE hash_txt_heap.seqno = 4002;
! SELECT h.seqno AS i4002, h.random AS c0_to_p
!    FROM hash_txt_heap h
!    WHERE h.random = '0123456789abcdefghijklmnop'::text;
!  i4002 |          c0_to_p           
! -------+----------------------------
!   4002 | 0123456789abcdefghijklmnop
! (1 row)
! 
! UPDATE hash_txt_heap
!    SET seqno = 20000
!    WHERE hash_txt_heap.random = '959363399'::text;
! SELECT h.seqno AS t20000
!    FROM hash_txt_heap h
!    WHERE h.random = '959363399'::text;
!  t20000 
! --------
!   20000
! (1 row)
! 
! UPDATE hash_f8_heap
!    SET random = '-1234.1234'::float8
!    WHERE hash_f8_heap.seqno = 8906;
! SELECT h.seqno AS i8096, h.random AS f1234_1234
!    FROM hash_f8_heap h
!    WHERE h.random = '-1234.1234'::float8;
!  i8096 | f1234_1234 
! -------+------------
!   8906 | -1234.1234
! (1 row)
! 
! UPDATE hash_f8_heap
!    SET seqno = 20000
!    WHERE hash_f8_heap.random = '488912369'::float8;
! SELECT h.seqno AS f20000
!    FROM hash_f8_heap h
!    WHERE h.random = '488912369'::float8;
!  f20000 
! --------
!   20000
! (1 row)
! 
! -- UPDATE hash_ovfl_heap
! --    SET x = 1000
! --   WHERE x = 90;
! -- this vacuums the index as well
! -- VACUUM hash_ovfl_heap;
! -- SELECT count(*) AS i0 FROM hash_ovfl_heap
! --   WHERE x = 90;
! -- SELECT count(*) AS i988 FROM hash_ovfl_heap
! --  WHERE x = 1000;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/update.out	Thu Oct 16 14:31:37 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/update.out	Tue Oct 28 15:53:05 2014
***************
*** 1,150 ****
! --
! -- UPDATE syntax tests
! --
! CREATE TABLE update_test (
!     a   INT DEFAULT 10,
!     b   INT,
!     c   TEXT
! );
! INSERT INTO update_test VALUES (5, 10, 'foo');
! INSERT INTO update_test(b, a) VALUES (15, 10);
! SELECT * FROM update_test;
!  a  | b  |  c  
! ----+----+-----
!   5 | 10 | foo
!  10 | 15 | 
! (2 rows)
! 
! UPDATE update_test SET a = DEFAULT, b = DEFAULT;
! SELECT * FROM update_test;
!  a  | b |  c  
! ----+---+-----
!  10 |   | foo
!  10 |   | 
! (2 rows)
! 
! -- aliases for the UPDATE target table
! UPDATE update_test AS t SET b = 10 WHERE t.a = 10;
! SELECT * FROM update_test;
!  a  | b  |  c  
! ----+----+-----
!  10 | 10 | foo
!  10 | 10 | 
! (2 rows)
! 
! UPDATE update_test t SET b = t.b + 10 WHERE t.a = 10;
! SELECT * FROM update_test;
!  a  | b  |  c  
! ----+----+-----
!  10 | 20 | foo
!  10 | 20 | 
! (2 rows)
! 
! --
! -- Test VALUES in FROM
! --
! UPDATE update_test SET a=v.i FROM (VALUES(100, 20)) AS v(i, j)
!   WHERE update_test.b = v.j;
! SELECT * FROM update_test;
!   a  | b  |  c  
! -----+----+-----
!  100 | 20 | foo
!  100 | 20 | 
! (2 rows)
! 
! --
! -- Test multiple-set-clause syntax
! --
! INSERT INTO update_test SELECT a,b+1,c FROM update_test;
! SELECT * FROM update_test;
!   a  | b  |  c  
! -----+----+-----
!  100 | 20 | foo
!  100 | 20 | 
!  100 | 21 | foo
!  100 | 21 | 
! (4 rows)
! 
! UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'foo';
! SELECT * FROM update_test;
!   a  | b  |   c   
! -----+----+-------
!  100 | 20 | 
!  100 | 21 | 
!   10 | 31 | bugle
!   10 | 32 | bugle
! (4 rows)
! 
! UPDATE update_test SET (c,b) = ('car', a+b), a = a + 1 WHERE a = 10;
! SELECT * FROM update_test;
!   a  | b  |  c  
! -----+----+-----
!  100 | 20 | 
!  100 | 21 | 
!   11 | 41 | car
!   11 | 42 | car
! (4 rows)
! 
! -- fail, multi assignment to same column:
! UPDATE update_test SET (c,b) = ('car', a+b), b = a + 1 WHERE a = 10;
! ERROR:  multiple assignments to same column "b"
! -- uncorrelated sub-select:
! UPDATE update_test
!   SET (b,a) = (select a,b from update_test where b = 41 and c = 'car')
!   WHERE a = 100 AND b = 20;
! SELECT * FROM update_test;
!   a  | b  |  c  
! -----+----+-----
!  100 | 21 | 
!   11 | 41 | car
!   11 | 42 | car
!   41 | 11 | 
! (4 rows)
! 
! -- correlated sub-select:
! UPDATE update_test o
!   SET (b,a) = (select a+1,b from update_test i
!                where i.a=o.a and i.b=o.b and i.c is not distinct from o.c);
! SELECT * FROM update_test;
!  a  |  b  |  c  
! ----+-----+-----
!  21 | 101 | 
!  41 |  12 | car
!  42 |  12 | car
!  11 |  42 | 
! (4 rows)
! 
! -- fail, multiple rows supplied:
! UPDATE update_test SET (b,a) = (select a+1,b from update_test);
! ERROR:  more than one row returned by a subquery used as an expression
! -- set to null if no rows supplied:
! UPDATE update_test SET (b,a) = (select a+1,b from update_test where a = 1000)
!   WHERE a = 11;
! SELECT * FROM update_test;
!  a  |  b  |  c  
! ----+-----+-----
!  21 | 101 | 
!  41 |  12 | car
!  42 |  12 | car
!     |     | 
! (4 rows)
! 
! -- if an alias for the target table is specified, don't allow references
! -- to the original table name
! UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
! ERROR:  invalid reference to FROM-clause entry for table "update_test"
! LINE 1: UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a...
!                                         ^
! HINT:  Perhaps you meant to reference the table alias "t".
! -- Make sure that we can update to a TOASTed value.
! UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
! SELECT a, b, char_length(c) FROM update_test;
!  a  |  b  | char_length 
! ----+-----+-------------
!  21 | 101 |            
!     |     |            
!  41 |  12 |       10000
!  42 |  12 |       10000
! (4 rows)
! 
! DROP TABLE update_test;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/namespace.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/namespace.out	Tue Oct 28 15:53:05 2014
***************
*** 1,71 ****
! --
! -- Regression tests for schemas (namespaces)
! --
! CREATE SCHEMA test_schema_1
!        CREATE UNIQUE INDEX abc_a_idx ON abc (a)
!        CREATE VIEW abc_view AS
!               SELECT a+1 AS a, b+1 AS b FROM abc
!        CREATE TABLE abc (
!               a serial,
!               b int UNIQUE
!        );
! -- verify that the objects were created
! SELECT COUNT(*) FROM pg_class WHERE relnamespace =
!     (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema_1');
!  count 
! -------
!      5
! (1 row)
! 
! INSERT INTO test_schema_1.abc DEFAULT VALUES;
! INSERT INTO test_schema_1.abc DEFAULT VALUES;
! INSERT INTO test_schema_1.abc DEFAULT VALUES;
! SELECT * FROM test_schema_1.abc;
!  a | b 
! ---+---
!  1 |  
!  2 |  
!  3 |  
! (3 rows)
! 
! SELECT * FROM test_schema_1.abc_view;
!  a | b 
! ---+---
!  2 |  
!  3 |  
!  4 |  
! (3 rows)
! 
! ALTER SCHEMA test_schema_1 RENAME TO test_schema_renamed;
! SELECT COUNT(*) FROM pg_class WHERE relnamespace =
!     (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema_1');
!  count 
! -------
!      0
! (1 row)
! 
! -- test IF NOT EXISTS cases
! CREATE SCHEMA test_schema_renamed; -- fail, already exists
! ERROR:  schema "test_schema_renamed" already exists
! CREATE SCHEMA IF NOT EXISTS test_schema_renamed; -- ok with notice
! NOTICE:  schema "test_schema_renamed" already exists, skipping
! CREATE SCHEMA IF NOT EXISTS test_schema_renamed -- fail, disallowed
!        CREATE TABLE abc (
!               a serial,
!               b int UNIQUE
!        );
! ERROR:  CREATE SCHEMA IF NOT EXISTS cannot include schema elements
! LINE 2:        CREATE TABLE abc (
!                ^
! DROP SCHEMA test_schema_renamed CASCADE;
! NOTICE:  drop cascades to 2 other objects
! DETAIL:  drop cascades to table test_schema_renamed.abc
! drop cascades to view test_schema_renamed.abc_view
! -- verify that the objects were dropped
! SELECT COUNT(*) FROM pg_class WHERE relnamespace =
!     (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema_renamed');
!  count 
! -------
!      0
! (1 row)
! 
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/prepared_xacts.out	Tue Sep 27 16:30:52 2011
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/prepared_xacts.out	Tue Oct 28 15:53:05 2014
***************
*** 1,254 ****
! --
! -- PREPARED TRANSACTIONS (two-phase commit)
! --
! -- We can't readily test persistence of prepared xacts within the
! -- regression script framework, unfortunately.  Note that a crash
! -- isn't really needed ... stopping and starting the postmaster would
! -- be enough, but we can't even do that here.
! -- create a simple table that we'll use in the tests
! CREATE TABLE pxtest1 (foobar VARCHAR(10));
! INSERT INTO pxtest1 VALUES ('aaa');
! -- Test PREPARE TRANSACTION
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! UPDATE pxtest1 SET foobar = 'bbb' WHERE foobar = 'aaa';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  bbb
! (1 row)
! 
! PREPARE TRANSACTION 'foo1';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
! (1 row)
! 
! -- Test pg_prepared_xacts system view
! SELECT gid FROM pg_prepared_xacts;
!  gid  
! ------
!  foo1
! (1 row)
! 
! -- Test ROLLBACK PREPARED
! ROLLBACK PREPARED 'foo1';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
! (1 row)
! 
! SELECT gid FROM pg_prepared_xacts;
!  gid 
! -----
! (0 rows)
! 
! -- Test COMMIT PREPARED
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! INSERT INTO pxtest1 VALUES ('ddd');
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  ddd
! (2 rows)
! 
! PREPARE TRANSACTION 'foo2';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
! (1 row)
! 
! COMMIT PREPARED 'foo2';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  ddd
! (2 rows)
! 
! -- Test duplicate gids
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! UPDATE pxtest1 SET foobar = 'eee' WHERE foobar = 'ddd';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  eee
! (2 rows)
! 
! PREPARE TRANSACTION 'foo3';
! SELECT gid FROM pg_prepared_xacts;
!  gid  
! ------
!  foo3
! (1 row)
! 
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! INSERT INTO pxtest1 VALUES ('fff');
! -- This should fail, because the gid foo3 is already in use
! PREPARE TRANSACTION 'foo3';
! ERROR:  transaction identifier "foo3" is already in use
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  ddd
! (2 rows)
! 
! ROLLBACK PREPARED 'foo3';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  ddd
! (2 rows)
! 
! -- Test serialization failure (SSI)
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! UPDATE pxtest1 SET foobar = 'eee' WHERE foobar = 'ddd';
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  eee
! (2 rows)
! 
! PREPARE TRANSACTION 'foo4';
! SELECT gid FROM pg_prepared_xacts;
!  gid  
! ------
!  foo4
! (1 row)
! 
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
! SELECT * FROM pxtest1;
!  foobar 
! --------
!  aaa
!  ddd
! (2 rows)
! 
! -- This should fail, because the two transactions have a write-skew anomaly
! INSERT INTO pxtest1 VALUES ('fff');
! ERROR:  could not serialize access due to read/write dependencies among transactions
! DETAIL:  Reason code: Canceled on identification as a pivot, during write.
! HINT:  The transaction might succeed if retried.
! PREPARE TRANSACTION 'foo5';
! SELECT gid FROM pg_prepared_xacts;
!  gid  
! ------
!  foo4
! (1 row)
! 
! ROLLBACK PREPARED 'foo4';
! SELECT gid FROM pg_prepared_xacts;
!  gid 
! -----
! (0 rows)
! 
! -- Clean up
! DROP TABLE pxtest1;
! -- Test subtransactions
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
!   CREATE TABLE pxtest2 (a int);
!   INSERT INTO pxtest2 VALUES (1);
!   SAVEPOINT a;
!     INSERT INTO pxtest2 VALUES (2);
!   ROLLBACK TO a;
!   SAVEPOINT b;
!   INSERT INTO pxtest2 VALUES (3);
! PREPARE TRANSACTION 'regress-one';
! CREATE TABLE pxtest3(fff int);
! -- Test shared invalidation
! BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
!   DROP TABLE pxtest3;
!   CREATE TABLE pxtest4 (a int);
!   INSERT INTO pxtest4 VALUES (1);
!   INSERT INTO pxtest4 VALUES (2);
!   DECLARE foo CURSOR FOR SELECT * FROM pxtest4;
!   -- Fetch 1 tuple, keeping the cursor open
!   FETCH 1 FROM foo;
!  a 
! ---
!  1
! (1 row)
! 
! PREPARE TRANSACTION 'regress-two';
! -- No such cursor
! FETCH 1 FROM foo;
! ERROR:  cursor "foo" does not exist
! -- Table doesn't exist, the creation hasn't been committed yet
! SELECT * FROM pxtest2;
! ERROR:  relation "pxtest2" does not exist
! LINE 1: SELECT * FROM pxtest2;
!                       ^
! -- There should be two prepared transactions
! SELECT gid FROM pg_prepared_xacts;
!      gid     
! -------------
!  regress-one
!  regress-two
! (2 rows)
! 
! -- pxtest3 should be locked because of the pending DROP
! set statement_timeout to 2000;
! SELECT * FROM pxtest3;
! ERROR:  canceling statement due to statement timeout
! reset statement_timeout;
! -- Disconnect, we will continue testing in a different backend
! \c -
! -- There should still be two prepared transactions
! SELECT gid FROM pg_prepared_xacts;
!      gid     
! -------------
!  regress-one
!  regress-two
! (2 rows)
! 
! -- pxtest3 should still be locked because of the pending DROP
! set statement_timeout to 2000;
! SELECT * FROM pxtest3;
! ERROR:  canceling statement due to statement timeout
! reset statement_timeout;
! -- Commit table creation
! COMMIT PREPARED 'regress-one';
! \d pxtest2
!     Table "public.pxtest2"
!  Column |  Type   | Modifiers 
! --------+---------+-----------
!  a      | integer | 
! 
! SELECT * FROM pxtest2;
!  a 
! ---
!  1
!  3
! (2 rows)
! 
! -- There should be one prepared transaction
! SELECT gid FROM pg_prepared_xacts;
!      gid     
! -------------
!  regress-two
! (1 row)
! 
! -- Commit table drop
! COMMIT PREPARED 'regress-two';
! SELECT * FROM pxtest3;
! ERROR:  relation "pxtest3" does not exist
! LINE 1: SELECT * FROM pxtest3;
!                       ^
! -- There should be no prepared transactions
! SELECT gid FROM pg_prepared_xacts;
!  gid 
! -----
! (0 rows)
! 
! -- Clean up
! DROP TABLE pxtest2;
! DROP TABLE pxtest3;  -- will still be there if prepared xacts are disabled
! ERROR:  table "pxtest3" does not exist
! DROP TABLE pxtest4;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/delete.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/delete.out	Tue Oct 28 15:53:05 2014
***************
*** 1,33 ****
! CREATE TABLE delete_test (
!     id SERIAL PRIMARY KEY,
!     a INT,
!     b text
! );
! INSERT INTO delete_test (a) VALUES (10);
! INSERT INTO delete_test (a, b) VALUES (50, repeat('x', 10000));
! INSERT INTO delete_test (a) VALUES (100);
! -- allow an alias to be specified for DELETE's target table
! DELETE FROM delete_test AS dt WHERE dt.a > 75;
! -- if an alias is specified, don't allow the original table name
! -- to be referenced
! DELETE FROM delete_test dt WHERE delete_test.a > 25;
! ERROR:  invalid reference to FROM-clause entry for table "delete_test"
! LINE 1: DELETE FROM delete_test dt WHERE delete_test.a > 25;
!                                          ^
! HINT:  Perhaps you meant to reference the table alias "dt".
! SELECT id, a, char_length(b) FROM delete_test;
!  id | a  | char_length 
! ----+----+-------------
!   1 | 10 |            
!   2 | 50 |       10000
! (2 rows)
! 
! -- delete a row with a TOASTed value
! DELETE FROM delete_test WHERE a > 25;
! SELECT id, a, char_length(b) FROM delete_test;
!  id | a  | char_length 
! ----+----+-------------
!   1 | 10 |            
! (1 row)
! 
! DROP TABLE delete_test;
--- 1 ----
! psql: FATAL:  the database system is in recovery mode

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/misc.out	Tue Oct 28 15:52:48 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/misc.out	Tue Oct 28 15:53:07 2014
***************
*** 27,38 ****
--- 27,45 ----
     FROM onek
     WHERE onek.stringu1 = 'JBAAAA' and
  	  onek.stringu1 = tmp.stringu1;
+ ERROR:  relation "tmp" does not exist
+ LINE 1: UPDATE tmp
+                ^
  UPDATE tmp
     SET stringu1 = reverse_name(onek2.stringu1)
     FROM onek2
     WHERE onek2.stringu1 = 'JCAAAA' and
  	  onek2.stringu1 = tmp.stringu1;
+ ERROR:  relation "tmp" does not exist
+ LINE 1: UPDATE tmp
+                ^
  DROP TABLE tmp;
+ ERROR:  table "tmp" does not exist
  --UPDATE person*
  --   SET age = age + 1;
  --UPDATE person*
***************
*** 576,592 ****
  
  SELECT user_relns() AS user_relns
     ORDER BY user_relns;
!      user_relns      
! ---------------------
!  a
   a_star
   abstime_tbl
   aggtest
-  aggtype
   array_index_op_test
   array_op_test
-  arrtest
-  b
   b_star
   bb
   box_tbl
--- 583,595 ----
  
  SELECT user_relns() AS user_relns
     ORDER BY user_relns;
!        user_relns       
! ------------------------
   a_star
   abstime_tbl
   aggtest
   array_index_op_test
   array_op_test
   b_star
   bb
   box_tbl
***************
*** 595,615 ****
   bt_i4_heap
   bt_name_heap
   bt_txt_heap
-  c
   c_star
   char_tbl
-  check2_tbl
-  check_seq
-  check_tbl
   circle_tbl
   city
!  copy_tbl
!  d
   d_star
   date_tbl
-  default_seq
-  default_tbl
-  defaultexpr_tbl
   dept
   dupindexcols
   e_star
--- 598,612 ----
   bt_i4_heap
   bt_name_heap
   bt_txt_heap
   c_star
   char_tbl
   circle_tbl
   city
!  concur_reindex_matview
!  concur_reindex_tab
!  concur_reindex_tab2
   d_star
   date_tbl
   dept
   dupindexcols
   e_star
***************
*** 628,637 ****
   iexit
   ihighway
   inet_tbl
-  inhf
-  inhx
-  insert_seq
-  insert_tbl
   int2_tbl
   int4_tbl
   int8_tbl
--- 625,630 ----
***************
*** 639,647 ****
   iportaltest
   kd_point_tbl
   line_tbl
-  log_table
   lseg_tbl
-  main_table
   money_data
   num_data
   num_exp_add
--- 632,638 ----
***************
*** 663,669 ****
   quad_point_tbl
   radix_text_tbl
   ramp
-  random_tbl
   real_city
   reltime_tbl
   road
--- 654,659 ----
***************
*** 672,678 ****
   street
   stud_emp
   student
-  subselect_tbl
   t
   tenk1
   tenk2
--- 662,667 ----
***************
*** 697,704 ****
   tvvm
   tvvmv
   varchar_tbl
!  xacttest
! (120 rows)
  
  SELECT name(equipment(hobby_construct(text 'skywalking', text 'mer')));
   name 
--- 686,692 ----
   tvvm
   tvvmv
   varchar_tbl
! (101 rows)
  
  SELECT name(equipment(hobby_construct(text 'skywalking', text 'mer')));
   name 

======================================================================

*** /Users/decibel/pgsql/HEAD/src/test/regress/expected/with.out	Mon May  5 19:06:09 2014
--- /Users/decibel/pgsql/HEAD/src/test/regress/results/with.out	Tue Oct 28 15:53:09 2014
***************
*** 2083,2126 ****
  EXPLAIN (VERBOSE, COSTS OFF)
  WITH wcte AS ( INSERT INTO int8_tbl VALUES ( 42, 47 ) RETURNING q2 )
  DELETE FROM a USING wcte WHERE aa = q2;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Delete on public.a
!    CTE wcte
!      ->  Insert on public.int8_tbl
!            Output: int8_tbl.q2
!            ->  Result
!                  Output: 42::bigint, 47::bigint
!    ->  Nested Loop
!          Output: a.ctid, wcte.*
!          Join Filter: (a.aa = wcte.q2)
!          ->  Seq Scan on public.a
!                Output: a.ctid, a.aa
!          ->  CTE Scan on wcte
!                Output: wcte.*, wcte.q2
!    ->  Nested Loop
!          Output: b.ctid, wcte.*
!          Join Filter: (b.aa = wcte.q2)
!          ->  Seq Scan on public.b
!                Output: b.ctid, b.aa
!          ->  CTE Scan on wcte
!                Output: wcte.*, wcte.q2
!    ->  Nested Loop
!          Output: c.ctid, wcte.*
!          Join Filter: (c.aa = wcte.q2)
!          ->  Seq Scan on public.c
!                Output: c.ctid, c.aa
!          ->  CTE Scan on wcte
!                Output: wcte.*, wcte.q2
!    ->  Nested Loop
!          Output: d.ctid, wcte.*
!          Join Filter: (d.aa = wcte.q2)
!          ->  Seq Scan on public.d
!                Output: d.ctid, d.aa
!          ->  CTE Scan on wcte
!                Output: wcte.*, wcte.q2
! (34 rows)
! 
  -- error cases
  -- data-modifying WITH tries to use its own output
  WITH RECURSIVE t AS (
--- 2083,2091 ----
  EXPLAIN (VERBOSE, COSTS OFF)
  WITH wcte AS ( INSERT INTO int8_tbl VALUES ( 42, 47 ) RETURNING q2 )
  DELETE FROM a USING wcte WHERE aa = q2;
! ERROR:  relation "a" does not exist
! LINE 3: DELETE FROM a USING wcte WHERE aa = q2;
!                     ^
  -- error cases
  -- data-modifying WITH tries to use its own output
  WITH RECURSIVE t AS (

======================================================================

regression.outtext/plain; charset=UTF-8; name=regression.out; x-mac-creator=0; x-mac-type=0Download

#14

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Jim Nasby (#13)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

Thanks for your input, Jim!

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

Patch applies against current HEAD and builds, but I'm getting 37 failed
tests (mostly parallel, but also misc and WITH; results attached). Is that
expected?

This is caused by the recent commit 7b1c2a0 (that I actually participated
in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.

The "mark the concurrent index" bit is rather confusing; it sounds like

it's

referring to the new index instead of the old. Now that I've read the

code I

understand what's going on here between the concurrent index *entry* and

the

filenode swap, but I don't think the docs make this sufficiently clear to
users.

How about something like this:

The following steps occur in a concurrent index build, each in a separate
transaction. Note that if there are multiple indexes to be rebuilt then

each

step loops through all the indexes we're rebuilding, using a separate
transaction for each one.

1. [blah]

Definitely a good idea! I took your text and made it more precise, listing
the actions done for each step, the pg_index flags switched, using
<orderedlist> to make the list of steps described in a separate paragraph
more exhaustive for the user. At the same time I reworked the docs removing
a part that was somewhat duplicated about dealing with the constraints
having invalid index entries and how to drop them.

+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for

concurrent

+ * operations. The index is inserted into catalogs and needs to be built
later
+ * on. This is called during concurrent index processing. The heap

relation

+ * on which is based the index needs to be closed by the caller.

Last bit presumably should be "on which the index is based".

What about "Create a concurrent index based on the definition of the one
provided by caller"?

+ /* Build the list of column names, necessary for index_create */
Instead of all this work wouldn't it be easier to create a version of
index_create/ConstructTupleDescriptor that will use the IndexInfo for the
old index? ISTM index_concurrent_create() is doing a heck of a lot of work
to marshal data into one form just to have it get marshaled yet again.

Worst

case, if we do have to play this game, there should be a stand-alone
function to get the columns/expressions for an existing index; you're
duplicating a lot of code from pg_get_indexdef_worker().

Yes, this definitely sucks and the approach creating a function to get all
the column names is not productive as well. Then let's define an additional
argument in index_create to pass a potential TupleDesc instead of this
whole wart. I noticed as well that we need to actually reset attcacheoff to
be able to use a fresh version of the tuple descriptor of the old index. I
added a small API for this purpose in tupdesc.h called ResetTupleDescCache.
Would it make sense instead to extend CreateTupleDescCopyConstr or
CreateTupleDescCopy with a boolean flag?

index_concurrent_swap(): Perhaps it'd be better to create
index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
refactor the duplicated code out... the actual function would then become:

This sentence is not finished :) IMO, index_concurrent_swap looks good as
is, taking as arguments the index and its concurrent entry, and swapping
their relfilenode after taking AccessExclusiveLock that will be hold until
the end of this transaction.

ReindexRelationConcurrently()

+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can

+ * either an index or a table. If a table is specified, each step of
REINDEX
+ * CONCURRENTLY is done in parallel with all the table's indexes as well

+ * its dependent toast indexes.
This comment is a bit misleading; we're not actually doing anything in
parallel, right? AFAICT index_concurrent_build is going to block while

each

index is built the first time.

Yes, parallel may be misleading. What is meant here is that each step of
the process is done one by one on all the valid indexes a table may have.

+ * relkind is an index, this index itself will be rebuilt. The

locks

taken
+        * parent relations and involved indexes are kept until this
transaction
+        * is committed to protect against schema changes that might occur
until
+        * the session lock is taken on each relation.

This comment is a bit unclear to me... at minimum I think it should be "*

parent relations" instead of "* parent relations", but I think it needs to
elaborate on why/when we're also taking session level locks.

Hum, done as follows:
@@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
         * If the relkind of given relation Oid is a table, all its valid
indexes
         * will be rebuilt, including its associated toast table indexes. If
         * relkind is an index, this index itself will be rebuilt. The
locks taken
-        * parent relations and involved indexes are kept until this
transaction
+        * on parent relations and involved indexes are kept until this
transaction
         * is committed to protect against schema changes that might occur
until
-        * the session lock is taken on each relation.
+        * the session lock is taken on each relation, session lock used to
+        * similarly protect from any schema change that could happen
within the
+        * multiple transactions that are used during this process.
         */

I also wordsmithed this comment a bit...
* Here begins the process for concurrently rebuilding the index
and this one...
* During this phase the concurrent indexes catch up with any new

Slight differences indeed. Thanks and included.

I'd change that to "explosion in the number of indexes a parent relation
could have if this operation fails."

Well, implosion was more... I don't recall my state of mind when writing
that. So changed the way you recommend.

Phase 4, 5 and 6 are rather confusing if you don't understand that each
"concurrent index" entry is meant to be thrown away. I think the Phase 4
comment should elaborate on that.

OK, done.

The comment in check_exclusion_constraint() is good; shouldn't the related
comment on this line in index_create() mention that
check_exclusion_constraint() needs to be changed if we ever support
concurrent builds of exclusion indexes?

if (concurrent && is_exclusion && !is_reindex)

OK, what about that then:
        /*
-        * This case is currently not supported, but there's no way to ask
for it
-        * in the grammar anyway, so it can't happen.
+        * This case is currently only supported during a concurrent index
+        * rebuild, but there is no way to ask for it in the grammar
otherwise
+        * anyway. If support for exclusion constraints is added in the
future,
+        * the check similar to this one in check_exclusion_constraint
should as
+        * well be changed accordingly.

Updated patch is attached.
Thanks again.
Regards,
--
Michael

Attachments:

20141030_reindex_concurrently_3_v2.patchtext/x-diff; charset=US-ASCII; name=20141030_reindex_concurrently_3_v2.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
    <sect2 id="locking-pages">
     <title>Page-level Locks</title>
-  
+
     <para>
      In addition to table and row locks, page-level share/exclusive locks are
      used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..285f3ff 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,12 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
      </para>
     </listitem>
 
@@ -139,6 +142,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -218,6 +236,194 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition is only used to build the
+       new index, and will be removed at the completion of the process. This
+       step is done as a single transaction for all the indexes involved in
+       this process, meaning that if <command>REINDEX CONCURRENTLY</> is
+       run on a table with multiple indexes, all the catalog entries of the
+       temporary indexes are created within a single transaction. A
+       <literal>SHARE UPDATE EXCLUSIVE</literal> lock at session level is taken
+       on the indexes being reindexed as well as its parent table to prevent
+       any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each temporary entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. One the validation of the index
+       related to the temporary entry is done, a cache invalidation is done
+       so as all the sessions that referenced this index in any cached plans
+       will refresh them. This step is performed within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       <literal>pg_class.relfilenode</> for the existing index definition
+       and the temporary definition are swapped. This means that the existing
+       index definition now uses the index data that we stored during the
+       build, and the temporary definition is using the old index data. Again
+       a cache invalidation is performed to refresh any sessions that may refer
+       to the previous index definition. Note that at this point
+       <literal>pg_class.indisvalid</> is not switched to <quote>true</>,
+       making the temporary index definition ignored by any read query, for
+       the sake of toast indexes that can only use one single index in ready
+       state at the same time. During the swap an exclusive lock is taken
+       on the index and its temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Temporary entries have <literal>pg_class.isready</> switched to
+       <quote>false</> to prevent any new tuple insertions. This step
+       is done within a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The temporary index definition and its data (which is now the
+       data for the old index) are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. If a concurrent index is based on a <literal>PRIMARY KEY</>
+    or an exclude constraint is marked as invalid. It can be dropped with
+    <literal>ALTER TABLE DROP CONSTRAINT</>. This is also the case of
+    <literal>UNIQUE</> indexes using constraints. Other indexes can be
+    dropped using <literal>DROP INDEX</> including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal> except when an index and its
+    concurrent entry are swapped where a <literal>ACCESS EXCLUSIVE</literal>
+    lock is taken on the parent relation.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -249,7 +455,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index f3b3689..138be1c 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -259,6 +259,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 }
 
 /*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
  * Free a TupleDesc including all substructure
  */
 void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index ee10594..a31ecc7 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -63,6 +65,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/ruleutils.h"
 #include "utils/syscache.h"
 #include "utils/tuplesort.h"
 #include "utils/snapmgr.h"
@@ -663,6 +666,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -674,6 +678,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -690,6 +698,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -697,7 +706,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -740,19 +750,24 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -779,14 +794,21 @@ index_create(Relation heapRelation,
 						indexRelationName)));
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed
+	 * by caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1093,6 +1115,340 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent index processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. An exclusive lock
+ * is taken on those two relations during the swap of their relfilenode.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, AccessExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, AccessExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table. Note
+	 * we do not need to worry about xacts that open the table for reading
+	 * after this point; they will see the index as invalid when they open
+	 * the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1441,52 +1797,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table. Note
-		 * we do not need to worry about xacts that open the table for reading
-		 * after this point; they will see the index as invalid when they open
-		 * the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 160f006..1a6ee5a 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -341,8 +341,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 NULL,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3c1e90e..662dd46 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -276,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -312,7 +393,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -322,13 +402,10 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -459,7 +536,8 @@ DefineIndex(Oid relationId,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -606,11 +684,11 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -692,27 +770,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -777,74 +843,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -871,6 +872,544 @@ DefineIndex(Oid relationId,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation, session lock used to
+	 * similarly protect from any schema change that could happen within the
+	 * multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built later. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 NULL,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with any new tuples
+	 * that were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index. Note
+	 * that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an explosion in the number of indexes
+	 * a parent relation could have if this operation step fails multiple times
+	 * in a row due to a reason or another. Note that once this phase is done
+	 * each concurrent index will be thrown away in the next process steps.
+	 */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them. One transaction is used for each single index
+	 * entry.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the concurrent entries are
+	 * already considered as invalid and not ready, so they will not be used
+	 * by other backends for any read or write operations.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1533,7 +2072,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1559,6 +2099,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1671,18 +2218,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1751,17 +2302,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
-
-	if (!reindex_relation(heapOid,
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run the concurrent process if necessary */
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid);
+	else
+		result = reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS))
+							 REINDEX_REL_CHECK_CONSTRAINTS);
+
+	/* Let user know if operation has been moot */
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1778,7 +2339,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1790,6 +2354,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1874,17 +2447,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ecdff1e..e212da0 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -907,6 +907,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -942,7 +943,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..8690eeb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 21b070a..eaf9dd3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3724,6 +3724,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 358395f..c49ee7a 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1901,6 +1901,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0de9584..4b8067d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7157,35 +7157,38 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *
  *		QUERY:
  *
- *		REINDEX type <name> [FORCE]
+ *		REINDEX type [CONCURRENTLY] <name> [FORCE]
  *
  * FORCE no longer does anything, but we accept it for backwards compatibility
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 4a2a339..5339676 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -744,16 +744,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -765,8 +769,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 66d80b5..4f2376f 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1643,6 +1643,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 		return false;
 	}
 
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 083f4bd..6b3df50 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -95,6 +95,8 @@ extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
 extern void DecrTupleDescRefCount(TupleDesc tupdesc);
 
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 006b180..7d8c376 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -53,6 +53,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -60,7 +61,24 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0ebdbc1..b988555 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -28,10 +28,11 @@ extern Oid DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cef9544..854551e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2728,6 +2728,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 8326e94..dba06c6 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2786,3 +2786,60 @@ explain (costs off)
    Index Cond: ((thousand = 1) AND (tenthous = 1001))
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index d4d24ef..93321c0 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -938,3 +938,45 @@ ORDER BY thousand;
 
 explain (costs off)
   select * from tenk1 where (thousand, tenthous) in ((1,1001), (null,null));
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#15

Jim Nasby

Jim.Nasby@BlueTreble.com

about 11 years ago

In reply to: Michael Paquier (#14)

Re: REINDEX CONCURRENTLY 2.0

On 10/30/14, 3:19 AM, Michael Paquier wrote:

Thanks for your input, Jim!

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote:

Patch applies against current HEAD and builds, but I'm getting 37 failed
tests (mostly parallel, but also misc and WITH; results attached). Is that
expected?

This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.

The "mark the concurrent index" bit is rather confusing; it sounds like it's
referring to the new index instead of the old. Now that I've read the code I
understand what's going on here between the concurrent index *entry* and the
filenode swap, but I don't think the docs make this sufficiently clear to
users.

How about something like this:

The following steps occur in a concurrent index build, each in a separate
transaction. Note that if there are multiple indexes to be rebuilt then each
step loops through all the indexes we're rebuilding, using a separate
transaction for each one.

1. [blah]

Definitely a good idea! I took your text and made it more precise, listing the actions done for each step, the pg_index flags switched, using <orderedlist> to make the list of steps described in a separate paragraph more exhaustive for the user. At the same time I reworked the docs removing a part that was somewhat duplicated about dealing with the constraints having invalid index entries and how to drop them.

Awesome! Just a few items here:

+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. One the validation of the index

s/One the/Once the/

+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built
later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.
Last bit presumably should be "on which the index is based".
What about "Create a concurrent index based on the definition of the one provided by caller"?

That's good too, but my comment was on the last sentence, not the first.

+ /* Build the list of column names, necessary for index_create */
Instead of all this work wouldn't it be easier to create a version of
index_create/ConstructTupleDescriptor that will use the IndexInfo for the
old index? ISTM index_concurrent_create() is doing a heck of a lot of work
to marshal data into one form just to have it get marshaled yet again. Worst
case, if we do have to play this game, there should be a stand-alone
function to get the columns/expressions for an existing index; you're
duplicating a lot of code from pg_get_indexdef_worker().

Yes, this definitely sucks and the approach creating a function to get all the column names is not productive as well. Then let's define an additional argument in index_create to pass a potential TupleDesc instead of this whole wart. I noticed as well that we need to actually reset attcacheoff to be able to use a fresh version of the tuple descriptor of the old index. I added a small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it make sense instead to extend CreateTupleDescCopyConstr or CreateTupleDescCopy with a boolean flag?

Perhaps there'd be other places that would want to reset the stats, so I lean slightly that direction.

The comment at the beginning of index_create needs to be modified to mention tupdesc and especially that setting tupdesc over-rides indexColNames.

index_concurrent_swap(): Perhaps it'd be better to create
index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
refactor the duplicated code out... the actual function would then become:

This sentence is not finished :) IMO, index_concurrent_swap looks good as is, taking as arguments the index and its concurrent entry, and swapping their relfilenode after taking AccessExclusiveLock that will be hold until the end of this transaction.

Fair enough.

ReindexRelationConcurrently()

+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each step of
REINDEX
+ * CONCURRENTLY is done in parallel with all the table's indexes as well as
+ * its dependent toast indexes.
This comment is a bit misleading; we're not actually doing anything in
parallel, right? AFAICT index_concurrent_build is going to block while each
index is built the first time.

Yes, parallel may be misleading. What is meant here is that each step of the process is done one by one on all the valid indexes a table may have.

New comment looks good.

+        * relkind is an index, this index itself will be rebuilt. The locks
taken
+        * parent relations and involved indexes are kept until this
transaction
+        * is committed to protect against schema changes that might occur
until
+        * the session lock is taken on each relation.
This comment is a bit unclear to me... at minimum I think it should be "* on
parent relations" instead of "* parent relations", but I think it needs to
elaborate on why/when we're also taking session level locks.

Hum, done as follows:
@@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
* If the relkind of given relation Oid is a table, all its valid indexes
* will be rebuilt, including its associated toast table indexes. If
* relkind is an index, this index itself will be rebuilt. The locks taken
-        * parent relations and involved indexes are kept until this transaction
+        * on parent relations and involved indexes are kept until this transaction
* is committed to protect against schema changes that might occur until
-        * the session lock is taken on each relation.
+        * the session lock is taken on each relation, session lock used to
+        * similarly protect from any schema change that could happen within the
+        * multiple transactions that are used during this process.
*/

Cool.

OK, what about that then:
/*
-        * This case is currently not supported, but there's no way to ask for it
-        * in the grammar anyway, so it can't happen.
+        * This case is currently only supported during a concurrent index
+        * rebuild, but there is no way to ask for it in the grammar otherwise
+        * anyway. If support for exclusion constraints is added in the future,
+        * the check similar to this one in check_exclusion_constraint should as
+        * well be changed accordingly.

Updated patch is attached.

Works for me.

Keep in mind I'm not super familiar with the guts of index creation, so it'd be good for someone else to look at that bit (especially index_concurrent_create and ReindexRelationConcurrently).
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Jim Nasby

Jim.Nasby@BlueTreble.com

about 11 years ago

In reply to: Michael Paquier (#14)

Re: REINDEX CONCURRENTLY 2.0

On 10/30/14, 3:19 AM, Michael Paquier wrote:

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote:

Patch applies against current HEAD and builds, but I'm getting 37 failed
tests (mostly parallel, but also misc and WITH; results attached). Is that
expected?

This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.

Sorry, forgot to mention patch now passes make check cleanly.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Michael Paquier (#14)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Updated patch is attached.

Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,
--
Michael

Attachments:

20141106_reindex_concurrently_3_v3.patchtext/x-diff; charset=US-ASCII; name=20141106_reindex_concurrently_3_v3.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
    <sect2 id="locking-pages">
     <title>Page-level Locks</title>
-  
+
     <para>
      In addition to table and row locks, page-level share/exclusive locks are
      used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..285f3ff 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,12 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
      </para>
     </listitem>
 
@@ -139,6 +142,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -218,6 +236,194 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition is only used to build the
+       new index, and will be removed at the completion of the process. This
+       step is done as a single transaction for all the indexes involved in
+       this process, meaning that if <command>REINDEX CONCURRENTLY</> is
+       run on a table with multiple indexes, all the catalog entries of the
+       temporary indexes are created within a single transaction. A
+       <literal>SHARE UPDATE EXCLUSIVE</literal> lock at session level is taken
+       on the indexes being reindexed as well as its parent table to prevent
+       any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each temporary entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. One the validation of the index
+       related to the temporary entry is done, a cache invalidation is done
+       so as all the sessions that referenced this index in any cached plans
+       will refresh them. This step is performed within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       <literal>pg_class.relfilenode</> for the existing index definition
+       and the temporary definition are swapped. This means that the existing
+       index definition now uses the index data that we stored during the
+       build, and the temporary definition is using the old index data. Again
+       a cache invalidation is performed to refresh any sessions that may refer
+       to the previous index definition. Note that at this point
+       <literal>pg_class.indisvalid</> is not switched to <quote>true</>,
+       making the temporary index definition ignored by any read query, for
+       the sake of toast indexes that can only use one single index in ready
+       state at the same time. During the swap an exclusive lock is taken
+       on the index and its temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Temporary entries have <literal>pg_class.isready</> switched to
+       <quote>false</> to prevent any new tuple insertions. This step
+       is done within a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The temporary index definition and its data (which is now the
+       data for the old index) are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. If a concurrent index is based on a <literal>PRIMARY KEY</>
+    or an exclude constraint is marked as invalid. It can be dropped with
+    <literal>ALTER TABLE DROP CONSTRAINT</>. This is also the case of
+    <literal>UNIQUE</> indexes using constraints. Other indexes can be
+    dropped using <literal>DROP INDEX</> including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal> except when an index and its
+    concurrent entry are swapped where a <literal>ACCESS EXCLUSIVE</literal>
+    lock is taken on the parent relation.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -249,7 +455,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index f3b3689..138be1c 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -259,6 +259,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 }
 
 /*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
  * Free a TupleDesc including all substructure
  */
 void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 01ed880..5f5acec 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -63,6 +65,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/ruleutils.h"
 #include "utils/syscache.h"
 #include "utils/tuplesort.h"
 #include "utils/snapmgr.h"
@@ -663,6 +666,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -674,6 +678,10 @@ UpdateIndexRelation(Oid indexoid,
  *		will be marked "invalid" and the caller must take additional steps
  *		to fix it up.
  * is_internal: if true, post creation hook for new index
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -690,6 +698,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -697,7 +706,8 @@ index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal)
+			 bool is_internal,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -740,19 +750,24 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -779,14 +794,21 @@ index_create(Relation heapRelation,
 						indexRelationName)));
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed
+	 * by caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1093,6 +1115,349 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent index processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. An exclusive lock
+ * is taken on those two relations during the swap of their relfilenode.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, LOCKTAG locktag)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Before doing any operation, we need to wait until no running
+	 * transaction could be using any index for a query as a deadlock
+	 * could occur if another transaction running tries to take the same
+	 * level of locking as this operation. Hence use AccessExclusiveLock
+	 * to ensure that there is nothing nasty waiting.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, AccessExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, AccessExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table. Note
+	 * we do not need to worry about xacts that open the table for reading
+	 * after this point; they will see the index as invalid when they open
+	 * the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1441,52 +1806,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table. Note
-		 * we do not need to worry about xacts that open the table for reading
-		 * after this point; they will see the index as invalid when they open
-		 * the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 160f006..1a6ee5a 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -341,8 +341,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 NULL,
 				 true, false, false, false,
-				 true, false, false, true);
+				 true, false, false, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3c1e90e..fd3812b 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -276,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -312,7 +393,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -322,13 +402,10 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -459,7 +536,8 @@ DefineIndex(Oid relationId,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -606,11 +684,11 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
-					 stmt->concurrent, !check_rights);
+					 stmt->concurrent, !check_rights, false);
 
 	/* Add any requested comment */
 	if (stmt->idxcomment != NULL)
@@ -692,27 +770,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -777,74 +843,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -871,6 +872,559 @@ DefineIndex(Oid relationId,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation, session lock used to
+	 * similarly protect from any schema change that could happen within the
+	 * multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built later. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 NULL,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with any new tuples
+	 * that were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index. Note
+	 * that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an explosion in the number of indexes
+	 * a parent relation could have if this operation step fails multiple times
+	 * in a row due to a reason or another. Note that once this phase is done
+	 * each concurrent index will be thrown away in the next process steps.
+	 */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it before the swap.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid, *heapLockTag);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them. One transaction is used for each single index
+	 * entry.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the concurrent entries are
+	 * already considered as invalid and not ready, so they will not be used
+	 * by other backends for any read or write operations.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1533,7 +2087,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1559,6 +2114,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1671,18 +2233,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1751,17 +2317,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
-
-	if (!reindex_relation(heapOid,
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run the concurrent process if necessary */
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid);
+	else
+		result = reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS))
+							 REINDEX_REL_CHECK_CONSTRAINTS);
+
+	/* Let user know if operation has been moot */
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1778,7 +2354,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1790,6 +2369,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1874,17 +2462,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ecdff1e..e212da0 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -907,6 +907,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -942,7 +943,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..8690eeb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 21b070a..eaf9dd3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3724,6 +3724,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 358395f..c49ee7a 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1901,6 +1901,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0de9584..4b8067d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7157,35 +7157,38 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *
  *		QUERY:
  *
- *		REINDEX type <name> [FORCE]
+ *		REINDEX type [CONCURRENTLY] <name> [FORCE]
  *
  * FORCE no longer does anything, but we accept it for backwards compatibility
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 4a2a339..5339676 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -744,16 +744,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -765,8 +769,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 66d80b5..4f2376f 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1643,6 +1643,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 		return false;
 	}
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 886188c..b8b9851 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -144,7 +144,7 @@ static bool completion_case_sensitive;	/* completion is case sensitive */
  * 5) The list of attributes of the given table (possibly schema-qualified).
  * 6/ The list of arguments to the given function (possibly schema-qualified).
  */
-#define COMPLETE_WITH_QUERY(query) \
+#define COMPLETE_WITH_QUERY(query)				\
 do { \
 	completion_charp = query; \
 	matches = completion_matches(text, complete_from_query); \
@@ -2261,7 +2261,9 @@ psql_completion(const char *text, int start, int end)
 			 pg_strcasecmp(prev_wd, "ON") == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
 	/* If we have CREATE|UNIQUE INDEX <sth> CONCURRENTLY, then add "ON" */
-	else if ((pg_strcasecmp(prev3_wd, "INDEX") == 0 ||
+	else if ((pg_strcasecmp(prev4_wd, "CREATE") == 0 ||
+			  pg_strcasecmp(prev3_wd, "CREATE") == 0) &&
+			 (pg_strcasecmp(prev3_wd, "INDEX") == 0 ||
 			  pg_strcasecmp(prev2_wd, "INDEX") == 0) &&
 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
 		COMPLETE_WITH_CONST("ON");
@@ -3334,14 +3336,35 @@ psql_completion(const char *text, int start, int end)
 
 		COMPLETE_WITH_LIST(list_REINDEX);
 	}
-	else if (pg_strcasecmp(prev2_wd, "REINDEX") == 0)
+	else if (pg_strcasecmp(prev2_wd, "REINDEX") == 0 ||
+			 pg_strcasecmp(prev3_wd, "REINDEX") == 0)
 	{
+		/* Complete REINDEX TABLE with a list of tables, and CONCURRENTLY  */
 		if (pg_strcasecmp(prev_wd, "TABLE") == 0)
+			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+				" UNION SELECT 'CONCURRENTLY'");
+		/* Complete REINDEX TABLE CONCURRENTLY with a list of tables */
+		else if (pg_strcasecmp(prev2_wd, "TABLE") == 0 &&
+				 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
 			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		/* Complete REINDEX TABLE with a list of indexes, and CONCURRENTLY */
 		else if (pg_strcasecmp(prev_wd, "INDEX") == 0)
+			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+									   " UNION SELECT 'CONCURRENTLY'");
+		/* Complete REINDEX INDEX CONCCURRENTLY with a list if indexes */
+		else if (pg_strcasecmp(prev2_wd, "INDEX") == 0 &&
+				 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
 			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
-		else if (pg_strcasecmp(prev_wd, "SYSTEM") == 0 ||
-				 pg_strcasecmp(prev_wd, "DATABASE") == 0)
+		/* Complete REINDEX DATABASE with a list of databases, and CONCURRENTLY */
+		else if (pg_strcasecmp(prev_wd, "DATABASE") == 0)
+			COMPLETE_WITH_QUERY(Query_for_list_of_databases
+								" UNION SELECT 'CONCURRENTLY'");
+		/* Complete REINDEX DATABASE CONCURRENTLY with a list of databases */
+		else if (pg_strcasecmp(prev2_wd, "DATABASE") == 0 ||
+				 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
+			COMPLETE_WITH_QUERY(Query_for_list_of_databases);
+		/* Complete REINDEX SYSTEM with a list of databases */
+		else if (pg_strcasecmp(prev_wd, "SYSTEM") == 0)
 			COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 	}
 
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 083f4bd..6b3df50 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -95,6 +95,8 @@ extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
 extern void DecrTupleDescRefCount(TupleDesc tupdesc);
 
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 006b180..be42201 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -53,6 +53,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -60,7 +61,26 @@ extern Oid index_create(Relation heapRelation,
 			 bool allow_system_table_mods,
 			 bool skip_build,
 			 bool concurrent,
-			 bool is_internal);
+			 bool is_internal,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  LOCKTAG locktag);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0ebdbc1..b988555 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -28,10 +28,11 @@ extern Oid DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cef9544..854551e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2728,6 +2728,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 79a7956..451a415 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -21,6 +21,7 @@ test: delete-abort-savept-2
 test: aborted-keyrevoke
 test: multixact-no-deadlock
 test: multixact-no-forget
+test: reindex-concurrently
 test: propagate-lock-delete
 test: nowait
 test: nowait-2
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index d903c4b..ef73504 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2808,3 +2808,60 @@ explain (costs off)
    Index Cond: ((thousand = 1) AND (tenthous = 1001))
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 989fc97..8e99752 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -951,3 +951,45 @@ RESET enable_indexscan;
 
 explain (costs off)
   select * from tenk1 where (thousand, tenthous) in ((1,1001), (null,null));
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#18

Peter Eisentraut

peter_e@gmx.net

about 11 years ago

In reply to: Michael Paquier (#12)

Re: REINDEX CONCURRENTLY 2.0

On 10/1/14 3:00 AM, Michael Paquier wrote:

- Use of AccessExclusiveLock when swapping relfilenodes of an index and
its concurrent entry instead of ShareUpdateExclusiveLock for safety. At
the limit of my understanding, that's the consensus reached until now.

I'm very curious about this point. I looked through all the previous
discussions, and the only time I saw this mentioned was at the very
beginning when it was said that we could review the patch while ignoring
this issue and fix it later with MVCC catalog access. Then it got very
technical, but it was never explicitly concluded whether it was possible
to fix this or not.

Also, in the thread "Concurrently option for reindexdb" you pointed out
that requiring an exclusive lock isn't really concurrent and proposed an
option like --minimum-locks.

I will point out again that we specifically invented DROP INDEX
CONCURRENTLY because holding an exclusive lock even briefly isn't good
enough.

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS. It's still useful, but we
shouldn't give people the idea that they can throw away their custom
CREATE INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY scripts.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Jeff Janes

jeff.janes@gmail.com

about 11 years ago

In reply to: Michael Paquier (#17)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Updated patch is attached.

Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,

It looks like this needs another rebase, I get failures
on index.c, toasting.c, indexcmds.c, and index.h

Thanks,

Jeff

#20

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Jeff Janes (#19)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Nov 11, 2014 at 3:24 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Updated patch is attached.

Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,

It looks like this needs another rebase, I get failures on index.c, toasting.c, indexcmds.c, and index.h

Indeed. There are some conflicts created by the recent modification of
index_create. Here is a rebased patch.
--
Michael

Attachments:

20141110_reindex_concurrently_3_v4.patchapplication/x-patch; name=20141110_reindex_concurrently_3_v4.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
    <sect2 id="locking-pages">
     <title>Page-level Locks</title>
-  
+
     <para>
      In addition to table and row locks, page-level share/exclusive locks are
      used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..285f3ff 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,12 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
      </para>
     </listitem>
 
@@ -139,6 +142,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORCE</literal></term>
     <listitem>
      <para>
@@ -218,6 +236,194 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition is only used to build the
+       new index, and will be removed at the completion of the process. This
+       step is done as a single transaction for all the indexes involved in
+       this process, meaning that if <command>REINDEX CONCURRENTLY</> is
+       run on a table with multiple indexes, all the catalog entries of the
+       temporary indexes are created within a single transaction. A
+       <literal>SHARE UPDATE EXCLUSIVE</literal> lock at session level is taken
+       on the indexes being reindexed as well as its parent table to prevent
+       any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each temporary entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. One the validation of the index
+       related to the temporary entry is done, a cache invalidation is done
+       so as all the sessions that referenced this index in any cached plans
+       will refresh them. This step is performed within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       <literal>pg_class.relfilenode</> for the existing index definition
+       and the temporary definition are swapped. This means that the existing
+       index definition now uses the index data that we stored during the
+       build, and the temporary definition is using the old index data. Again
+       a cache invalidation is performed to refresh any sessions that may refer
+       to the previous index definition. Note that at this point
+       <literal>pg_class.indisvalid</> is not switched to <quote>true</>,
+       making the temporary index definition ignored by any read query, for
+       the sake of toast indexes that can only use one single index in ready
+       state at the same time. During the swap an exclusive lock is taken
+       on the index and its temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Temporary entries have <literal>pg_class.isready</> switched to
+       <quote>false</> to prevent any new tuple insertions. This step
+       is done within a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The temporary index definition and its data (which is now the
+       data for the old index) are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the concurrent
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name finishing by
+    the suffix cct. If a concurrent index is based on a <literal>PRIMARY KEY</>
+    or an exclude constraint is marked as invalid. It can be dropped with
+    <literal>ALTER TABLE DROP CONSTRAINT</>. This is also the case of
+    <literal>UNIQUE</> indexes using constraints. Other indexes can be
+    dropped using <literal>DROP INDEX</> including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+    by default not allowed to run inside a transaction block, so in this case
+    <command>CONCURRENTLY</> is not supported.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+    </command> rebuilds concurrently only the non-system relations. System
+    relations are rebuilt with a non-concurrent context. Toast indexes are
+    rebuilt concurrently if the relation they depend on is a non-system
+    relation.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal> except when an index and its
+    concurrent entry are swapped where a <literal>ACCESS EXCLUSIVE</literal>
+    lock is taken on the parent relation.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -249,7 +455,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index f3b3689..138be1c 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -259,6 +259,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 }
 
 /*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
  * Free a TupleDesc including all substructure
  */
 void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 912038a..66020ba 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage.h"
+#include "commands/defrem.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -63,6 +65,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/ruleutils.h"
 #include "utils/syscache.h"
 #include "utils/tuplesort.h"
 #include "utils/snapmgr.h"
@@ -663,6 +666,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -676,6 +680,10 @@ UpdateIndexRelation(Oid indexoid,
  * is_internal: if true, post creation hook for new index
  * if_not_exists: if true, do not throw an error if a relation with
  * 	the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -692,6 +700,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -700,7 +709,8 @@ index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists)
+			 bool if_not_exists,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -743,19 +753,24 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs. If the index is created during
+	 * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemRelation(heapRelation) &&
+		!is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -794,14 +809,21 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed
+	 * by caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1108,6 +1130,350 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+
+/*
+ * index_concurrent_create
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent index processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 (const char *) concurrentName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 indexRelation->rd_index->indisprimary,
+								 OidIsValid(constraintOid),	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal? */
+								 false, /* if_not_exists? */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. An exclusive lock
+ * is taken on those two relations during the swap of their relfilenode.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, LOCKTAG locktag)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				tmpnode;
+
+	/*
+	 * Before doing any operation, we need to wait until no running
+	 * transaction could be using any index for a query as a deadlock
+	 * could occur if another transaction running tries to take the same
+	 * level of locking as this operation. Hence use AccessExclusiveLock
+	 * to ensure that there is nothing nasty waiting.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, AccessExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, AccessExclusiveLock);
+
+	/* Now swap relfilenode of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Here is where the actual swap happens */
+	tmpnode = oldIndexForm->relfilenode;
+	oldIndexForm->relfilenode = newIndexForm->relfilenode;
+	newIndexForm->relfilenode = tmpnode;
+
+	/* Then update the tuples for each relation */
+	simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+	CatalogUpdateIndexes(pg_class, oldIndexTuple);
+	CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table. Note
+	 * we do not need to worry about xacts that open the table for reading
+	 * after this point; they will see the index as invalid when they open
+	 * the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+
 /*
  * index_constraint_create
  *
@@ -1456,52 +1822,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table. Note
-		 * we do not need to worry about xacts that open the table for reading
-		 * after this point; they will see the index as invalid when they open
-		 * the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 5ef6dcc..e1992ad 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -341,8 +341,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
 				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 NULL,
 				 true, false, false, false,
-				 true, false, false, true, false);
+				 true, false, false, true, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 0205595..b3c1db5 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 static Oid GetIndexOpClass(List *opclass, Oid attrType,
 				char *accessMethodName, Oid accessMethodId);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
-				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+							 List *colnames, List *exclusionOpNames,
+							 bool primary, bool isconstraint,
+							 bool concurrent);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -276,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
 }
 
 /*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
+/*
  * DefineIndex
  *		Creates a new index.
  *
@@ -312,7 +393,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	bool		amcanorder;
@@ -322,13 +402,10 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
 	Snapshot	snapshot;
-	int			i;
 
 	/*
 	 * count attributes in index
@@ -459,7 +536,8 @@ DefineIndex(Oid relationId,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -606,12 +684,12 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
 					 stmt->concurrent, !check_rights,
-					 stmt->if_not_exists);
+					 stmt->if_not_exists, false);
 
 	if (!OidIsValid(indexRelationId))
 	{
@@ -699,27 +777,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/*
 	 * Update the pg_index row to mark the index as ready for inserts. Once we
@@ -784,74 +850,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -878,6 +879,559 @@ DefineIndex(Oid relationId,
 
 
 /*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+	List	   *concurrentIndexIds = NIL,
+			   *indexIds = NIL,
+			   *parentRelationIds = NIL,
+			   *lockTags = NIL,
+			   *relationLocks = NIL;
+	ListCell   *lc, *lc2;
+	Snapshot	snapshot;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation, session lock used to
+	 * similarly protect from any schema change that could happen within the
+	 * multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc2, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc2);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built later. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 NULL,
+										 false,
+										 false,
+										 true);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create(indexParentRel,
+												indOid,
+												concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/*
+		 * Update the pg_index row of the concurrent index as ready for inserts.
+		 * Once we commit this transaction, any new transactions that open the
+		 * table must insert new entries into the index for insertions and
+		 * non-HOT updates.
+		 */
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with any new tuples
+	 * that were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction to make the concurrent index valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index. Note
+	 * that the concurrent index used for swaping is not marked as valid
+	 * because we need to keep the former index and the concurrent index with
+	 * a different valid status to avoid an explosion in the number of indexes
+	 * a parent relation could have if this operation step fails multiple times
+	 * in a row due to a reason or another. Note that once this phase is done
+	 * each concurrent index will be thrown away in the next process steps.
+	 */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it before the swap.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid, *heapLockTag);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		relOid = IndexGetRelation(indOid, false);
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them. One transaction is used for each single index
+	 * entry.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the concurrent entries are
+	 * already considered as invalid and not ready, so they will not be used
+	 * by other backends for any read or write operations.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid indexOid = lfirst_oid(lc);
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indexOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
+
+
+/*
  * CheckMutability
  *		Test whether given expression is mutable
  */
@@ -1540,7 +2094,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent)
 {
 	char	   *indexname;
 
@@ -1566,6 +2121,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1678,18 +2240,22 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
 
-	/* lock level used here should match index lock reindex_index() */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
-									  RangeVarCallbackForReindexIndex,
-									  (void *) &heapOid);
+	indOid = RangeVarGetRelidExtended(indexRelation,
+				concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+				concurrent, concurrent,
+				RangeVarCallbackForReindexIndex,
+				(void *) &heapOid);
 
-	reindex_index(indOid, false);
+	/* Continue process for concurrent or non-concurrent case */
+	if (!concurrent)
+		reindex_index(indOid, false);
+	else
+		ReindexRelationConcurrently(indOid);
 
 	return indOid;
 }
@@ -1758,17 +2324,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
-									   RangeVarCallbackOwnsTable, NULL);
-
-	if (!reindex_relation(heapOid,
+	heapOid = RangeVarGetRelidExtended(relation,
+		concurrent ? ShareUpdateExclusiveLock : ShareLock,
+		concurrent, concurrent,
+		RangeVarCallbackOwnsTable, NULL);
+
+	/* Run the concurrent process if necessary */
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid);
+	else
+		result = reindex_relation(heapOid,
 						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS))
+							 REINDEX_REL_CHECK_CONSTRAINTS);
+
+	/* Let user know if operation has been moot */
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1785,7 +2361,10 @@ ReindexTable(RangeVar *relation)
  * That means this must not be called within a user transaction block!
  */
 Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+				bool do_system,
+				bool do_user,
+				bool concurrent)
 {
 	Relation	relationRelation;
 	HeapScanDesc scan;
@@ -1797,6 +2376,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 
 	AssertArg(databaseName);
 
+	/*
+	 * CONCURRENTLY operation is not allowed for a system, but it is for a
+	 * database.
+	 */
+	if (concurrent && !do_user)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex system concurrently")));
+
 	if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1881,17 +2469,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result = false;
+		bool		process_concurrent;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS))
+
+		/* Determine if relation needs to be processed concurrently */
+		process_concurrent = concurrent &&
+			!IsSystemNamespace(get_rel_namespace(relid));
+
+		/*
+		 * Reindex relation with a concurrent or non-concurrent process.
+		 * System relations cannot be reindexed concurrently, but they
+		 * need to be reindexed including pg_class with a normal process
+		 * as they could be corrupted, and concurrent process might also
+		 * use them. This does not include toast relations, which are
+		 * reindexed when their parent relation is processed.
+		 */
+		if (process_concurrent)
+		{
+			old = MemoryContextSwitchTo(private_context);
+			result = ReindexRelationConcurrently(relid);
+			MemoryContextSwitchTo(old);
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS);
+
+		if (result)
 			ereport(NOTICE,
-					(errmsg("table \"%s.%s\" was reindexed",
+					(errmsg("table \"%s.%s\" was reindexed%s",
 							get_namespace_name(get_rel_namespace(relid)),
-							get_rel_name(relid))));
+							get_rel_name(relid),
+							process_concurrent ? " concurrently" : "")));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 714a9f1..9875f1a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -908,6 +908,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -943,7 +944,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) &&
+		relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..8690eeb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
 	}
 
 	/*
+	 * As an invalid index only exists when created in a concurrent context,
+	 * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+	 * as this feature is not available for exclusion constraints, this code
+	 * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+	 * index exists in parallel to this one so we can bypass this check as
+	 * it has already been done on the other index existing in parallel.
+	 * If exclusion constraints are supported in the future for CREATE INDEX
+	 * CONCURRENTLY, this should be removed or completed especially for this
+	 * purpose.
+	 */
+	if (!index->rd_index->indisvalid)
+		return true;
+
+	/*
 	 * Search the tuples that are in the index for any violations, including
 	 * tuples that aren't visible yet.
 	 */
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76b5b3..6675d85 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3748,6 +3748,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(do_system);
 	COPY_SCALAR_FIELD(do_user);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index d5db71d..de35a08 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1902,6 +1902,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(do_system);
 	COMPARE_SCALAR_FIELD(do_user);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bd180e7..f31519d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7183,35 +7183,38 @@ opt_if_exists: IF_P EXISTS						{ $$ = TRUE; }
  *
  *		QUERY:
  *
- *		REINDEX type <name> [FORCE]
+ *		REINDEX type [CONCURRENTLY] <name> [FORCE]
  *
  * FORCE no longer does anything, but we accept it for backwards compatibility
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_type qualified_name opt_force
+			REINDEX reindex_type opt_concurrently qualified_name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					$$ = (Node *)n;
 				}
-			| REINDEX SYSTEM_P name opt_force
+			| REINDEX SYSTEM_P opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = false;
 					$$ = (Node *)n;
 				}
-			| REINDEX DATABASE name opt_force
+			| REINDEX DATABASE opt_concurrently name opt_force
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = OBJECT_DATABASE;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->do_system = true;
 					n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 422911c..4da9191 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -745,16 +745,20 @@ standard_ProcessUtility(Node *parsetree,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				switch (stmt->kind)
 				{
 					case OBJECT_INDEX:
-						ReindexIndex(stmt->relation);
+						ReindexIndex(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_TABLE:
 					case OBJECT_MATVIEW:
-						ReindexTable(stmt->relation);
+						ReindexTable(stmt->relation, stmt->concurrent);
 						break;
 					case OBJECT_DATABASE:
 
@@ -766,8 +770,8 @@ standard_ProcessUtility(Node *parsetree,
 						 */
 						PreventTransactionChain(isTopLevel,
 												"REINDEX DATABASE");
-						ReindexDatabase(stmt->name,
-										stmt->do_system, stmt->do_user);
+						ReindexDatabase(stmt->name, stmt->do_system,
+										stmt->do_user, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 66d80b5..4f2376f 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1643,6 +1643,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 		return false;
 	}
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56dc688..7339bcd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -144,7 +144,7 @@ static bool completion_case_sensitive;	/* completion is case sensitive */
  * 5) The list of attributes of the given table (possibly schema-qualified).
  * 6/ The list of arguments to the given function (possibly schema-qualified).
  */
-#define COMPLETE_WITH_QUERY(query) \
+#define COMPLETE_WITH_QUERY(query)				\
 do { \
 	completion_charp = query; \
 	matches = completion_matches(text, complete_from_query); \
@@ -2261,7 +2261,9 @@ psql_completion(const char *text, int start, int end)
 			 pg_strcasecmp(prev_wd, "ON") == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
 	/* If we have CREATE|UNIQUE INDEX <sth> CONCURRENTLY, then add "ON" */
-	else if ((pg_strcasecmp(prev3_wd, "INDEX") == 0 ||
+	else if ((pg_strcasecmp(prev4_wd, "CREATE") == 0 ||
+			  pg_strcasecmp(prev3_wd, "CREATE") == 0) &&
+			 (pg_strcasecmp(prev3_wd, "INDEX") == 0 ||
 			  pg_strcasecmp(prev2_wd, "INDEX") == 0) &&
 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
 		COMPLETE_WITH_CONST("ON");
@@ -3334,14 +3336,35 @@ psql_completion(const char *text, int start, int end)
 
 		COMPLETE_WITH_LIST(list_REINDEX);
 	}
-	else if (pg_strcasecmp(prev2_wd, "REINDEX") == 0)
+	else if (pg_strcasecmp(prev2_wd, "REINDEX") == 0 ||
+			 pg_strcasecmp(prev3_wd, "REINDEX") == 0)
 	{
+		/* Complete REINDEX TABLE with a list of tables, and CONCURRENTLY  */
 		if (pg_strcasecmp(prev_wd, "TABLE") == 0)
+			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+				" UNION SELECT 'CONCURRENTLY'");
+		/* Complete REINDEX TABLE CONCURRENTLY with a list of tables */
+		else if (pg_strcasecmp(prev2_wd, "TABLE") == 0 &&
+				 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
 			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		/* Complete REINDEX TABLE with a list of indexes, and CONCURRENTLY */
 		else if (pg_strcasecmp(prev_wd, "INDEX") == 0)
+			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+									   " UNION SELECT 'CONCURRENTLY'");
+		/* Complete REINDEX INDEX CONCCURRENTLY with a list if indexes */
+		else if (pg_strcasecmp(prev2_wd, "INDEX") == 0 &&
+				 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
 			COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
-		else if (pg_strcasecmp(prev_wd, "SYSTEM") == 0 ||
-				 pg_strcasecmp(prev_wd, "DATABASE") == 0)
+		/* Complete REINDEX DATABASE with a list of databases, and CONCURRENTLY */
+		else if (pg_strcasecmp(prev_wd, "DATABASE") == 0)
+			COMPLETE_WITH_QUERY(Query_for_list_of_databases
+								" UNION SELECT 'CONCURRENTLY'");
+		/* Complete REINDEX DATABASE CONCURRENTLY with a list of databases */
+		else if (pg_strcasecmp(prev2_wd, "DATABASE") == 0 ||
+				 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
+			COMPLETE_WITH_QUERY(Query_for_list_of_databases);
+		/* Complete REINDEX SYSTEM with a list of databases */
+		else if (pg_strcasecmp(prev_wd, "SYSTEM") == 0)
 			COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 	}
 
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 083f4bd..6b3df50 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -95,6 +95,8 @@ extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
 extern void DecrTupleDescRefCount(TupleDesc tupdesc);
 
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index c36a729..97f9d83 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -53,6 +53,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -61,7 +62,26 @@ extern Oid index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists);
+			 bool if_not_exists,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+								   Oid indOid,
+								   char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  LOCKTAG locktag);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern void index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0ebdbc1..b988555 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -28,10 +28,11 @@ extern Oid DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation);
-extern Oid	ReindexTable(RangeVar *relation);
+extern Oid	ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, bool concurrent);
 extern Oid ReindexDatabase(const char *databaseName,
-				bool do_system, bool do_user);
+							bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e4f815..b9484a0 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2729,6 +2729,7 @@ typedef struct ReindexStmt
 	const char *name;			/* name of database to reindex */
 	bool		do_system;		/* include system tables in database case */
 	bool		do_user;		/* include user tables in database case */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 79a7956..451a415 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -21,6 +21,7 @@ test: delete-abort-savept-2
 test: aborted-keyrevoke
 test: multixact-no-deadlock
 test: multixact-no-forget
+test: reindex-concurrently
 test: propagate-lock-delete
 test: nowait
 test: nowait-2
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 26d883c..192eb14 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2818,3 +2818,60 @@ explain (costs off)
    Index Cond: ((thousand = 1) AND (tenthous = 1001))
 (2 rows)
 
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ c1     | integer | not null
+ c2     | text    | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index e08f35e..34e1c42 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -957,3 +957,45 @@ RESET enable_indexscan;
 
 explain (costs off)
   select * from tenk1 where (thousand, tenthous) in ((1,1001), (null,null));
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#21

Robert Haas

robertmhaas@gmail.com

about 11 years ago

In reply to: Peter Eisentraut (#18)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Robert Haas

robertmhaas@gmail.com

about 11 years ago

In reply to: Robert Haas (#21)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

But more to the point .... why, precisely, can't this work without an
AccessExclusiveLock? And can't we fix that instead of setting for
something clearly inferior?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Andres Freund

andres@2ndquadrant.com

about 11 years ago

In reply to: Robert Haas (#22)

Re: REINDEX CONCURRENTLY 2.0

On 2014-11-12 16:11:58 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

I'm unconvinced. A *short* exclusive lock (just to update two pg_class
row), still gives most of the benefits of CONCURRENTLY. Also, I do think
we can get rid of that period in the not too far away future.

But more to the point .... why, precisely, can't this work without an
AccessExclusiveLock? And can't we fix that instead of setting for
something clearly inferior?

It's nontrivial to fix, but I think we can fix it at some point. I just
think we should get the *major* part of the feature before investing
lots of time making it even better. There's *very* frequent questions
about having this. And people do really dangerous stuff (like manually
updating pg_class.relfilenode and such) to cope.

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Robert Haas

robertmhaas@gmail.com

about 11 years ago

In reply to: Andres Freund (#23)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-11-12 16:11:58 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

I'm unconvinced. A *short* exclusive lock (just to update two pg_class
row), still gives most of the benefits of CONCURRENTLY.

I am pretty doubtful about that. It's still going to require you to
wait for all transactions to drain out of the table while new ones are
blocked from entering. Which sucks. Unless all of your transactions
are very short, but that's not necessarily typical.

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

How about storing both the old and new relfilenodes in the same pg_class entry?

1. Take a snapshot.
2. Index all the tuples in that snapshot.
3. Publish the new relfilenode to an additional pg_class column,
relnewfilenode or similar.
4. Wait until everyone can see step #3.
5. Rescan the table and add any missing tuples to the index.
6. Set some flag in pg_class to mark the relnewfilenode as active and
relfilenode as not to be used for queries.
7. Wait until everyone can see step #6.
8. Set some flag in pg_class to mark relfilenode as not even to be opened.
9. Wait until everyone can see step #8.
10. Drop old relfilenode.
11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0.

This is basically CREATE INDEX CONCURRENTLY (without the first step
where we out-wait people who might create now-invalid HOT chains,
because that can't arise with a REINDEX of an existing index) plus
DROP INDEX CONCURRENTLY.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

Andres Freund

andres@2ndquadrant.com

about 11 years ago

In reply to: Robert Haas (#24)

Re: REINDEX CONCURRENTLY 2.0

On 2014-11-12 18:23:38 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-11-12 16:11:58 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

I'm unconvinced. A *short* exclusive lock (just to update two pg_class
row), still gives most of the benefits of CONCURRENTLY.

I am pretty doubtful about that. It's still going to require you to
wait for all transactions to drain out of the table while new ones are
blocked from entering. Which sucks. Unless all of your transactions
are very short, but that's not necessarily typical.

Yes, it sucks. But it beats not being able to reindex a relation with a
primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

How about storing both the old and new relfilenodes in the same pg_class entry?

That's quite a cool idea

[think a bit]

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key. I don't
think we want to touch all those places to also disambiguate on some
other factor. All the relevant APIs are either just passing around oids
or relcache entries.

There's also the problem that we'd really need two different pg_index
rows to make things work. Alternatively we can duplicate the three
relevant columns (indisready, indislive, indislive) in there for the
different filenodes. But that's not entirely pretty.

1. Take a snapshot.
2. Index all the tuples in that snapshot.
3. Publish the new relfilenode to an additional pg_class column,
relnewfilenode or similar.
4. Wait until everyone can see step #3.

Here all backends need to update both indexes, right? And all the
indexing infrastructure can't deal with that without having separate
oids & relcache entries.

5. Rescan the table and add any missing tuples to the index.
6. Set some flag in pg_class to mark the relnewfilenode as active and
relfilenode as not to be used for queries.
7. Wait until everyone can see step #6.
8. Set some flag in pg_class to mark relfilenode as not even to be opened.
9. Wait until everyone can see step #8.
10. Drop old relfilenode.
11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0.

Even that one isn't trivial - how do you deal with the fact that
somebody looking at updating newrelfilenode might, in the midst of
processing, see newrelfilenode = 0?

I've earlier come up with a couple possible solutions, but I
unfortunately found holes in all of them. And if I can find holes in
them, there surely are more :(.

I don't recall what the problem with just swapping the names was - but
I'm pretty sure there was one... Hm. The index relation oids are
referred to by constraints and dependencies. That's somewhat
solvable. But I think there was something else as well...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Alvaro Herrera

alvherre@2ndquadrant.com

about 11 years ago

In reply to: Andres Freund (#25)

Re: REINDEX CONCURRENTLY 2.0

Andres Freund wrote:

On 2014-11-12 18:23:38 -0500, Robert Haas wrote:

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

How about storing both the old and new relfilenodes in the same pg_class entry?

That's quite a cool idea

[think a bit]

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key.

Hmm, can we make the relmapper do this job instead of having another
pg_class column? Essentially the same sketch Robert proposed, instead
we would initially set relfilenode=0 and have all onlookers use the
relmapper to obtain the correct relfilenode; switching to the new
relfilenode can be done atomically, and un-relmap the index once the
process is complete.

The difference from what Robert proposes is that the transient state is
known to cause failures for anyone not prepared to deal with it, so it
should be easy to spot what places need adjustment.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Andres Freund (#25)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund <andres@2ndquadrant.com> wrote:

I don't recall what the problem with just swapping the names was - but
I'm pretty sure there was one... Hm. The index relation oids are
referred to by constraints and dependencies. That's somewhat
solvable. But I think there was something else as well...

The reason given 2 years ago for not using relname was the fast that
the oid of the index changes, and to it be refered by some pg_depend
entries:
/messages/by-id/20121208133730.GA6422@awork2.anarazel.de
/messages/by-id/12742.1354977643@sss.pgh.pa.us
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Michael Paquier (#27)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Nov 13, 2014 at 10:26 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund <andres@2ndquadrant.com> wrote:

I don't recall what the problem with just swapping the names was - but
I'm pretty sure there was one... Hm. The index relation oids are
referred to by constraints and dependencies. That's somewhat
solvable. But I think there was something else as well...

The reason given 2 years ago for not using relname was the fast that
the oid of the index changes, and to it be refered by some pg_depend
entries:

Feel free to correct: "and that it could be referred".
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Robert Haas

robertmhaas@gmail.com

about 11 years ago

In reply to: Andres Freund (#25)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key. I don't
think we want to touch all those places to also disambiguate on some
other factor. All the relevant APIs are either just passing around oids
or relcache entries.

I'm not quite following this. The whole point is to AVOID having two
indexes. You have one index which may at times have two sets of
physical storage.

There's also the problem that we'd really need two different pg_index
rows to make things work. Alternatively we can duplicate the three
relevant columns (indisready, indislive, indislive) in there for the
different filenodes. But that's not entirely pretty.

I think what you would probably end up with is a single "char" or int2
column that defines the state of the index. Certain states would be
valid only when relnewfilenode != 0.

1. Take a snapshot.
2. Index all the tuples in that snapshot.
3. Publish the new relfilenode to an additional pg_class column,
relnewfilenode or similar.
4. Wait until everyone can see step #3.

Here all backends need to update both indexes, right?

Yes.

And all the
indexing infrastructure can't deal with that without having separate
oids & relcache entries.

Why not?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Peter Eisentraut

peter_e@gmx.net

about 11 years ago

In reply to: Andres Freund (#25)

Re: REINDEX CONCURRENTLY 2.0

On 11/12/14 7:31 PM, Andres Freund wrote:

Yes, it sucks. But it beats not being able to reindex a relation with a
primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks. Use a different adverb for an in-between
facility.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

Andres Freund

andres@2ndquadrant.com

about 11 years ago

In reply to: Peter Eisentraut (#30)

Re: REINDEX CONCURRENTLY 2.0

On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote:

On 11/12/14 7:31 PM, Andres Freund wrote:

Yes, it sucks. But it beats not being able to reindex a relation with

a

primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks. Use a different adverb for an
in-between
facility.

I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get around to the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs between the actions.

I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder to see the consequences of that.

--
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Jim Nasby

Jim.Nasby@BlueTreble.com

about 11 years ago

In reply to: Andres Freund (#31)

Re: REINDEX CONCURRENTLY 2.0

On 11/13/14, 3:50 PM, Andres Freund wrote:

On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote:

On 11/12/14 7:31 PM, Andres Freund wrote:

Yes, it sucks. But it beats not being able to reindex a relation with

a

primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks. Use a different adverb for an
in-between
facility.

I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get around to the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs between the actions.

It also means that if we ever found a way to get rid of the exclusive lock we'd then have an inconsistency anyway. Or we'd also create REINDEX CONCURRENT at that time, and then have 2 command syntaxes to support.

I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder to see the consequences of that.

Having been responsible for a site where downtime was a 6 figure dollar amount per hour, I've spent a LOT of time worrying about lock problems. The really big issue here isn't grabbing an exclusive lock; it's grabbing one at some random time when no one is there to actively monitor what's happening. (If you can't handle *any* exclusive locks, that also means you can never do an ALTER TABLE ADD COLUMN either.) With that in mind, would it be possible to set this up so that the time-consuming process of building the new index file happens first, and then (optionally) some sort of DBA action is required to actually do the relfilenode swap? I realize that's not the most elegant solution, but it's WAY better than this feature not hitting 9.5 and people having to hand-code a solution.

Possible syntax:
REINDEX CONCURRENTLY -- Does what current patch does
REINDEX CONCURRENT BUILD -- Builds new files
REINDEX CONCURRENT SWAP -- Swaps new files in

This suffers from the syntax problems I mentioned above, but at least this way it's all limited to one command, and it probably allows a lot more people to use it.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Andres Freund

andres@2ndquadrant.com

about 11 years ago

In reply to: Jim Nasby (#32)

Re: REINDEX CONCURRENTLY 2.0

On 2014-11-14 02:04:00 -0600, Jim Nasby wrote:

On 11/13/14, 3:50 PM, Andres Freund wrote:
Having been responsible for a site where downtime was a 6 figure
dollar amount per hour, I've spent a LOT of time worrying about lock
problems. The really big issue here isn't grabbing an exclusive lock;
it's grabbing one at some random time when no one is there to actively
monitor what's happening. (If you can't handle *any* exclusive locks,
that also means you can never do an ALTER TABLE ADD COLUMN either.)

With that in mind, would it be possible to set this up so that the
time-consuming process of building the new index file happens first,
and then (optionally) some sort of DBA action is required to actually
do the relfilenode swap? I realize that's not the most elegant
solution, but it's WAY better than this feature not hitting 9.5 and
people having to hand-code a solution.

I don't think having a multi step version of the feature and it not
making into 9.5 are synonymous. And I really don't want to make it even
more complex before we have the basic version in.

I think a split like your:

Possible syntax:
REINDEX CONCURRENTLY -- Does what current patch does
REINDEX CONCURRENT BUILD -- Builds new files
REINDEX CONCURRENT SWAP -- Swaps new files in

could make sense, but it's really an additional feature ontop.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Andres Freund

andres@2ndquadrant.com

about 11 years ago

In reply to: Robert Haas (#29)

Re: REINDEX CONCURRENTLY 2.0

On 2014-11-13 11:41:18 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key. I don't
think we want to touch all those places to also disambiguate on some
other factor. All the relevant APIs are either just passing around oids
or relcache entries.

I'm not quite following this. The whole point is to AVOID having two
indexes. You have one index which may at times have two sets of
physical storage.

Right. But how are we going to refer to these different relfilenodes?
All the indexing infrastructure just uses oids and/or Relation pointers
to refer to the index. How would you hand down the knowledge which of
the relfilenodes is supposed to be used in some callchain?

There's ugly solutions like having a flag like 'bool
rd_useotherfilenode' inside struct RelationData, but even ignoring the
uglyness I don't think that'd work well - what if some function called
inside the index code again starts a index lookup?

I think I might just not getting your idea here?

And all the
indexing infrastructure can't deal with that without having separate
oids & relcache entries.

Hopefully explained above?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Robert Haas

robertmhaas@gmail.com

about 11 years ago

In reply to: Andres Freund (#34)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Nov 14, 2014 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-11-13 11:41:18 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key. I don't
think we want to touch all those places to also disambiguate on some
other factor. All the relevant APIs are either just passing around oids
or relcache entries.

I'm not quite following this. The whole point is to AVOID having two
indexes. You have one index which may at times have two sets of
physical storage.

Right. But how are we going to refer to these different relfilenodes?
All the indexing infrastructure just uses oids and/or Relation pointers
to refer to the index. How would you hand down the knowledge which of
the relfilenodes is supposed to be used in some callchain?

If you've got a Relation, you don't need someone to tell you which
physical storage to use; you can figure that out for yourself by
looking at the Relation. If you've got an OID, you're probably going
to go conjure up a Relation, and then you can do the same thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Alvaro Herrera (#26)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Nov 13, 2014 at 10:25 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Andres Freund wrote:

On 2014-11-12 18:23:38 -0500, Robert Haas wrote:

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

How about storing both the old and new relfilenodes in the same pg_class entry?

That's quite a cool idea

[think a bit]

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key.

Hmm, can we make the relmapper do this job instead of having another
pg_class column? Essentially the same sketch Robert proposed, instead
we would initially set relfilenode=0 and have all onlookers use the
relmapper to obtain the correct relfilenode; switching to the new
relfilenode can be done atomically, and un-relmap the index once the
process is complete.
The difference from what Robert proposes is that the transient state is
known to cause failures for anyone not prepared to deal with it, so it
should be easy to spot what places need adjustment.

How would the failure handling actually work? Would we need some extra
process to remove the extra relfilenodes? Note that in the current
patch the temporary concurrent entry is kept as INVALID all the time,
giving the user a path to remove them with DROP INDEX even in the case
of invalid toast indexes in catalog pg_toast.

Note that I am on the side of using the exclusive lock when swapping
relfilenodes for now in any case, that's what pg_repack does by
renaming the indexes, and people use it.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37

Oskari Saarenmaa

os@ohmu.fi

about 11 years ago

In reply to: Andres Freund (#31)

Re: REINDEX CONCURRENTLY 2.0

13.11.2014, 23:50, Andres Freund kirjoitti:

On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote:

On 11/12/14 7:31 PM, Andres Freund wrote:

Yes, it sucks. But it beats not being able to reindex a relation with

a

primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks. Use a different adverb for an
in-between
facility.

I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get around to the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs between the actions.

I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder to see the consequences of that.

If the short-lived lock is the only blocker for this feature at the
moment could we just require an additional qualifier for CONCURRENTLY
(FORCE?) until the lock can be removed, something like:

tmp# REINDEX INDEX CONCURRENTLY tmp_pkey;
ERROR: REINDEX INDEX CONCURRENTLY is not fully concurrent; use REINDEX
INDEX CONCURRENTLY FORCE to perform reindex with a short-lived lock.

tmp=# REINDEX INDEX CONCURRENTLY FORCE tmp_pkey;
REINDEX

It's not optimal, but currently there's no way to reindex a primary key
anywhere close to concurrently and a short lock would be a huge
improvement over the current situation.

/ Oskari

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Oskari Saarenmaa (#37)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Dec 23, 2014 at 5:54 PM, Oskari Saarenmaa <os@ohmu.fi> wrote:

If the short-lived lock is the only blocker for this feature at the
moment could we just require an additional qualifier for CONCURRENTLY
(FORCE?) until the lock can be removed, something like:
=# [blah]

FWIW, I'd just keep only CONCURRENTLY with no fancy additional
keywords even if we cheat on it, as long as it is precised in the
documentation that an exclusive lock is taken for a very short time,
largely shorter than what a normal REINDEX would do btw.

It's not optimal, but currently there's no way to reindex a primary key
anywhere close to concurrently and a short lock would be a huge
improvement over the current situation.

Yep.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Robert Haas (#22)

Re: REINDEX CONCURRENTLY 2.0

On 2014-11-12 16:11:58 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

But more to the point .... why, precisely, can't this work without an
AccessExclusiveLock? And can't we fix that instead of setting for
something clearly inferior?

So, here's an alternative approach of how to get rid of the AEL
locks. They're required because we want to switch the relfilenodes
around. I've pretty much no confidence in any of the schemes anybody has
come up to avoid that.

So, let's not switch relfilenodes around.

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.

Does anybody see a fundamental problem with that approach?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Andres Freund (#39)

Re: REINDEX CONCURRENTLY 2.0

On Mon, Feb 2, 2015 at 9:10 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-11-12 16:11:58 -0500, Robert Haas wrote:

On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.

What he said.

But more to the point .... why, precisely, can't this work without an
AccessExclusiveLock? And can't we fix that instead of setting for
something clearly inferior?

So, here's an alternative approach of how to get rid of the AEL
locks. They're required because we want to switch the relfilenodes
around. I've pretty much no confidence in any of the schemes anybody has
come up to avoid that.

So, let's not switch relfilenodes around.

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.

Does anybody see a fundamental problem with that approach?

I'm not sure whether that will work out, but it seems worth a try.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Andres Freund (#39)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On 02/02/2015 03:10 PM, Andres Freund wrote:

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.

I think that sounds like a good plan. The oid change does not seem like
a too big deal to me, especially since that is what users will get now
too. Do you still think this is the right way to solve this?

I have attached my work in progress patch which implements and is very
heavily based on Michael's previous work. There are some things left to
do but I think I should have a patch ready for the next commitfest if
people still like this type of solution.

I also changed index_set_state_flags() to be transactional since I
wanted the old index to become invalid at exactly the same time as the
new becomes valid. From reviewing the code that seems like a safe change.

A couple of bike shedding questions:

- Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok?

- What should we do with REINDEX DATABASE CONCURRENTLY and the system
catalog? I so not think we can reindex the system catalog concurrently
safely, so what should REINDEX DATABASE do with the catalog indexes?
Skip them, reindex them while taking locks, or just error out?

- What level of information should be output in VERBOSE mode?

What remains to be implemented:

- Support for exclusion constraints
- Look more into how I handle constraints (currently the temporary index
too seems to have the PRIMARY KEY flag)
- Support for the VERBOSE flag
- More testing to catch bugs

Andreas

Attachments:

reindex-concurrently-wip.patchtext/x-patch; name=reindex-concurrently-wip.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..24464020cd 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
      </para>
     </listitem>
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
      <para>
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_class.indisvalid</> is switched to
+       <quote>true</> for the new index and to <quote>false</> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_class.isready</> switched to <quote>false</>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name ending in
+    the suffix cct, or ccto if it is an old index definiton which we failed to
+    drop. Invalid indexes can be dropped using <literal>DROP INDEX</> including
+    invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +446,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 4e2ebe1ae7..2f93d3e954 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -260,6 +260,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 }
 
 /*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
  * Free a TupleDesc including all substructure
  */
 void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f8d92145e8..cc183aa62e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -677,6 +677,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -690,6 +691,10 @@ UpdateIndexRelation(Oid indexoid,
  * is_internal: if true, post creation hook for new index
  * if_not_exists: if true, do not throw an error if a relation with
  *		the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -706,6 +711,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -714,7 +720,8 @@ index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists)
+			 bool if_not_exists,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -760,7 +767,7 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -808,14 +815,21 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed
+	 * by caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1123,6 +1137,380 @@ index_create(Relation heapRelation,
 }
 
 /*
+ * index_concurrent_create_copy
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent reindex processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create_copy(Relation heapRelation, Oid indOid, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 newName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 indexRelation->rd_index->indisprimary,
+								 false,	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal? */
+								 false, /* if_not_exists? */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts. Once we
+	 * commit this transaction, any new transactions that open the table must
+	 * insert new entries into the index for insertions and non-HOT updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap name, dependencies and constraints of the old index over to the new
+ * index.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
+{
+	Relation		oldIndexRel, newIndexRel, pg_class;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_class	oldIndexForm, newIndexForm;
+	Oid				constraintOid = get_index_constraint(oldIndexOid);
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldIndexRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newIndexRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+	/* Swap the names */
+	namestrcpy(&newIndexForm->relname, NameStr(oldIndexForm->relname));
+	namestrcpy(&oldIndexForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_class, &newIndexTuple->t_self, newIndexTuple);
+
+	if (OidIsValid(constraintOid)) {
+		ObjectAddress	myself, referenced;
+		Relation		pg_constraint;
+		HeapTuple		constraintTuple;
+
+		pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(oldIndexTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		((Form_pg_constraint) GETSTRUCT(constraintTuple))->conindid = newIndexOid;
+
+		CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+
+		heap_freetuple(constraintTuple);
+		heap_close(pg_constraint, RowExclusiveLock);
+
+		deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+										RelationRelationId, DEPENDENCY_AUTO);
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		// TODO: pg_depend for old index?
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexOid;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = constraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependencyForAll(RelationRelationId, oldIndexOid, newIndexOid);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldIndexRel, NoLock);
+	relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * Now we must wait until no running transaction could be using the
+	 * index for a query.  Use AccessExclusiveLock here to check for
+	 * running transactions that hold locks of any kind on the table. Note
+	 * we do not need to worry about xacts that open the table for reading
+	 * after this point; they will see the index as invalid when they open
+	 * the relation.
+	 *
+	 * Note: the reason we use actual lock acquisition here, rather than
+	 * just checking the ProcArray and sleeping, is that deadlock is
+	 * possible if one of the transactions in question is blocked trying
+	 * to acquire an exclusive lock on our table. The lock code will
+	 * detect deadlock and error out properly.
+	 */
+	WaitForLockers(locktag, AccessExclusiveLock);
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object,
+					DROP_RESTRICT,
+					0);
+}
+
+/*
  * index_constraint_create
  *
  * Set up a constraint associated with an index.  Return the new constraint's
@@ -1472,52 +1860,8 @@ index_drop(Oid indexId, bool concurrent)
 		CommitTransactionCommand();
 		StartTransactionCommand();
 
-		/*
-		 * Now we must wait until no running transaction could be using the
-		 * index for a query.  Use AccessExclusiveLock here to check for
-		 * running transactions that hold locks of any kind on the table. Note
-		 * we do not need to worry about xacts that open the table for reading
-		 * after this point; they will see the index as invalid when they open
-		 * the relation.
-		 *
-		 * Note: the reason we use actual lock acquisition here, rather than
-		 * just checking the ProcArray and sleeping, is that deadlock is
-		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
-		 * detect deadlock and error out properly.
-		 */
-		WaitForLockers(heaplocktag, AccessExclusiveLock);
-
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId, heaplocktag);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
@@ -3185,18 +3529,7 @@ validate_index_heapscan(Relation heapRelation,
  * index_set_state_flags - adjust pg_index state flags
  *
  * This is used during CREATE/DROP INDEX CONCURRENTLY to adjust the pg_index
- * flags that denote the index's state.  Because the update is not
- * transactional and will not roll back on error, this must only be used as
- * the last step in a transaction that has not made any transactional catalog
- * updates!
- *
- * Note that heap_inplace_update does send a cache inval message for the
- * tuple, so other sessions will hear about the update as soon as we commit.
- *
- * NB: In releases prior to PostgreSQL 9.4, the use of a non-transactional
- * update here would have been unsafe; now that MVCC rules apply even for
- * system catalog scans, we could potentially use a transactional update here
- * instead.
+ * flags that denote the index's state.
  */
 void
 index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
@@ -3205,9 +3538,6 @@ index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
 	HeapTuple	indexTuple;
 	Form_pg_index indexForm;
 
-	/* Assert that current xact hasn't done any transactional updates */
-	Assert(GetTopTransactionIdIfAny() == InvalidTransactionId);
-
 	/* Open pg_index and fetch a writable copy of the index's tuple */
 	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
 
@@ -3266,8 +3596,7 @@ index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
 			break;
 	}
 
-	/* ... and write it back in-place */
-	heap_inplace_update(pg_index, indexTuple);
+	CatalogTupleUpdate(pg_index, &indexTuple->t_self, indexTuple);
 
 	heap_close(pg_index, RowExclusiveLock);
 }
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d0ee851215..e294e7e313 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -377,6 +377,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 }
 
 /*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+					   Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		errmsg("cannot remove dependency on %s because it is a system object",
+			   getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
+/*
  * isObjectPinned()
  *
  * Test if an object is required for basic database functionality.
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 0e4231668d..96044663e9 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -332,9 +332,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 true, false, false, false,
-				 true, false, false, true, false);
+				 true, false, false, true, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 265e9b33f7..f23e4a1c27 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -72,11 +72,13 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
 				  bool isconstraint);
 static char *ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint);
+				bool primary, bool isconstraint,
+				bool concurrent, bool concurrentold);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 /*
  * CheckIndexCompatible
@@ -283,6 +285,87 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -320,7 +403,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -331,9 +413,7 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -474,7 +554,9 @@ DefineIndex(Oid relationId,
 											indexColNames,
 											stmt->excludeOpNames,
 											stmt->primary,
-											stmt->isconstraint);
+											stmt->isconstraint,
+											false,
+											false);
 
 	/*
 	 * look up the access method, verify it can handle the requested features
@@ -661,12 +743,12 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
 					 stmt->concurrent, !check_rights,
-					 stmt->if_not_exists);
+					 stmt->if_not_exists, false);
 
 	ObjectAddressSet(address, RelationRelationId, indexRelationId);
 
@@ -756,34 +838,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -841,74 +904,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -1600,7 +1598,8 @@ ChooseRelationName(const char *name1, const char *name2,
 static char *
 ChooseIndexName(const char *tabname, Oid namespaceId,
 				List *colnames, List *exclusionOpNames,
-				bool primary, bool isconstraint)
+				bool primary, bool isconstraint,
+				bool concurrent, bool concurrentold)
 {
 	char	   *indexname;
 
@@ -1626,6 +1625,20 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
 									   "key",
 									   namespaceId);
 	}
+	else if (concurrent)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "cct",
+									   namespaceId);
+	}
+	else if (concurrentold)
+	{
+		indexname = ChooseRelationName(tabname,
+									   NULL,
+									   "ccto",
+									   namespaceId);
+	}
 	else
 	{
 		indexname = ChooseRelationName(tabname,
@@ -1738,7 +1751,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -1750,8 +1763,9 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+									  concurrent, concurrent,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
 
@@ -1763,7 +1777,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 
 	return indOid;
 }
@@ -1832,18 +1849,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   concurrent, concurrent,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1861,7 +1887,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -1989,19 +2015,24 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
-
-			if (options & REINDEXOPT_VERBOSE)
-				ereport(INFO,
-						(errmsg("table \"%s.%s\" was reindexed",
-								get_namespace_name(get_rel_namespace(relid)),
+
+		if (concurrent)
+			result = ReindexRelationConcurrently(relid, options);
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+		if (result && (options & REINDEXOPT_VERBOSE))
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
@@ -2010,3 +2041,581 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 
 	MemoryContextDelete(private_context);
 }
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *concurrentIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc, *lc2;
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation, session lock used to
+	 * similarly protect from any schema change that could happen within the
+	 * multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+						indexIds = lappend_oid(indexIds, cellOid);
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+							indexIds = lappend_oid(indexIds, cellOid);
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+					indexIds = list_make1_oid(relationOid);
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built later. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseIndexName(get_rel_name(indOid),
+										 get_rel_namespace(indexRel->rd_index->indrelid),
+										 NULL,
+										 NULL,
+										 false,
+										 false,
+										 true,
+										 false);
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create_copy(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with any new tuples
+	 * that were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+		Snapshot		snapshot;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the concurrent index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we get only get constraint violations from the
+	 * indexes with the correct names.
+	 */
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Each index needs to be swapped in a separate transaction, so start
+		 * a new one.
+		 */
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseIndexName(get_rel_name(indOid),
+								  get_rel_namespace(relOid),
+								  NULL,
+								  NULL,
+								  false,
+								  false,
+								  false,
+								  true);
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid, oldName);
+
+		/* Swap which index is valid */
+		index_set_state_flags(indOid, INDEX_DROP_CLEAR_VALID);
+		index_set_state_flags(concurrentOid, INDEX_CREATE_SET_VALID);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/* Commit this transaction and make old index invalidation visible */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them. One transaction is used for each single index
+	 * entry.
+	 */
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Finish the index invalidation and set it as dead. Note that it is
+		 * necessary to wait for for virtual locks on the parent relation
+		 * before setting the index as dead.
+		 */
+		index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the concurrent entries are
+	 * already considered as invalid and not ready, so they will not be used
+	 * by other backends for any read or write operations.
+	 */
+	foreach(lc, indexIds)
+	{
+		Oid 		indOid = lfirst_oid(lc);
+		Oid			relOid;
+		LOCKTAG	   *heapLockTag = NULL;
+		ListCell   *cell;
+
+		/* Check for any process interruption */
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start transaction to drop this index */
+		StartTransactionCommand();
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Get fresh snapshot for next step */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Find the locktag of parent table for this index, we need to wait for
+		 * locks on it.
+		 */
+		foreach(cell, lockTags)
+		{
+			LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+			if (relOid == localTag->locktag_field2)
+				heapLockTag = localTag;
+		}
+		Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+		/*
+		 * Wait till every transaction that saw the old index state has
+		 * finished.
+		 */
+		WaitForLockers(*heapLockTag, AccessExclusiveLock);
+
+		/*
+		 * Open transaction if necessary, for the first index treated its
+		 * transaction has been already opened previously.
+		 */
+		index_concurrent_drop(indOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/* Commit this transaction to make the update visible. */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Get fresh snapshot for the end of process */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	return true;
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 37a4c4a3d6..8cf2c9ae1d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1084,6 +1084,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		expected_relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1131,7 +1132,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 30d733e57a..d814b96c7c 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4066,6 +4066,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 55c73b7292..7601593e7b 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2026,6 +2026,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 174773bdf3..06d8daeef1 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7922,42 +7922,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 5d3be38bf5..9b0b950dca 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -773,16 +773,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -798,7 +802,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												(stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												(stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												"REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5349c39411..553b25a499 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1897,6 +1897,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6e759d0b76..2617a48c0b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2932,12 +2932,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches1("REINDEX"))
 		COMPLETE_WITH_LIST5("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches2("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches3("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+	else if (Matches3("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches3("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches3("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index b48f839028..b7120c6702 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -93,6 +93,8 @@ extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 				   TupleDesc src, AttrNumber srcAttno);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 10759c7c58..60d5c7c9ee 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -235,6 +235,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+								   Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, Oid *tableId, int32 *colId);
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 20bec90b9d..19d42e000b 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -16,6 +16,7 @@
 
 #include "catalog/objectaddress.h"
 #include "nodes/execnodes.h"
+#include "storage/lock.h"
 
 
 #define DEFAULT_INDEX_TYPE	"btree"
@@ -54,6 +55,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -62,7 +64,26 @@ extern Oid index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists);
+			 bool if_not_exists,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create_copy(Relation heapRelation,
+										Oid indOid,
+										const char *newName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  const char *oldName);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid,
+									  LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 8740cee944..e73432900a 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -29,10 +29,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern Oid	ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 07a8436143..5861587826 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3111,6 +3111,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 2606a27624..7a11eca488 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -38,6 +38,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index e519fdb0f6..83a800618a 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3072,3 +3072,61 @@ DROP ROLE regress_reindexuser;
 SET client_min_messages TO 'warning';
 DROP SCHEMA schema_to_reindex CASCADE;
 RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+         Table "public.concur_reindex_tab"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ c1     | integer |           | not null | 
+ c2     | text    |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 1648072568..89bb3974d6 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1086,3 +1086,46 @@ DROP ROLE regress_reindexuser;
 SET client_min_messages TO 'warning';
 DROP SCHEMA schema_to_reindex CASCADE;
 RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#42

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#41)

Re: REINDEX CONCURRENTLY 2.0

On Sun, Feb 12, 2017 at 6:44 AM, Andreas Karlsson <andreas@proxel.se> wrote:

On 02/02/2015 03:10 PM, Andres Freund wrote:

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.

I think that sounds like a good plan. The oid change does not seem like a
too big deal to me, especially since that is what users will get now too. Do
you still think this is the right way to solve this?

That hurts mainly system indexes. Perhaps users with broken system
indexes are not going to care about concurrency anyway. Thinking now
about it I don't see how that would not work, but I did not think
deeply about this problem lately.

I have attached my work in progress patch which implements and is very
heavily based on Michael's previous work. There are some things left to do
but I think I should have a patch ready for the next commitfest if people
still like this type of solution.

Cool to see a rebase of this patch. It's been a long time...

I also changed index_set_state_flags() to be transactional since I wanted
the old index to become invalid at exactly the same time as the new becomes
valid. From reviewing the code that seems like a safe change.

A couple of bike shedding questions:

- Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok?

Yeah, that's fine. At least that's what has been concluded in previous threads.

- What should we do with REINDEX DATABASE CONCURRENTLY and the system
catalog? I so not think we can reindex the system catalog concurrently
safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
them, reindex them while taking locks, or just error out?

System indexes cannot have their OIDs changed as they are used in
syscache lookups. So just logging a warning looks fine to me, and the
price to pay to avoid taking an exclusive lock even for a short amount
of time.

- What level of information should be output in VERBOSE mode?

Er, something like that as well, no?
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

What remains to be implemented:
- Support for exclusion constraints
- Look more into how I handle constraints (currently the temporary index too
seems to have the PRIMARY KEY flag)
- Support for the VERBOSE flag
- More testing to catch bugs

This is a crasher:
create table aa (a int primary key);
reindex (verbose) schema concurrently public ;

For invalid indexes sometimes snapshots are still active (after
issuing the previous crash for example):
=# reindex (verbose) table concurrently aa;
WARNING: XX002: cannot reindex concurrently invalid index
"public.aa_pkey_cct", skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2119
WARNING: 01000: snapshot 0x7fde12003038 still active
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Michael Paquier (#42)

Re: REINDEX CONCURRENTLY 2.0

On 02/13/2017 06:31 AM, Michael Paquier wrote:

- What should we do with REINDEX DATABASE CONCURRENTLY and the system
catalog? I so not think we can reindex the system catalog concurrently
safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
them, reindex them while taking locks, or just error out?

System indexes cannot have their OIDs changed as they are used in
syscache lookups. So just logging a warning looks fine to me, and the
price to pay to avoid taking an exclusive lock even for a short amount
of time.

Good idea, I think I will add one line of warning if it finds any system
index in the schema.

- What level of information should be output in VERBOSE mode?

Er, something like that as well, no?
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

REINDEX (VERBOSE) currently prints one such line per index, which does
not really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all
indexes on a relation at the same time. It is not immediately obvious
how this should work. Maybe one such detail line per table?

This is a crasher:
create table aa (a int primary key);
reindex (verbose) schema concurrently public ;

For invalid indexes sometimes snapshots are still active (after
issuing the previous crash for example):
=# reindex (verbose) table concurrently aa;
WARNING: XX002: cannot reindex concurrently invalid index
"public.aa_pkey_cct", skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2119
WARNING: 01000: snapshot 0x7fde12003038 still active

Thanks for testing the patch! The crash was caused by things being
allocated in the wrong memory context when reindexing multiple tables
and therefore freed on the first intermediate commit. I have created a
new memory context to handle this in which I only allocate the lists
which need to survive between transactions..

Hm, when writing the above I just realized why ReindexTable/ReindexIndex
did not suffer from the same bug. It is because the first transaction
there allocated in the PortalHeapMemory context which survives commit. I
really need to look at if there is a clean way to handle memory contexts
in my patch.

I also found the snapshot still active bug, it seems to have been caused
by REINDEX TABLE CONCURRENTLY leaving an open snapshot which cannot be
popped by PortalRunUtility().

Thanks again!
Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#43)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote:

On 02/13/2017 06:31 AM, Michael Paquier wrote:

Er, something like that as well, no?
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

REINDEX (VERBOSE) currently prints one such line per index, which does not
really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
on a relation at the same time. It is not immediately obvious how this
should work. Maybe one such detail line per table?

Hard to recall this thing in details with the time and the fact that a
relation is reindexed by processing all the indexes once at each step.
Hm... What if ReindexRelationConcurrently() actually is refactored in
such a way that it processes all the steps for each index
individually? This way you can monitor the time it takes to build
completely each index, including its . This operation would consume
more transactions but in the event of a failure the amount of things
to clean up is really reduced particularly for relations with many
indexes. This would as well reduce VERBOSE to print one line per index
rebuilt.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Michael Paquier (#44)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Feb 14, 2017 at 12:56 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

This way you can monitor the time it takes to build
completely each index, including its .

You can ignore the terms "including its" here.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#46

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Michael Paquier (#44)

Re: REINDEX CONCURRENTLY 2.0

On 02/14/2017 04:56 AM, Michael Paquier wrote:

On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote:

On 02/13/2017 06:31 AM, Michael Paquier wrote:

Er, something like that as well, no?
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

REINDEX (VERBOSE) currently prints one such line per index, which does not
really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
on a relation at the same time. It is not immediately obvious how this
should work. Maybe one such detail line per table?

Hard to recall this thing in details with the time and the fact that a
relation is reindexed by processing all the indexes once at each step.
Hm... What if ReindexRelationConcurrently() actually is refactored in
such a way that it processes all the steps for each index
individually? This way you can monitor the time it takes to build
completely each index, including its . This operation would consume
more transactions but in the event of a failure the amount of things
to clean up is really reduced particularly for relations with many
indexes. This would as well reduce VERBOSE to print one line per index
rebuilt.

I am actually thinking about going the opposite direction (by reducing
the number of times we call WaitForLockers), because it is not just
about consuming transaction IDs, we also do not want to wait too many
times for transactions to commit. I am leaning towards only calling
WaitForLockersMultiple three times per table.

1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for
lockers once per index. This seems rather wasteful.

I have thought about that the code might be cleaner if we just looped
over all indexes (and as a bonus the VERBOSE output would be more
obvious), but I do not think it is worth waiting for lockers all those
extra times.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Andreas Karlsson (#46)

Re: REINDEX CONCURRENTLY 2.0

On 02/17/2017 01:53 PM, Andreas Karlsson wrote:

I am actually thinking about going the opposite direction (by reducing
the number of times we call WaitForLockers), because it is not just
about consuming transaction IDs, we also do not want to wait too many
times for transactions to commit. I am leaning towards only calling
WaitForLockersMultiple three times per table.

1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for
lockers once per index. This seems rather wasteful.

I have thought about that the code might be cleaner if we just looped
over all indexes (and as a bonus the VERBOSE output would be more
obvious), but I do not think it is worth waiting for lockers all those
extra times.

Thinking about this makes me wonder about why you decided to use a
transaction per index in many of the steps rather than a transaction per
step. Most steps should be quick. The only steps I think the makes sense
to have a transaction per table are.

1) When building indexes to avoid long running transactions.

2) When validating the new indexes, also to avoid long running transactions.

But when swapping the indexes or when dropping the old indexes I do not
see any reason to not just use one transaction per step since we do not
even have to wait for any locks (other than WaitForLockers which we just
want to call once anyway since all indexes relate to the same table).

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#48

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#47)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote:

Thinking about this makes me wonder about why you decided to use a
transaction per index in many of the steps rather than a transaction per
step. Most steps should be quick. The only steps I think the makes sense to
have a transaction per table are.

I don't recall all the details to be honest :)

1) When building indexes to avoid long running transactions.
2) When validating the new indexes, also to avoid long running transactions.

But when swapping the indexes or when dropping the old indexes I do not see
any reason to not just use one transaction per step since we do not even
have to wait for any locks (other than WaitForLockers which we just want to
call once anyway since all indexes relate to the same table).

Perhaps, this really needs a careful lookup.

By the way, as this patch is showing up for the first time in this
development cycle, would it be allowed in the last commit fest? That's
not a patch in the easy category, far from that, but it does not
present a new concept.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#49

Bruce Momjian

bruce@momjian.us

almost 9 years ago

In reply to: Michael Paquier (#48)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote:

On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote:

Thinking about this makes me wonder about why you decided to use a
transaction per index in many of the steps rather than a transaction per
step. Most steps should be quick. The only steps I think the makes sense to
have a transaction per table are.

I don't recall all the details to be honest :)

1) When building indexes to avoid long running transactions.
2) When validating the new indexes, also to avoid long running transactions.

But when swapping the indexes or when dropping the old indexes I do not see
any reason to not just use one transaction per step since we do not even
have to wait for any locks (other than WaitForLockers which we just want to
call once anyway since all indexes relate to the same table).

Perhaps, this really needs a careful lookup.

By the way, as this patch is showing up for the first time in this
development cycle, would it be allowed in the last commit fest? That's
not a patch in the easy category, far from that, but it does not
present a new concept.

FYI, the thread started on 2013-11-15.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#50

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Bruce Momjian (#49)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Feb 28, 2017 at 5:29 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote:

On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote:

Thinking about this makes me wonder about why you decided to use a
transaction per index in many of the steps rather than a transaction per
step. Most steps should be quick. The only steps I think the makes sense to
have a transaction per table are.

I don't recall all the details to be honest :)

1) When building indexes to avoid long running transactions.
2) When validating the new indexes, also to avoid long running transactions.

But when swapping the indexes or when dropping the old indexes I do not see
any reason to not just use one transaction per step since we do not even
have to wait for any locks (other than WaitForLockers which we just want to
call once anyway since all indexes relate to the same table).

Perhaps, this really needs a careful lookup.

By the way, as this patch is showing up for the first time in this
development cycle, would it be allowed in the last commit fest? That's
not a patch in the easy category, far from that, but it does not
present a new concept.

FYI, the thread started on 2013-11-15.

I don't object to the addition of this patch in next CF as this
presents no new concept. Still per the complications this patch and
because it is a complicated patch I was wondering if people are fine
to include it in this last CF.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#51

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Michael Paquier (#50)

Re: REINDEX CONCURRENTLY 2.0

Michael Paquier <michael.paquier@gmail.com> writes:

I don't object to the addition of this patch in next CF as this
presents no new concept. Still per the complications this patch and
because it is a complicated patch I was wondering if people are fine
to include it in this last CF.

The March CF is already looking pretty daunting. We can try to include
this but I won't be too surprised if it gets punted to a future CF.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#52

Bruce Momjian

bruce@momjian.us

almost 9 years ago

In reply to: Tom Lane (#51)

Re: REINDEX CONCURRENTLY 2.0

On Mon, Feb 27, 2017 at 05:31:21PM -0500, Tom Lane wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

I don't object to the addition of this patch in next CF as this
presents no new concept. Still per the complications this patch and
because it is a complicated patch I was wondering if people are fine
to include it in this last CF.

The March CF is already looking pretty daunting. We can try to include
this but I won't be too surprised if it gets punted to a future CF.

Yeah, that was my reaction too.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#53

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Michael Paquier (#1)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

Hi,

Here is a third take on this feature, heavily based on Michael Paquier's
2.0 patch. This time the patch does not attempt to preserve the index
oids, but instead creates new indexes and moves all dependencies from
the old indexes to the new before dropping the old ones. The only
downside I can see to this approach is that we no logner will able to
reindex catalog tables concurrently, but in return it should be easier
to confirm that this approach can be made work.

This patch relies on that we can change the indisvalid flag of indexes
transactionally, and as far as I can tell this is the case now that we
have MVCC for the catalog updates.

The code does some extra intermediate commits when building the indexes
to avoid long running transactions.

How REINDEX CONCURRENTLY operates:

For each table:

1. Create new indexes without populating them, and lock the tables and
indexes for the session.

2. After waiting for all running transactions populate each index in a
separate transaction and set them to ready.

3. After waiting again for all running transactions validate each index
in a separate transaction (but not setting them to valid just yet).

4. Swap all dependencies over from each old index to the new index and
rename the old and the new indexes (from the <name> to <name>_ccold and
<name>_new to <name>), and set isprimary and isexclusion flags. Here we
also mark the new indexes as valid and the old indexes as invalid.

5. After waiting for all running transactions we change each index from
invalid to dead.

6. After waiting for all running transactions we drop each index.

7. Drop all session locks.

Andreas

Attachments:

reindex-concurrenctly-v1.patchtext/x-diff; name=reindex-concurrenctly-v1.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
      </para>
     </listitem>
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
      <para>
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_class.indisvalid</> is switched to
+       <quote>true</> for the new index and to <quote>false</> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_class.isready</> switched to <quote>false</>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</>
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +446,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 4e2ebe1ae7..2f93d3e954 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -260,6 +260,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 }
 
 /*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
  * Free a TupleDesc including all substructure
  */
 void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f8d92145e8..7fc3344121 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -677,6 +677,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -690,6 +691,10 @@ UpdateIndexRelation(Oid indexoid,
  * is_internal: if true, post creation hook for new index
  * if_not_exists: if true, do not throw an error if a relation with
  *		the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -706,6 +711,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -714,7 +720,8 @@ index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists)
+			 bool if_not_exists,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -760,16 +767,19 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -808,14 +818,21 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed
+	 * by caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1123,6 +1140,404 @@ index_create(Relation heapRelation,
 }
 
 /*
+ * index_concurrent_create_copy
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent reindex processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create_copy(Relation heapRelation, Oid indOid, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	bool		initdeferred = false;
+	Oid			constraintOid = get_index_constraint(indOid);
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/*
+	 * Determine if index is initdeferred, this depends on its dependent
+	 * constraint.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		/* Look for the correct value */
+		HeapTuple			constraintTuple;
+		Form_pg_constraint	constraintForm;
+
+		constraintTuple = SearchSysCache1(CONSTROID,
+									 ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "cache lookup failed for constraint %u",
+				 constraintOid);
+		constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+		initdeferred = constraintForm->condeferred;
+
+		ReleaseSysCache(constraintTuple);
+	}
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 newName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 false, /* do not copy primary flag */
+								 false,	/* is constraint? */
+								 !indexRelation->rd_index->indimmediate,	/* is deferrable? */
+								 initdeferred,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal? */
+								 false, /* if_not_exists? */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts. Once we
+	 * commit this transaction, any new transactions that open the table must
+	 * insert new entries into the index for insertions and non-HOT updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap name, dependencies and constraints of the old index over to the new
+ * index.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
+{
+	Relation		pg_class, pg_index, oldClassRel, newClassRel;
+	HeapTuple		oldClassTuple, newClassTuple;
+	Form_pg_class	oldClassForm, newClassForm;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_index	oldIndexForm, newIndexForm;
+	Oid				constraintOid = get_index_constraint(oldIndexOid);
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniquness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+
+	/* Mark old index as valid and new is invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	if (OidIsValid(constraintOid)) {
+		ObjectAddress	myself, referenced;
+		Relation		pg_constraint;
+		HeapTuple		constraintTuple;
+
+		pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		((Form_pg_constraint) GETSTRUCT(constraintTuple))->conindid = newIndexOid;
+
+		CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+
+		heap_freetuple(constraintTuple);
+		heap_close(pg_constraint, RowExclusiveLock);
+
+		deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+										RelationRelationId, DEPENDENCY_AUTO);
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		// TODO: pg_depend for old index?
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexOid;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = constraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependencyForAll(RelationRelationId, oldIndexOid, newIndexOid);
+
+	/* Close relations and clean up */
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
+/*
  * index_constraint_create
  *
  * Set up a constraint associated with an index.  Return the new constraint's
@@ -1483,41 +1898,13 @@ index_drop(Oid indexId, bool concurrent)
 		 * Note: the reason we use actual lock acquisition here, rather than
 		 * just checking the ProcArray and sleeping, is that deadlock is
 		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
+		 * to acquire an exclusive lock on our table. The lock code will
 		 * detect deadlock and error out properly.
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d0ee851215..e294e7e313 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -377,6 +377,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 }
 
 /*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+					   Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		errmsg("cannot remove dependency on %s because it is a system object",
+			   getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
+/*
  * isObjectPinned()
  *
  * Test if an object is required for basic database functionality.
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 0e4231668d..96044663e9 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -332,9 +332,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 true, false, false, false,
-				 true, false, false, true, false);
+				 true, false, false, true, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 72bb06c760..7a51c25d98 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -51,6 +51,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -77,6 +78,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 /*
  * CheckIndexCompatible
@@ -283,6 +285,87 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -320,7 +403,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -331,9 +413,7 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -662,12 +742,12 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
 					 stmt->concurrent, !check_rights,
-					 stmt->if_not_exists);
+					 stmt->if_not_exists, false);
 
 	ObjectAddressSet(address, RelationRelationId, indexRelationId);
 
@@ -757,34 +837,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -842,74 +903,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -1739,7 +1735,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -1751,8 +1747,9 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+									  concurrent, concurrent,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
 
@@ -1764,7 +1761,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 
 	return indOid;
 }
@@ -1833,18 +1833,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   concurrent, concurrent,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1862,7 +1870,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -1874,6 +1882,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -1964,6 +1973,17 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!IsSystemClass(relid, classtuple))
 			continue;
 
+		/* A system catalog cannot be reindexed concurrently */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -1990,19 +2010,28 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
-
-			if (options & REINDEXOPT_VERBOSE)
-				ereport(INFO,
-						(errmsg("table \"%s.%s\" was reindexed",
-								get_namespace_name(get_rel_namespace(relid)),
+
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+		if (result && (options & REINDEXOPT_VERBOSE))
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
@@ -2011,3 +2040,597 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 
 	MemoryContextDelete(private_context);
 }
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *concurrentIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc, *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char	   *relationName = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation, session lock used to
+	 * similarly protect from any schema change that could happen within the
+	 * multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/* Save the list of relation OIDs in private context */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built later. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccold",
+											get_rel_namespace(indexRel->rd_index->indrelid));
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create_copy(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with any new tuples
+	 * that were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+		Snapshot		snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the concurrent index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we get only get constraint violations from the
+	 * indexes with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid));
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrent_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid 		indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrent_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+		ereport(INFO,
+				(errmsg("relation \"%s\" was reindexed",
+						relationName),
+				 errdetail("%s.",
+						   pg_rusage_show(&ru0))));
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+
+	return true;
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3cea220421..fa33242ca2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1079,6 +1079,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		expected_relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1126,7 +1127,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 05d8538717..1afe54dad3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4068,6 +4068,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index d595cd7481..519b8126dd 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2028,6 +2028,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e833b2eba5..8c68150eb1 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7930,42 +7930,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 3bc0ae5e7e..c6da772a7d 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -773,16 +773,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -798,7 +802,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												(stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												(stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												"REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5349c39411..553b25a499 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1897,6 +1897,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ddad71a10f..476a6a5b54 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2988,12 +2988,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches1("REINDEX"))
 		COMPLETE_WITH_LIST5("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches2("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches3("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+	else if (Matches3("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches3("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches3("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index b48f839028..b7120c6702 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -93,6 +93,8 @@ extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 				   TupleDesc src, AttrNumber srcAttno);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 10759c7c58..60d5c7c9ee 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -235,6 +235,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+								   Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, Oid *tableId, int32 *colId);
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 20bec90b9d..c41a4ea098 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -54,6 +54,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -62,7 +63,25 @@ extern Oid index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists);
+			 bool if_not_exists,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create_copy(Relation heapRelation,
+										Oid indOid,
+										const char *newName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  const char *oldName);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 8740cee944..e73432900a 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -29,10 +29,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern Oid	ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 5afc3ebea0..e2000b812c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3112,6 +3112,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 2606a27624..7a11eca488 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -38,6 +38,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index e519fdb0f6..5d8e922483 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3072,3 +3072,72 @@ DROP ROLE regress_reindexuser;
 SET client_min_messages TO 'warning';
 DROP SCHEMA schema_to_reindex CASCADE;
 RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 1648072568..3bd825ee02 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1086,3 +1086,53 @@ DROP ROLE regress_reindexuser;
 SET client_min_messages TO 'warning';
 DROP SCHEMA schema_to_reindex CASCADE;
 RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#54

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 9 years ago

In reply to: Andreas Karlsson (#53)

Re: REINDEX CONCURRENTLY 2.0

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.

Another downside is any stored regclass fields will become invalid.
Admittedly that's a pretty unusual use case, but it'd be nice if there
was at least a way to let users fix things during the rename phase
(perhaps via an event trigger).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#55

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#53)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andreas@proxel.se> wrote:

For each table:

1. Create new indexes without populating them, and lock the tables and
indexes for the session.

+    /*
+     * Copy contraint flags for old index. This is safe because the old index
+     * guaranteed uniquness.
+     */
+    newIndexForm->indisprimary = oldIndexForm->indisprimary;
+    oldIndexForm->indisprimary = false;
+    newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+    oldIndexForm->indisexclusion = false;
[...]
+        deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+                                        RelationRelationId, DEPENDENCY_AUTO);
+        deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+                                        ConstraintRelationId,
DEPENDENCY_INTERNAL);
+
+        // TODO: pg_depend for old index?
There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#56

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Jim Nasby (#54)

Re: REINDEX CONCURRENTLY 2.0

On 2017-03-01 19:25:23 -0600, Jim Nasby wrote:

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.

Another downside is any stored regclass fields will become invalid.
Admittedly that's a pretty unusual use case, but it'd be nice if there was
at least a way to let users fix things during the rename phase (perhaps via
an event trigger).

I'm fairly confident that we don't want to invoke event triggers inside
the CIC code... I'm also fairly confident that between index oids
stored somewhere being invalidated - what'd be a realistic use case of
that - and not having reindex concurrently, just about everyone will
choose the former.

Regards,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#57

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Jim Nasby (#54)

Re: REINDEX CONCURRENTLY 2.0

On 03/02/2017 02:25 AM, Jim Nasby wrote:

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.

Another downside is any stored regclass fields will become invalid.
Admittedly that's a pretty unusual use case, but it'd be nice if there
was at least a way to let users fix things during the rename phase
(perhaps via an event trigger).

Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY
issue event triggers seems strange to me. While it does create and drop
indexes as part of its implementation, it is actually just an index
maintenance job.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#58

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Andres Freund (#56)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Mar 2, 2017 at 11:48 AM, Andres Freund <andres@anarazel.de> wrote:

On 2017-03-01 19:25:23 -0600, Jim Nasby wrote:

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.

Another downside is any stored regclass fields will become invalid.
Admittedly that's a pretty unusual use case, but it'd be nice if there was
at least a way to let users fix things during the rename phase (perhaps via
an event trigger).

I'm fairly confident that we don't want to invoke event triggers inside
the CIC code... I'm also fairly confident that between index oids
stored somewhere being invalidated - what'd be a realistic use case of
that - and not having reindex concurrently, just about everyone will
choose the former.

Maybe. But it looks to me like this patch is going to have
considerably more than its share of user-visible warts, and I'm not
very excited about that. I feel like what we ought to be doing is
keeping the index OID the same and changing the relfilenode to point
to a newly-created one, and I attribute our failure to make that
design work thus far to insufficiently aggressive hacking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#59

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Robert Haas (#58)

Re: REINDEX CONCURRENTLY 2.0

On March 4, 2017 1:16:56 AM PST, Robert Haas <robertmhaas@gmail.com> wrote:

Maybe. But it looks to me like this patch is going to have
considerably more than its share of user-visible warts, and I'm not
very excited about that. I feel like what we ought to be doing is
keeping the index OID the same and changing the relfilenode to point
to a newly-created one, and I attribute our failure to make that
design work thus far to insufficiently aggressive hacking.

We literally spent years and a lot of emails waiting for that to happen. Users now hack up solutions like this in userspace, obviously a bad solution.

I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#60

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Andres Freund (#59)

Re: REINDEX CONCURRENTLY 2.0

On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund <andres@anarazel.de> wrote:

I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

I, again, give that a firm "maybe". If the warts end up annoying 1%
of the users who try to use this feature, then you're right. If they
end up making a substantial percentage of people who try to use this
feature give up on it, then we've added a bunch of complexity and
future code maintenance for little real gain. I'm not ruling out the
possibility that you're 100% correct, but I'm not nearly as convinced
of that as you are.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#61

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Andres Freund (#59)

Re: REINDEX CONCURRENTLY 2.0

On Sat, Mar 4, 2017 at 9:34 AM, Andres Freund <andres@anarazel.de> wrote:

On March 4, 2017 1:16:56 AM PST, Robert Haas <robertmhaas@gmail.com> wrote:

Maybe. But it looks to me like this patch is going to have
considerably more than its share of user-visible warts, and I'm not
very excited about that. I feel like what we ought to be doing is
keeping the index OID the same and changing the relfilenode to point
to a newly-created one, and I attribute our failure to make that
design work thus far to insufficiently aggressive hacking.

We literally spent years and a lot of emails waiting for that to happen. Users now hack up solutions like this in userspace, obviously a bad solution.

I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

IMHO, REINDEX itself is implemented in a way that is conceptually
pure, and yet quite user hostile.

I tend to tell colleagues that ask about REINDEX something along the
lines of: Just assume that REINDEX is going to block out even SELECT
statements referencing the underlying table. It might not be that bad
for you in practice, but the details are arcane such that it might as
well be that simple most of the time. Even if you have time to listen
to me explain it all, which you clearly don't, you're still probably
not going to be able to apply what you've learned in a way that helps
you.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#62

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Robert Haas (#60)

Re: REINDEX CONCURRENTLY 2.0

On 03/05/2017 07:56 PM, Robert Haas wrote:

On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund <andres@anarazel.de> wrote:

I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

I, again, give that a firm "maybe". If the warts end up annoying 1%
of the users who try to use this feature, then you're right. If they
end up making a substantial percentage of people who try to use this
feature give up on it, then we've added a bunch of complexity and
future code maintenance for little real gain. I'm not ruling out the
possibility that you're 100% correct, but I'm not nearly as convinced
of that as you are.

I agree that these warts are annoying but I also think the limitations
can be explained pretty easily to users (e.g. by explaining it in the
manual how REINDEX CONCURRENTLY creates a new index to replace the old
one), and I think that is the important question about if a useful
feature with warts should be merged or not. Does it make things
substantially harder for the average user to grok?

And I would argue that his feature is useful for quite many, based on my
experience running a semi-large database. Index bloat happens and
without REINDEX CONCURRENTLY it can be really annoying to solve,
especially for primary keys. Certainly more people have problems with
index bloat than the number of people who store index oids in their
database.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#63

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#62)

Re: REINDEX CONCURRENTLY 2.0

On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:

And I would argue that his feature is useful for quite many, based on my
experience running a semi-large database. Index bloat happens and without
REINDEX CONCURRENTLY it can be really annoying to solve, especially for
primary keys. Certainly more people have problems with index bloat than the
number of people who store index oids in their database.

Yeah, but that's not the only wart, I think. For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level. If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector. Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system. I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#64

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Robert Haas (#63)

Re: REINDEX CONCURRENTLY 2.0

On 2017-03-07 21:48:23 -0500, Robert Haas wrote:

On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:

And I would argue that his feature is useful for quite many, based on my
experience running a semi-large database. Index bloat happens and without
REINDEX CONCURRENTLY it can be really annoying to solve, especially for
primary keys. Certainly more people have problems with index bloat than the
number of people who store index oids in their database.

Yeah, but that's not the only wart, I think.

I don't really see any other warts that don't correspond to CREATE/DROP
INDEX CONCURRENTLY.

For example, I believe (haven't looked at this patch series in a
while) that the patch takes a lock and later escalates the lock level.

It shouldn't* - that was required precisely because we had to switch the
relfilenodes when the oid stayed the same. Otherwise in-progress index
lookups could end up using the wrong relfilenodes and/or switch in the
middle of a lookup.

* excepting the exclusive lock DROP INDEX CONCURRENTLY style dropping
uses after marking the index as dead - but that shouldn't be much of a
concern?

Also, if by any chance you think (or use any software that thinks)
that OIDs for system objects are a stable identifier, this will be the
first case where that ceases to be true.

Can you come up with an halfway realistic scenario why an index oid, not
a table, constraint, sequence oid, would be relied upon?

If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.

Given how relatively few complaints we have about CIC's possibility of
ending up with invalid indexes - not that there are none - and it's
widespread usage, I'm not too concerned about this.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#65

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Robert Haas (#63)

Re: REINDEX CONCURRENTLY 2.0

On 03/08/2017 03:48 AM, Robert Haas wrote:

On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:

And I would argue that his feature is useful for quite many, based on my
experience running a semi-large database. Index bloat happens and without
REINDEX CONCURRENTLY it can be really annoying to solve, especially for
primary keys. Certainly more people have problems with index bloat than the
number of people who store index oids in their database.

Yeah, but that's not the only wart, I think.

The only two potential issues I see with the patch are:

1) That the index oid changes visibly to external users.

2) That the code for moving the dependencies will need to be updated
when adding new things which refer to an index oid.

Given how useful I find REINDEX CONCURRENTLY I think these warts are
worth it given that the impact is quite limited. I am of course biased
since if I did not believe this I would not pursue this solution in the
first place.

For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level. If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector.

This version of the patch no longer does that. For my use case
escalating the lock would make this patch much less interesting. The
highest lock level taken is the same one as the initial one (SHARE
UPDATE EXCLUSIVE). The current patch does on a high level (very
simplified) this:

1. CREATE INDEX CONCURRENTLY ind_new;
2. Atomically move all dependencies from ind to ind_new, rename ind to
ind_old, and rename ind_new to ind.
3. DROP INDEX CONCURRENTLY ind_old;

The actual implementation is a bit more complicated in reality, but no
part escalates the lock level over what would be required by the steps
for creating and dropping indexes concurrently

Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system. I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.

Hm, I cannot think of any real life scenario where this will be an issue
based on my personal experience with PostgreSQL, but if you can think of
one please provide it. I will try to ponder some more on this myself.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#66

Jim Nasby

jim.nasby@openscg.com

almost 9 years ago

In reply to: Andreas Karlsson (#65)

Re: REINDEX CONCURRENTLY 2.0

On 3/8/17 9:34 AM, Andreas Karlsson wrote:

Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system. I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.

Hm, I cannot think of any real life scenario where this will be an issue
based on my personal experience with PostgreSQL, but if you can think of
one please provide it. I will try to ponder some more on this myself.

The case I currently have is to allow tracking database objects similar
to (but not the same) as how we track the objects that belong to an
extension[1]. That currently depends on event triggers to keep names
updated if they're changed, as well as making use of the reg* types. If
an event trigger fired as part of the index rename (essentially treating
it like an ALTER INDEX) then I should be able to work around that.

The ultimate reason for doing this is to provide something similar to
extensions (create a bunch of database objects that are all bound
together), but also similar to classes in OO languages (so you can have
multiple instances).[2]

Admittedly, this is pretty off the beaten path and I certainly wouldn't
hold up the patch because of it. I am hoping that it'd be fairly easy to
fire an event trigger as if someone had just renamed the index.

1: https://github.com/decibel/object_reference
2: https://github.com/decibel/pg_classy
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#67

Thomas Munro

thomas.munro@enterprisedb.com

almost 9 years ago

In reply to: Andres Freund (#64)

Re: REINDEX CONCURRENTLY 2.0

On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund <andres@anarazel.de> wrote:

Can you come up with an halfway realistic scenario why an index oid, not
a table, constraint, sequence oid, would be relied upon?

Is there an implication for SIREAD locks? Predicate locks on index
pages include the index OID in the tag.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#68

Thomas Munro

thomas.munro@enterprisedb.com

almost 9 years ago

In reply to: Thomas Munro (#67)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 10, 2017 at 9:36 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund <andres@anarazel.de> wrote:

Can you come up with an halfway realistic scenario why an index oid, not
a table, constraint, sequence oid, would be relied upon?

Is there an implication for SIREAD locks? Predicate locks on index
pages include the index OID in the tag.

Ah, yes, but that is covered by a call to
TransferPredicateLocksToHeapRelation() in index_concurrent_set_dead().

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#69

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Michael Paquier (#55)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On 03/02/2017 03:10 AM, Michael Paquier wrote:

On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andreas@proxel.se> wrote:
+    /*
+     * Copy contraint flags for old index. This is safe because the old index
+     * guaranteed uniquness.
+     */
+    newIndexForm->indisprimary = oldIndexForm->indisprimary;
+    oldIndexForm->indisprimary = false;
+    newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+    oldIndexForm->indisexclusion = false;
[...]
+        deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+                                        RelationRelationId, DEPENDENCY_AUTO);
+        deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+                                        ConstraintRelationId,
DEPENDENCY_INTERNAL);
+
+        // TODO: pg_depend for old index?

Spotted one of my TODO comments there so I have attached a patch where I
have cleaned up that function. I also fixed the the code to properly
support triggers.

There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.

Hm, I am not sure how much that would help since a lot of the mumb-jumbo
is by necessity in the step where we move the constraints over from the
old index to the new.

Andreas

Attachments:

reindex-concurrently-v2.patchtext/x-patch; name=reindex-concurrently-v2.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
-         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+         <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+         <command>REINDEX CONCURRENTLY</>,
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.
      </para>
     </listitem>
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    </varlistentry>
 
    <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
      <para>
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_class.indisvalid</> is switched to
+       <quote>true</> for the new index and to <quote>false</> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_class.isready</> switched to <quote>false</>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</>
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +446,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 4e2ebe1ae7..2f93d3e954 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -260,6 +260,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 }
 
 /*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
  * Free a TupleDesc including all substructure
  */
 void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8d42a347ea..c40ac0b154 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -677,6 +677,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -690,6 +691,10 @@ UpdateIndexRelation(Oid indexoid,
  * is_internal: if true, post creation hook for new index
  * if_not_exists: if true, do not throw an error if a relation with
  *		the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -706,6 +711,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -714,7 +720,8 @@ index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists)
+			 bool if_not_exists,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -760,16 +767,19 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -808,14 +818,21 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed
+	 * by caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1123,6 +1140,445 @@ index_create(Relation heapRelation,
 }
 
 /*
+ * index_concurrent_create_copy
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent reindex processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create_copy(Relation heapRelation, Oid indOid, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple, classTuple;
+	Datum		indclassDatum, colOptionDatum, optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 newName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 false, /* do not copy primary flag */
+								 false,	/* is constraint? */
+								 false,	/* is deferrable? */
+								 false,	/* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false,	/* is_internal? */
+								 false, /* if_not_exists? */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel, indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts. Once we
+	 * commit this transaction, any new transactions that open the table must
+	 * insert new entries into the index for insertions and non-HOT updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap name, dependencies and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
+{
+	Relation		pg_class, pg_index, pg_constraint, pg_trigger;
+	Relation		oldClassRel, newClassRel;
+	HeapTuple		oldClassTuple, newClassTuple;
+	Form_pg_class	oldClassForm, newClassForm;
+	HeapTuple		oldIndexTuple, newIndexTuple;
+	Form_pg_index	oldIndexForm, newIndexForm;
+	Oid				indexConstraintOid;
+	List		   *constraintOids = NIL;
+	ListCell	   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniquness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new is invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move contstraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexOid);
+
+	indexConstraintOid = get_index_constraint(oldIndexOid);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = heap_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple			constraintTuple, triggerTuple;
+		Form_pg_constraint	conForm;
+		ScanKeyData 		key[1];
+		SysScanDesc 		scan;
+		Oid					constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexOid)
+		{
+			conForm->conindid = newIndexOid;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexOid)
+				continue;
+
+			/* make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexOid;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress	myself, referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexOid;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependencyForAll(RelationRelationId, oldIndexOid, newIndexOid);
+
+	/* Close relations */
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+	heap_close(pg_constraint, RowExclusiveLock);
+	heap_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation, indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid				constraintOid = get_index_constraint(indexOid);
+	ObjectAddress	object;
+	Form_pg_index	indexForm;
+	Relation		pg_index;
+	HeapTuple		indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process.
+	 * Register constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
+/*
  * index_constraint_create
  *
  * Set up a constraint associated with an index.  Return the new constraint's
@@ -1483,41 +1939,13 @@ index_drop(Oid indexId, bool concurrent)
 		 * Note: the reason we use actual lock acquisition here, rather than
 		 * just checking the ProcArray and sleeping, is that deadlock is
 		 * possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
+		 * to acquire an exclusive lock on our table. The lock code will
 		 * detect deadlock and error out properly.
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d0ee851215..9dce6420df 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -377,6 +377,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 }
 
 /*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+					   Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		errmsg("cannot remove dependency on %s because it is a system object",
+			   getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
+/*
  * isObjectPinned()
  *
  * Test if an object is required for basic database functionality.
@@ -722,3 +810,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	heap_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 0e4231668d..96044663e9 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -332,9 +332,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 true, false, false, false,
-				 true, false, false, true, false);
+				 true, false, false, true, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 72bb06c760..7a51c25d98 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -51,6 +51,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -77,6 +78,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 /*
  * CheckIndexCompatible
@@ -283,6 +285,87 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int i, n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue; /* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int n_newer_snapshots, j, k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue; /* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -320,7 +403,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -331,9 +413,7 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -662,12 +742,12 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
 					 stmt->concurrent, !check_rights,
-					 stmt->if_not_exists);
+					 stmt->if_not_exists, false);
 
 	ObjectAddressSet(address, RelationRelationId, indexRelationId);
 
@@ -757,34 +837,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -842,74 +903,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-										 PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots)		/* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -1739,7 +1735,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -1751,8 +1747,9 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+									  concurrent, concurrent,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
 
@@ -1764,7 +1761,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 
 	return indOid;
 }
@@ -1833,18 +1833,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   concurrent, concurrent,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1862,7 +1870,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -1874,6 +1882,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -1964,6 +1973,17 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!IsSystemClass(relid, classtuple))
 			continue;
 
+		/* A system catalog cannot be reindexed concurrently */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -1990,19 +2010,28 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
-
-			if (options & REINDEXOPT_VERBOSE)
-				ereport(INFO,
-						(errmsg("table \"%s.%s\" was reindexed",
-								get_namespace_name(get_rel_namespace(relid)),
+
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+		if (result && (options & REINDEXOPT_VERBOSE))
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
@@ -2011,3 +2040,597 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 
 	MemoryContextDelete(private_context);
 }
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *concurrentIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc, *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char	   *relationName = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this transaction
+	 * is committed to protect against schema changes that might occur until
+	 * the session lock is taken on each relation, session lock used to
+	 * similarly protect from any schema change that could happen within the
+	 * multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes
+				 * including toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+												ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+													ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/* Save the list of relation OIDs in private context */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data
+	 * as the former index except that it will be only registered in catalogs
+	 * and will be built later. It is possible to perform all the operations
+	 * on all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a relation name for concurrent index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccold",
+											get_rel_namespace(indexRel->rd_index->indrelid));
+
+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create_copy(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the concurrent index Oid */
+		concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each concurrent relation from drop then
+		 * close relations. The lockrelid on parent relation is not taken here
+		 * to avoid multiple locks taken on the same relation, instead we rely
+		 * on parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG		*heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build concurrent indexes in a separate transaction for each index to
+	 * avoid having open transactions for an unnecessary long time. A
+	 * concurrent build is done for each concurrent index that will replace
+	 * the old indexes. Before doing that, we need to wait on the parent
+	 * relations until no running transactions could have the parent table
+	 * of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it
+		 * to determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/* we can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the concurrent indexes catch up with any new tuples
+	 * that were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Perform a scan of each concurrent index with the heap, then insert
+	 * any missing index entries.
+	 */
+	foreach(lc, concurrentIndexIds)
+	{
+		Oid				indOid = lfirst_oid(lc);
+		Oid				relOid;
+		TransactionId	limitXmin;
+		Snapshot		snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the concurrent indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save the xmin
+		 * limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples
+		 * before the reference snapshot was taken, so we need to wait for the
+		 * transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the concurrent index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we get only get constraint violations from the
+	 * indexes with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, concurrentIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid));
+
+		/* Swap old index and its concurrent entry */
+		index_concurrent_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent
+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrent_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as
+	 * DROP INDEX CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid 		indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrent_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId lockRel = *((LockRelId *) lfirst(lc));
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+		ereport(INFO,
+				(errmsg("relation \"%s\" was reindexed",
+						relationName),
+				 errdetail("%s.",
+						   pg_rusage_show(&ru0))));
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+
+	return true;
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1ddb72d164..2e65371fe5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1081,6 +1081,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	char		expected_relkind;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1128,7 +1129,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index bfc2ac1716..cb9db29bb2 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4164,6 +4164,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 54e9c983a0..e2f18ae1ae 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2038,6 +2038,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e7acc2d9a2..4ce0e6f6ba 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7861,42 +7861,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 20b5273405..c76dacc44a 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -773,16 +773,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -798,7 +802,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												(stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												(stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												"REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 1aa56ab3a2..70e56a254e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1904,6 +1904,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e8458e939e..42fc4bd8ff 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3029,12 +3029,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches1("REINDEX"))
 		COMPLETE_WITH_LIST5("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches2("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches3("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+	else if (Matches3("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches3("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches3("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index b48f839028..b7120c6702 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -93,6 +93,8 @@ extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 				   TupleDesc src, AttrNumber srcAttno);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 10759c7c58..0c962a6e26 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -235,6 +235,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+								   Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, Oid *tableId, int32 *colId);
@@ -247,6 +250,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 20bec90b9d..c41a4ea098 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -54,6 +54,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -62,7 +63,25 @@ extern Oid index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists);
+			 bool if_not_exists,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create_copy(Relation heapRelation,
+										Oid indOid,
+										const char *newName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  const char *oldName);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 8740cee944..e73432900a 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -29,10 +29,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_rights,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern Oid	ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index a44d2178e1..5f27b10691 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3141,6 +3141,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 2606a27624..7a11eca488 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -38,6 +38,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index e519fdb0f6..5d8e922483 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3072,3 +3072,72 @@ DROP ROLE regress_reindexuser;
 SET client_min_messages TO 'warning';
 DROP SCHEMA schema_to_reindex CASCADE;
 RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 1648072568..3bd825ee02 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1086,3 +1086,53 @@ DROP ROLE regress_reindexuser;
 SET client_min_messages TO 'warning';
 DROP SCHEMA schema_to_reindex CASCADE;
 RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#70

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Andreas Karlsson (#69)

Re: REINDEX CONCURRENTLY 2.0

On 03/13/2017 03:11 AM, Andreas Karlsson wrote:

I also fixed the the code to properly support triggers.

And by "support triggers" I actually meant fixing the support for moving
the foreign keys to the new index.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#71

Michael Banck

michael.banck@credativ.de

almost 9 years ago

In reply to: Andreas Karlsson (#69)

Re: REINDEX CONCURRENTLY 2.0

Hi,

I had a look at this.

On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote:

Spotted one of my TODO comments there so I have attached a patch where I
have cleaned up that function. I also fixed the the code to properly support
triggers.

The patch applies with quite a few offsets on top of current (2fd8685)
master, I have not verified that those are all ok.

Regression tests pass, also the included isolation tests.

I hope that Michael will post a full review as he worked on the code
extensively, but here are some some code comments, mostly on the
comments (note that I'm not a native speaker, so I might be wrong on
some of them as well):

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
An index build with the <literal>CONCURRENTLY</> option failed, leaving
an <quote>invalid</> index. Such indexes are useless but it can be
convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      <command>REINDEX</> will perform a concurrent build if <literal>
+      CONCURRENTLY</> is specified. To build the index without interfering
+      with production you should drop the index and reissue either the
+      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+      command. Indexes of toast relations can be rebuilt with <command>REINDEX
+      CONCURRENTLY</>.

I think the "To build the index[...]" part should be rephrased, the
current diff makes it sound like you should drop the index first even if
you reindex concurrently. What about "Note that <command>REINDEX</> will
only perform a concurrent build if <literal> CONCURRENTLY</> is
specified"?

Anyway, this part is only about reindexing invalid indexes, so
mentioning that reindex is not concurrently or the part about create-
index-concurrently-then-rename only for this case is a bit weird, but
this is a pre-existing condition.

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8d42a347ea..c40ac0b154 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
/*
+ * index_concurrent_create_copy
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent reindex processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */

That should be "The heap relation on which the index is based ..." I
think.

+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in
+	 * commit of transaction where this concurrent index was created
+	 * at the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);

Looks like indexInfo starts with lowercase, but the comment above has
upper case `IndexInfo'.

+/*
+ * index_concurrent_swap
+ *
+ * Swap name, dependencies and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */

The `while' looks slightly odd to me, ISTM this is just another
operation this function performs, whereas "while" makes it sound like
the marking happens concurrently; so maybe ". Also, mark the old index
as invalid[...]"?

+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
+{

[...]

+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniquness.
+	 */

"uniqueness".

+ /* Mark old index as valid and new is invalid as index_set_state_flags */

"new as invalid". Also, this comment style is different to this one:

+	/*
+	 * Move contstraints and triggers over to the new index
+	 */

I guess the latter could be changed to a one-line comment as the former,
but maybe there is a deeper sense (locality of comment?) in this.

+ /* make a modifiable copy */

I think comments should start capitalized?

+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */

I'd write "as it is certain" instead of "as it is sure", but I can't
explain why. Maybe persons are sure, but situations are certain?

@@ -1483,41 +1939,13 @@ index_drop(Oid indexId, bool concurrent)
* Note: the reason we use actual lock acquisition here, rather than
* just checking the ProcArray and sleeping, is that deadlock is
* possible if one of the transactions in question is blocked trying
-		 * to acquire an exclusive lock on our table.  The lock code will
+		 * to acquire an exclusive lock on our table. The lock code will

Gratuitous whitespace change, seeing that other comments added in this
patch have the extra whitespace after full stops as well.

diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d0ee851215..9dce6420df 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -377,6 +377,94 @@ changeDependencyFor(Oid classId, Oid objectId,
}

/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+					   Oid newRefObjectId)

This one is mostly a copy-paste of changeDependencyFor(), did you
consider refactoring that into handling the All case as well?

@@ -722,3 +810,58 @@ get_index_constraint(Oid indexId)

return constraintId;
}
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)

Same for this one, but there's two similar functions
(get_constraint_index() and get_index_constraint()) already so I guess
it's fine?

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 72bb06c760..7a51c25d98 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -283,6 +285,87 @@ CheckIndexCompatible(Oid oldId,
return ret;
}

+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin

"an older snapshot" maybe?

+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they

Probably an empty line between the two paragraphs is in order, or just
keep it one paragraph.

+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).

Please rephrase what's in the paranthesis, I am not quite sure what
it means, maybe "(and any they are currently taking is..."?

+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)

Weird punctuation, I think the full stop before the paranthesis should
be put after it, the full stop at the end of the paranthesis be dropped
and the beginning of the parenthesis should not be capitalized.

Oh hrm, I now see that the patch just moves the comments, so maybe don't
bother.

@@ -1739,7 +1735,7 @@ ChooseIndexColumnNames(List *indexElems)
*		Recreate a specific index.
*/
Oid
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
{
Oid			indOid;
Oid			heapOid = InvalidOid;
@@ -1751,8 +1747,9 @@ ReindexIndex(RangeVar *indexRelation, int options)
* obtain lock on table first, to avoid deadlock hazard.  The lock level
* used here must match the index lock obtained in reindex_index().
*/
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
-									  false, false,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+									  concurrent, concurrent,

I find the way the bool is passed for the third and fourth argument a
bit weird, but ok. I would still suggest to explain in the comment
above why the two other arguments to RangeVarGetRelidExtended()
(`missing_ok' and `nowait') are dependent on concurrent reindexing; it's
not super-obvious to me.

/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   concurrent, concurrent,
RangeVarCallbackOwnsTable, NULL);

Same here.

+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);

This somewhat begs the question why those two cases are so different
(one is implemented in src/backend/catalog/index.c, the other in
src/backend/commands/indexcmds.c, and their naming scheme is different).
I guess that's ok, but it might also be a hint that
ReindexRelationConcurrently() is implemented at the wrong level.

@@ -2011,3 +2040,597 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,

MemoryContextDelete(private_context);
}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes

"one by one".

+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{

[...]

+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * If the relkind of given relation Oid is a table, all its valid indexes

capitalization, "If" is in the middle of a sentence.

+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt.

Maybe mention here that system catalogs and shared relations cannot be
reindexed concurrently?

+		/* Create concurrent index based on given index */
+		concurrentOid = index_concurrent_create_copy(indexParentRel,
+													 indOid,
+													 concurrentName);

AIUI, this creates/copies some meta-data for the concurrent index, but
does not yet create the index itself, right? If so, the comment is
somewhat misleading.

+		/*
+		 * Now open the relation of concurrent index, a lock is also needed on
+		 * it
+		 */

Multi-line comments should end with a full-stop I think?

+ /*
+ * Phase 3 of REINDEX CONCURRENTLY

[...]

+		/*
+		 * This concurrent index is now valid as they contain all the tuples
+		 * necessary. However, it might not have taken into account deleted tuples

"as they contain" should be "as it contains" I guess, since the rest of
the comment is talking about a singular index.

+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the concurrent indexes have been validated, it is necessary
+	 * to swap each concurrent index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we get only get constraint violations from the

"we only get"

+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * The indexes hold now a fresh relfilenode of their respective concurrent

I'd write "now hold" instead of "hold now".

+	 * entries indexes. It is time to mark the now-useless concurrent entries
+	 * as not ready so as they can be safely discarded from write operations
+	 * that may occur on them.

So the "concurrent entries" is the original index, as that one should be
now-useless? If so, that's a bit confusing terminology to me and it was
called "old index" in the previous phases.

+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the concurrent indexes, with actually the same code path as

Again, I'd have written "Drop the old indexes". Also, "with actually the
same" sounds a bit awkward, maybe "actually using the same" would be
better.

+	/*
+	 * Last thing to do is to release the session-level lock on the parent table
+	 * and the indexes of table.

"and on the indexes of the table"? Or what exactly is meant with the
last bit?

diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 20b5273405..c76dacc44a 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -773,16 +773,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);

Those lines are now in excess of 80 chars.

Cheers,

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mï¿½nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mï¿½nchengladbach
Geschï¿½ftsfï¿½hrung: Dr. Michael Meskes, Jï¿½rg Folz, Sascha Heuer

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#72

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#69)

Re: REINDEX CONCURRENTLY 2.0

On Mon, Mar 13, 2017 at 11:11 AM, Andreas Karlsson <andreas@proxel.se> wrote:

On 03/02/2017 03:10 AM, Michael Paquier wrote:

There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.

Hm, I am not sure how much that would help since a lot of the mumb-jumbo is
by necessity in the step where we move the constraints over from the old
index to the new.

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#73

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Michael Banck (#71)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Mar 30, 2017 at 5:13 AM, Michael Banck
<michael.banck@credativ.de> wrote:

On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote:

Spotted one of my TODO comments there so I have attached a patch where I
have cleaned up that function. I also fixed the the code to properly support
triggers.

I hope that Michael will post a full review as he worked on the code
extensively, but here are some some code comments, mostly on the
comments (note that I'm not a native speaker, so I might be wrong on
some of them as well):

Thanks, Michael. I have done a pass on it

[review comments]

Here are more comments:

+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
It would be nice to mention that REINDEX SCHEMA pg_catalog is not
supported, or just tell that concurrent reindexing of any system
catalog indexes.

When running REINDEX SCHEMA CONCURRENTLY public on the regression
database I am bumping into a bunch of these warnings:
WARNING: 01000: snapshot 0x7fa5e6000040 still active
LOCATION: AtEOXact_Snapshot, snapmgr.c:1123
WARNING: 01000: snapshot 0x7fa5e6000040 still active
LOCATION: AtEOXact_Snapshot, snapmgr.c:1123

+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+       tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.

+   /*
+    * Copy contraint flags for old index. This is safe because the old index
+    * guaranteed uniquness.
+    */
s/uniquness/uniqueness/ and s/contraint/constraint/.

+   /*
+    * Move contstraints and triggers over to the new index
+    */
s/contstraints/constraints/.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?

With this patch, we are reaching the 9th boolean argument for
create_index(). Any opinions about refactoring that into a set of
bitwise flags? Fairly unrelated to this patch.

+   /*
+    * Move all dependencies on the old index to the new
+    */
Sentence unfinished.

It has been years since I looked at this code (I wrote it in
majority), but here is what I would explore if I were to work on that
for the next release cycle:
- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.
- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.
The first and second points require a bit of thoughts for sure, but in
the long term that would pay in maintenance if we don't reinvent the
wheel, or at least try not to.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#74

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Michael Paquier (#73)

Re: REINDEX CONCURRENTLY 2.0

Thanks for the feedback. I will look at it when I get the time.

On 03/31/2017 08:27 AM, Michael Paquier wrote:

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.

I am personally worried about the amount time spent waiting for long
running transactions if you reindex per index rather than per relation.
Because when you for one index wait on long running transactions nothing
prevents new long transaction from starting, which we will have to wait
for while reindexing the next index. If your database has many long
running transactions more time will be spent waiting than the time spent
working.

Are the number of locks really a big deal compared to other costs
involved here? REINDEX does a lot of expensive things like staring
transactions, taking snapshots, scanning large tables, building a new
index, etc. The trade off I see is between temporary disk usage and time
spent waiting for transactions, and doing the REINDEX per relation
allows for flexibility since people can still explicitly reindex per
index of they want to.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#75

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Andreas Karlsson (#74)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andreas@proxel.se> wrote:

Thanks for the feedback. I will look at it when I get the time.

On 03/31/2017 08:27 AM, Michael Paquier wrote:

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.

I am personally worried about the amount time spent waiting for long running
transactions if you reindex per index rather than per relation. Because when
you for one index wait on long running transactions nothing prevents new
long transaction from starting, which we will have to wait for while
reindexing the next index. If your database has many long running
transactions more time will be spent waiting than the time spent working.

Yup, I am not saying that one approach or the other are bad, both are
worth considering. That's a deal between waiting and manual potential
cleanup in the event of a failure.

and doing the REINDEX per relation allows for flexibility
since people can still explicitly reindex per index of they want to.

You have a point here.

I am marking this patch as returned with feedback, this won't get in
PG10. If I am freed from the SCRAM-related open items I'll try to give
another shot at implementing this feature before the first CF of PG11.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#76

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Michael Paquier (#75)

Re: REINDEX CONCURRENTLY 2.0

On 04/03/2017 07:57 AM, Michael Paquier wrote:

On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andreas@proxel.se> wrote:

On 03/31/2017 08:27 AM, Michael Paquier wrote:

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.

I am personally worried about the amount time spent waiting for long running
transactions if you reindex per index rather than per relation. Because when
you for one index wait on long running transactions nothing prevents new
long transaction from starting, which we will have to wait for while
reindexing the next index. If your database has many long running
transactions more time will be spent waiting than the time spent working.

Yup, I am not saying that one approach or the other are bad, both are
worth considering. That's a deal between waiting and manual potential
cleanup in the event of a failure.

Agreed, and which is worse probably depends heavily on your schema and
workload.

I am marking this patch as returned with feedback, this won't get in
PG10. If I am freed from the SCRAM-related open items I'll try to give
another shot at implementing this feature before the first CF of PG11.

Thanks! I also think I will have time to work on this before the first CF.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#77

Andreas Karlsson

andreas@proxel.se

over 8 years ago

In reply to: Michael Paquier (#73)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

I have attached a new, rebased version of the batch with most of Banck's
and some of your feedback incorporated. Thanks for the good feedback!

On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX
SCHEMA CONCURRENTLY public on the regression

database I am bumping into a bunch of these warnings:
WARNING: 01000: snapshot 0x7fa5e6000040 still active
LOCATION: AtEOXact_Snapshot, snapmgr.c:1123
WARNING: 01000: snapshot 0x7fa5e6000040 still active
LOCATION: AtEOXact_Snapshot, snapmgr.c:1123

I failed to reproduce this. Do you have a reproducible test case?

+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+       tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.

Sorry, but I am not sure I understand your suggestion. I do not like the
ResetTupleDescCache function so all suggestions are welcome.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?

I have thought some about this myself and I do not care strongly either way.

- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.

I am no fan of the current code duplication and how fragile it is, but I
think these cases are sufficiently different to prevent meaningful code
reuse. But it could just be me who is unfamiliar with that part of the code.

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.

I am still leaning towards my current tradeoff since waiting for all
queries to stop using an index can take a lot of time and if you only
have to do that once per table it would be a huge benefit under some
workloads, and you can still reindex each index separately if you need to.

Andreas

Attachments:

reindex-concurrently-v3.patchtext/x-patch; name=reindex-concurrently-v3.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index dda0170886..c97944b2c9 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,7 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
-         <command>CREATE STATISTICS</> and
+         <command>REINDEX CONCURRENTLY</>, <command>CREATE STATISTICS</> and
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
          <xref linkend="SQL-ALTERTABLE">).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..4ef3a89a29 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
      <para>
       An index build with the <literal>CONCURRENTLY</> option failed, leaving
       an <quote>invalid</> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</> to rebuild them. Note that
-      <command>REINDEX</> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+      convenient to use <command>REINDEX</> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_class.indisvalid</> is switched to
+       <quote>true</> for the new index and to <quote>false</> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_class.isready</> switched to <quote>false</>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</>
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrenctly.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +441,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 4436c86361..509b8d4cb5 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -228,6 +228,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	dstAtt->attidentity = '\0';
 }
 
+/*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i].attcacheoff = -1;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c7b2f031f0..3f4661d644 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -680,6 +680,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -693,6 +694,10 @@ UpdateIndexRelation(Oid indexoid,
  * is_internal: if true, post creation hook for new index
  * if_not_exists: if true, do not throw an error if a relation with
  *		the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -709,6 +714,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -717,7 +723,8 @@ index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists)
+			 bool if_not_exists,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -763,16 +770,19 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -811,14 +821,21 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed by
+	 * caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1125,6 +1142,459 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrent_create_copy
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent reindex processing. The heap relation on which the index is based
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create_copy(Relation heapRelation, Oid indOid, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 newName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 false, /* do not copy primary flag */
+								 false, /* is constraint? */
+								 false, /* is deferrable? */
+								 false, /* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false, /* is_internal? */
+								 false, /* if_not_exists? */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts. Once we
+	 * commit this transaction, any new transactions that open the table must
+	 * insert new entries into the index for insertions and non-HOT updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap name, dependencies and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexOid);
+
+	indexConstraintOid = get_index_constraint(oldIndexOid);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = heap_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexOid)
+		{
+			conForm->conindid = newIndexOid;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexOid)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexOid;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexOid;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependencyForAll(RelationRelationId, oldIndexOid, newIndexOid);
+
+	/* Close relations */
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+	heap_close(pg_constraint, RowExclusiveLock);
+	heap_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid			constraintOid = get_index_constraint(indexOid);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1491,36 +1961,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index dd6ca3e8f7..981c849296 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -376,6 +376,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+					   Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		errmsg("cannot remove dependency on %s because it is a system object",
+			   getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -730,3 +818,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	heap_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 6f517bbcda..8476fff4e3 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -332,9 +332,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 true, false, false, false,
-				 true, false, false, true, false);
+				 true, false, false, true, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index b61aaac284..0c46ac6d7e 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -51,6 +51,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -77,6 +78,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 /*
  * CheckIndexCompatible
@@ -283,6 +285,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -324,7 +410,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -335,9 +420,7 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -670,12 +753,12 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
 					 stmt->concurrent, !check_rights,
-					 stmt->if_not_exists);
+					 stmt->if_not_exists, false);
 
 	ObjectAddressSet(address, RelationRelationId, indexRelationId);
 
@@ -765,34 +848,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -850,74 +914,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -1747,7 +1746,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -1759,7 +1758,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  false, false,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -1772,7 +1772,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 
 	return indOid;
 }
@@ -1841,18 +1844,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   false, false,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1870,7 +1881,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -1882,6 +1893,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -1972,6 +1984,17 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!IsSystemClass(relid, classtuple))
 			continue;
 
+		/* A system catalog cannot be reindexed concurrently */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -1998,24 +2021,625 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
-
-			if (options & REINDEXOPT_VERBOSE)
-				ereport(INFO,
-						(errmsg("table \"%s.%s\" was reindexed",
-								get_namespace_name(get_rel_namespace(relid)),
-								get_rel_name(relid))));
+
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+		if (result && (options & REINDEXOPT_VERBOSE))
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							get_namespace_name(get_rel_namespace(relid)),
+							get_rel_name(relid))));
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char	   *relationName = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+														  ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid));
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrent_create_copy(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/* We can do away with our snapshot */
 		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid));
+
+		/* Swap old index with the new one */
+		index_concurrent_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrent_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrent_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finallt release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+		ereport(INFO,
+				(errmsg("relation \"%s\" was reindexed",
+						relationName),
+				 errdetail("%s.",
+						   pg_rusage_show(&ru0))));
+
+	/* Start a new transaction to finish process properly */
 	StartTransactionCommand();
 
 	MemoryContextDelete(private_context);
+
+	return true;
 }
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0f08245a67..2d731e4c59 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1092,6 +1092,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1151,7 +1152,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f9ddf4ed76..80e4419c24 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4260,6 +4260,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 8d92c03633..1df08b9296 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2083,6 +2083,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7d0de99baf..91382d84fb 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8026,42 +8026,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 775477c6cf..df6372e9c0 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -779,16 +779,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -804,7 +808,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												(stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												(stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												"REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index a41932ff27..9491b157c1 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1951,6 +1951,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 1583cfa998..1b32c3b572 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3139,12 +3139,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches1("REINDEX"))
 		COMPLETE_WITH_LIST5("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches2("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches3("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+	else if (Matches3("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches3("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches3("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 989fe738bb..c2a805fbc3 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -95,6 +95,8 @@ extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 				   TupleDesc src, AttrNumber srcAttno);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index b9f98423cc..7d8cea6e53 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -236,6 +236,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+								   Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -246,6 +249,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 1d4ec09f8f..7d66248363 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -54,6 +54,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -62,7 +63,25 @@ extern Oid index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists);
+			 bool if_not_exists,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create_copy(Relation heapRelation,
+										Oid indOid,
+										const char *newName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  const char *oldName);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index f7bb4a54f7..755cccd7e4 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -30,10 +30,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern Oid	ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 5f2a4a75da..e25b860292 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3224,6 +3224,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 32c965b2a0..1499cb6c1f 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -41,6 +41,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 064adb4640..74fb614170 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3067,3 +3067,73 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 67470db918..1aaca37fd9 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1085,3 +1085,54 @@ RESET ROLE;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#78

Andreas Karlsson

andreas@proxel.se

about 8 years ago

In reply to: Andreas Karlsson (#77)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

Here is a rebased version of the patch.

Andreas

Attachments:

reindex-concurrently-v4.patchtext/x-patch; name=reindex-concurrently-v4.patchDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0ca2851e5..f8c59ea127 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command> and
          <command>ALTER TABLE VALIDATE</command> and other
          <command>ALTER TABLE</command> variants (for full details see
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 2e053c4c24..4019bad4c2 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+      endterm="SQL-REINDEX-CONCURRENTLY-title">.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    reindex anything.
   </para>
 
+  <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+   <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+   <primary>index</primary>
+   <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+    is used, <productname>PostgreSQL</> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</> is
+       switched to <quote>true</> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_class.indisvalid</> is switched to
+       <quote>true</> for the new index and to <quote>false</> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_class.isready</> switched to <quote>false</>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</>
+    command will fail but leave behind an <quote>invalid</> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</> <command>\d</> command will report
+    such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</>.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</>
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</> cannot.
+   </para>
+
+   <para>
+    Invalid indexes of toast relations can be dropped if a failure occurred
+    during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+    for a given toast relation, cannot be dropped.
+   </para>
+
+   <para>
+    <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+    on all the relations involved during operation. When
+    <command>CONCURRENTLY</command> is specified, the operation is done with
+    <literal>SHARE UPDATE EXCLUSIVE</literal>.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrenctly.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -262,7 +441,18 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
-</programlisting></para>
+</programlisting>
+  </para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 9e37ca73a8..8e2e253f24 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -244,6 +244,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	dstAtt->attidentity = '\0';
 }
 
+/*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+	int i;
+
+	for (i = 0; i < tupdesc->natts; i++)
+		tupdesc->attrs[i].attcacheoff = -1;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c7b2f031f0..3f4661d644 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -680,6 +680,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * isprimary: index is a PRIMARY KEY
  * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
  * deferrable: constraint is DEFERRABLE
@@ -693,6 +694,10 @@ UpdateIndexRelation(Oid indexoid,
  * is_internal: if true, post creation hook for new index
  * if_not_exists: if true, do not throw an error if a relation with
  *		the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ *		existing index created during a concurrent operation. This index can
+ *		also be a toast relation. Sufficient locks are normally taken on
+ *		the related relations once this is called during a concurrent operation.
  *
  * Returns the OID of the created index.
  */
@@ -709,6 +714,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -717,7 +723,8 @@ index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists)
+			 bool if_not_exists,
+			 bool is_reindex)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
 	Relation	pg_class;
@@ -763,16 +770,19 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
 
 	/*
-	 * This case is currently not supported, but there's no way to ask for it
-	 * in the grammar anyway, so it can't happen.
+	 * This case is currently only supported during a concurrent index
+	 * rebuild, but there is no way to ask for it in the grammar otherwise
+	 * anyway. If support for exclusion constraints is added in the future,
+	 * the check similar to this one in check_exclusion_constraint should as
+	 * well be changed accordingly.
 	 */
-	if (concurrent && is_exclusion)
+	if (concurrent && is_exclusion && !is_reindex)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -811,14 +821,21 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if nothing is passed by
+	 * caller.
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (tupdesc == NULL)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1125,6 +1142,459 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrent_create_copy
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent reindex processing. The heap relation on which the index is based
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create_copy(Relation heapRelation, Oid indOid, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			concurrentOid = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+
+	indexRelation = index_open(indOid, RowExclusiveLock);
+
+	/* Concurrent index uses the same index information as former index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/*
+	 * Create a copy of the tuple descriptor to be used for the concurrent
+	 * entry and reset any cache counters on it to have a fresh version.
+	 */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+	ResetTupleDescCache(indexTupDesc);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indOid);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, indOid);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", indOid);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the concurrent index */
+	concurrentOid = index_create(heapRelation,
+								 newName,
+								 InvalidOid,
+								 InvalidOid,
+								 indexInfo,
+								 NIL,
+								 indexRelation->rd_rel->relam,
+								 indexRelation->rd_rel->reltablespace,
+								 indexRelation->rd_indcollation,
+								 indclass->values,
+								 indcoloptions->values,
+								 optionDatum,
+								 indexTupDesc,
+								 false, /* do not copy primary flag */
+								 false, /* is constraint? */
+								 false, /* is deferrable? */
+								 false, /* is initially deferred? */
+								 true,	/* allow table to be a system catalog? */
+								 true,	/* skip build? */
+								 true,	/* concurrent? */
+								 false, /* is_internal? */
+								 false, /* if_not_exists? */
+								 true); /* reindex? */
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return concurrentOid;
+}
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+					   Oid indexOid,
+					   bool isprimary)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts. Once we
+	 * commit this transaction, any new transactions that open the table must
+	 * insert new entries into the index for insertions and non-HOT updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap name, dependencies and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexOid, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexOid, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexOid));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexOid));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexOid);
+
+	indexConstraintOid = get_index_constraint(oldIndexOid);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = heap_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexOid)
+		{
+			conForm->conindid = newIndexOid;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexOid)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexOid;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexOid;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependencyForAll(RelationRelationId, oldIndexOid, newIndexOid);
+
+	/* Close relations */
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+	heap_close(pg_constraint, RowExclusiveLock);
+	heap_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+	Oid			constraintOid = get_index_constraint(indexOid);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexOid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexOid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexOid;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1491,36 +1961,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrent_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index dd6ca3e8f7..981c849296 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -376,6 +376,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+					   Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		errmsg("cannot remove dependency on %s because it is a system object",
+			   getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -730,3 +818,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	heap_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 6f517bbcda..8476fff4e3 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -332,9 +332,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 true, false, false, false,
-				 true, false, false, true, false);
+				 true, false, false, true, false, false);
 
 	heap_close(toast_rel, NoLock);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3f615b6260..8230942218 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -51,6 +51,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -77,6 +78,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 
 /*
  * CheckIndexCompatible
@@ -283,6 +285,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -324,7 +410,6 @@ DefineIndex(Oid relationId,
 	Oid			tablespaceId;
 	List	   *indexColNames;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -335,9 +420,7 @@ DefineIndex(Oid relationId,
 	IndexInfo  *indexInfo;
 	int			numberOfAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -669,12 +752,12 @@ DefineIndex(Oid relationId,
 					 indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, stmt->primary,
+					 coloptions, reloptions, NULL, stmt->primary,
 					 stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
 					 allowSystemTableMods,
 					 skip_build || stmt->concurrent,
 					 stmt->concurrent, !check_rights,
-					 stmt->if_not_exists);
+					 stmt->if_not_exists, false);
 
 	ObjectAddressSet(address, RelationRelationId, indexRelationId);
 
@@ -764,34 +847,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrent_build(RangeVarGetRelid(stmt->relation,
+											ShareUpdateExclusiveLock,
+											false),
+						   indexRelationId,
+						   stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -849,74 +913,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -1746,7 +1745,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -1758,7 +1757,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  false, false,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -1771,7 +1771,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 
 	return indOid;
 }
@@ -1840,18 +1843,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   false, false,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -1869,7 +1880,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -1881,6 +1892,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -1971,6 +1983,17 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!IsSystemClass(relid, classtuple))
 			continue;
 
+		/* A system catalog cannot be reindexed concurrently */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -1997,20 +2020,29 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
-
-			if (options & REINDEXOPT_VERBOSE)
-				ereport(INFO,
-						(errmsg("table \"%s.%s\" was reindexed",
-								get_namespace_name(get_rel_namespace(relid)),
-								get_rel_name(relid))));
+
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+		if (result && (options & REINDEXOPT_VERBOSE))
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							get_namespace_name(get_rel_namespace(relid)),
+							get_rel_name(relid))));
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
@@ -2018,3 +2050,595 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 
 	MemoryContextDelete(private_context);
 }
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char	   *relationName = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+														  ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid));
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrent_create_copy(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrent_build(indexRel->rd_index->indrelid,
+							   concurrentOid,
+							   primary);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid));
+
+		/* Swap old index with the new one */
+		index_concurrent_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrent_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrent_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finallt release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+		ereport(INFO,
+				(errmsg("relation \"%s\" was reindexed",
+						relationName),
+				 errdetail("%s.",
+						   pg_rusage_show(&ru0))));
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+
+	return true;
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3ab808715b..dd831f56f9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1136,6 +1136,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1195,7 +1196,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index c1a83ca909..b527b6f55f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4271,6 +4271,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 7a700018e7..64d39fde5c 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2092,6 +2092,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 4c83a63f7d..10c6bcf189 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8057,42 +8057,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 82a707af7b..533b986e27 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -779,16 +779,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventTransactionChain(isTopLevel,
+											"REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -804,7 +808,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												(stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												(stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												"REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 9b59ee840b..6b9ee9972e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2147,6 +2147,23 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
+
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index a09c49d6cf..7077a49d3b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3142,12 +3142,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches1("REINDEX"))
 		COMPLETE_WITH_LIST5("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches2("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches2("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches3("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+	else if (Matches3("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches3("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches3("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 2be5af1d3e..0e65570635 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -107,6 +107,8 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 				   TupleDesc src, AttrNumber srcAttno);
 
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index b9f98423cc..7d8cea6e53 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -236,6 +236,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
+								   Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -246,6 +249,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 1d4ec09f8f..7d66248363 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -54,6 +54,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bool isprimary,
 			 bool isconstraint,
 			 bool deferrable,
@@ -62,7 +63,25 @@ extern Oid index_create(Relation heapRelation,
 			 bool skip_build,
 			 bool concurrent,
 			 bool is_internal,
-			 bool if_not_exists);
+			 bool if_not_exists,
+			 bool is_reindex);
+
+extern Oid index_concurrent_create_copy(Relation heapRelation,
+										Oid indOid,
+										const char *newName);
+
+extern void index_concurrent_build(Oid heapOid,
+								   Oid indexOid,
+								   bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+								  Oid oldIndexOid,
+								  const char *oldName);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_concurrent_drop(Oid indexOid);
 
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index f7bb4a54f7..755cccd7e4 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -30,10 +30,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern Oid	ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern Oid	ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 732e5d6788..1504eb83fc 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3246,6 +3246,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 7dad3c2316..21d4ca62a7 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -41,6 +41,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 031a0bcec9..0f46b94a36 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3067,3 +3067,73 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index a45e8ebeff..f267e30651 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1085,3 +1085,54 @@ RESET ROLE;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

#79

Michael Paquier

michael.paquier@gmail.com

about 8 years ago

In reply to: Andreas Karlsson (#78)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Wed, Nov 1, 2017 at 1:20 PM, Andreas Karlsson <andreas@proxel.se> wrote:

Here is a rebased version of the patch.

The patch does not apply, and needs a rebase. I am moving it to next
CF with waiting on author as status.
--
Michael

#80

Alvaro Herrera

alvherre@alvh.no-ip.org

about 8 years ago

In reply to: Andreas Karlsson (#78)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Andreas Karlsson wrote:

Here is a rebased version of the patch.

Is anybody working on rebasing this patch?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#81

Alvaro Herrera

alvherre@alvh.no-ip.org

about 8 years ago

In reply to: Michael Paquier (#72)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Michael Paquier wrote:

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.

I dislike the backend code that uses SPI and manufacturing node to
re-creates indexes. IMO we should get rid of it. Let's not call it
"facilities", but rather "grotty hacks".

I think before suggesting to add even more code to perpetuate that idea,
we should think about going in the other direction. I have not tried to
write the code, but it should be possible to have an intermediate
function called by ProcessUtility* which transforms the IndexStmt into
an internal representation, then calls DefineIndex. This way, all this
code that wants to create indexes for backend-internal reasons can
create the internal representation directly then call DefineIndex,
instead of the horrible hacks they use today creating parse nodes by
hand.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#82

Michael Paquier

michael.paquier@gmail.com

about 8 years ago

In reply to: Alvaro Herrera (#81)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:

Michael Paquier wrote:

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.

I dislike the backend code that uses SPI and manufacturing node to
re-creates indexes. IMO we should get rid of it. Let's not call it
"facilities", but rather "grotty hacks".

Aha. You are making my day here ;)

I think before suggesting to add even more code to perpetuate that idea,
we should think about going in the other direction. I have not tried to
write the code, but it should be possible to have an intermediate
function called by ProcessUtility* which transforms the IndexStmt into
an internal representation, then calls DefineIndex. This way, all this
code that wants to create indexes for backend-internal reasons can
create the internal representation directly then call DefineIndex,
instead of the horrible hacks they use today creating parse nodes by
hand.

Yeah, that would be likely possible. I am not volunteering for that in
the short term though..
--
Michael

#83

Craig Ringer

craig@2ndquadrant.com

about 8 years ago

In reply to: Michael Paquier (#82)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 21 December 2017 at 11:31, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:

Michael Paquier wrote:

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.

I dislike the backend code that uses SPI and manufacturing node to
re-creates indexes. IMO we should get rid of it. Let's not call it
"facilities", but rather "grotty hacks".

Aha. You are making my day here ;)

I think before suggesting to add even more code to perpetuate that idea,
we should think about going in the other direction. I have not tried to
write the code, but it should be possible to have an intermediate
function called by ProcessUtility* which transforms the IndexStmt into
an internal representation, then calls DefineIndex. This way, all this
code that wants to create indexes for backend-internal reasons can
create the internal representation directly then call DefineIndex,
instead of the horrible hacks they use today creating parse nodes by
hand.

Yeah, that would be likely possible. I am not volunteering for that in
the short term though..

It sounds like that'd make some of ALTER TABLE a bit less ... upsetting ...
too.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#84

Stephen Frost

sfrost@snowman.net

almost 8 years ago

In reply to: Craig Ringer (#83)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Craig, Michael, all,

* Craig Ringer (craig@2ndquadrant.com) wrote:

On 21 December 2017 at 11:31, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:

Michael Paquier wrote:

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.

I dislike the backend code that uses SPI and manufacturing node to
re-creates indexes. IMO we should get rid of it. Let's not call it
"facilities", but rather "grotty hacks".

Aha. You are making my day here ;)

I think before suggesting to add even more code to perpetuate that idea,
we should think about going in the other direction. I have not tried to
write the code, but it should be possible to have an intermediate
function called by ProcessUtility* which transforms the IndexStmt into
an internal representation, then calls DefineIndex. This way, all this
code that wants to create indexes for backend-internal reasons can
create the internal representation directly then call DefineIndex,
instead of the horrible hacks they use today creating parse nodes by
hand.

Yeah, that would be likely possible. I am not volunteering for that in
the short term though..

It sounds like that'd make some of ALTER TABLE a bit less ... upsetting ...
too.

I'm a big fan of this patch but it doesn't appear to have made any
progress in quite a while. Is there any chance we can get an updated
patch and perhaps get another review before the end of this CF...?

Refactoring this to have an internal representation between
ProcessUtility() and DefineIndex doesn't sound too terrible and if it
means the ability to reuse that, seems like it'd be awful nice to do
so..

Thanks!

Stephen

#85

Andreas Karlsson

andreas@proxel.se

almost 8 years ago

In reply to: Stephen Frost (#84)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 01/26/2018 03:28 AM, Stephen Frost wrote:

I'm a big fan of this patch but it doesn't appear to have made any
progress in quite a while. Is there any chance we can get an updated
patch and perhaps get another review before the end of this CF...?

Sorry, as you may have guessed I do not have the time right now to get
this updated during this commitfest.

Refactoring this to have an internal representation between
ProcessUtility() and DefineIndex doesn't sound too terrible and if it
means the ability to reuse that, seems like it'd be awful nice to do
so..

I too like the concept, but have not had the time to look into it.

Andreas

#86

Michael Paquier

michael.paquier@gmail.com

almost 8 years ago

In reply to: Andreas Karlsson (#85)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Wed, Jan 31, 2018 at 01:48:00AM +0100, Andreas Karlsson wrote:

I too like the concept, but have not had the time to look into it.

This may happen at some point, for now I am marking the patch as
returned with feedback.
--
Michael

#87

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Michael Paquier (#86)

1 attachment(s)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Here is a revival of this patch. This is Andreas Karlsson's v4 patch
(2017-11-01) with some updates for conflicts and changed APIs.

AFAICT from the discussions, there were no more conceptual concerns with
this approach. Recall that with this patch REINDEX CONCURRENTLY creates
a new index (with a new OID) and then switch the names and dependencies.

I have done a review of this patch and it looks pretty solid to me.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v5-0001-REINDEX-CONCURRENTLY.patchtext/plain; charset=UTF-8; name=v5-0001-REINDEX-CONCURRENTLY.patch; x-mac-creator=0; x-mac-type=0Download

From 250f023a16e122153c90f2b378864cf118cb50fe Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Fri, 7 Dec 2018 16:19:16 +0100
Subject: [PATCH v5] REINDEX CONCURRENTLY

---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 184 +++-
 src/backend/catalog/index.c                   | 501 +++++++++-
 src/backend/catalog/pg_depend.c               | 143 +++
 src/backend/catalog/toasting.c                |   2 +-
 src/backend/commands/indexcmds.c              | 858 +++++++++++++++---
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/include/catalog/dependency.h              |   5 +
 src/include/catalog/index.h                   |  18 +
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  70 ++
 src/test/regress/sql/create_index.sql         |  51 ++
 22 files changed, 1880 insertions(+), 179 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ <title>Table-level Lock Modes</title>
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..1e92c8fbfc 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ <title>Description</title>
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ <title>Parameters</title>
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,160 @@ <title>Notes</title>
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_class.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</command>.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +438,14 @@ <title>Examples</title>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8709e8c22c..3ac85a9e0b 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -694,6 +694,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * flags: bitmask that can include any combination of these bits:
  *		INDEX_CREATE_IS_PRIMARY
  *			the index is a primary key
@@ -734,6 +735,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -796,7 +798,7 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsCatalogRelation(heapRelation))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -864,14 +866,20 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if not passed by caller
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (!tupdesc)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1203,6 +1211,451 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrently_create_copy
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Create a copy of the tuple descriptor to be used for the new entry */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  NIL,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  indexTupDesc,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_concurrently_build
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need to
+ * be kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrently_build(Oid heapOid,
+						 Oid indexOid,
+						 bool isprimary)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false, true);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrently_swap
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = heap_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+	heap_close(pg_constraint, RowExclusiveLock);
+	heap_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrently_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrently_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrently_drop(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1592,36 +2045,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrently_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 2ea05f350b..82fae513f3 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -376,6 +376,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -735,3 +823,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	heap_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 462969a838..4a967c3519 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -335,7 +335,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 INDEX_CREATE_IS_PRIMARY, 0, true, true, NULL);
 
 	heap_close(toast_rel, NoLock);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 73656d8cc8..95ed7c2b2b 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -58,6 +58,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -84,6 +85,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -298,6 +300,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -346,7 +432,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -361,9 +446,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -847,7 +930,7 @@ DefineIndex(Oid relationId,
 					 stmt->oldNode, indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions,
+					 coloptions, reloptions, NULL,
 					 flags, constr_flags,
 					 allowSystemTableMods, !check_rights,
 					 &createdConstraintId);
@@ -1143,34 +1226,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false, true);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrently_build(RangeVarGetRelid(stmt->relation,
+											  ShareUpdateExclusiveLock,
+											  false),
+							 indexRelationId,
+							 stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1242,74 +1306,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2195,7 +2194,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2207,7 +2206,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2227,7 +2227,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2295,18 +2298,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2324,7 +2335,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2336,6 +2347,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2444,6 +2456,17 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/* A system catalog cannot be reindexed concurrently */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2470,26 +2493,629 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
-
-			if (options & REINDEXOPT_VERBOSE)
-				ereport(INFO,
-						(errmsg("table \"%s.%s\" was reindexed",
-								get_namespace_name(get_rel_namespace(relid)),
-								get_rel_name(relid))));
+
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+		if (result && (options & REINDEXOPT_VERBOSE))
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							get_namespace_name(get_rel_namespace(relid)),
+							get_rel_name(relid))));
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char	   *relationName = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (get_rel_relkind(relationOid))
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+														  ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrently_create_copy(indexParentRel,
+													   indOid,
+													   concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrently_build(indexRel->rd_index->indrelid,
+								 concurrentOid,
+								 primary);
+
+		/* We can do away with our snapshot */
 		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_concurrently_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrently_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrently_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finallt release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+		ereport(INFO,
+				(errmsg("relation \"%s\" was reindexed",
+						relationName),
+				 errdetail("%s.",
+						   pg_rusage_show(&ru0))));
+
+	/* Start a new transaction to finish process properly */
 	StartTransactionCommand();
 
 	MemoryContextDelete(private_context);
+
+	return true;
 }
 
 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 843ed48aa7..a6f9059a15 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1185,6 +1185,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1247,7 +1248,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index db49968409..277066baf3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4353,6 +4353,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3a084b4d1f..88ec6639e6 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2097,6 +2097,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 2c2208ffb7..bd55e866dc 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8314,42 +8314,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 970c94ee80..94b478be2a 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 62c2928e6b..7b3b21c5de 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2149,6 +2149,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index fa44b2820b..a2893d91ee 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3148,12 +3148,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 46c271a46c..5c7c6ef4a2 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -251,6 +251,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -261,6 +264,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 35a29f3498..8f004198d7 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -65,6 +65,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -77,6 +78,23 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
+extern Oid index_concurrently_create_copy(Relation heapRelation,
+										  Oid oldIndexId,
+										  const char *newName);
+
+extern void index_concurrently_build(Oid heapOid,
+									 Oid indexOid,
+									 bool isprimary);
+
+extern void index_concurrently_swap(Oid newIndexId,
+									Oid oldIndexId,
+									const char *oldName);
+
+extern void index_concurrently_set_dead(Oid heapOid,
+										Oid indexOid);
+
+extern void index_concurrently_drop(Oid indexId);
+
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 1d05a4bcdc..5d99dff48e 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e5bdc1cec5..70136772a5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3304,6 +3304,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index dd57a96e78..ced699ad7a 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -41,6 +41,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 46deb55c67..a5e382bf28 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3292,3 +3292,73 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 59da6b6592..9f13e718a1 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1207,3 +1207,54 @@ CREATE ROLE regress_reindexuser NOLOGIN;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

base-commit: f8f6e44676ef38fee7a5bbe4f256a34ea7799ac1
-- 
2.19.2

#88

Sergei Kornilov

sk@zsrv.org

about 7 years ago

In reply to: Peter Eisentraut (#87)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Hello

Thank you for working on this patch!

I perform some tests and think behavior with partition tables is slightly inconsistent.

postgres=# reindex table measurement;
WARNING: REINDEX of partitioned tables is not yet implemented, skipping "measurement"
NOTICE: table "measurement" has no indexes
REINDEX
postgres=# reindex table CONCURRENTLY measurement;
ERROR: cannot reindex concurrently this type of relation

Maybe we need report warning and skip partitioned tables similar to plain reindex?

This makes more sense for "reindex database" or "reindex schema": as far i can see, concurrent reindex will stop work after first partitioned table in list.

regards, Sergei

#89

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Sergei Kornilov (#88)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 07/12/2018 17:40, Sergei Kornilov wrote:

I perform some tests and think behavior with partition tables is slightly inconsistent.

postgres=# reindex table measurement;
WARNING: REINDEX of partitioned tables is not yet implemented, skipping "measurement"
NOTICE: table "measurement" has no indexes
REINDEX
postgres=# reindex table CONCURRENTLY measurement;
ERROR: cannot reindex concurrently this type of relation

Maybe we need report warning and skip partitioned tables similar to plain reindex?

OK, that should be easy to fix.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#90

Sergei Kornilov

sk@zsrv.org

about 7 years ago

In reply to: Peter Eisentraut (#89)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Hello

I review code and documentation and i have few notes. Did you register this patch in CF app?

I found one error in phase 4. Simple reproducer:

create table test (i int);
create index this_is_very_large_exactly_maxnamelen_index_name_wink_wink_wink on test (i);
create index this_is_very_large_exactly_maxnamelen_index_name_wink_winkccold on test (i);
reindex table CONCURRENTLY test;

This fails with error

ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"
DETAIL: Key (relname, relnamespace)=(this_is_very_large_exactly_maxnamelen_index_name_wink_win_ccold, 2200) already exists.

CommandCounterIncrement() in (or after) index_concurrently_swap will fix this issue.

ReindexPartitionedIndex(Relation parentIdx)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("REINDEX is not yet implemented for partitioned indexes")));

I think we need add errhint("you can REINDEX each partition separately") or something similar.
Also can we omit this warning for reindex database? All partition must be in same database and warning in such case is useless: we have warning, but doing reindex for each partition => we reindex partitioned table correctly.

Another behavior issue i found with reindex (verbose) schema/database: INFO ereport is printed twice for each table.

INFO: relation "measurement_y2006m02" was reindexed
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s.
INFO: table "public.measurement_y2006m02" was reindexed

One from ReindexMultipleTables and another (with pg_rusage_show) from ReindexRelationConcurrently.

ReindexRelationConcurrently
if (!indexRelation->rd_index->indisvalid)

it is better use IndexIsValid macro here? And same question about added indexform->indisvalid in src/backend/commands/tablecmds.c

<para>
An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
</para>

This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid indexes we still need reindex with lock table or recreate this index concurrently.

+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</literal> is
+       switched to <quote>true</quote>
+       At this point <literal>pg_class.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote>

Should be pg_index.indisvalid and pg_index.indisready, right?

regards, Sergei

#91

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Sergei Kornilov (#90)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 09/12/2018 19:55, Sergei Kornilov wrote:

reindex table CONCURRENTLY test;

By the way, does this syntax make sense? I haven't seen a discussion on
this anywhere in the various threads. I keep thinking that

reindex concurrently table test;

would make more sense. How about in combination with (verbose)?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#92

Stephen Frost

sfrost@snowman.net

about 7 years ago

In reply to: Peter Eisentraut (#91)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:

On 09/12/2018 19:55, Sergei Kornilov wrote:

reindex table CONCURRENTLY test;

By the way, does this syntax make sense? I haven't seen a discussion on
this anywhere in the various threads. I keep thinking that

reindex concurrently table test;

would make more sense. How about in combination with (verbose)?

I don't think it's a mistake that we have 'create index concurrently'
and it certainly would seem odd to me for 'create index' and 'reindex
table' to be different.

Certainly, from my recollection of english, you'd say "I am going to
reindex the table concurrently", you wouldn't say "I am going to
reindex concurrently the table."

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1]http://www.grammar.cl/Notes/Adverbs.htm, adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

Thanks!

Stephen

[1]: http://www.grammar.cl/Notes/Adverbs.htm

#93

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Stephen Frost (#92)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote:

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1], adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

This part has been a long debate already in 2012-2013 when I sent the
first iterations of the patch, and my memories on the matter are that
the grammar you are showing here matches with the past agreement.
--
Michael

#94

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Stephen Frost (#92)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 14/12/2018 01:14, Stephen Frost wrote:

reindex table CONCURRENTLY test;

By the way, does this syntax make sense? I haven't seen a discussion on
this anywhere in the various threads. I keep thinking that

reindex concurrently table test;

would make more sense. How about in combination with (verbose)?

I don't think it's a mistake that we have 'create index concurrently'
and it certainly would seem odd to me for 'create index' and 'reindex
table' to be different.

Certainly, from my recollection of english, you'd say "I am going to
reindex the table concurrently", you wouldn't say "I am going to
reindex concurrently the table."

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1], adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

So it would be grammatical to say

reindex table test concurrently

or in a pinch

reindex concurrently table test

but I don't see anything grammatical about

reindex table concurrently test

(given that the object is "table test").

Where this gets really messy is stuff like this:

reindex (verbose) database concurrently postgres

Why would "concurrently" not be part of the options next to "verbose"?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#95

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Michael Paquier (#93)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 14/12/2018 01:23, Michael Paquier wrote:

On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote:

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1], adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

This part has been a long debate already in 2012-2013 when I sent the
first iterations of the patch, and my memories on the matter are that
the grammar you are showing here matches with the past agreement.

Do you happen to have a link for that? I didn't find anything.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#96

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Eisentraut (#95)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 2018-Dec-14, Peter Eisentraut wrote:

On 14/12/2018 01:23, Michael Paquier wrote:

On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote:

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1], adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

This part has been a long debate already in 2012-2013 when I sent the
first iterations of the patch, and my memories on the matter are that
the grammar you are showing here matches with the past agreement.

Do you happen to have a link for that? I didn't find anything.

I think putting the CONCURRENTLY in the parenthesized list of options is
most sensible.

CREATE INDEX didn't have such an option list when we added this feature
there; see
/messages/by-id/200608011143.k71Bh9c22067@momjian.us
for some discussion about that grammar. Our options were not great ...

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#97

Stephen Frost

sfrost@snowman.net

about 7 years ago

In reply to: Peter Eisentraut (#94)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:

On 14/12/2018 01:14, Stephen Frost wrote:

reindex table CONCURRENTLY test;

By the way, does this syntax make sense? I haven't seen a discussion on
this anywhere in the various threads. I keep thinking that

reindex concurrently table test;

would make more sense. How about in combination with (verbose)?

I don't think it's a mistake that we have 'create index concurrently'
and it certainly would seem odd to me for 'create index' and 'reindex
table' to be different.

Certainly, from my recollection of english, you'd say "I am going to
reindex the table concurrently", you wouldn't say "I am going to
reindex concurrently the table."

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1], adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

So it would be grammatical to say

reindex table test concurrently

Yes, though I'm not really a fan of it.

or in a pinch

reindex concurrently table test

No, you can't put concurrently between reindex and table.

but I don't see anything grammatical about

reindex table concurrently test

I disagree, this does look reasonable to me and it's certainly much
better than 'reindex concurrently table' which looks clearly incorrect.

Where this gets really messy is stuff like this:

reindex (verbose) database concurrently postgres

Why would "concurrently" not be part of the options next to "verbose"?

That wasn't what was asked and I don't think I see a problem with having
concurrently be allowed in the parentheses. For comparison, it's not
like "explain analyze select ..." or "explain buffers select" is
terribly good grammatical form.

If you wanted to try to get to a better form for the spelled out
sentence, I would think:

concurrently reindex table test

would probably be the approach to use, though that's not what we use for
'create index' and it'd be rather out of character for us to start a
command with an adverb, making it ultimately a poor choice overall.

Going back to what we already have done and have in released versions,
we have 'create unique index concurrently test ...' and that's at least
reasonable (the adverb isn't showing up between the verb and the object,
and the adjective is between the verb and the object) and is what I
vote to go with, with the caveat that if we want to also allow it inside
the parentheses, I'm fine with that.

Thanks!

Stephen

#98

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Stephen Frost (#97)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 2018-Dec-14, Stephen Frost wrote:

That wasn't what was asked and I don't think I see a problem with having
concurrently be allowed in the parentheses. For comparison, it's not
like "explain analyze select ..." or "explain buffers select" is
terribly good grammatical form.

... and we don't allow EXPLAIN BUFFERS at all, and if we had had a
parenthesized option list in EXPLAIN when we invented EXPLAIN ANALYZE, I
bet we would have *not* made the ANALYZE keyword appear unadorned in
that command.

If you wanted to try to get to a better form for the spelled out
sentence, I would think:

concurrently reindex table test

would probably be the approach to use,

I think this is terrible from a command-completion perspective, and from
a documentation perspective (Certainly we wouldn't have a manpage about
the "concurrently" command, for starters).

My vote goes to put the keyword inside of and exclusively in the
parenthesized option list.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#99

Stephen Frost

sfrost@snowman.net

about 7 years ago

In reply to: Alvaro Herrera (#98)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Greetings,

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

On 2018-Dec-14, Stephen Frost wrote:

That wasn't what was asked and I don't think I see a problem with having
concurrently be allowed in the parentheses. For comparison, it's not
like "explain analyze select ..." or "explain buffers select" is
terribly good grammatical form.

... and we don't allow EXPLAIN BUFFERS at all, and if we had had a
parenthesized option list in EXPLAIN when we invented EXPLAIN ANALYZE, I
bet we would have *not* made the ANALYZE keyword appear unadorned in
that command.

I'm not convinced of that- there is value in being able to write full
and useful commands without having to always use parentheses.

If you wanted to try to get to a better form for the spelled out
sentence, I would think:

concurrently reindex table test

would probably be the approach to use,

I think this is terrible from a command-completion perspective, and from
a documentation perspective (Certainly we wouldn't have a manpage about
the "concurrently" command, for starters).

Right, I agreed that this had other downsides in the email you're
replying to here. Glad we agree that it's not a good option.

My vote goes to put the keyword inside of and exclusively in the
parenthesized option list.

I disagree with the idea of exclusively having concurrently be in the
parentheses. 'explain buffers' is a much less frequently used option
(though that might, in part, be because it's a bit annoying to write out
explain (analyze, buffers) select...; I wonder if we could have a way to
say "if I'm running analyze, I always want buffers"...), but
concurrently reindexing a table (or index..) is going to almost
certainly be extremely common, perhaps even more common than *not*
reindexing concurrently.

Thanks!

Stephen

#100

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Alvaro Herrera (#96)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Fri, Dec 14, 2018 at 09:00:58AM -0300, Alvaro Herrera wrote:

On 2018-Dec-14, Peter Eisentraut wrote:

Do you happen to have a link for that? I didn't find anything.

The message I was thinking about is close to here:
/messages/by-id/20121210152856.GC16664@awork2.anarazel.de

I think putting the CONCURRENTLY in the parenthesized list of options is
most sensible.

For new options of VACUUM and ANALYZE we tend to prefer that as well,
and this simplifies the query parsing.
--
Michael

#101

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Sergei Kornilov (#90)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 09/12/2018 19:55, Sergei Kornilov wrote:

<para>
An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
</para>

This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid indexes we still need reindex with lock table or recreate this index concurrently.

The current patch prevents REINDEX CONCURRENTLY of invalid indexes, but
I wonder why that is so. Anyone remember?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#102

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Peter Eisentraut (#101)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Thu, Dec 27, 2018 at 11:04:09AM +0100, Peter Eisentraut wrote:

The current patch prevents REINDEX CONCURRENTLY of invalid indexes, but
I wonder why that is so. Anyone remember?

It should be around this time:
/messages/by-id/CAB7nPqRwVtQcHWErUf9o0hrRGFyQ9xArk7K7jCLxqKLy_6CXPQ@mail.gmail.com

And if I recall correctly the rason to not be able to reindex invalid
entries was that when working on a table, schema or database, if a
failure happens in the process, the reindexing would need to happen
for a double number of indexes when repeating the command.
--
Michael

#103

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Stephen Frost (#99)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 2018-Dec-14, Stephen Frost wrote:

My vote goes to put the keyword inside of and exclusively in the
parenthesized option list.

I disagree with the idea of exclusively having concurrently be in the
parentheses. 'explain buffers' is a much less frequently used option
(though that might, in part, be because it's a bit annoying to write out
explain (analyze, buffers) select...; I wonder if we could have a way to
say "if I'm running analyze, I always want buffers"...),

I'm skeptical. I think EXPLAIN ANALYZE is more common because it has
more than one decade of advantage compared to the more detailed option
list. Yes, it's also easier, but IMO it's a brain thing (muscle
memory), not a fingers thing.

but concurrently reindexing a table (or index..) is going to almost
certainly be extremely common, perhaps even more common than *not*
reindexing concurrently.

Well, users can use the reindexdb utility and save some keystrokes.

Anyway we don't typically add redundant ways to express the same things.
Where we have them, it's just because the old way was there before, and
we added the extensible way later. Adding two in the first appearance
of a new feature seems absurd to me.

After looking at the proposed grammar again today and in danger of
repeating myself, IMO allowing the concurrency keyword to appear outside
the parens would be a mistake. Valid commands:

REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
REINDEX (CONCURRENTLY) INDEX bar;

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#104

Andrew Gierth

andrew@tao11.riddles.org.uk

about 7 years ago

In reply to: Alvaro Herrera (#103)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

"Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Alvaro> After looking at the proposed grammar again today and in danger
Alvaro> of repeating myself, IMO allowing the concurrency keyword to
Alvaro> appear outside the parens would be a mistake. Valid commands:

Alvaro> REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
Alvaro> REINDEX (CONCURRENTLY) INDEX bar;

We burned that bridge with CREATE INDEX CONCURRENTLY; to make REINDEX
require different syntax would be too inconsistent.

If we didn't have all these existing uses of CONCURRENTLY without
parens, your argument might have more merit; but we do.

--
Andrew (irc:RhodiumToad)

#105

Andres Freund

andres@anarazel.de

about 7 years ago

In reply to: Andrew Gierth (#104)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 2018-12-31 21:35:57 +0000, Andrew Gierth wrote:

"Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Alvaro> After looking at the proposed grammar again today and in danger
Alvaro> of repeating myself, IMO allowing the concurrency keyword to
Alvaro> appear outside the parens would be a mistake. Valid commands:

Alvaro> REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
Alvaro> REINDEX (CONCURRENTLY) INDEX bar;

We burned that bridge with CREATE INDEX CONCURRENTLY; to make REINDEX
require different syntax would be too inconsistent.

If we didn't have all these existing uses of CONCURRENTLY without
parens, your argument might have more merit; but we do.

#106

Stephen Frost

sfrost@snowman.net

about 7 years ago

In reply to: Alvaro Herrera (#103)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Greetings,

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

On 2018-Dec-14, Stephen Frost wrote:

My vote goes to put the keyword inside of and exclusively in the
parenthesized option list.

I disagree with the idea of exclusively having concurrently be in the
parentheses. 'explain buffers' is a much less frequently used option
(though that might, in part, be because it's a bit annoying to write out
explain (analyze, buffers) select...; I wonder if we could have a way to
say "if I'm running analyze, I always want buffers"...),

I'm skeptical. I think EXPLAIN ANALYZE is more common because it has
more than one decade of advantage compared to the more detailed option
list. Yes, it's also easier, but IMO it's a brain thing (muscle
memory), not a fingers thing.

I would argue that it's both.

but concurrently reindexing a table (or index..) is going to almost
certainly be extremely common, perhaps even more common than *not*
reindexing concurrently.

Well, users can use the reindexdb utility and save some keystrokes.

That's a really poor argument as those unix utilities are hardly ever
used, in my experience.

Anyway we don't typically add redundant ways to express the same things.
Where we have them, it's just because the old way was there before, and
we added the extensible way later. Adding two in the first appearance
of a new feature seems absurd to me.

SQL allows many, many different ways to express the same thing. I agree
that we haven't done that much in our utility commands, but I don't see
that as an argument against doing so, just that we haven't (previously)
really had the need- because most of the time we don't have a bunch of
different options where we want to have a list.

After looking at the proposed grammar again today and in danger of
repeating myself, IMO allowing the concurrency keyword to appear outside
the parens would be a mistake. Valid commands:

REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
REINDEX (CONCURRENTLY) INDEX bar;

This discussion hasn't changed my opinion, and, though I'm likely
repeating myself as well, I also agree with the down-thread comment that
this ship really has already sailed.

Thanks!

Stephen

#107

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 7 years ago

In reply to: Sergei Kornilov (#90)

1 attachment(s)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Updated patch for some merge conflicts and addressing most of your
comments. (I did not do anything about the syntax.)

On 09/12/2018 19:55, Sergei Kornilov wrote:

I found one error in phase 4. Simple reproducer:

create table test (i int);
create index this_is_very_large_exactly_maxnamelen_index_name_wink_wink_wink on test (i);
create index this_is_very_large_exactly_maxnamelen_index_name_wink_winkccold on test (i);
reindex table CONCURRENTLY test;

This fails with error

ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"
DETAIL: Key (relname, relnamespace)=(this_is_very_large_exactly_maxnamelen_index_name_wink_win_ccold, 2200) already exists.

CommandCounterIncrement() in (or after) index_concurrently_swap will fix this issue.

fixed

ReindexPartitionedIndex(Relation parentIdx)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("REINDEX is not yet implemented for partitioned indexes")));

I think we need add errhint("you can REINDEX each partition separately") or something similar.
Also can we omit this warning for reindex database? All partition must be in same database and warning in such case is useless: we have warning, but doing reindex for each partition => we reindex partitioned table correctly.

fixed by skipping in ReindexRelationConcurrently(), same as other
unsupported relkinds

Another behavior issue i found with reindex (verbose) schema/database: INFO ereport is printed twice for each table.

INFO: relation "measurement_y2006m02" was reindexed
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s.
INFO: table "public.measurement_y2006m02" was reindexed

One from ReindexMultipleTables and another (with pg_rusage_show) from ReindexRelationConcurrently.

fixed with some restructuring

ReindexRelationConcurrently
if (!indexRelation->rd_index->indisvalid)

it is better use IndexIsValid macro here? And same question about added indexform->indisvalid in src/backend/commands/tablecmds.c

IndexIsValid() has been removed in the meantime.

<para>
An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
</para>

This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid indexes we still need reindex with lock table or recreate this index concurrently.

still being discussed elsewhere in this thread

+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_class.isready</literal> is
+       switched to <quote>true</quote>
+       At this point <literal>pg_class.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote>

Should be pg_index.indisvalid and pg_index.indisready, right?

fixed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v6-0001-REINDEX-CONCURRENTLY.patchtext/plain; charset=UTF-8; name=v6-0001-REINDEX-CONCURRENTLY.patch; x-mac-creator=0; x-mac-type=0Download

From 55345211e7d6026b573142b8cbe8fe24f7692285 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 3 Jan 2019 22:43:30 +0100
Subject: [PATCH v6] REINDEX CONCURRENTLY

---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 184 +++-
 src/backend/catalog/index.c                   | 501 +++++++++-
 src/backend/catalog/pg_depend.c               | 143 +++
 src/backend/catalog/toasting.c                |   2 +-
 src/backend/commands/indexcmds.c              | 885 +++++++++++++++---
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/include/catalog/dependency.h              |   5 +
 src/include/catalog/index.h                   |  18 +
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  70 ++
 src/test/regress/sql/create_index.sql         |  51 +
 22 files changed, 1912 insertions(+), 174 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ <title>Table-level Lock Modes</title>
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..91e823abe8 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ <title>Description</title>
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ <title>Parameters</title>
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,160 @@ <title>Notes</title>
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</command>.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +438,14 @@ <title>Examples</title>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c91408046a..0c026045b8 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -694,6 +694,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * flags: bitmask that can include any combination of these bits:
  *		INDEX_CREATE_IS_PRIMARY
  *			the index is a primary key
@@ -734,6 +735,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -796,7 +798,7 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -864,14 +866,20 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if not passed by caller
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (!tupdesc)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1203,6 +1211,451 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrently_create_copy
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Create a copy of the tuple descriptor to be used for the new entry */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  NIL,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  indexTupDesc,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_concurrently_build
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need to
+ * be kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrently_build(Oid heapOid,
+						 Oid indexOid,
+						 bool isprimary)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, isprimary, false, true);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrently_swap
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy contraint flags for old index. This is safe because the old index
+	 * guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = heap_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+	heap_close(pg_constraint, RowExclusiveLock);
+	heap_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrently_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrently_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrently_drop(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1592,36 +2045,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrently_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index fde7e170be..9abfb21e96 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -376,6 +376,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -735,3 +823,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	heap_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 052a0a1305..99aba264fc 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -335,7 +335,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 INDEX_CREATE_IS_PRIMARY, 0, true, true, NULL);
 
 	heap_close(toast_rel, NoLock);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index d263903622..fb4826d61f 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -58,6 +58,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -84,6 +85,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -298,6 +300,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -346,7 +432,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -361,9 +446,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -856,7 +939,7 @@ DefineIndex(Oid relationId,
 					 stmt->oldNode, indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions,
+					 coloptions, reloptions, NULL,
 					 flags, constr_flags,
 					 allowSystemTableMods, !check_rights,
 					 &createdConstraintId);
@@ -1152,34 +1235,15 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false, true);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrently_build(RangeVarGetRelid(stmt->relation,
+											  ShareUpdateExclusiveLock,
+											  false),
+							 indexRelationId,
+							 stmt->primary);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1251,74 +1315,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2204,7 +2203,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2216,7 +2215,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2236,7 +2236,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2304,18 +2307,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2333,7 +2344,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2345,6 +2356,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2453,6 +2465,20 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.
+		 */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2479,26 +2505,663 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+		}
+
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+														  ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrently_create_copy(indexParentRel,
+													   indOid,
+													   concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		bool		primary;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		primary = indexRel->rd_index->indisprimary;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrently_build(indexRel->rd_index->indrelid,
+								 concurrentOid,
+								 primary);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
 		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_concurrently_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrently_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrently_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finallt release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+	}
+
+	/* Start a new transaction to finish process properly */
 	StartTransactionCommand();
 
 	MemoryContextDelete(private_context);
+
+	return true;
 }
 
 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e1af2c4495..90de2848e3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1245,6 +1245,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1307,7 +1308,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 006a3d1772..f7f8e040a3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4352,6 +4352,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 133df1b364..880d4a0bdb 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2097,6 +2097,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c086235b25..74a87ef1fc 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8312,42 +8312,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 27ae6be751..140a67e548 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index b11d7ac6ce..8297bf9aa5 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2149,6 +2149,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index bca788c7a3..f0829536a2 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3197,12 +3197,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5dea27016e..24b47d4fc6 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -251,6 +251,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -261,6 +264,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 0f1f63b38e..eeed407943 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -65,6 +65,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -77,6 +78,23 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
+extern Oid index_concurrently_create_copy(Relation heapRelation,
+										  Oid oldIndexId,
+										  const char *newName);
+
+extern void index_concurrently_build(Oid heapOid,
+									 Oid indexOid,
+									 bool isprimary);
+
+extern void index_concurrently_swap(Oid newIndexId,
+									Oid oldIndexId,
+									const char *oldName);
+
+extern void index_concurrently_set_dead(Oid heapOid,
+										Oid indexOid);
+
+extern void index_concurrently_drop(Oid indexId);
+
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index e592a914a4..e11caf2cd1 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 27782fed6c..4bfe4ce05b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3303,6 +3303,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 91d9d90135..e32886bacb 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 46deb55c67..a5e382bf28 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3292,3 +3292,73 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 59da6b6592..9f13e718a1 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1207,3 +1207,54 @@ CREATE ROLE regress_reindexuser NOLOGIN;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

base-commit: 68a13f28bebc9eb70cc6988bfa2daaf4500f519f
-- 
2.20.1

#108

Sergei Kornilov

sk@zsrv.org

about 7 years ago

In reply to: Peter Eisentraut (#107)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Hello
Thank you! I review new patch version. It applied, builds and pass tests. Code looks good, but i notice new behavior notes:

postgres=# reindex (verbose) table CONCURRENTLY measurement ;
WARNING: REINDEX of partitioned tables is not yet implemented, skipping "measurement"
NOTICE: table "measurement" has no indexes
REINDEX
postgres=# \d measurement
Partitioned table "public.measurement"
Column | Type | Collation | Nullable | Default
-----------+---------+-----------+----------+---------
city_id | integer | | not null |
logdate | date | | not null |
peaktemp | integer | | |
unitsales | integer | | |
Partition key: RANGE (logdate)
Indexes:
"measurement_logdate_idx" btree (logdate)
Number of partitions: 0

NOTICE seems unnecessary here.

Unfortunally concurrenttly reindex loses comments, reproducer:

create table testcomment (i int);
create index testcomment_idx1 on testcomment (i);
comment on index testcomment_idx1 IS 'test comment';
\di+ testcomment_idx1
reindex table testcomment ;
\di+ testcomment_idx1 # ok
reindex table CONCURRENTLY testcomment ;
\di+ testcomment_idx1 # we lose comment

Also i think we need change REINDEX to "<command>REINDEX</command> (without <option>CONCURRENTLY</option>)" in ACCESS EXCLUSIVE section Table-level Lock Modes documentation (to be similar with REFRESH MATERIALIZED VIEW and CREATE INDEX description)

About reindex invalid indexes - i found one good question in archives [1]/messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com: how about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove only invalid ones. Valid index gives "ERROR: permission denied: "pg_toast_16426_index_ccnew" is a system catalog"

About syntax: i vote for current syntax "reindex table CONCURRENTLY tablename". This looks consistent with existed CREATE INDEX CONCURRENTLY and REFRESH MATERIALIZED VIEW CONCURRENTLY.

regards, Sergei

[1]: /messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com

#109

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Sergei Kornilov (#108)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

About syntax: i vote for current syntax "reindex table CONCURRENTLY
tablename". This looks consistent with existed CREATE INDEX CONCURRENTLY
and REFRESH MATERIALIZED VIEW CONCURRENTLY.

Pavel

Show quoted text

regards, Sergei

[1]:
/messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com

#110

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Sergei Kornilov (#108)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:

NOTICE seems unnecessary here.

Unfortunally concurrently reindex loses comments, reproducer:

Yes, the NOTICE message makes little sense.

I am getting back in touch with this stuff. It has been some time but
the core of the patch has not actually changed in its base concept, so
I am still very familiar with it as the original author. There are
even typos I may have introduced a couple of years back, like
"contraint". I have not yet spent much time on that, but there are at
quick glance a bunch of things that could be retouched to get pieces
of that committable.

+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
+    including invalid toast indexes.
This needs <literal> markups for "ccnew" and "ccold".  "definiton" is
not correct.

index_create does not actually need its extra argument with the tuple
descriptor. I think that we had better grab the column name list from
indexInfo and just pass that down to index_create() (patched on my
local branch), so it is an overkill to take a full copy of the index's
TupleDesc.

The patch, standing as-is, is close to 2k lines long, so let's cut
that first into more pieces refactoring the concurrent build code.
Here are some preliminary notes:
- WaitForOlderSnapshots() could be in its own patch.
- index_concurrently_build() and index_concurrently_set_dead() can be
in an independent patch. set_dead() had better be a wrapper on top of
index_set_state_flags actually which is able to set any kind of
flags.
- A couple of pieces in index_create() could be cut as well.

I can send patches for those things as first steps which could happen
in this commit then, and commit them as needed. This way, we reduce
the size of the main patch. Even if the main portion does not get
into v12, we'd still have base pieces to move on next.

Regarding the grammar, we tend for the last couple of years to avoid
complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.

Does somebody mind if I jump into the ship after so long? I was the
original author of the monster after all...
--
Michael

#111

Andreas Karlsson

andreas@proxel.se

almost 7 years ago

In reply to: Michael Paquier (#110)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 1/16/19 9:27 AM, Michael Paquier wrote:> Regarding the grammar, we
tend for the last couple of years to avoid

complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.

Personally I do not care, but there have been a lot of voices for
keeping REINDEX CONCURRENTLY consistent with CREATE INDEX CONCURRENTLY
and DROP INDEX CONCURRENTLY.

Does somebody mind if I jump into the ship after so long? I was the
original author of the monster after all...

Fine by me. Peter?

Andreas

#112

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#110)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 2019-Jan-16, Michael Paquier wrote:

Regarding the grammar, we tend for the last couple of years to avoid
complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.

That's my opinion too, but I was outvoted in another subthread -- see
/messages/by-id/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
Stephen Frost, Andrew Gierth and Andres Freund all voted to put
CONCURRENTLY outside the parens. It seems we now have three votes to
put it *in* the parens (you, Peter Eisentraut, me). I guess more votes
are needed to settle this issue.

My opinion is that if we had had parenthesized options lists back when
CREATE INDEX CONCURRENTLY was invented, we would have put it there.
But we were young and needed the money ...

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#113

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Alvaro Herrera (#112)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Wed, Jan 16, 2019 at 02:59:31PM -0300, Alvaro Herrera wrote:

That's my opinion too, but I was outvoted in another subthread -- see
/messages/by-id/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
Stephen Frost, Andrew Gierth and Andres Freund all voted to put
CONCURRENTLY outside the parens. It seems we now have three votes to
put it *in* the parens (you, Peter Eisentraut, me). I guess more votes
are needed to settle this issue.

Sure, let's see. I would have been in the crowd of not using
parenthetised grammar five years ago, but the recent deals with other
commands worry me, as we would repeat the same errors.

My opinion is that if we had had parenthesized options lists back when
CREATE INDEX CONCURRENTLY was invented, we would have put it there.
But we were young and needed the money ...

:)
--
Michael

#114

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Andreas Karlsson (#111)

3 attachment(s)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Wed, Jan 16, 2019 at 05:56:15PM +0100, Andreas Karlsson wrote:

On 1/16/19 9:27 AM, Michael Paquier wrote:

Does somebody mind if I jump into the ship after so long? I was the
original author of the monster after all...

Fine by me. Peter?

Okay, I have begun digging into the patch, and extracted for now two
things which can be refactored first, giving a total of three patches:
- 0001, which creates WaitForOlderSnapshots to snapmgr.c. I think
that this can be useful for external extensions to have a process wait
for snapshots older than a minimum threshold hold by other
transactions.
- 0002, which moves the concurrent index build into its own routine,
index_build_concurrent(). At the same time, index_build() has a
isprimary argument which is not used, so let's remove it. This
simplifies a bit the refactoring as well.
- 0003 is the core patch, realigned with the rest, fixing some typos I
found on the way.

Here are also some notes for things I am planning to look after with a
second pass:
- The concurrent drop (phase 5) part, still shares a lot with DROP
INDEX CONCURRENTLY, and I think that we had better refactor more the
code so as REINDEX CONCURRENTLY shared more with DROP INDEX. One
thing which I think is incorrect is that we do not clear the invalid
flag of the drop index before marking it as dead. This looks like a
missing piece from another concurrent-related bug fix lost over the
rebases this patch had.
- set_dead could be refactored so as it is able to handle in input
multiple indexes, using WaitForLockersMultiple(). This way CREATE
INDEX CONCURRENTLY could also use it.
- There are no regression tests for partitioned tables.
- The NOTICE messages showing up when a table has no indexes should be
removed.
- index_create() does not really need a TupleDesc argument, as long as
the caller is able to provide a list of column names.
- At the end of the day, I think that it would be nice to reach a
state where we have a set of low-level routines like
index_build_concurrent, index_set_dead_concurrent which are used by
both paths of CONCURRENTLY and can be called for each phase within a
given transaction. Those pieces can also be helpful to implement for
example an extension able to do concurrent reindexing out of core.

I think that the refactorings in 0001 and 0002 are committable as-is,
and this shaves some code from the core patch.

Thoughts?
--
Michael

Attachments:

0001-Refactor-code-to-wait-for-older-snapshots-into-its-o.patchtext/x-diff; charset=us-asciiDownload

From 2d871385cdecd2b860f5eeb2615d8fcf9e866e54 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 17 Jan 2019 11:54:42 +0900
Subject: [PATCH 1/3] Refactor code to wait for older snapshots into its own
 routine

This is being used by CREATE INDEX CONCURRENTLY to make sure that valid
indexes are marked as such after waiting for all transactions using
snapshots older than the reference snapshot used for concurrent index
validation are gone.  This piece is useful independently, and can be
used by REINDEX CONCURRENTLY.
---
 src/backend/commands/indexcmds.c | 71 +---------------------------
 src/backend/utils/time/snapmgr.c | 81 ++++++++++++++++++++++++++++++++
 src/include/utils/snapmgr.h      |  2 +
 3 files changed, 85 insertions(+), 69 deletions(-)

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 1959e8a82e..a81e656059 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -362,9 +362,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -1252,74 +1250,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index f93b37b9c9..1dc3c162db 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -1990,6 +1990,87 @@ MaintainOldSnapshotTimeMapping(TimestampTz whenTaken, TransactionId xmin)
 	LWLockRelease(OldSnapshotTimeMapLock);
 }
 
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
 
 /*
  * Setup a snapshot that replaces normal catalog snapshots that allows catalog
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index f8308e6925..41c6b908d6 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -98,6 +98,8 @@ extern void MaintainOldSnapshotTimeMapping(TimestampTz whenTaken,
 
 extern char *ExportSnapshot(Snapshot snapshot);
 
+extern void WaitForOlderSnapshots(TransactionId limitXmin);
+
 /* Support for catalog timetravel for logical decoding */
 struct HTAB;
 extern struct HTAB *HistoricSnapshotGetTupleCids(void);
-- 
2.20.1

0002-Refactor-index-concurrent-build-into-its-own-routine.patchtext/x-diff; charset=us-asciiDownload

From b4dc21e928cd38f6e641144f08c60ccd0b2d0d4e Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 17 Jan 2019 12:25:26 +0900
Subject: [PATCH 2/3] Refactor index concurrent build into its own routine

This is used by CREATE INDEX CONCURRENTLY, and finds its uses for other
concurrent-safe operations as the index building step happens in an
independent transaction, like REINDEX CONCURRENTLY.

This simplifies at the same time index_build() which has been including
a flag to track primary indexes, but this was not used.  This is
removed, simplifying this refactoring on the way.
---
 src/backend/bootstrap/bootstrap.c |  2 +-
 src/backend/catalog/heap.c        |  2 +-
 src/backend/catalog/index.c       | 58 ++++++++++++++++++++++++++++---
 src/backend/commands/indexcmds.c  | 28 ++-------------
 src/include/catalog/index.h       |  3 +-
 5 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 63bb134949..a3242cae50 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -1131,7 +1131,7 @@ build_indices(void)
 		heap = heap_open(ILHead->il_heap, NoLock);
 		ind = index_open(ILHead->il_ind, NoLock);
 
-		index_build(heap, ind, ILHead->il_info, false, false, false);
+		index_build(heap, ind, ILHead->il_info, false, false);
 
 		index_close(ind, NoLock);
 		heap_close(heap, NoLock);
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index d7ccf2bfbe..f1364d29de 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3063,7 +3063,7 @@ RelationTruncateIndexes(Relation heapRelation)
 
 		/* Initialize the index and rebuild */
 		/* Note: we do not need to re-establish pkey setting */
-		index_build(heapRelation, currentIndex, indexInfo, false, true, false);
+		index_build(heapRelation, currentIndex, indexInfo, true, false);
 
 		/* We're done with this index */
 		index_close(currentIndex, NoLock);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8701e3a791..2447952f59 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1190,8 +1190,7 @@ index_create(Relation heapRelation,
 	}
 	else
 	{
-		index_build(heapRelation, indexRelation, indexInfo, isprimary, false,
-					true);
+		index_build(heapRelation, indexRelation, indexInfo, false, true);
 	}
 
 	/*
@@ -2236,7 +2235,6 @@ void
 index_build(Relation heapRelation,
 			Relation indexRelation,
 			IndexInfo *indexInfo,
-			bool isprimary,
 			bool isreindex,
 			bool parallel)
 {
@@ -2398,6 +2396,58 @@ index_build(Relation heapRelation,
 }
 
 
+/*
+ * index_build_concurrent - build index for a concurrent operation.
+ *
+ * Low-level locks are taken when this operation is performed to prevent
+ * only schema changes, but they need to be kept until the end of the
+ * transaction performing this operation.  'indexOid' refers to an index
+ * relation OID already created as part of previous processing, and
+ * 'heapOid' refers to its parent heap relation.
+ */
+void
+index_build_concurrent(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* This had better make sure that a snapshot is active */
+	Assert(ActiveSnapshotSet());
+
+	/* Open and lock the parent heap relation */
+	heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, false, true);
+
+	/* Close both relations, and keep the locks */
+	heap_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+
 /*
  * IndexBuildHeapScan - scan the heap relation to find tuples to be indexed
  *
@@ -3703,7 +3753,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 
 		/* Initialize the index and rebuild */
 		/* Note: we do not need to re-establish pkey setting */
-		index_build(heapRelation, iRel, indexInfo, false, true, true);
+		index_build(heapRelation, iRel, indexInfo, true, true);
 	}
 	PG_CATCH();
 	{
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index a81e656059..930d4d9880 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -347,7 +347,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -1151,34 +1150,11 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = heap_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, stmt->primary, false, true);
-
-	/* Close both the relations, but keep the locks */
-	heap_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_build_concurrent(relationId, indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 8daac5663c..129c00fd0c 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -107,10 +107,11 @@ extern void FormIndexDatum(IndexInfo *indexInfo,
 extern void index_build(Relation heapRelation,
 			Relation indexRelation,
 			IndexInfo *indexInfo,
-			bool isprimary,
 			bool isreindex,
 			bool parallel);
 
+extern void index_build_concurrent(Oid heapOid, Oid indexOid);
+
 extern double IndexBuildHeapScan(Relation heapRelation,
 				   Relation indexRelation,
 				   IndexInfo *indexInfo,
-- 
2.20.1

0003-Core-patch-for-REINDEX-CONCURRENTLY.patchtext/x-diff; charset=us-asciiDownload

From f29fa50414c4ab972850abaf38e2d5f745b5da6e Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 17 Jan 2019 13:59:22 +0900
Subject: [PATCH 3/3] Core patch for REINDEX CONCURRENTLY

---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 185 ++++-
 src/backend/catalog/index.c                   | 448 ++++++++++-
 src/backend/catalog/pg_depend.c               | 143 ++++
 src/backend/commands/indexcmds.c              | 694 +++++++++++++++++-
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/include/catalog/dependency.h              |   5 +
 src/include/catalog/index.h                   |  13 +
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  70 ++
 src/test/regress/sql/create_index.sql         |  51 ++
 21 files changed, 1765 insertions(+), 71 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..b7122e0e97 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,161 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the contraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</command>.
+    The concurrent index created during the processing has a name ending in
+    the suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it
+    is an old index definition which we failed to drop. Invalid indexes can
+    be dropped using <literal>DROP INDEX</literal>, including invalid toast
+    indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</command> or
+    <command>REINDEX INDEX</command> command can be performed within a
+    transaction block, but <command>REINDEX CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +439,14 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 2447952f59..82a6764f26 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -741,6 +741,7 @@ index_create(Relation heapRelation,
 			 Oid *constraintId)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
+	Oid			heapNamespaceId = get_rel_namespace(heapRelationId);
 	Relation	pg_class;
 	Relation	indexRelation;
 	TupleDesc	indexTupDesc;
@@ -793,10 +794,12 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs.  Toast catalogs are fine
+	 * though as they are associated with a root relation which could be
+	 * reindexed concurrently.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(heapNamespaceId))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -1202,6 +1205,415 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_create_copy_concurrent
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_create_copy_concurrent(Relation heapRelation, Oid oldIndexId,
+							 const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	List	   *indexColNames = NIL;
+	int			i;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/*
+	 * Extract the list of column names to be used for the index
+	 * creation.
+	 */
+	indexTupDesc = RelationGetDescr(indexRelation);
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(indexTupDesc, i);
+
+		/* Grab the column name and save it to the list */
+		indexColNames = lappend(indexColNames, NameStr(att->attname));
+	}
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  indexColNames,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+
+/*
+ * index_swap_concurrent
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_swap_concurrent(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy conntraint flags from the old index. This is safe because the old
+	 * index guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = heap_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = heap_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new one.
+	 */
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	heap_close(pg_class, RowExclusiveLock);
+	heap_close(pg_index, RowExclusiveLock);
+	heap_close(pg_constraint, RowExclusiveLock);
+	heap_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_set_dead_concurrent
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_set_dead_concurrent(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	heap_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_drop_concurrent
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_drop_concurrent(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	heap_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1591,36 +2003,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		heap_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_set_dead_concurrent(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index fde7e170be..9abfb21e96 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -376,6 +376,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	heap_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -735,3 +823,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	heap_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 930d4d9880..26d3866bb2 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -59,6 +59,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -85,6 +86,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -2114,7 +2116,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2126,7 +2128,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2146,7 +2149,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2214,18 +2220,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2243,7 +2257,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2255,6 +2269,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2363,6 +2378,20 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.
+		 */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2389,20 +2418,33 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+		}
+
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
@@ -2411,6 +2453,628 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	MemoryContextDelete(private_context);
 }
 
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = heap_open(toastOid,
+														  ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					heap_close(toastRelation, NoLock);
+				}
+
+				heap_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = heap_open(indexRel->rd_index->indrelid,
+								   ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_create_copy_concurrent(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		heap_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		heap_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+
+		/* Perform concurrent build of new index */
+		index_build_concurrent(indexRel->rd_index->indrelid, concurrentOid);
+
+		/* Keep lock until the end of this transaction */
+		index_close(indexRel, NoLock);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_swap_concurrent(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_set_dead_concurrent(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_drop_concurrent(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finally release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+
+	return true;
+}
+
 /*
  *	ReindexPartitionedIndex
  *		Reindex each child of the given partitioned index.
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d2781cbf19..5d866ebc47 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1245,6 +1245,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1307,7 +1308,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 006a3d1772..f7f8e040a3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4352,6 +4352,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 133df1b364..880d4a0bdb 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2097,6 +2097,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c086235b25..74a87ef1fc 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8312,42 +8312,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 27ae6be751..140a67e548 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index b11d7ac6ce..8297bf9aa5 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2149,6 +2149,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 292b1f483a..2e495b7c9e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3197,12 +3197,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5dea27016e..24b47d4fc6 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -251,6 +251,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -261,6 +264,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 129c00fd0c..803fa6a2c8 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -112,6 +112,19 @@ extern void index_build(Relation heapRelation,
 
 extern void index_build_concurrent(Oid heapOid, Oid indexOid);
 
+extern Oid index_create_copy_concurrent(Relation heapRelation,
+										Oid oldIndexId,
+										const char *newName);
+
+extern void index_swap_concurrent(Oid newIndexId,
+								  Oid oldIndexId,
+								  const char *oldName);
+
+extern void index_set_dead_concurrent(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_drop_concurrent(Oid indexId);
+
 extern double IndexBuildHeapScan(Relation heapRelation,
 				   Relation indexRelation,
 				   IndexInfo *indexInfo,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index e592a914a4..e11caf2cd1 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 27782fed6c..4bfe4ce05b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3303,6 +3303,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 91d9d90135..e32886bacb 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 46deb55c67..a5e382bf28 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3292,3 +3292,73 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 59da6b6592..9f13e718a1 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1207,3 +1207,54 @@ RESET ROLE;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
-- 
2.20.1

#115

Vik Fearing

vik.fearing@2ndquadrant.com

almost 7 years ago

In reply to: Alvaro Herrera (#112)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 16/01/2019 18:59, Alvaro Herrera wrote:

On 2019-Jan-16, Michael Paquier wrote:

Regarding the grammar, we tend for the last couple of years to avoid
complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.

That's my opinion too, but I was outvoted in another subthread -- see
/messages/by-id/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
Stephen Frost, Andrew Gierth and Andres Freund all voted to put
CONCURRENTLY outside the parens. It seems we now have three votes to
put it *in* the parens (you, Peter Eisentraut, me). I guess more votes
are needed to settle this issue.

My vote is to have homogeneous syntax for all of this, and so put it in
parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
parentheses for it, too.

I supposed we'll keep what would then be the legacy syntax for a few
decades or more.
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#116

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Vik Fearing (#115)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote:

My vote is to have homogeneous syntax for all of this, and so put it in
parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
parentheses for it, too.

That would be a new thing as these variants don't exist yet, and WITH
is for storage parameters. In my opinion, the long-term take on doing
such things is that we are then able to reduce the number of reserved
keywords in the grammar. Even if for the case of CONCURRENTLY we may
see humans on Mars before this actually happens, this does not mean
that we should not do it moving forward for other keywords in the
grammar.
--
Michael

#117

Vik Fearing

vik.fearing@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#116)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 19/01/2019 02:33, Michael Paquier wrote:

On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote:

My vote is to have homogeneous syntax for all of this, and so put it in
parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
parentheses for it, too.

That would be a new thing as these variants don't exist yet, and WITH
is for storage parameters. In my opinion, the long-term take on doing
such things is that we are then able to reduce the number of reserved
keywords in the grammar. Even if for the case of CONCURRENTLY we may
see humans on Mars before this actually happens, this does not mean
that we should not do it moving forward for other keywords in the
grammar.

I'm not sure I understand your point.

I don't want a situation like this:
CREATE INDEX CONCURRENTLY ...
DROP INDEX CONCURRENTLY ...
REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

I never said anything about WITH.
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#118

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Michael Paquier (#114)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Thu, Jan 17, 2019 at 02:11:01PM +0900, Michael Paquier wrote:

Okay, I have begun digging into the patch, and extracted for now two
things which can be refactored first, giving a total of three patches:
- 0001, which creates WaitForOlderSnapshots to snapmgr.c. I think
that this can be useful for external extensions to have a process wait
for snapshots older than a minimum threshold hold by other
transactions.
- 0002, which moves the concurrent index build into its own routine,
index_build_concurrent(). At the same time, index_build() has a
isprimary argument which is not used, so let's remove it. This
simplifies a bit the refactoring as well.
- 0003 is the core patch, realigned with the rest, fixing some typos I
found on the way.

Are there any objections if I commit 0001? Introducing
WaitForOlderSnapshots() is quite independent from the rest, and the
refactoring is obvious. For 0002, I am still not 100% sure if
index_build_concurrent() is the best interface but I am planning to
look more at this stuff next week, particularly the drop portion which
needs more work.
--
Michael

#119

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Vik Fearing (#117)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Hello

I don't want a situation like this:
    CREATE INDEX CONCURRENTLY ...
    DROP INDEX CONCURRENTLY ...
    REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

We already have parenthesized VERBOSE option for REINDEX. So proposed syntax was:

REINDEX (CONCURRENTLY) INDEX ...
REINDEX (VERBOSE, CONCURRENTLY) INDEX ...

Like parameters for EXPLAIN, VACUUM. And completely unlike create/drop index.

So consistent syntax for create/drop would be:

CREATE (CONCURRENTLY) INDEX ...
CREATE (UNIQUE, CONCURRENTLY) INDEX ... # or we want parenthesized concurrently, but not unique? CREATE UNIQUE (CONCURRENTLY) INDEX?
DROP (CONCURRENTLY) INDEX ...

How about REFRESH MATERIALIZED VIEW? Do not change?

regards, Sergei

#120

Stephen Frost

sfrost@snowman.net

almost 7 years ago

In reply to: Vik Fearing (#115)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Greetings,

* Vik Fearing (vik.fearing@2ndquadrant.com) wrote:

On 16/01/2019 18:59, Alvaro Herrera wrote:

On 2019-Jan-16, Michael Paquier wrote:

Regarding the grammar, we tend for the last couple of years to avoid
complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.

That's my opinion too, but I was outvoted in another subthread -- see
/messages/by-id/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
Stephen Frost, Andrew Gierth and Andres Freund all voted to put
CONCURRENTLY outside the parens. It seems we now have three votes to
put it *in* the parens (you, Peter Eisentraut, me). I guess more votes
are needed to settle this issue.

My vote is to have homogeneous syntax for all of this, and so put it in
parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
parentheses for it, too.

I supposed we'll keep what would then be the legacy syntax for a few
decades or more.

I'm still of the opinion that we should have CONCURRENTLY allowed
without the parentheses. I could see allowing it with them, as well,
but I do feel that we should be using the parentheses-based approach
more as a last-resort kind of thing instead of just baking in everything
to require them.

We have said before that we don't want to have things implemented in a
purely functional way (see the discussions around pglogical and such)
and while this isn't quite the same, I do think it heads in that
direction. It's certainly harder to have to think about how to
structure these commands so that they look like they belong in SQL but I
think it has benefits too.

Thanks!

Stephen

#121

Andres Freund

andres@anarazel.de

almost 7 years ago

In reply to: Stephen Frost (#120)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On January 19, 2019 7:32:55 AM PST, Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Vik Fearing (vik.fearing@2ndquadrant.com) wrote:

My vote is to have homogeneous syntax for all of this, and so put it

in

parentheses, but we should also allow CREATE INDEX and DROP INDEX to

use

parentheses for it, too.

I supposed we'll keep what would then be the legacy syntax for a few
decades or more.

I'm still of the opinion that we should have CONCURRENTLY allowed
without the parentheses. I could see allowing it with them, as well,
but I do feel that we should be using the parentheses-based approach
more as a last-resort kind of thing instead of just baking in
everything
to require them.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#122

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Vik Fearing (#117)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Sat, Jan 19, 2019 at 03:01:07AM +0100, Vik Fearing wrote:

On 19/01/2019 02:33, Michael Paquier wrote:

On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote:

My vote is to have homogeneous syntax for all of this, and so put it in
parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
parentheses for it, too.

That would be a new thing as these variants don't exist yet, and WITH
is for storage parameters. In my opinion, the long-term take on doing
such things is that we are then able to reduce the number of reserved
keywords in the grammar. Even if for the case of CONCURRENTLY we may
see humans on Mars before this actually happens, this does not mean
that we should not do it moving forward for other keywords in the
grammar.

I'm not sure I understand your point.

I don't want a situation like this:
CREATE INDEX CONCURRENTLY ...
DROP INDEX CONCURRENTLY ...
REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

I am not sure what is the actual reason which could force to decide
that all three queries should have the same grammar, and why this has
anything to do on a thread about REINDEX. REINDEX can work on many
more object types than an index so its scope is much larger, contrary
to CREATE/DROP INDEX. An advantage of using parenthesized grammar and
prioritize it is that you don't have to add it to the list of reserved
keywords, and the parser can rely on IDENT for its work.

I personally prefer the parenthesized grammar for that reason. If the
crowd votes in majority for the other option, that's of course fine to
me too.

I never said anything about WITH.

Perhaps I have not explained my thoughts clearly here. My point was
that if some day we decide to drop the non-parenthesized grammar of
CREATE/DROP INDEX, one possibility would be to have a "concurrent"
option as part of WITH, even if that's used only now for storage
parameters. That's the only actual part of the grammar which is
extensible.
--
Michael

#123

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Vik Fearing (#117)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:

I don't want a situation like this:
CREATE INDEX CONCURRENTLY ...
DROP INDEX CONCURRENTLY ...
REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

+1 for all three being the same. I could see allowing only the
unparenthesized format for all three, or allowing both forms for all
three, but I think having only one form for each and having them not
agree will be too confusing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#124

Pavel Stehule

pavel.stehule@gmail.com

almost 7 years ago

In reply to: Robert Haas (#123)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

st 23. 1. 2019 19:17 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:

On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com>
wrote:

I don't want a situation like this:
CREATE INDEX CONCURRENTLY ...
DROP INDEX CONCURRENTLY ...
REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

+1 for all three being the same. I could see allowing only the
unparenthesized format for all three, or allowing both forms for all
three, but I think having only one form for each and having them not
agree will be too confusing.

Pavel

Show quoted text

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#125

Andres Freund

andres@anarazel.de

almost 7 years ago

In reply to: Robert Haas (#123)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Hi,

On 2019-01-23 13:17:26 -0500, Robert Haas wrote:

On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:

I don't want a situation like this:
CREATE INDEX CONCURRENTLY ...
DROP INDEX CONCURRENTLY ...
REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

+1 for all three being the same. I could see allowing only the
unparenthesized format for all three, or allowing both forms for all
three, but I think having only one form for each and having them not
agree will be too confusing.

It seems quite unnecesarily confusing to me to require parens for
REINDEX CONCURRENTLY when we've historically not required that for
CREATE/DROP INDEX CONCURRENTLY. Besides that, training people that it's
the correct form to use parens for CIC/DIC, creates an unnecessary
version dependency.

I think it actually makes sense to see the CONCURRENTLY versions as
somewhat separate types of statements than the non concurrent
versions. They have significantly different transactional behaviour
(like not being able to be run within one, and leaving gunk behind in
case of error). For me it semantically makes sense to have that denoted
at the toplevel, it's a related but different type of DDL statement.

Greetings,

Andres Freund

#126

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#110)

1 attachment(s)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

Here is an updated patch, which addresses some of your issues below as
well as the earlier reported issue that comments were lost during
REINDEX CONCURRENTLY.

On 16/01/2019 09:27, Michael Paquier wrote:

On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:

NOTICE seems unnecessary here.

Unfortunally concurrently reindex loses comments, reproducer:

Yes, the NOTICE message makes little sense.

This is existing behavior of reindex-not-concurrently.

I am getting back in touch with this stuff. It has been some time but
the core of the patch has not actually changed in its base concept, so
I am still very familiar with it as the original author. There are
even typos I may have introduced a couple of years back, like
"contraint". I have not yet spent much time on that, but there are at
quick glance a bunch of things that could be retouched to get pieces
of that committable.
+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
+    including invalid toast indexes.
This needs <literal> markups for "ccnew" and "ccold".  "definiton" is
not correct.

Fixed those.

index_create does not actually need its extra argument with the tuple
descriptor. I think that we had better grab the column name list from
indexInfo and just pass that down to index_create() (patched on my
local branch), so it is an overkill to take a full copy of the index's
TupleDesc.

Please send a fixup patch.

The patch, standing as-is, is close to 2k lines long, so let's cut
that first into more pieces refactoring the concurrent build code.
Here are some preliminary notes:
- WaitForOlderSnapshots() could be in its own patch.
- index_concurrently_build() and index_concurrently_set_dead() can be
in an independent patch. set_dead() had better be a wrapper on top of
index_set_state_flags actually which is able to set any kind of
flags.
- A couple of pieces in index_create() could be cut as well.

I'm not a fan of that. I had already considered all the ways in which
subparts of this patch could get committed, and some of it was
committed, so what's left now is what I thought should stay together.
The patch isn't really that big and most of it is moving code around. I
would also avoid chopping around in this patch now and focus on getting
it finished instead. The functionality seems solid, so if it's good,
let's commit it, if it's not, let's get it fixed up.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v7-0001-REINDEX-CONCURRENTLY.patchtext/plain; charset=UTF-8; name=v7-0001-REINDEX-CONCURRENTLY.patch; x-mac-creator=0; x-mac-type=0Download

From e9600073108c9fbfe64087932f4bb2ea12f58418 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 29 Jan 2019 21:45:29 +0100
Subject: [PATCH v7] REINDEX CONCURRENTLY

---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 184 +++-
 src/backend/catalog/index.c                   | 547 ++++++++++-
 src/backend/catalog/pg_depend.c               | 143 +++
 src/backend/catalog/toasting.c                |   2 +-
 src/backend/commands/indexcmds.c              | 882 +++++++++++++++---
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/include/catalog/dependency.h              |   5 +
 src/include/catalog/index.h                   |  17 +
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  95 ++
 src/test/regress/sql/create_index.sql         |  61 ++
 22 files changed, 1989 insertions(+), 174 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ <title>Table-level Lock Modes</title>
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..ee22c267c1 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ <title>Description</title>
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ <title>Parameters</title>
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,160 @@ <title>Notes</title>
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the constraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</command>.
+    The concurrent index created during the processing has a name ending in
+    the suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it is an old index definition which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
+    including invalid toast indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
+    command can be performed within a transaction block, but
+    <command>REINDEX CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +438,14 @@ <title>Examples</title>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 225c078018..be857237e7 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -41,6 +41,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
+#include "catalog/pg_description.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_operator.h"
@@ -693,6 +694,7 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
  * flags: bitmask that can include any combination of these bits:
  *		INDEX_CREATE_IS_PRIMARY
  *			the index is a primary key
@@ -733,6 +735,7 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -795,7 +798,7 @@ index_create(Relation heapRelation,
 	 * release locks before committing in catalogs
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -863,14 +866,20 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples
+	 * construct tuple descriptor for index tuples if not passed by caller
 	 */
-	indexTupDesc = ConstructTupleDescriptor(heapRelation,
-											indexInfo,
-											indexColNames,
-											accessMethodObjectId,
-											collationObjectId,
-											classObjectId);
+	if (!tupdesc)
+		indexTupDesc = ConstructTupleDescriptor(heapRelation,
+												indexInfo,
+												indexColNames,
+												accessMethodObjectId,
+												collationObjectId,
+												classObjectId);
+	else
+	{
+		Assert(indexColNames == NIL);
+		indexTupDesc = tupdesc;
+	}
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1201,6 +1210,496 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrently_create_copy
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Create a copy of the tuple descriptor to be used for the new entry */
+	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  NIL,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  indexTupDesc,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_concurrently_build
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need to
+ * be kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrently_build(Oid heapOid,
+						 Oid indexOid)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* Open and lock the parent heap relation */
+	heapRel = table_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, false, true);
+
+	/* Close both relations, and keep the locks */
+	table_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrently_swap
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy constraint flags for old index. This is safe because the old index
+	 * guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = table_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = table_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move comment if any
+	 */
+	{
+		Relation	description;
+		ScanKeyData skey[3];
+		SysScanDesc sd;
+		HeapTuple	tuple;
+		Datum		values[Natts_pg_description] = {0};
+		bool		nulls[Natts_pg_description] = {0};
+		bool		replaces[Natts_pg_description] = {0};
+
+		values[Anum_pg_description_objoid - 1] = ObjectIdGetDatum(newIndexId);
+		replaces[Anum_pg_description_objoid - 1] = true;
+
+		ScanKeyInit(&skey[0],
+					Anum_pg_description_objoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(oldIndexId));
+		ScanKeyInit(&skey[1],
+					Anum_pg_description_classoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(RelationRelationId));
+		ScanKeyInit(&skey[2],
+					Anum_pg_description_objsubid,
+					BTEqualStrategyNumber, F_INT4EQ,
+					Int32GetDatum(0));
+
+		description = table_open(DescriptionRelationId, RowExclusiveLock);
+
+		sd = systable_beginscan(description, DescriptionObjIndexId, true,
+								NULL, 3, skey);
+
+		while ((tuple = systable_getnext(sd)) != NULL)
+		{
+			tuple = heap_modify_tuple(tuple, RelationGetDescr(description),
+									  values, nulls, replaces);
+			CatalogTupleUpdate(description, &tuple->t_self, tuple);
+
+			break;					/* Assume there can be only one match */
+		}
+
+		systable_endscan(sd);
+		table_close(description, NoLock);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	table_close(pg_class, RowExclusiveLock);
+	table_close(pg_index, RowExclusiveLock);
+	table_close(pg_constraint, RowExclusiveLock);
+	table_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrently_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = table_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	table_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrently_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrently_drop(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	table_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1590,36 +2089,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = table_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		table_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrently_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 2b8f651c99..969b34e752 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -375,6 +375,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = heap_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -734,3 +822,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = heap_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	table_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 77be19175a..fb93c41c88 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -336,7 +336,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
 				 INDEX_CREATE_IS_PRIMARY, 0, true, true, NULL);
 
 	table_close(toast_rel, NoLock);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 5b2b8d2969..2b4971fc93 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -59,6 +59,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -84,6 +85,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -298,6 +300,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -346,7 +432,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -361,9 +446,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -856,7 +939,7 @@ DefineIndex(Oid relationId,
 					 stmt->oldNode, indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions,
+					 coloptions, reloptions, NULL,
 					 flags, constr_flags,
 					 allowSystemTableMods, !check_rights,
 					 &createdConstraintId);
@@ -1152,34 +1235,14 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = table_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, false, true);
-
-	/* Close both the relations, but keep the locks */
-	table_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrently_build(RangeVarGetRelid(stmt->relation,
+											  ShareUpdateExclusiveLock,
+											  false),
+							 indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1251,74 +1314,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2204,7 +2202,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2216,7 +2214,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2236,7 +2235,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2304,18 +2306,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2333,7 +2343,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2345,6 +2355,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2453,6 +2464,20 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.
+		 */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2479,26 +2504,661 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+		}
+
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = table_open(toastOid,
+														   ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					table_close(toastRelation, NoLock);
+				}
+
+				table_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = table_open(indexRel->rd_index->indrelid,
+									ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrently_create_copy(indexParentRel,
+													   indOid,
+													   concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		table_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = table_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		table_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			tableOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * get its information.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		tableOid = indexRel->rd_index->indrelid;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrently_build(tableOid, concurrentOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
 		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_concurrently_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrently_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrently_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finallt release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+	}
+
+	/* Start a new transaction to finish process properly */
 	StartTransactionCommand();
 
 	MemoryContextDelete(private_context);
+
+	return true;
 }
 
 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ff76499137..ec3df34943 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1257,6 +1257,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1319,7 +1320,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 3eb7e95d64..0ac1205af5 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4353,6 +4353,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5c4fa7d077..00874fb9a5 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2098,6 +2098,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1faf4152c..d09043c6e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8266,42 +8266,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f1b4..9f8f62b5de 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5d8634d818..82511e34ac 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2192,6 +2192,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7b7a88fda3..da814bfec8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3205,12 +3205,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5dea27016e..24b47d4fc6 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -251,6 +251,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -261,6 +264,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 330c481a8b..84dd900dd6 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -65,6 +65,7 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
+			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -77,6 +78,22 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
+extern Oid index_concurrently_create_copy(Relation heapRelation,
+										  Oid oldIndexId,
+										  const char *newName);
+
+extern void index_concurrently_build(Oid heapOid,
+									 Oid indexOid);
+
+extern void index_concurrently_swap(Oid newIndexId,
+									Oid oldIndexId,
+									const char *oldName);
+
+extern void index_concurrently_set_dead(Oid heapOid,
+										Oid indexOid);
+
+extern void index_concurrently_drop(Oid indexId);
+
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index e592a914a4..e11caf2cd1 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4ec8a83541..fd353ed7bd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3307,6 +3307,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 91d9d90135..e32886bacb 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 46deb55c67..f10ff3c5c1 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3292,3 +3292,98 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+DROP TABLE testcomment;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 59da6b6592..1669f6a0d8 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1207,3 +1207,64 @@ CREATE ROLE regress_reindexuser NOLOGIN;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+DROP TABLE testcomment;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

base-commit: e77cfa54d700557ea700d47454c9e570f20f1841
-- 
2.20.1

#127

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#126)

2 attachment(s)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Tue, Jan 29, 2019 at 09:51:35PM +0100, Peter Eisentraut wrote:

On 16/01/2019 09:27, Michael Paquier wrote:

index_create does not actually need its extra argument with the tuple
descriptor. I think that we had better grab the column name list from
indexInfo and just pass that down to index_create() (patched on my
local branch), so it is an overkill to take a full copy of the index's
TupleDesc.

Please send a fixup patch.

Sure. Attached is a patch which can be applied on top of what you
sent last, based on what I noticed at review, here and there. You
also forgot to switch two heap_open() to table_open() in pg_depend.c.

The patch, standing as-is, is close to 2k lines long, so let's cut
that first into more pieces refactoring the concurrent build code.
Here are some preliminary notes:
- WaitForOlderSnapshots() could be in its own patch.
- index_concurrently_build() and index_concurrently_set_dead() can be
in an independent patch. set_dead() had better be a wrapper on top of
index_set_state_flags actually which is able to set any kind of
flags.
- A couple of pieces in index_create() could be cut as well.

I'm not a fan of that. I had already considered all the ways in which
subparts of this patch could get committed, and some of it was
committed, so what's left now is what I thought should stay together.
The patch isn't really that big and most of it is moving code around. I
would also avoid chopping around in this patch now and focus on getting
it finished instead. The functionality seems solid, so if it's good,
let's commit it, if it's not, let's get it fixed up.

Well, the feature looks solid to me, and not much of its code has
actually changed over the years FWIW.

Committing large and complex patches is something you have more
experience with than myself, and I find the exercise very difficult.
So if you feel it's adapted to keep things grouped together, it is not
an issue to me. I'll follow the lead.
--
Michael

Attachments:

reindex-conc-v7-full-michael.patchtext/x-diff; charset=us-asciiDownload

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..b0983c7ea1 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,161 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the constraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid
+    index and try again to perform <command>REINDEX CONCURRENTLY</command>.
+    The concurrent index created during the processing has a name ending in
+    the suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it
+    is an old index definition which we failed to drop. Invalid indexes can
+    be dropped using <literal>DROP INDEX</literal>, including invalid toast
+    indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same
+    table to occur in parallel, but only one concurrent index build can
+    occur on a table at a time. In both cases, no other types of schema
+    modification on the table are allowed meanwhile.  Another difference
+    is that a regular <command>REINDEX TABLE</command> or
+    <command>REINDEX INDEX</command> command can be performed within a
+    transaction block, but <command>REINDEX CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +439,14 @@ $ <userinput>psql broken_db</userinput>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 169b2de6cf..b84c05736f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -41,6 +41,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
+#include "catalog/pg_description.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_operator.h"
@@ -739,6 +740,7 @@ index_create(Relation heapRelation,
 			 Oid *constraintId)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
+	Oid			heapNamespaceId = get_rel_namespace(heapRelationId);
 	Relation	pg_class;
 	Relation	indexRelation;
 	TupleDesc	indexTupDesc;
@@ -791,10 +793,12 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs.  Toast catalogs are fine
+	 * though as they are associated with a root relation which could be
+	 * reindexed concurrently.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(heapNamespaceId))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -1200,6 +1204,512 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_create_copy_concurrent
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_create_copy_concurrent(Relation heapRelation, Oid oldIndexId,
+							 const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	TupleDesc	indexTupDesc;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	List	   *indexColNames = NIL;
+	int			i;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/*
+	 * Extract the list of column names to be used for the index
+	 * creation.
+	 */
+	indexTupDesc = RelationGetDescr(indexRelation);
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(indexTupDesc, i);
+
+		/* Grab the column name and save it to the list */
+		indexColNames = lappend(indexColNames, NameStr(att->attname));
+	}
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  indexColNames,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_build_concurrent
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need
+ * to be kept until the end of the transaction performing this operation.
+ * 'indexOid' refers to an index relation OID already created as part of
+ * previous processing, and 'heapOid' refers to its parent heap relation.
+ */
+void
+index_build_concurrent(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* This had better make sure that a snapshot is active */
+	Assert(ActiveSnapshotSet());
+
+	/* Open and lock the parent heap relation */
+	heapRel = table_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, false, true);
+
+	/* Close both relations, and keep the locks */
+	table_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_swap_concurrent
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_swap_concurrent(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy conntraint flags from the old index. This is safe because the old
+	 * index guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = table_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = table_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move comment if any.
+	 */
+	{
+		Relation	description;
+		ScanKeyData skey[3];
+		SysScanDesc sd;
+		HeapTuple	tuple;
+		Datum		values[Natts_pg_description] = {0};
+		bool		nulls[Natts_pg_description] = {0};
+		bool		replaces[Natts_pg_description] = {0};
+
+		values[Anum_pg_description_objoid - 1] = ObjectIdGetDatum(newIndexId);
+		replaces[Anum_pg_description_objoid - 1] = true;
+
+		ScanKeyInit(&skey[0],
+					Anum_pg_description_objoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(oldIndexId));
+		ScanKeyInit(&skey[1],
+					Anum_pg_description_classoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(RelationRelationId));
+		ScanKeyInit(&skey[2],
+					Anum_pg_description_objsubid,
+					BTEqualStrategyNumber, F_INT4EQ,
+					Int32GetDatum(0));
+
+		description = table_open(DescriptionRelationId, RowExclusiveLock);
+
+		sd = systable_beginscan(description, DescriptionObjIndexId, true,
+								NULL, 3, skey);
+
+		while ((tuple = systable_getnext(sd)) != NULL)
+		{
+			tuple = heap_modify_tuple(tuple, RelationGetDescr(description),
+									  values, nulls, replaces);
+			CatalogTupleUpdate(description, &tuple->t_self, tuple);
+
+			break;					/* Assume there can be only one match */
+		}
+
+		systable_endscan(sd);
+		table_close(description, NoLock);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new one.
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	table_close(pg_class, RowExclusiveLock);
+	table_close(pg_index, RowExclusiveLock);
+	table_close(pg_constraint, RowExclusiveLock);
+	table_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_set_dead_concurrent
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_set_dead_concurrent(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = table_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	table_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_drop_concurrent
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_drop_concurrent(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	table_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1589,36 +2099,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = table_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		table_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_set_dead_concurrent(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 2b8f651c99..b5b8f62b19 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -375,6 +375,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -734,3 +822,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = table_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	table_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index bd85099c28..8a80308c3b 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -57,6 +57,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -82,6 +83,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -296,6 +298,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.  This is used when building concurrently an index.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see the
+ * index worked on.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -344,7 +430,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -359,9 +444,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -1150,34 +1233,11 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = table_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, false, true);
-
-	/* Close both the relations, but keep the locks */
-	table_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_build_concurrent(relationId, indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1249,74 +1309,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2202,7 +2197,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2214,7 +2209,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2234,7 +2230,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2302,18 +2301,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2331,7 +2338,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2343,6 +2350,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2451,6 +2459,20 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.
+		 */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2477,20 +2499,33 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+		}
+
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 	}
@@ -2499,6 +2534,629 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	MemoryContextDelete(private_context);
 }
 
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = table_open(toastOid,
+														   ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					table_close(toastRelation, NoLock);
+				}
+
+				table_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = table_open(indexRel->rd_index->indrelid,
+									ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_create_copy_concurrent(indexParentRel,
+													 indOid,
+													 concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		table_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = table_open(lfirst_oid(lc),
+											  ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		table_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * determine if it is used as a primary key.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+
+		/* Perform concurrent build of new index */
+		index_build_concurrent(indexRel->rd_index->indrelid, concurrentOid);
+
+		/* Keep lock until the end of this transaction */
+		index_close(indexRel, NoLock);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_swap_concurrent(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_set_dead_concurrent(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_drop_concurrent(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finally release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+
+	return true;
+}
+
 /*
  *	ReindexPartitionedIndex
  *		Reindex each child of the given partitioned index.
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 434be403fe..35bdc6a02f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1253,6 +1253,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1315,7 +1316,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 807393dfaa..aa4180f48f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4353,6 +4353,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index a397de155e..2468fa9228 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2098,6 +2098,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1faf4152c..d09043c6e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8266,42 +8266,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f1b4..9f8f62b5de 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5d8634d818..82511e34ac 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2192,6 +2192,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7b7a88fda3..da814bfec8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3205,12 +3205,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5dea27016e..24b47d4fc6 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -251,6 +251,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -261,6 +264,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 330c481a8b..803fa6a2c8 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -110,6 +110,21 @@ extern void index_build(Relation heapRelation,
 			bool isreindex,
 			bool parallel);
 
+extern void index_build_concurrent(Oid heapOid, Oid indexOid);
+
+extern Oid index_create_copy_concurrent(Relation heapRelation,
+										Oid oldIndexId,
+										const char *newName);
+
+extern void index_swap_concurrent(Oid newIndexId,
+								  Oid oldIndexId,
+								  const char *oldName);
+
+extern void index_set_dead_concurrent(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_drop_concurrent(Oid indexId);
+
 extern double IndexBuildHeapScan(Relation heapRelation,
 				   Relation indexRelation,
 				   IndexInfo *indexInfo,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index e592a914a4..e11caf2cd1 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4ec8a83541..fd353ed7bd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3307,6 +3307,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 91d9d90135..e32886bacb 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 46deb55c67..f10ff3c5c1 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3292,3 +3292,98 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+DROP TABLE testcomment;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 59da6b6592..1669f6a0d8 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1207,3 +1207,64 @@ RESET ROLE;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+DROP TABLE testcomment;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

reindex-conc-v7-review.patchtext/x-diff; charset=us-asciiDownload

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index ee22c267c1..b0983c7ea1 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -386,9 +386,10 @@ Indexes:
     The recommended recovery method in such cases is to drop the invalid
     index and try again to perform <command>REINDEX CONCURRENTLY</command>.
     The concurrent index created during the processing has a name ending in
-    the suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it is an old index definition which we failed
-    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
-    including invalid toast indexes.
+    the suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it
+    is an old index definition which we failed to drop. Invalid indexes can
+    be dropped using <literal>DROP INDEX</literal>, including invalid toast
+    indexes.
    </para>
 
    <para>
@@ -396,9 +397,9 @@ Indexes:
     table to occur in parallel, but only one concurrent index build can
     occur on a table at a time. In both cases, no other types of schema
     modification on the table are allowed meanwhile.  Another difference
-    is that a regular <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
-    command can be performed within a transaction block, but
-    <command>REINDEX CONCURRENTLY</command> cannot.
+    is that a regular <command>REINDEX TABLE</command> or
+    <command>REINDEX INDEX</command> command can be performed within a
+    transaction block, but <command>REINDEX CONCURRENTLY</command> cannot.
    </para>
 
    <para>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 9b3a742663..b84c05736f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -693,7 +693,6 @@ UpdateIndexRelation(Oid indexoid,
  * classObjectId: array of index opclass OIDs, one per index column
  * coloptions: array of per-index-column indoption settings
  * reloptions: AM-specific options
- * tupdesc: Tuple descriptor used for the index if defined
  * flags: bitmask that can include any combination of these bits:
  *		INDEX_CREATE_IS_PRIMARY
  *			the index is a primary key
@@ -734,7 +733,6 @@ index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
-			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -742,6 +740,7 @@ index_create(Relation heapRelation,
 			 Oid *constraintId)
 {
 	Oid			heapRelationId = RelationGetRelid(heapRelation);
+	Oid			heapNamespaceId = get_rel_namespace(heapRelationId);
 	Relation	pg_class;
 	Relation	indexRelation;
 	TupleDesc	indexTupDesc;
@@ -794,10 +793,12 @@ index_create(Relation heapRelation,
 
 	/*
 	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * release locks before committing in catalogs.  Toast catalogs are fine
+	 * though as they are associated with a root relation which could be
+	 * reindexed concurrently.
 	 */
 	if (concurrent &&
-		IsSystemNamespace(get_rel_namespace(heapRelationId)))
+		IsSystemNamespace(heapNamespaceId))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -865,20 +866,14 @@ index_create(Relation heapRelation,
 	}
 
 	/*
-	 * construct tuple descriptor for index tuples if not passed by caller
+	 * construct tuple descriptor for index tuples
 	 */
-	if (!tupdesc)
-		indexTupDesc = ConstructTupleDescriptor(heapRelation,
-												indexInfo,
-												indexColNames,
-												accessMethodObjectId,
-												collationObjectId,
-												classObjectId);
-	else
-	{
-		Assert(indexColNames == NIL);
-		indexTupDesc = tupdesc;
-	}
+	indexTupDesc = ConstructTupleDescriptor(heapRelation,
+											indexInfo,
+											indexColNames,
+											accessMethodObjectId,
+											collationObjectId,
+											classObjectId);
 
 	/*
 	 * Allocate an OID for the index, unless we were told what to use.
@@ -1210,14 +1205,15 @@ index_create(Relation heapRelation,
 }
 
 /*
- * index_concurrently_create_copy
+ * index_create_copy_concurrent
  *
  * Create concurrently an index based on the definition of the one provided by
  * caller.  The index is inserted into catalogs and needs to be built later
  * on.  This is called during concurrent reindex processing.
  */
 Oid
-index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+index_create_copy_concurrent(Relation heapRelation, Oid oldIndexId,
+							 const char *newName)
 {
 	Relation	indexRelation;
 	IndexInfo  *indexInfo;
@@ -1231,6 +1227,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char
 	oidvector  *indclass;
 	int2vector *indcoloptions;
 	bool		isnull;
+	List	   *indexColNames = NIL;
+	int			i;
 
 	indexRelation = index_open(oldIndexId, RowExclusiveLock);
 
@@ -1242,9 +1240,6 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char
 	indexInfo->ii_ExclusionProcs = NULL;
 	indexInfo->ii_ExclusionStrats = NULL;
 
-	/* Create a copy of the tuple descriptor to be used for the new entry */
-	indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
-
 	/* Get the array of class and column options IDs from index info */
 	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
 	if (!HeapTupleIsValid(indexTuple))
@@ -1266,6 +1261,19 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char
 	optionDatum = SysCacheGetAttr(RELOID, classTuple,
 								  Anum_pg_class_reloptions, &isnull);
 
+	/*
+	 * Extract the list of column names to be used for the index
+	 * creation.
+	 */
+	indexTupDesc = RelationGetDescr(indexRelation);
+	for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(indexTupDesc, i);
+
+		/* Grab the column name and save it to the list */
+		indexColNames = lappend(indexColNames, NameStr(att->attname));
+	}
+
 	/* Now create the new index */
 	newIndexId = index_create(heapRelation,
 							  newName,
@@ -1274,14 +1282,13 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char
 							  InvalidOid,	/* parentConstraintId */
 							  InvalidOid,	/* relFileNode */
 							  indexInfo,
-							  NIL,
+							  indexColNames,
 							  indexRelation->rd_rel->relam,
 							  indexRelation->rd_rel->reltablespace,
 							  indexRelation->rd_indcollation,
 							  indclass->values,
 							  indcoloptions->values,
 							  optionDatum,
-							  indexTupDesc,
 							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
 							  0,
 							  true,	/* allow table to be a system catalog? */
@@ -1297,20 +1304,24 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char
 }
 
 /*
- * index_concurrently_build
+ * index_build_concurrent
  *
  * Build index for a concurrent operation.  Low-level locks are taken when
- * this operation is performed to prevent only schema changes, but they need to
- * be kept until the end of the transaction performing this operation.
+ * this operation is performed to prevent only schema changes, but they need
+ * to be kept until the end of the transaction performing this operation.
+ * 'indexOid' refers to an index relation OID already created as part of
+ * previous processing, and 'heapOid' refers to its parent heap relation.
  */
 void
-index_concurrently_build(Oid heapOid,
-						 Oid indexOid)
+index_build_concurrent(Oid heapOid, Oid indexOid)
 {
 	Relation	heapRel,
 				indexRelation;
 	IndexInfo  *indexInfo;
 
+	/* This had better make sure that a snapshot is active */
+	Assert(ActiveSnapshotSet());
+
 	/* Open and lock the parent heap relation */
 	heapRel = table_open(heapOid, ShareUpdateExclusiveLock);
 
@@ -1344,13 +1355,13 @@ index_concurrently_build(Oid heapOid,
 }
 
 /*
- * index_concurrently_swap
+ * index_swap_concurrent
  *
  * Swap name, dependencies, and constraints of the old index over to the new
  * index, while marking the old index as invalid and the new as valid.
  */
 void
-index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+index_swap_concurrent(Oid newIndexId, Oid oldIndexId, const char *oldName)
 {
 	Relation	pg_class,
 				pg_index,
@@ -1417,8 +1428,8 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
 
 	/*
-	 * Copy constraint flags for old index. This is safe because the old index
-	 * guaranteed uniqueness.
+	 * Copy conntraint flags from the old index. This is safe because the old
+	 * index guaranteed uniqueness.
 	 */
 	newIndexForm->indisprimary = oldIndexForm->indisprimary;
 	oldIndexForm->indisprimary = false;
@@ -1509,7 +1520,7 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move comment if any
+	 * Move comment if any.
 	 */
 	{
 		Relation	description;
@@ -1555,7 +1566,7 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move all dependencies on the old index to the new
+	 * Move all dependencies on the old index to the new one.
 	 */
 
 	if (OidIsValid(indexConstraintOid))
@@ -1592,7 +1603,7 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 }
 
 /*
- * index_concurrently_set_dead
+ * index_set_dead_concurrent
  *
  * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
  * CONCURRENTLY before actually dropping the index. After calling this
@@ -1601,7 +1612,7 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
  * function.
  */
 void
-index_concurrently_set_dead(Oid heapOid, Oid indexOid)
+index_set_dead_concurrent(Oid heapOid, Oid indexOid)
 {
 	Relation	heapRelation,
 				indexRelation;
@@ -1638,7 +1649,7 @@ index_concurrently_set_dead(Oid heapOid, Oid indexOid)
 }
 
 /*
- * index_concurrently_drop
+ * index_drop_concurrent
  *
  * Drop a single index concurrently as the last step of an index concurrent
  * process. Deletion is done through performDeletion or dependencies of the
@@ -1648,7 +1659,7 @@ index_concurrently_set_dead(Oid heapOid, Oid indexOid)
  * server sessions.
  */
 void
-index_concurrently_drop(Oid indexId)
+index_drop_concurrent(Oid indexId)
 {
 	Oid			constraintOid = get_index_constraint(indexId);
 	ObjectAddress object;
@@ -2089,7 +2100,7 @@ index_drop(Oid indexId, bool concurrent)
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
 		/* Finish invalidation of index and mark it as dead */
-		index_concurrently_set_dead(heapId, indexId);
+		index_set_dead_concurrent(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 969b34e752..b5b8f62b19 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -395,7 +395,7 @@ changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
 	ObjectAddress objAddr;
 	bool		newIsPinned;
 
-	depRel = heap_open(DependRelationId, RowExclusiveLock);
+	depRel = table_open(DependRelationId, RowExclusiveLock);
 
 	/*
 	 * If oldRefObjectId is pinned, there won't be any dependency entries on
@@ -838,7 +838,7 @@ get_index_ref_constraints(Oid indexId)
 	HeapTuple	tup;
 
 	/* Search the dependency table for the index */
-	depRel = heap_open(DependRelationId, AccessShareLock);
+	depRel = table_open(DependRelationId, AccessShareLock);
 
 	ScanKeyInit(&key[0],
 				Anum_pg_depend_refclassid,
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index fb93c41c88..77be19175a 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -336,7 +336,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 				 list_make2("chunk_id", "chunk_seq"),
 				 BTREE_AM_OID,
 				 rel->rd_rel->reltablespace,
-				 collationObjectId, classObjectId, coloptions, (Datum) 0, NULL,
+				 collationObjectId, classObjectId, coloptions, (Datum) 0,
 				 INDEX_CREATE_IS_PRIMARY, 0, true, true, NULL);
 
 	table_close(toast_rel, NoLock);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index a3b6aed0a7..8a80308c3b 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -305,15 +305,15 @@ CheckIndexCompatible(Oid oldId,
  * Wait for transactions that might have an older snapshot than the given xmin
  * limit, because it might not contain tuples deleted just before it has
  * been taken. Obtain a list of VXIDs of such transactions, and wait for them
- * individually.
+ * individually.  This is used when building concurrently an index.
  *
  * We can exclude any running transactions that have xmin > the xmin given;
  * their oldest snapshot must be newer than our xmin limit.
  * We can also exclude any transactions that have xmin = zero, since they
  * evidently have no live snapshot at all (and any one they might be in
  * process of taking is certainly newer than ours).  Transactions in other
- * DBs can be ignored too, since they'll never even be able to see this
- * index.
+ * DBs can be ignored too, since they'll never even be able to see the
+ * index worked on.
  *
  * We can also exclude autovacuum processes and processes running manual
  * lazy VACUUMs, because they won't be fazed by missing index entries
@@ -937,7 +937,7 @@ DefineIndex(Oid relationId,
 					 stmt->oldNode, indexInfo, indexColNames,
 					 accessMethodId, tablespaceId,
 					 collationObjectId, classObjectId,
-					 coloptions, reloptions, NULL,
+					 coloptions, reloptions,
 					 flags, constr_flags,
 					 allowSystemTableMods, !check_rights,
 					 &createdConstraintId);
@@ -1237,10 +1237,7 @@ DefineIndex(Oid relationId,
 	PushActiveSnapshot(GetTransactionSnapshot());
 
 	/* Perform concurrent build of index */
-	index_concurrently_build(RangeVarGetRelid(stmt->relation,
-											  ShareUpdateExclusiveLock,
-											  false),
-							 indexRelationId);
+	index_build_concurrent(relationId, indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -2814,9 +2811,9 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 											false);
 
 		/* Create new index definition based on given index */
-		concurrentOid = index_concurrently_create_copy(indexParentRel,
-													   indOid,
-													   concurrentName);
+		concurrentOid = index_create_copy_concurrent(indexParentRel,
+													 indOid,
+													 concurrentName);
 
 		/* Now open the relation of the new index, a lock is also needed on it */
 		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
@@ -2851,7 +2848,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 */
 	foreach(lc, parentRelationIds)
 	{
-		Relation	heapRelation = table_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		Relation	heapRelation = table_open(lfirst_oid(lc),
+											  ShareUpdateExclusiveLock);
 		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
 		LOCKTAG    *heaplocktag;
 
@@ -2914,7 +2912,6 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 		Relation	indexRel;
 		Oid			indOid = lfirst_oid(lc);
 		Oid			concurrentOid = lfirst_oid(lc2);
-		Oid			tableOid;
 
 		CHECK_FOR_INTERRUPTS();
 
@@ -2926,14 +2923,15 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 
 		/*
 		 * Index relation has been closed by previous commit, so reopen it to
-		 * get its information.
+		 * determine if it is used as a primary key.
 		 */
 		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
-		tableOid = indexRel->rd_index->indrelid;
-		index_close(indexRel, NoLock);
 
 		/* Perform concurrent build of new index */
-		index_concurrently_build(tableOid, concurrentOid);
+		index_build_concurrent(indexRel->rd_index->indrelid, concurrentOid);
+
+		/* Keep lock until the end of this transaction */
+		index_close(indexRel, NoLock);
 
 		/* We can do away with our snapshot */
 		PopActiveSnapshot();
@@ -3045,7 +3043,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 									 false);
 
 		/* Swap old index with the new one */
-		index_concurrently_swap(concurrentOid, indOid, oldName);
+		index_swap_concurrent(concurrentOid, indOid, oldName);
 
 		/*
 		 * Invalidate the relcache for the table, so that after this commit
@@ -3090,7 +3088,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 		relOid = IndexGetRelation(indOid, false);
 
 		/* Finish the index invalidation and set it as dead. */
-		index_concurrently_set_dead(relOid, indOid);
+		index_set_dead_concurrent(relOid, indOid);
 	}
 
 	/* Commit this transaction to make the updates visible. */
@@ -3118,14 +3116,14 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 
 		CHECK_FOR_INTERRUPTS();
 
-		index_concurrently_drop(indOid);
+		index_drop_concurrent(indOid);
 	}
 
 	PopActiveSnapshot();
 	CommitTransactionCommand();
 
 	/*
-	 * Finallt release the session-level lock on the parent table.
+	 * Finally release the session-level lock on the parent table.
 	 */
 	foreach(lc, relationLocks)
 	{
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 84dd900dd6..803fa6a2c8 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -65,7 +65,6 @@ extern Oid index_create(Relation heapRelation,
 			 Oid *classObjectId,
 			 int16 *coloptions,
 			 Datum reloptions,
-			 TupleDesc tupdesc,
 			 bits16 flags,
 			 bits16 constr_flags,
 			 bool allow_system_table_mods,
@@ -78,22 +77,6 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
-extern Oid index_concurrently_create_copy(Relation heapRelation,
-										  Oid oldIndexId,
-										  const char *newName);
-
-extern void index_concurrently_build(Oid heapOid,
-									 Oid indexOid);
-
-extern void index_concurrently_swap(Oid newIndexId,
-									Oid oldIndexId,
-									const char *oldName);
-
-extern void index_concurrently_set_dead(Oid heapOid,
-										Oid indexOid);
-
-extern void index_concurrently_drop(Oid indexId);
-
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
@@ -127,6 +110,21 @@ extern void index_build(Relation heapRelation,
 			bool isreindex,
 			bool parallel);
 
+extern void index_build_concurrent(Oid heapOid, Oid indexOid);
+
+extern Oid index_create_copy_concurrent(Relation heapRelation,
+										Oid oldIndexId,
+										const char *newName);
+
+extern void index_swap_concurrent(Oid newIndexId,
+								  Oid oldIndexId,
+								  const char *oldName);
+
+extern void index_set_dead_concurrent(Oid heapOid,
+									  Oid indexOid);
+
+extern void index_drop_concurrent(Oid indexId);
+
 extern double IndexBuildHeapScan(Relation heapRelation,
 				   Relation indexRelation,
 				   IndexInfo *indexInfo,

#128

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#127)

1 attachment(s)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 30/01/2019 06:16, Michael Paquier wrote:

On Tue, Jan 29, 2019 at 09:51:35PM +0100, Peter Eisentraut wrote:

On 16/01/2019 09:27, Michael Paquier wrote:

index_create does not actually need its extra argument with the tuple
descriptor. I think that we had better grab the column name list from
indexInfo and just pass that down to index_create() (patched on my
local branch), so it is an overkill to take a full copy of the index's
TupleDesc.

Please send a fixup patch.

Sure. Attached is a patch which can be applied on top of what you
sent last, based on what I noticed at review, here and there. You
also forgot to switch two heap_open() to table_open() in pg_depend.c.

OK, applied most of that.

I didn't take your function renaming. I had deliberately named the
functions index_concurrently_${task}, so their relationship is more
easily visible.

Anyway, that's all cosmetics. Are there any more functional or
correctness issues to be addressed?

Another thing I was thinking of: We need some database-global tests.
For example, at some point during development, I had broken some variant
of REINDEX DATABASE. Where could we put those tests? Maybe with reindexdb?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v8-0001-REINDEX-CONCURRENTLY.patchtext/plain; charset=UTF-8; name=v8-0001-REINDEX-CONCURRENTLY.patch; x-mac-creator=0; x-mac-type=0Download

From 5ea43abfdb81269157576ff722396475c8881d2c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 7 Feb 2019 12:43:09 +0100
Subject: [PATCH v8] REINDEX CONCURRENTLY

Discussion: https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee
---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 185 +++-
 src/backend/catalog/index.c                   | 544 ++++++++++-
 src/backend/catalog/pg_depend.c               | 143 +++
 src/backend/commands/indexcmds.c              | 877 +++++++++++++++---
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/include/catalog/dependency.h              |   5 +
 src/include/catalog/index.h                   |  16 +
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  95 ++
 src/test/regress/sql/create_index.sql         |  61 ++
 21 files changed, 1986 insertions(+), 167 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ <title>Table-level Lock Modes</title>
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..f41c432598 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ <title>Description</title>
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ <title>Parameters</title>
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,161 @@ <title>Notes</title>
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the constraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid index
+    and try again to perform <command>REINDEX CONCURRENTLY</command>.  The
+    concurrent index created during the processing has a name ending in the
+    suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it is an
+    old index definition which we failed to drop. Invalid indexes can be
+    dropped using <literal>DROP INDEX</literal>, including invalid toast
+    indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same table
+    to occur in parallel, but only one concurrent index build can occur on a
+    table at a time. In both cases, no other types of schema modification on
+    the table are allowed meanwhile.  Another difference is that a regular
+    <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
+    command can be performed within a transaction block, but <command>REINDEX
+    CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +439,14 @@ <title>Examples</title>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index faf6956813..bdd15cf662 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -41,6 +41,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
+#include "catalog/pg_description.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_operator.h"
@@ -790,11 +791,13 @@ index_create(Relation heapRelation,
 				 errmsg("user-defined indexes on system catalog tables are not supported")));
 
 	/*
-	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * Concurrent index build on a system catalog is unsafe because we tend to
+	 * release locks before committing in catalogs.  Toast catalogs are fine
+	 * though as they are associated with a root relation which could be
+	 * reindexed concurrently.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -1200,6 +1203,509 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrently_create_copy
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	List	   *indexColNames = NIL;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/*
+	 * Extract the list of column names to be used for the index
+	 * creation.
+	 */
+	for (int i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		TupleDesc	indexTupDesc = RelationGetDescr(indexRelation);
+		Form_pg_attribute att = TupleDescAttr(indexTupDesc, i);
+
+		indexColNames = lappend(indexColNames, NameStr(att->attname));
+	}
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  indexColNames,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_concurrently_build
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need
+ * to be kept until the end of the transaction performing this operation.
+ * 'indexOid' refers to an index relation OID already created as part of
+ * previous processing, and 'heapOid' refers to its parent heap relation.
+ */
+void
+index_concurrently_build(Oid heapOid,
+						 Oid indexOid)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* This had better make sure that a snapshot is active */
+	Assert(ActiveSnapshotSet());
+
+	/* Open and lock the parent heap relation */
+	heapRel = table_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, false, true);
+
+	/* Close both relations, and keep the locks */
+	table_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrently_swap
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy constraint flags from the old index. This is safe because the old
+	 * index guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = table_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = table_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move comment if any
+	 */
+	{
+		Relation	description;
+		ScanKeyData skey[3];
+		SysScanDesc sd;
+		HeapTuple	tuple;
+		Datum		values[Natts_pg_description] = {0};
+		bool		nulls[Natts_pg_description] = {0};
+		bool		replaces[Natts_pg_description] = {0};
+
+		values[Anum_pg_description_objoid - 1] = ObjectIdGetDatum(newIndexId);
+		replaces[Anum_pg_description_objoid - 1] = true;
+
+		ScanKeyInit(&skey[0],
+					Anum_pg_description_objoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(oldIndexId));
+		ScanKeyInit(&skey[1],
+					Anum_pg_description_classoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(RelationRelationId));
+		ScanKeyInit(&skey[2],
+					Anum_pg_description_objsubid,
+					BTEqualStrategyNumber, F_INT4EQ,
+					Int32GetDatum(0));
+
+		description = table_open(DescriptionRelationId, RowExclusiveLock);
+
+		sd = systable_beginscan(description, DescriptionObjIndexId, true,
+								NULL, 3, skey);
+
+		while ((tuple = systable_getnext(sd)) != NULL)
+		{
+			tuple = heap_modify_tuple(tuple, RelationGetDescr(description),
+									  values, nulls, replaces);
+			CatalogTupleUpdate(description, &tuple->t_self, tuple);
+
+			break;					/* Assume there can be only one match */
+		}
+
+		systable_endscan(sd);
+		table_close(description, NoLock);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new one
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	table_close(pg_class, RowExclusiveLock);
+	table_close(pg_index, RowExclusiveLock);
+	table_close(pg_constraint, RowExclusiveLock);
+	table_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrently_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = table_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	table_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrently_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrently_drop(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	table_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1589,36 +2095,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = table_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		table_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrently_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 2b8f651c99..b5b8f62b19 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -375,6 +375,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -734,3 +822,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = table_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	table_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index bd85099c28..e867475fa5 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -57,6 +57,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -82,6 +83,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -296,6 +298,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually. This is used when building an index concurrently.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see the
+ * index being worked on.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -344,7 +430,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -359,9 +444,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -1150,34 +1233,11 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = table_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, false, true);
-
-	/* Close both the relations, but keep the locks */
-	table_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrently_build(relationId, indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1249,74 +1309,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2202,7 +2197,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2214,7 +2209,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2234,7 +2230,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2302,18 +2301,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2331,7 +2338,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2343,6 +2350,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2451,6 +2459,20 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.
+		 */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2477,26 +2499,661 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+
+			PushActiveSnapshot(GetTransactionSnapshot());
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+		}
+
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = table_open(toastOid,
+														   ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					table_close(toastRelation, NoLock);
+				}
+
+				table_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+		return false;
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = table_open(indexRel->rd_index->indrelid,
+									ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrently_create_copy(indexParentRel,
+													   indOid,
+													   concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		table_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = table_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		table_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			tableOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * get its information.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		tableOid = indexRel->rd_index->indrelid;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrently_build(tableOid, concurrentOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
 		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_concurrently_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrently_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrently_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finally, release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+	}
+
+	/* Start a new transaction to finish process properly */
 	StartTransactionCommand();
 
 	MemoryContextDelete(private_context);
+
+	return true;
 }
 
 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..8db3368148 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1253,6 +1253,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1315,7 +1316,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index b44ead269f..7311a97371 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4353,6 +4353,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 1e169e0b9c..8298f4863f 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2098,6 +2098,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1faf4152c..d09043c6e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8266,42 +8266,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f1b4..9f8f62b5de 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5d8634d818..82511e34ac 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2192,6 +2192,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7b7a88fda3..da814bfec8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3205,12 +3205,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 5dea27016e..24b47d4fc6 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -251,6 +251,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -261,6 +264,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 330c481a8b..0093154e80 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -77,6 +77,22 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
+extern Oid index_concurrently_create_copy(Relation heapRelation,
+										  Oid oldIndexId,
+										  const char *newName);
+
+extern void index_concurrently_build(Oid heapOid,
+									 Oid indexOid);
+
+extern void index_concurrently_swap(Oid newIndexId,
+									Oid oldIndexId,
+									const char *oldName);
+
+extern void index_concurrently_set_dead(Oid heapOid,
+										Oid indexOid);
+
+extern void index_concurrently_drop(Oid indexId);
+
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index e592a914a4..e11caf2cd1 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fe14d7db2..ce07b9081c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3307,6 +3307,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 91d9d90135..e32886bacb 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 46deb55c67..f10ff3c5c1 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3292,3 +3292,98 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+DROP TABLE testcomment;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 59da6b6592..1669f6a0d8 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1207,3 +1207,64 @@ CREATE ROLE regress_reindexuser NOLOGIN;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+DROP TABLE testcomment;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;

base-commit: 793c736d69091d385a967b2740cc93cfb7a7b076
-- 
2.20.1

#129

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#128)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Thu, Feb 07, 2019 at 12:49:43PM +0100, Peter Eisentraut wrote:

Anyway, that's all cosmetics. Are there any more functional or
correctness issues to be addressed?

Not that I can think of. At least this evening.

Another thing I was thinking of: We need some database-global tests.
For example, at some point during development, I had broken some variant
of REINDEX DATABASE. Where could we put those tests? Maybe with
reindexdb?

Having some coverage in the TAP tests of reindexdb is a good idea.
--
Michael

#130

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Peter Eisentraut (#128)

Re: REINDEX CONCURRENTLY 2.0

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: tested, passed

Hello
Sorry for late response. I review latest patch version.

I didn't found any new behavior bugs.

reindex concurrenlty can be in deadlock with another reindex concurrently (or manual vacuum (maybe with wraparound autovacuum) or create index concurrently on same table). But i think this is not issue for this feature, two create index concurrently can be in deadlock too.

Just one small note for documentation:

+Indexes:
+    "idx" btree (col)
+    "idx_cct" btree (col) INVALID

Second index should be idx_ccnew (or idx_ccold), right?

Code looks good for me.

regards, Sergei

#131

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Peter Eisentraut (#128)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On 2019-Feb-07, Peter Eisentraut wrote:

Another thing I was thinking of: We need some database-global tests.
For example, at some point during development, I had broken some variant
of REINDEX DATABASE. Where could we put those tests? Maybe with reindexdb?

What's wrong with a new reindex.sql in regress?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#132

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Alvaro Herrera (#131)

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

On Thu, Feb 07, 2019 at 12:07:01PM -0300, Alvaro Herrera wrote:

On 2019-Feb-07, Peter Eisentraut wrote:

Another thing I was thinking of: We need some database-global tests.
For example, at some point during development, I had broken some variant
of REINDEX DATABASE. Where could we put those tests? Maybe with reindexdb?

What's wrong with a new reindex.sql in regress?

Depending on the numbers of objects created and remaining around
before the script is run in the main suite, this would be costly. I
think that this approach would not scale well in the long-term.
Having TAP test with reindexdb gives you access to a full instance
with its contents always fully known at test time.
--
Michael

#133

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Sergei Kornilov (#130)

Re: REINDEX CONCURRENTLY 2.0

Hello

Patch is marked as target version 12, but is inactive few weeks long. I think many users want this feature and patch is in good shape. We have open questions on this thread?

Latest patch still can be aplied cleanly; it builds and pass tests.

regards, Sergei

#134

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Sergei Kornilov (#133)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-13 15:13, Sergei Kornilov wrote:

Patch is marked as target version 12, but is inactive few weeks long. I think many users want this feature and patch is in good shape. We have open questions on this thread?

Latest patch still can be aplied cleanly; it builds and pass tests.

Here is an updated patch. It includes the typo fix in the documentation
from you, some small bug fixes, a new reindexdb --concurrently option,
and based on that some whole-database tests, as discussed recently.

I think this addresses all open issues.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v9-0001-REINDEX-CONCURRENTLY.patchtext/plain; charset=UTF-8; name=v9-0001-REINDEX-CONCURRENTLY.patch; x-mac-creator=0; x-mac-type=0Download

From 83df99b2862f02e1b200b2db0c036d75f5df936a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 13 Mar 2019 23:01:29 +0100
Subject: [PATCH v9] REINDEX CONCURRENTLY

This adds the CONCURRENTLY option to the REINDEX command.  A REINDEX
CONCURRENTLY on a specific index behaves like

    CREATE INDEX CONCURRENTLY new
    DROP INDEX CONCURRENTLY old
    ALTER INDEX new RENAME TO old

based on existing functionality.  The REINDEX command also has the
capability to run its other variants (TABLE, DATABASE, SYSTEM) with
the CONCURRENTLY option (although SYSTEM will end up skipping all
system tables).

The reindexdb command gets the --concurrently option.

Author: TODO
Reviewed-by: TODO
Discussion: https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee
---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 185 +++-
 doc/src/sgml/ref/reindexdb.sgml               |  10 +
 src/backend/catalog/index.c                   | 544 ++++++++++-
 src/backend/catalog/pg_depend.c               | 143 +++
 src/backend/commands/indexcmds.c              | 879 +++++++++++++++---
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/bin/scripts/reindexdb.c                   |  42 +-
 src/bin/scripts/t/090_reindexdb.pl            |  30 +-
 src/include/catalog/dependency.h              |   5 +
 src/include/catalog/index.h                   |  16 +
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  97 ++
 src/test/regress/sql/create_index.sql         |  63 ++
 24 files changed, 2059 insertions(+), 182 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ <title>Table-level Lock Modes</title>
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..a60488c6a0 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -67,10 +67,7 @@ <title>Description</title>
      <para>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.
      </para>
     </listitem>
 
@@ -151,6 +148,21 @@ <title>Parameters</title>
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +253,161 @@ <title>Notes</title>
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created, where all
+    the concurrent entries are created using only one transaction. Note that
+    if there are multiple indexes to be rebuilt then each step loops through
+    all the indexes we're rebuilding, using a separate transaction for each one.
+    <command>REINDEX CONCURRENTLY</command> proceeds as follows when rebuilding
+    indexes:
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>. This definition will be used to replace the
+       old index. This step is done as a single transaction for all the indexes
+       involved in this process, meaning that if
+       <command>REINDEX CONCURRENTLY</command> is run on a table with multiple
+       indexes, all the catalog entries of the new indexes are created within a
+       single transaction. A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       parent table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index entry.
+       Once the index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the
+       build is finished. This step is done within a single transaction
+       for each entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. This step is performed within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       All the constraints and foreign keys which refer to the index are swapped
+       to refer to the new index definition, and the names of the indexes are
+       changed. At this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
+       a cache invalidation is done so as all the sessions that referenced the
+       old index are invalidated. This step is done within a single transaction
+       for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The old index definition and its data are dropped. This step is done within
+       a single transaction for each temporary entry.
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+       for all the indexes processed as well as their parent tables.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_ccnew" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid index
+    and try again to perform <command>REINDEX CONCURRENTLY</command>.  The
+    concurrent index created during the processing has a name ending in the
+    suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it is an
+    old index definition which we failed to drop. Invalid indexes can be
+    dropped using <literal>DROP INDEX</literal>, including invalid toast
+    indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same table
+    to occur in parallel, but only one concurrent index build can occur on a
+    table at a time. In both cases, no other types of schema modification on
+    the table are allowed meanwhile.  Another difference is that a regular
+    <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
+    command can be performed within a transaction block, but <command>REINDEX
+    CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +439,14 @@ <title>Examples</title>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindexdb.sgml b/doc/src/sgml/ref/reindexdb.sgml
index 1273dad807..cdfac3fe4f 100644
--- a/doc/src/sgml/ref/reindexdb.sgml
+++ b/doc/src/sgml/ref/reindexdb.sgml
@@ -118,6 +118,16 @@ <title>Options</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--concurrently</option></term>
+      <listitem>
+       <para>
+        Use the <literal>CONCURRENTLY</literal> option.  See <xref
+        linkend="sql-reindex"/> for further information.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option><optional>-d</optional> <replaceable class="parameter">dbname</replaceable></option></term>
       <term><option><optional>--dbname=</optional><replaceable class="parameter">dbname</replaceable></option></term>
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c339a2bb77..da77254092 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
+#include "catalog/pg_description.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_operator.h"
@@ -791,11 +792,13 @@ index_create(Relation heapRelation,
 				 errmsg("user-defined indexes on system catalog tables are not supported")));
 
 	/*
-	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * Concurrent index build on a system catalog is unsafe because we tend to
+	 * release locks before committing in catalogs.  Toast catalogs are fine
+	 * though as they are associated with a root relation which could be
+	 * reindexed concurrently.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsSystemNamespace(get_rel_namespace(heapRelationId)))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -1210,6 +1213,509 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrently_create_copy
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	List	   *indexColNames = NIL;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Do not copy exclusion constraint */
+	indexInfo->ii_ExclusionOps = NULL;
+	indexInfo->ii_ExclusionProcs = NULL;
+	indexInfo->ii_ExclusionStrats = NULL;
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/*
+	 * Extract the list of column names to be used for the index
+	 * creation.
+	 */
+	for (int i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		TupleDesc	indexTupDesc = RelationGetDescr(indexRelation);
+		Form_pg_attribute att = TupleDescAttr(indexTupDesc, i);
+
+		indexColNames = lappend(indexColNames, NameStr(att->attname));
+	}
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  indexColNames,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_concurrently_build
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need
+ * to be kept until the end of the transaction performing this operation.
+ * 'indexOid' refers to an index relation OID already created as part of
+ * previous processing, and 'heapOid' refers to its parent heap relation.
+ */
+void
+index_concurrently_build(Oid heapOid,
+						 Oid indexOid)
+{
+	Relation	heapRel,
+				indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* This had better make sure that a snapshot is active */
+	Assert(ActiveSnapshotSet());
+
+	/* Open and lock the parent heap relation */
+	heapRel = table_open(heapOid, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexOid, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, false, true);
+
+	/* Close both relations, and keep the locks */
+	table_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts.  Once
+	 * we commit this transaction, any new transactions that open the table
+	 * must insert new entries into the index for insertions and non-HOT
+	 * updates.
+	 */
+	index_set_state_flags(indexOid, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrently_swap
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy constraint flags from the old index. This is safe because the old
+	 * index guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = table_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = table_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move comment if any
+	 */
+	{
+		Relation	description;
+		ScanKeyData skey[3];
+		SysScanDesc sd;
+		HeapTuple	tuple;
+		Datum		values[Natts_pg_description] = {0};
+		bool		nulls[Natts_pg_description] = {0};
+		bool		replaces[Natts_pg_description] = {0};
+
+		values[Anum_pg_description_objoid - 1] = ObjectIdGetDatum(newIndexId);
+		replaces[Anum_pg_description_objoid - 1] = true;
+
+		ScanKeyInit(&skey[0],
+					Anum_pg_description_objoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(oldIndexId));
+		ScanKeyInit(&skey[1],
+					Anum_pg_description_classoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(RelationRelationId));
+		ScanKeyInit(&skey[2],
+					Anum_pg_description_objsubid,
+					BTEqualStrategyNumber, F_INT4EQ,
+					Int32GetDatum(0));
+
+		description = table_open(DescriptionRelationId, RowExclusiveLock);
+
+		sd = systable_beginscan(description, DescriptionObjIndexId, true,
+								NULL, 3, skey);
+
+		while ((tuple = systable_getnext(sd)) != NULL)
+		{
+			tuple = heap_modify_tuple(tuple, RelationGetDescr(description),
+									  values, nulls, replaces);
+			CatalogTupleUpdate(description, &tuple->t_self, tuple);
+
+			break;					/* Assume there can be only one match */
+		}
+
+		systable_endscan(sd);
+		table_close(description, NoLock);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new one
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/* Close relations */
+	table_close(pg_class, RowExclusiveLock);
+	table_close(pg_index, RowExclusiveLock);
+	table_close(pg_constraint, RowExclusiveLock);
+	table_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrently_set_dead(Oid heapOid, Oid indexOid)
+{
+	Relation	heapRelation,
+				indexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're about
+	 * to stop doing inserts into the index which could show conflicts with
+	 * existing predicate locks, so now is the time to move them to the heap
+	 * relation.
+	 */
+	heapRelation = table_open(heapOid, ShareUpdateExclusiveLock);
+	indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(indexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just might
+	 * have it open for updating it.  So now we can unset indisready and
+	 * indislive, then wait till nobody could be using it at all anymore.
+	 */
+	index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit all
+	 * sessions will refresh the table's index list.  Forgetting just the
+	 * index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(heapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	table_close(heapRelation, NoLock);
+	index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrently_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is certain that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrently_drop(Oid indexId)
+{
+	Oid			constraintOid = get_index_constraint(indexId);
+	ObjectAddress object;
+	Form_pg_index indexForm;
+	Relation	pg_index;
+	HeapTuple	indexTuple;
+
+	/*
+	 * Check that the index dropped here is not alive, it might be used by
+	 * other backends in this case.
+	 */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	indexTuple = SearchSysCacheCopy1(INDEXRELID,
+									 ObjectIdGetDatum(indexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indexId);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	/*
+	 * This is only a safety check, just to avoid live indexes from being
+	 * dropped.
+	 */
+	if (indexForm->indislive)
+		elog(ERROR, "cannot drop live index with OID %u", indexId);
+
+	/* Clean up */
+	table_close(pg_index, RowExclusiveLock);
+
+	/*
+	 * We are sure to have a dead index, so begin the drop process. Register
+	 * constraint or index for drop.
+	 */
+	if (OidIsValid(constraintOid))
+	{
+		object.classId = ConstraintRelationId;
+		object.objectId = constraintOid;
+	}
+	else
+	{
+		object.classId = RelationRelationId;
+		object.objectId = indexId;
+	}
+
+	object.objectSubId = 0;
+
+	/* Perform deletion for normal and toast indexes */
+	performDeletion(&object, DROP_RESTRICT, 0);
+}
+
 /*
  * index_constraint_create
  *
@@ -1601,36 +2107,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = table_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		table_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrently_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 23b01f841e..d63bf5e56d 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -395,6 +395,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -754,3 +842,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = table_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	table_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 94006c1189..62cc6620fc 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -58,6 +58,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -83,6 +84,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -297,6 +299,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually. This is used when building an index concurrently.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see the
+ * index being worked on.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			i,
+				n_old_snapshots;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -345,7 +431,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -360,9 +445,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -1152,34 +1235,11 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = table_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, false, true);
-
-	/* Close both the relations, but keep the locks */
-	table_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrently_build(relationId, indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1251,74 +1311,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2204,7 +2199,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2216,7 +2211,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2236,7 +2232,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2304,18 +2303,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2333,7 +2340,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2345,6 +2352,7 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
@@ -2453,6 +2461,20 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.
+		 */
+		if (concurrent && IsSystemNamespace(get_rel_namespace(relid)))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2479,26 +2501,663 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+
+			PopActiveSnapshot();
+		}
+
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by one for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *parentRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext old;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(old);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller. For each element in given list,
+	 * if the relkind of given relation Oid is a table, all its valid indexes
+	 * will be rebuilt, including its associated toast table indexes. If
+	 * relkind is an index, this index itself will be rebuilt. The locks taken
+	 * on parent relations and involved indexes are kept until this
+	 * transaction is committed to protect against schema changes that might
+	 * occur until the session lock is taken on each relation, session lock
+	 * used to similarly protect from any schema change that could happen
+	 * within the multiple transactions that are used during this process.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+				MemoryContextSwitchTo(old);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(relationOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Open relation to get its indexes */
+				heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						old = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(old);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = table_open(toastOid,
+														   ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+					MemoryContextSwitchTo(old);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							old = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(old);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					table_close(toastRelation, NoLock);
+				}
+
+				table_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			parentOid = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(parentOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(parentOid)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				old = MemoryContextSwitchTo(private_context);
+
+				/* Track the parent relation of this index for session locks */
+				parentRelationIds = list_make1_oid(parentOid);
+
+				MemoryContextSwitchTo(old);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					old = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(old);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definetely no indexes, so leave */
+	if (indexIds == NIL)
+	{
 		PopActiveSnapshot();
+		return false;
+	}
+
+	Assert(parentRelationIds != NIL);
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Here begins the process for concurrently rebuilding the index entries.
+	 * We need first to create an index which is based on the same data as the
+	 * former index except that it will be only registered in catalogs and
+	 * will be built later. It is possible to perform all the operations on
+	 * all the indexes at the same time for a parent relation including
+	 * indexes for its toast relation.
+	 */
+
+	/* Do the concurrent index creation for each index */
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = InvalidOid;
+		Relation	indexRel,
+					indexParentRel,
+					indexConcurrentRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		/* Open the index parent relation, might be a toast or parent relation */
+		indexParentRel = table_open(indexRel->rd_index->indrelid,
+									ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indOid),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		concurrentOid = index_concurrently_create_copy(indexParentRel,
+													   indOid,
+													   concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+		/* Save the list of oids and locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Save the new index Oid */
+		newIndexIds = lappend_oid(newIndexIds, concurrentOid);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(old);
+
+		index_close(indexRel, NoLock);
+		index_close(indexConcurrentRel, NoLock);
+		table_close(indexParentRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, parentRelationIds)
+	{
+		Relation	heapRelation = table_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		old = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of parent relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(old);
+
+		/* Close heap relation */
+		table_close(heapRelation, NoLock);
+	}
+
+	/*
+	 * For a concurrent build, it is necessary to make the catalog entries
+	 * visible to the other transactions before actually building the index.
+	 * This will prevent them from making incompatible HOT updates. The index
+	 * is marked as not ready and invalid so as no other transactions will try
+	 * to use it for INSERT or SELECT.
+	 *
+	 * Before committing, get a session level lock on the relation, the
+	 * concurrent index and its copy to insure that none of them are dropped
+	 * until the operation is done.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time. A concurrent
+	 * build is done for each index that will replace the old indexes. Before
+	 * doing that, we need to wait on the parent relations until no running
+	 * transactions could have the parent table of index open.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			tableOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * get its information.
+		 */
+		indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+		tableOid = indexRel->rd_index->indrelid;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrently_build(tableOid, concurrentOid);
+
+		/* We can do away with our snapshot */
+		PopActiveSnapshot();
+
+		/*
+		 * Commit this transaction to make the indisready update visible for
+		 * concurrent index.
+		 */
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.
+	 *
+	 * We once again wait until no transaction can have the table open with
+	 * the index marked as read-only for updates. Each index validation is
+	 * done in a separate transaction to minimize how long we hold an open
+	 * transaction.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	/*
+	 * Scan the heap for each new index, then insert any missing index
+	 * entries.
+	 */
+	foreach(lc, newIndexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Open separate transaction to validate index */
+		StartTransactionCommand();
+
+		/* Get the parent relation Oid */
+		relOid = IndexGetRelation(indOid, false);
+
+		/*
+		 * Take the reference snapshot that will be used for the old indexes
+		 * validation.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		/* Validate index, which might be a toast */
+		validate_index(relOid, indOid, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+		PopActiveSnapshot();
+
+		/* And we can remove the validating snapshot too */
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * This new index is now valid as it contains all the tuples
+		 * necessary. However, it might not have taken into account deleted
+		 * tuples before the reference snapshot was taken, so we need to wait
+		 * for the transactions that might have older snapshots than ours.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
+		/* Commit this transaction now that the new index is valid */
+		CommitTransactionCommand();
+	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, it is necessary to swap
+	 * each new index with its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes dead at the same
+	 * time to make sure we only get constraint violations from the indexes
+	 * with the correct names.
+	 */
+
+	StartTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			indOid = lfirst_oid(lc);
+		Oid			concurrentOid = lfirst_oid(lc2);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(indOid),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(relOid),
+									 false);
+
+		/* Swap old index with the new one */
+		index_concurrently_swap(concurrentOid, indOid, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(relOid);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead so they can later be dropped.
+	 *
+	 * Note that it is necessary to wait for for virtual locks on the parent
+	 * relation before setting the index as dead.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+		Oid			relOid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		relOid = IndexGetRelation(indOid, false);
+
+		/* Finish the index invalidation and set it as dead. */
+		index_concurrently_set_dead(relOid, indOid);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes, with actually the same code path as DROP INDEX
+	 * CONCURRENTLY. This is safe as all the old entries are already
+	 * considered as invalid and not ready, so they will not be used by other
+	 * backends for any read or write operations.
+	 */
+
+	/* Perform a wait on all the session locks */
+	StartTransactionCommand();
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	/* Get fresh snapshot for next step */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	foreach(lc, indexIds)
+	{
+		Oid			indOid = lfirst_oid(lc);
+
+		CHECK_FOR_INTERRUPTS();
+
+		index_concurrently_drop(indOid);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finally, release the session-level lock on the parent table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+	}
+
+	/* Start a new transaction to finish process properly */
 	StartTransactionCommand();
 
 	MemoryContextDelete(private_context);
+
+	return true;
 }
 
 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 5ed560b02f..108cb300c7 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1295,6 +1295,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1357,7 +1358,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a8a735c247..9cbe82e653 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4357,6 +4357,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3cab90e9f8..6b804d9994 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2100,6 +2100,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e23e68fdb3..ed864695db 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8296,42 +8296,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 6ec795f1b4..9f8f62b5de 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5d8634d818..82511e34ac 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2192,6 +2192,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 10ae21cc61..0373b38ece 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3211,12 +3211,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 1cd1ccc951..d88d90d9fe 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -19,16 +19,17 @@ static void reindex_one_database(const char *name, const char *dbname,
 					 const char *type, const char *host,
 					 const char *port, const char *username,
 					 enum trivalue prompt_password, const char *progname,
-					 bool echo, bool verbose);
+					 bool echo, bool verbose, bool concurrently);
 static void reindex_all_databases(const char *maintenance_db,
 					  const char *host, const char *port,
 					  const char *username, enum trivalue prompt_password,
 					  const char *progname, bool echo,
-					  bool quiet, bool verbose);
+					  bool quiet, bool verbose, bool concurrently);
 static void reindex_system_catalogs(const char *dbname,
 						const char *host, const char *port,
 						const char *username, enum trivalue prompt_password,
-						const char *progname, bool echo, bool verbose);
+						const char *progname, bool echo, bool verbose,
+						bool concurrently);
 static void help(const char *progname);
 
 int
@@ -49,6 +50,7 @@ main(int argc, char *argv[])
 		{"table", required_argument, NULL, 't'},
 		{"index", required_argument, NULL, 'i'},
 		{"verbose", no_argument, NULL, 'v'},
+		{"concurrently", no_argument, NULL, 1},
 		{"maintenance-db", required_argument, NULL, 2},
 		{NULL, 0, NULL, 0}
 	};
@@ -68,6 +70,7 @@ main(int argc, char *argv[])
 	bool		echo = false;
 	bool		quiet = false;
 	bool		verbose = false;
+	bool		concurrently = false;
 	SimpleStringList indexes = {NULL, NULL};
 	SimpleStringList tables = {NULL, NULL};
 	SimpleStringList schemas = {NULL, NULL};
@@ -124,6 +127,9 @@ main(int argc, char *argv[])
 			case 'v':
 				verbose = true;
 				break;
+			case 1:
+				concurrently = true;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -182,7 +188,7 @@ main(int argc, char *argv[])
 		}
 
 		reindex_all_databases(maintenance_db, host, port, username,
-							  prompt_password, progname, echo, quiet, verbose);
+							  prompt_password, progname, echo, quiet, verbose, concurrently);
 	}
 	else if (syscatalog)
 	{
@@ -213,7 +219,7 @@ main(int argc, char *argv[])
 		}
 
 		reindex_system_catalogs(dbname, host, port, username, prompt_password,
-								progname, echo, verbose);
+								progname, echo, verbose, concurrently);
 	}
 	else
 	{
@@ -234,7 +240,7 @@ main(int argc, char *argv[])
 			for (cell = schemas.head; cell; cell = cell->next)
 			{
 				reindex_one_database(cell->val, dbname, "SCHEMA", host, port,
-									 username, prompt_password, progname, echo, verbose);
+									 username, prompt_password, progname, echo, verbose, concurrently);
 			}
 		}
 
@@ -245,7 +251,7 @@ main(int argc, char *argv[])
 			for (cell = indexes.head; cell; cell = cell->next)
 			{
 				reindex_one_database(cell->val, dbname, "INDEX", host, port,
-									 username, prompt_password, progname, echo, verbose);
+									 username, prompt_password, progname, echo, verbose, concurrently);
 			}
 		}
 		if (tables.head != NULL)
@@ -255,7 +261,7 @@ main(int argc, char *argv[])
 			for (cell = tables.head; cell; cell = cell->next)
 			{
 				reindex_one_database(cell->val, dbname, "TABLE", host, port,
-									 username, prompt_password, progname, echo, verbose);
+									 username, prompt_password, progname, echo, verbose, concurrently);
 			}
 		}
 
@@ -265,7 +271,7 @@ main(int argc, char *argv[])
 		 */
 		if (indexes.head == NULL && tables.head == NULL && schemas.head == NULL)
 			reindex_one_database(NULL, dbname, "DATABASE", host, port,
-								 username, prompt_password, progname, echo, verbose);
+								 username, prompt_password, progname, echo, verbose, concurrently);
 	}
 
 	exit(0);
@@ -275,7 +281,7 @@ static void
 reindex_one_database(const char *name, const char *dbname, const char *type,
 					 const char *host, const char *port, const char *username,
 					 enum trivalue prompt_password, const char *progname, bool echo,
-					 bool verbose)
+					 bool verbose, bool concurrently)
 {
 	PQExpBufferData sql;
 
@@ -293,6 +299,8 @@ reindex_one_database(const char *name, const char *dbname, const char *type,
 
 	appendPQExpBufferStr(&sql, type);
 	appendPQExpBufferChar(&sql, ' ');
+	if (concurrently)
+		appendPQExpBufferStr(&sql, "CONCURRENTLY ");
 	if (strcmp(type, "TABLE") == 0 ||
 		strcmp(type, "INDEX") == 0)
 		appendQualifiedRelation(&sql, name, conn, progname, echo);
@@ -328,7 +336,8 @@ static void
 reindex_all_databases(const char *maintenance_db,
 					  const char *host, const char *port,
 					  const char *username, enum trivalue prompt_password,
-					  const char *progname, bool echo, bool quiet, bool verbose)
+					  const char *progname, bool echo, bool quiet, bool verbose,
+					  bool concurrently)
 {
 	PGconn	   *conn;
 	PGresult   *result;
@@ -357,7 +366,7 @@ reindex_all_databases(const char *maintenance_db,
 
 		reindex_one_database(NULL, connstr.data, "DATABASE", host,
 							 port, username, prompt_password,
-							 progname, echo, verbose);
+							 progname, echo, verbose, concurrently);
 	}
 	termPQExpBuffer(&connstr);
 
@@ -367,7 +376,7 @@ reindex_all_databases(const char *maintenance_db,
 static void
 reindex_system_catalogs(const char *dbname, const char *host, const char *port,
 						const char *username, enum trivalue prompt_password,
-						const char *progname, bool echo, bool verbose)
+						const char *progname, bool echo, bool verbose, bool concurrently)
 {
 	PGconn	   *conn;
 	PQExpBufferData sql;
@@ -382,7 +391,11 @@ reindex_system_catalogs(const char *dbname, const char *host, const char *port,
 	if (verbose)
 		appendPQExpBuffer(&sql, " (VERBOSE)");
 
-	appendPQExpBuffer(&sql, " SYSTEM %s;", fmtId(PQdb(conn)));
+	appendPQExpBufferStr(&sql, " SYSTEM ");
+	if (concurrently)
+		appendPQExpBuffer(&sql, "CONCURRENTLY ");
+	appendPQExpBufferStr(&sql, fmtId(PQdb(conn)));
+	appendPQExpBufferChar(&sql, ';');
 
 	if (!executeMaintenanceCommand(conn, sql.data, echo))
 	{
@@ -403,6 +416,7 @@ help(const char *progname)
 	printf(_("  %s [OPTION]... [DBNAME]\n"), progname);
 	printf(_("\nOptions:\n"));
 	printf(_("  -a, --all                 reindex all databases\n"));
+	printf(_("      --concurrently        reindex concurrently\n"));
 	printf(_("  -d, --dbname=DBNAME       database to reindex\n"));
 	printf(_("  -e, --echo                show the commands being sent to the server\n"));
 	printf(_("  -i, --index=INDEX         recreate specific index(es) only\n"));
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e57a5e2bad..a6bd035f01 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -3,7 +3,7 @@
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 23;
+use Test::More tests => 35;
 
 program_help_ok('reindexdb');
 program_version_ok('reindexdb');
@@ -43,6 +43,34 @@
 	qr/statement: REINDEX \(VERBOSE\) TABLE public\.test1;/,
 	'reindex with verbose output');
 
+# the same with --concurrently
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', 'postgres' ],
+	qr/statement: REINDEX DATABASE CONCURRENTLY postgres;/,
+	'SQL REINDEX CONCURRENTLY run');
+
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-t', 'test1', 'postgres' ],
+	qr/statement: REINDEX TABLE CONCURRENTLY public\.test1;/,
+	'reindex specific table concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-i', 'test1x', 'postgres' ],
+	qr/statement: REINDEX INDEX CONCURRENTLY public\.test1x;/,
+	'reindex specific index concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-S', 'public', 'postgres' ],
+	qr/statement: REINDEX SCHEMA CONCURRENTLY public;/,
+	'reindex specific schema concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-s', 'postgres' ],
+	qr/statement: REINDEX SYSTEM CONCURRENTLY postgres;/,
+	'reindex system tables concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '-v', '-t', 'test1', 'postgres' ],
+	qr/statement: REINDEX \(VERBOSE\) TABLE public\.test1;/,
+	'reindex with verbose output');
+
+# connection strings
 $node->command_ok([qw(reindexdb --echo --table=pg_am dbname=template1)],
 	'reindexdb table with connection string');
 $node->command_ok(
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index b235a23f5d..4502fa7e84 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -196,6 +196,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -206,6 +209,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 29f7ed6237..c6bd54f022 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -77,6 +77,22 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
+extern Oid index_concurrently_create_copy(Relation heapRelation,
+										  Oid oldIndexId,
+										  const char *newName);
+
+extern void index_concurrently_build(Oid heapOid,
+									 Oid indexOid);
+
+extern void index_concurrently_swap(Oid newIndexId,
+									Oid oldIndexId,
+									const char *oldName);
+
+extern void index_concurrently_set_dead(Oid heapOid,
+										Oid indexOid);
+
+extern void index_concurrently_drop(Oid indexId);
+
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index e592a914a4..e11caf2cd1 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index fe35783359..7fc21ae4fe 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3315,6 +3315,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 91d9d90135..e32886bacb 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 5d4eb59a0c..e473dc165c 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3238,6 +3238,99 @@ REINDEX (VERBOSE) TABLE reindex_verbose;
 INFO:  index "reindex_verbose_pkey" was reindexed
 DROP TABLE reindex_verbose;
 --
+-- REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab_c3_excl"
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+DROP TABLE testcomment;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  can only reindex the currently open database
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+          Table "public.concur_reindex_tab"
+ Column |   Type    | Collation | Nullable | Default 
+--------+-----------+-----------+----------+---------
+ c1     | integer   |           | not null | 
+ c2     | text      |           |          | 
+ c3     | int4range |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+    "concur_reindex_tab_c3_excl" EXCLUDE USING gist (c3 WITH &&)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
+--
 -- REINDEX SCHEMA
 --
 REINDEX SCHEMA schema_to_reindex; -- failure, schema does not exist
@@ -3296,6 +3389,8 @@ BEGIN;
 REINDEX SCHEMA schema_to_reindex; -- failure, cannot run in a transaction
 ERROR:  REINDEX SCHEMA cannot run inside a transaction block
 END;
+-- concurrently
+REINDEX SCHEMA CONCURRENTLY schema_to_reindex;
 -- Failure for unauthorized user
 CREATE ROLE regress_reindexuser NOLOGIN;
 SET SESSION ROLE regress_reindexuser;
@@ -3307,3 +3402,5 @@ DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
 NOTICE:  drop cascades to 6 other objects
+RESET client_min_messages;
+RESET search_path;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 67ecad8dd5..fa503951cd 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1159,6 +1159,64 @@ CREATE TABLE reindex_verbose(id integer primary key);
 REINDEX (VERBOSE) TABLE reindex_verbose;
 DROP TABLE reindex_verbose;
 
+--
+-- REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex of exclusion constraint
+ALTER TABLE concur_reindex_tab ADD COLUMN c3 int4range, ADD EXCLUDE USING gist (c3 WITH &&);
+INSERT INTO concur_reindex_tab VALUES  (3, 'a', '[1,2]');
+REINDEX TABLE concur_reindex_tab;
+INSERT INTO concur_reindex_tab VALUES  (4, 'a', '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+DROP TABLE testcomment;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
+
 --
 -- REINDEX SCHEMA
 --
@@ -1201,6 +1259,9 @@ CREATE TABLE reindex_after AS SELECT oid, relname, relfilenode, relkind
 REINDEX SCHEMA schema_to_reindex; -- failure, cannot run in a transaction
 END;
 
+-- concurrently
+REINDEX SCHEMA CONCURRENTLY schema_to_reindex;
+
 -- Failure for unauthorized user
 CREATE ROLE regress_reindexuser NOLOGIN;
 SET SESSION ROLE regress_reindexuser;
@@ -1211,3 +1272,5 @@ CREATE ROLE regress_reindexuser NOLOGIN;
 DROP ROLE regress_reindexuser;
 \set VERBOSITY terse \\ -- suppress cascade details
 DROP SCHEMA schema_to_reindex CASCADE;
+RESET client_min_messages;
+RESET search_path;

base-commit: b0825d28ea83e44139bd319e6d1db2c499cd4c6a
-- 
2.21.0

#135

Michael Banck

michael.banck@credativ.de

almost 7 years ago

In reply to: Peter Eisentraut (#134)

Re: REINDEX CONCURRENTLY 2.0

Hi,

Am Mittwoch, den 13.03.2019, 23:10 +0100 schrieb Peter Eisentraut:

Here is an updated patch.

I had a quick look at some of the comments and noticed some possible
nitpicky-level problems:

+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */

"the transaction doing calling this function." sounds wrong to me.

+      * Extract the list of indexes that are going to be rebuilt based on the
+      * list of relation Oids given by caller. For each element in given list,
+      * if the relkind of given relation Oid is a table, all its valid indexes
+      * will be rebuilt, including its associated toast table indexes. If
+      * relkind is an index, this index itself will be rebuilt. The locks taken
+      * on parent relations and involved indexes are kept until this
+      * transaction is committed to protect against schema changes that might
+      * occur until the session lock is taken on each relation, session lock
+      * used to similarly protect from any schema change that could happen
+      * within the multiple transactions that are used during this process.
+      */

I think the last sentence in the above should be split up into several
sentences, maybe at "session lock used..."? Or maybe it should just say
"a session lock is used" instead?

+                                             else
+                                             {
+                                                     /*
+                                                      * Save the list of relation OIDs in private
+                                                      * context
+                                                      */

Missing full stop at end of comment.

+ /* Definetely no indexes, so leave */

s/Definetely/Definitely/.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

#136

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Michael Banck (#135)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-15 22:32, Michael Banck wrote:

I had a quick look at some of the comments and noticed some possible
nitpicky-level problems:

Thanks, I've integrated these changes into my local branch.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#137

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Peter Eisentraut (#134)

Re: REINDEX CONCURRENTLY 2.0

Hello

Yet another review of this patch from me...

An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.

Not sure we can just say "use REINDEX" since only non-concurrently reindex can rebuild such index. I propose to not change this part.

+    The following steps occur in a concurrent index build, each in a separate
+    transaction except when the new index definitions are created

+       All the constraints and foreign keys which refer to the index are swapped...
+       ... This step is done within a single transaction
+       for each temporary entry.

+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.

According to the code index_concurrently_swap is called in loop inside one transaction for all processed indexes of table. Same for index_concurrently_set_dead and index_concurrently_drop calls. So this part of documentation seems incorrect.

And few questions:
- reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex system catalog. Is this expected?
- should reindexdb check server version? For example, binary from patched HEAD can reindex v11 cluster and obviously fail if --concurrently was requested.
- psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently support.

Well, i still have no new questions about backend logic. Maybe we need mark patch as "Ready for Committer" in order to get more attention from other committers?

regards, Sergei

#138

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Sergei Kornilov (#137)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-23 20:04, Sergei Kornilov wrote:

An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      convenient to use <command>REINDEX</command> to rebuild them.

Not sure we can just say "use REINDEX" since only non-concurrently reindex can rebuild such index. I propose to not change this part.

Yeah, I reverted that and adjusted the wording a bit.

+       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
+       to prevent any new tuple insertions after waiting for running queries which
+       may reference the old index to complete. This step is done within a single
+       transaction for each temporary entry.
According to the code index_concurrently_swap is called in loop inside one transaction for all processed indexes of table. Same for index_concurrently_set_dead and index_concurrently_drop calls. So this part of documentation seems incorrect.

I rewrote that whole procedure to make it a bit simpler.

And few questions:
- reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex system catalog. Is this expected?

If support is ever added, then reindexdb supports it automatically. It
seems simpler to not have to repeat the same checks in two places.

- should reindexdb check server version? For example, binary from patched HEAD can reindex v11 cluster and obviously fail if --concurrently was requested.

Added.

- psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently support.

It seems we don't do version checks for tab completion of keywords.

Well, i still have no new questions about backend logic. Maybe we need mark patch as "Ready for Committer" in order to get more attention from other committers?

Let's do it. :-)

I've gone over this patch a few more times. I've read all the
discussion since 2012 again and made sure all the issues were addressed.
I made particularly sure that during the refactoring nothing in CREATE
INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently
changed. I checked all the steps again. I'm happy with it.

One more change I made was in the drop phase. I had to hack it up a bit
so that we can call index_drop() with a concurrent lock but not actually
doing the concurrent processing (that would be a bit recursive). The
previous patch was actually taking too strong a lock here.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v10-0001-REINDEX-CONCURRENTLY.patchtext/plain; charset=UTF-8; name=v10-0001-REINDEX-CONCURRENTLY.patch; x-mac-creator=0; x-mac-type=0Download

From 6a74a529920471d1c53850b3e3719814d2902e42 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 25 Mar 2019 16:17:23 +0100
Subject: [PATCH v10] REINDEX CONCURRENTLY

This adds the CONCURRENTLY option to the REINDEX command.  A REINDEX
CONCURRENTLY on a specific index creates a new index (like CREATE
INDEX CONCURRENTLY), then renames the old index away and the new index
in place and adjusts the dependencies, and then drops the old
index (like DROP INDEX CONCURRENTLY).  The REINDEX command also has
the capability to run its other variants (TABLE, DATABASE) with the
CONCURRENTLY option (but not SYSTEM).

The reindexdb command gets the --concurrently option.

Author: Michael Paquier, Andreas Karlsson, Peter Eisentraut
Reviewed-by: Andres Freund, Fujii Masao, Jim Nasby, Sergei Kornilov
Discussion: https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee
---
 doc/src/sgml/mvcc.sgml                        |   1 +
 doc/src/sgml/ref/create_index.sgml            |   1 +
 doc/src/sgml/ref/reindex.sgml                 | 190 +++-
 doc/src/sgml/ref/reindexdb.sgml               |  10 +
 src/backend/catalog/dependency.c              |   7 +-
 src/backend/catalog/index.c                   | 505 +++++++++-
 src/backend/catalog/pg_depend.c               | 143 +++
 src/backend/commands/indexcmds.c              | 889 +++++++++++++++---
 src/backend/commands/tablecmds.c              |  32 +-
 src/backend/nodes/copyfuncs.c                 |   1 +
 src/backend/nodes/equalfuncs.c                |   1 +
 src/backend/parser/gram.y                     |  22 +-
 src/backend/tcop/utility.c                    |  10 +-
 src/bin/psql/common.c                         |  16 +
 src/bin/psql/tab-complete.c                   |  18 +-
 src/bin/scripts/reindexdb.c                   |  50 +-
 src/bin/scripts/t/090_reindexdb.pl            |  29 +-
 src/include/catalog/dependency.h              |   6 +
 src/include/catalog/index.h                   |  16 +-
 src/include/commands/defrem.h                 |   6 +-
 src/include/nodes/parsenodes.h                |   1 +
 .../expected/reindex-concurrently.out         |  78 ++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out    |  97 ++
 src/test/regress/sql/create_index.sql         |  62 ++
 26 files changed, 2049 insertions(+), 183 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index bedd9a008d..9b7ef8bf09 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ <title>Table-level Lock Modes</title>
         <para>
          Acquired by <command>VACUUM</command> (without <option>FULL</option>),
          <command>ANALYZE</command>, <command>CREATE INDEX CONCURRENTLY</command>,
+         <command>REINDEX CONCURRENTLY</command>,
          <command>CREATE STATISTICS</command>, and certain <command>ALTER
          INDEX</command> and <command>ALTER TABLE</command> variants (for full
          details see <xref linkend="sql-alterindex"/> and <xref
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index d8f018f4da..d9d95b20e3 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -844,6 +844,7 @@ <title>See Also</title>
   <simplelist type="inline">
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
+   <member><xref linkend="sql-reindex"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 47cef987d4..ccabb330cb 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@
 
  <refsynopsisdiv>
 <synopsis>
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="parameter">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -68,7 +68,7 @@ <title>Description</title>
       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
       an <quote>invalid</quote> index. Such indexes are useless but it can be
       convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build. To build the
+      <command>REINDEX</command> will not perform a concurrent build on an invalid index. To build the
       index without interfering with production you should drop the index and
       reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
      </para>
@@ -151,6 +151,21 @@ <title>Parameters</title>
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>CONCURRENTLY</literal></term>
+    <listitem>
+     <para>
+      When this option is used, <productname>PostgreSQL</productname> will rebuild the
+      index without taking any locks that prevent concurrent inserts,
+      updates, or deletes on the table; whereas a standard reindex build
+      locks out writes (but not reads) on the table until it's done.
+      There are several caveats to be aware of when using this option
+      &mdash; see <xref linkend="sql-reindex-concurrently"
+      endterm="sql-reindex-concurrently-title"/>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>VERBOSE</literal></term>
     <listitem>
@@ -241,6 +256,159 @@ <title>Notes</title>
    Each individual partition can be reindexed separately instead.
   </para>
 
+  <refsect2 id="sql-reindex-concurrently">
+   <title id="sql-reindex-concurrently-title">Rebuilding Indexes Concurrently</title>
+
+   <indexterm zone="sql-reindex-concurrently">
+    <primary>index</primary>
+    <secondary>rebuilding concurrently</secondary>
+   </indexterm>
+
+   <para>
+    Rebuilding an index can interfere with regular operation of a database.
+    Normally <productname>PostgreSQL</productname> locks the table whose index is rebuilt
+    against writes and performs the entire index build with a single scan of the
+    table. Other transactions can still read the table, but if they try to
+    insert, update, or delete rows in the table they will block until the
+    index rebuild is finished. This could have a severe effect if the system is
+    a live production database. Very large tables can take many hours to be
+    indexed, and even for smaller tables, an index rebuild can lock out writers
+    for periods that are unacceptably long for a production system.
+   </para>
+
+   <para>
+    <productname>PostgreSQL</productname> supports rebuilding indexes with minimum locking
+    of writes.  This method is invoked by specifying the
+    <literal>CONCURRENTLY</literal> option of <command>REINDEX</command>. When this option
+    is used, <productname>PostgreSQL</productname> must perform two scans of the table
+    for each index that needs to be rebuild and in addition it must wait for
+    all existing transactions that could potentially use the index to
+    terminate. This method requires more total work than a standard index
+    rebuild and takes significantly longer to complete as it needs to wait
+    for unfinished transactions that might modify the index. However, since
+    it allows normal operations to continue while the index is rebuilt, this
+    method is useful for rebuilding indexes in a production environment. Of
+    course, the extra CPU, memory and I/O load imposed by the index rebuild
+    may slow down other operations.
+   </para>
+
+   <para>
+    The following steps occur in a concurrent reindex.  Each step is run in a
+    separate transaction.  If there are multiple indexes to be rebuilt, then
+    each step loops through all the indexes before moving to the next step.
+
+    <orderedlist>
+     <listitem>
+      <para>
+       A new temporary index definition is added into the catalog
+       <literal>pg_index</literal>.  This definition will be used to replace
+       the old index.  A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
+       session level is taken on the indexes being reindexed as well as its
+       associated table to prevent any schema modification while processing.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       A first pass to build the index is done for each new index.  Once the
+       index is built, its flag <literal>pg_index.indisready</literal> is
+       switched to <quote>true</quote> to make ready for inserts, making it
+       visible to other sessions once the transaction that performed the build
+       is finished.  This step is done in a separate transaction for each
+       index.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       Then a second pass is performed to add tuples that were added while the
+       first pass build was running.  This step is also done in a separate
+       transaction for each index.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       All the constraints that refer to the index are changed to refer to the
+       new index definition, and the names of the indexes are changed.  At
+       this point <literal>pg_index.indisvalid</literal> is switched to
+       <quote>true</quote> for the new index and to <quote>false</quote> for
+       the old, and a cache invalidation is done so as all the sessions that
+       referenced the old index are invalidated.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       The old indexes have <literal>pg_index.indisready</literal> switched to
+       <quote>false</quote> to prevent any new tuple insertions, after waiting
+       for running queries that might reference the old index to complete.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       The old indexes are dropped.  The <literal>SHARE UPDATE
+       EXCLUSIVE</literal> session locks for the indexes and the table ar
+       released.
+      </para>
+     </listitem>
+    </orderedlist>
+   </para>
+
+   <para>
+    If a problem arises while rebuilding the indexes, such as a
+    uniqueness violation in a unique index, the <command>REINDEX</command>
+    command will fail but leave behind an <quote>invalid</quote> new index on top
+    of the existing one. This index will be ignored for querying purposes
+    because it might be incomplete; however it will still consume update
+    overhead. The <application>psql</application> <command>\d</command> command will report
+    such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+       Table "public.tab"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ col    | integer |
+Indexes:
+    "idx" btree (col)
+    "idx_ccnew" btree (col) INVALID
+</programlisting>
+
+    The recommended recovery method in such cases is to drop the invalid index
+    and try again to perform <command>REINDEX CONCURRENTLY</command>.  The
+    concurrent index created during the processing has a name ending in the
+    suffix <literal>ccnew</literal>, or <literal>ccold</literal> if it is an
+    old index definition which we failed to drop. Invalid indexes can be
+    dropped using <literal>DROP INDEX</literal>, including invalid toast
+    indexes.
+   </para>
+
+   <para>
+    Regular index builds permit other regular index builds on the same table
+    to occur in parallel, but only one concurrent index build can occur on a
+    table at a time. In both cases, no other types of schema modification on
+    the table are allowed meanwhile.  Another difference is that a regular
+    <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
+    command can be performed within a transaction block, but <command>REINDEX
+    CONCURRENTLY</command> cannot.
+   </para>
+
+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command> since system catalogs cannot be reindexed
+    concurrently.
+   </para>
+
+   <para>
+    Furthermore, indexes for exclusion constraints cannot be reindexed
+    concurrently.  If such an index is named directly in this command, an
+    error is raised.  If a table or database with exclusion constraint indexes
+    is reindexed concurrently, those indexes will be skipped.  (It is possible
+    to reindex such indexes without the concurrently option.)
+   </para>
+  </refsect2>
  </refsect1>
 
  <refsect1>
@@ -272,6 +440,14 @@ <title>Examples</title>
 ...
 broken_db=&gt; REINDEX DATABASE broken_db;
 broken_db=&gt; \q
+</programlisting></para>
+
+  <para>
+   Rebuild a table while authorizing read and write operations on involved
+   relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
 </programlisting></para>
  </refsect1>
 
@@ -282,4 +458,14 @@ <title>Compatibility</title>
    There is no <command>REINDEX</command> command in the SQL standard.
   </para>
  </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="sql-createindex"/></member>
+   <member><xref linkend="sql-dropindex"/></member>
+   <member><xref linkend="app-reindexdb"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/reindexdb.sgml b/doc/src/sgml/ref/reindexdb.sgml
index 1273dad807..cdfac3fe4f 100644
--- a/doc/src/sgml/ref/reindexdb.sgml
+++ b/doc/src/sgml/ref/reindexdb.sgml
@@ -118,6 +118,16 @@ <title>Options</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--concurrently</option></term>
+      <listitem>
+       <para>
+        Use the <literal>CONCURRENTLY</literal> option.  See <xref
+        linkend="sql-reindex"/> for further information.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option><optional>-d</optional> <replaceable class="parameter">dbname</replaceable></option></term>
       <term><option><optional>--dbname=</optional><replaceable class="parameter">dbname</replaceable></option></term>
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index f7acb4103e..7af1670c0d 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -306,6 +306,10 @@ deleteObjectsInList(ObjectAddresses *targetObjects, Relation *depRel,
  * PERFORM_DELETION_SKIP_EXTENSIONS: do not delete extensions, even when
  * deleting objects that are part of an extension.  This should generally
  * be used only when dropping temporary objects.
+ *
+ * PERFORM_DELETION_CONCURRENT_LOCK: perform the drop normally but with a lock
+ * as if it were concurrent.  This is used by REINDEX CONCURRENTLY.
+ *
  */
 void
 performDeletion(const ObjectAddress *object,
@@ -1316,9 +1320,10 @@ doDeletion(const ObjectAddress *object, int flags)
 					relKind == RELKIND_PARTITIONED_INDEX)
 				{
 					bool		concurrent = ((flags & PERFORM_DELETION_CONCURRENTLY) != 0);
+					bool		concurrent_lock_mode = ((flags & PERFORM_DELETION_CONCURRENT_LOCK) != 0);
 
 					Assert(object->objectSubId == 0);
-					index_drop(object->objectId, concurrent);
+					index_drop(object->objectId, concurrent, concurrent_lock_mode);
 				}
 				else
 				{
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index cb2c001017..e7b88c3865 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
+#include "catalog/pg_description.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_operator.h"
@@ -59,6 +60,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -791,11 +793,11 @@ index_create(Relation heapRelation,
 				 errmsg("user-defined indexes on system catalog tables are not supported")));
 
 	/*
-	 * concurrent index build on a system catalog is unsafe because we tend to
-	 * release locks before committing in catalogs
+	 * Concurrent index build on a system catalog is unsafe because we tend to
+	 * release locks before committing in catalogs.
 	 */
 	if (concurrent &&
-		IsSystemRelation(heapRelation))
+		IsCatalogRelation(heapRelation))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("concurrent index creation on system catalog tables is not supported")));
@@ -1210,6 +1212,462 @@ index_create(Relation heapRelation,
 	return indexRelationId;
 }
 
+/*
+ * index_concurrently_create_copy
+ *
+ * Create concurrently an index based on the definition of the one provided by
+ * caller.  The index is inserted into catalogs and needs to be built later
+ * on.  This is called during concurrent reindex processing.
+ */
+Oid
+index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char *newName)
+{
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+	Oid			newIndexId = InvalidOid;
+	HeapTuple	indexTuple,
+				classTuple;
+	Datum		indclassDatum,
+				colOptionDatum,
+				optionDatum;
+	oidvector  *indclass;
+	int2vector *indcoloptions;
+	bool		isnull;
+	List	   *indexColNames = NIL;
+
+	indexRelation = index_open(oldIndexId, RowExclusiveLock);
+
+	/* New index uses the same index information as old index */
+	indexInfo = BuildIndexInfo(indexRelation);
+
+	/* Get the array of class and column options IDs from index info */
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", oldIndexId);
+	indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									Anum_pg_index_indclass, &isnull);
+	Assert(!isnull);
+	indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+	colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+									 Anum_pg_index_indoption, &isnull);
+	Assert(!isnull);
+	indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+	/* Fetch options of index if any */
+	classTuple = SearchSysCache1(RELOID, oldIndexId);
+	if (!HeapTupleIsValid(classTuple))
+		elog(ERROR, "cache lookup failed for relation %u", oldIndexId);
+	optionDatum = SysCacheGetAttr(RELOID, classTuple,
+								  Anum_pg_class_reloptions, &isnull);
+
+	/*
+	 * Extract the list of column names to be used for the index
+	 * creation.
+	 */
+	for (int i = 0; i < indexInfo->ii_NumIndexAttrs; i++)
+	{
+		TupleDesc	indexTupDesc = RelationGetDescr(indexRelation);
+		Form_pg_attribute att = TupleDescAttr(indexTupDesc, i);
+
+		indexColNames = lappend(indexColNames, NameStr(att->attname));
+	}
+
+	/* Now create the new index */
+	newIndexId = index_create(heapRelation,
+							  newName,
+							  InvalidOid,	/* indexRelationId */
+							  InvalidOid,	/* parentIndexRelid */
+							  InvalidOid,	/* parentConstraintId */
+							  InvalidOid,	/* relFileNode */
+							  indexInfo,
+							  indexColNames,
+							  indexRelation->rd_rel->relam,
+							  indexRelation->rd_rel->reltablespace,
+							  indexRelation->rd_indcollation,
+							  indclass->values,
+							  indcoloptions->values,
+							  optionDatum,
+							  INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT,
+							  0,
+							  true,	/* allow table to be a system catalog? */
+							  false, /* is_internal? */
+							  NULL);
+
+	/* Close the relations used and clean up */
+	index_close(indexRelation, NoLock);
+	ReleaseSysCache(indexTuple);
+	ReleaseSysCache(classTuple);
+
+	return newIndexId;
+}
+
+/*
+ * index_concurrently_build
+ *
+ * Build index for a concurrent operation.  Low-level locks are taken when
+ * this operation is performed to prevent only schema changes, but they need
+ * to be kept until the end of the transaction performing this operation.
+ * 'indexOid' refers to an index relation OID already created as part of
+ * previous processing, and 'heapOid' refers to its parent heap relation.
+ */
+void
+index_concurrently_build(Oid heapRelationId,
+						 Oid indexRelationId)
+{
+	Relation	heapRel;
+	Relation	indexRelation;
+	IndexInfo  *indexInfo;
+
+	/* This had better make sure that a snapshot is active */
+	Assert(ActiveSnapshotSet());
+
+	/* Open and lock the parent heap relation */
+	heapRel = table_open(heapRelationId, ShareUpdateExclusiveLock);
+
+	/* And the target index relation */
+	indexRelation = index_open(indexRelationId, RowExclusiveLock);
+
+	/*
+	 * We have to re-build the IndexInfo struct, since it was lost in the
+	 * commit of the transaction where this concurrent index was created at
+	 * the catalog level.
+	 */
+	indexInfo = BuildIndexInfo(indexRelation);
+	Assert(!indexInfo->ii_ReadyForInserts);
+	indexInfo->ii_Concurrent = true;
+	indexInfo->ii_BrokenHotChain = false;
+
+	/* Now build the index */
+	index_build(heapRel, indexRelation, indexInfo, false, true);
+
+	/* Close both the relations, but keep the locks */
+	table_close(heapRel, NoLock);
+	index_close(indexRelation, NoLock);
+
+	/*
+	 * Update the pg_index row to mark the index as ready for inserts. Once we
+	 * commit this transaction, any new transactions that open the table must
+	 * insert new entries into the index for insertions and non-HOT updates.
+	 */
+	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+}
+
+/*
+ * index_concurrently_swap
+ *
+ * Swap name, dependencies, and constraints of the old index over to the new
+ * index, while marking the old index as invalid and the new as valid.
+ */
+void
+index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
+{
+	Relation	pg_class,
+				pg_index,
+				pg_constraint,
+				pg_trigger;
+	Relation	oldClassRel,
+				newClassRel;
+	HeapTuple	oldClassTuple,
+				newClassTuple;
+	Form_pg_class oldClassForm,
+				newClassForm;
+	HeapTuple	oldIndexTuple,
+				newIndexTuple;
+	Form_pg_index oldIndexForm,
+				newIndexForm;
+	Oid			indexConstraintOid;
+	List	   *constraintOids = NIL;
+	ListCell   *lc;
+
+	/*
+	 * Take a necessary lock on the old and new index before swapping them.
+	 */
+	oldClassRel = relation_open(oldIndexId, ShareUpdateExclusiveLock);
+	newClassRel = relation_open(newIndexId, ShareUpdateExclusiveLock);
+
+	/* Now swap names and dependencies of those indexes */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	oldClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newClassTuple = SearchSysCacheCopy1(RELOID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newClassTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldClassForm = (Form_pg_class) GETSTRUCT(oldClassTuple);
+	newClassForm = (Form_pg_class) GETSTRUCT(newClassTuple);
+
+	/* Swap the names */
+	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
+	namestrcpy(&oldClassForm->relname, oldName);
+
+	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
+	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
+
+	heap_freetuple(oldClassTuple);
+	heap_freetuple(newClassTuple);
+
+	/* Now swap index info */
+	pg_index = table_open(IndexRelationId, RowExclusiveLock);
+
+	oldIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(oldIndexId));
+	if (!HeapTupleIsValid(oldIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", oldIndexId);
+	newIndexTuple = SearchSysCacheCopy1(INDEXRELID,
+										ObjectIdGetDatum(newIndexId));
+	if (!HeapTupleIsValid(newIndexTuple))
+		elog(ERROR, "could not find tuple for relation %u", newIndexId);
+
+	oldIndexForm = (Form_pg_index) GETSTRUCT(oldIndexTuple);
+	newIndexForm = (Form_pg_index) GETSTRUCT(newIndexTuple);
+
+	/*
+	 * Copy constraint flags from the old index. This is safe because the old
+	 * index guaranteed uniqueness.
+	 */
+	newIndexForm->indisprimary = oldIndexForm->indisprimary;
+	oldIndexForm->indisprimary = false;
+	newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+	oldIndexForm->indisexclusion = false;
+	newIndexForm->indimmediate = oldIndexForm->indimmediate;
+	oldIndexForm->indimmediate = true;
+
+	/* Mark old index as valid and new as invalid as index_set_state_flags */
+	newIndexForm->indisvalid = true;
+	oldIndexForm->indisvalid = false;
+	oldIndexForm->indisclustered = false;
+
+	CatalogTupleUpdate(pg_index, &oldIndexTuple->t_self, oldIndexTuple);
+	CatalogTupleUpdate(pg_index, &newIndexTuple->t_self, newIndexTuple);
+
+	heap_freetuple(oldIndexTuple);
+	heap_freetuple(newIndexTuple);
+
+	/*
+	 * Move constraints and triggers over to the new index
+	 */
+
+	constraintOids = get_index_ref_constraints(oldIndexId);
+
+	indexConstraintOid = get_index_constraint(oldIndexId);
+
+	if (OidIsValid(indexConstraintOid))
+		constraintOids = lappend_oid(constraintOids, indexConstraintOid);
+
+	pg_constraint = table_open(ConstraintRelationId, RowExclusiveLock);
+	pg_trigger = table_open(TriggerRelationId, RowExclusiveLock);
+
+	foreach(lc, constraintOids)
+	{
+		HeapTuple	constraintTuple,
+					triggerTuple;
+		Form_pg_constraint conForm;
+		ScanKeyData key[1];
+		SysScanDesc scan;
+		Oid			constraintOid = lfirst_oid(lc);
+
+		/* Move the constraint from the old to the new index */
+		constraintTuple = SearchSysCacheCopy1(CONSTROID,
+											  ObjectIdGetDatum(constraintOid));
+		if (!HeapTupleIsValid(constraintTuple))
+			elog(ERROR, "could not find tuple for constraint %u", constraintOid);
+
+		conForm = ((Form_pg_constraint) GETSTRUCT(constraintTuple));
+
+		if (conForm->conindid == oldIndexId)
+		{
+			conForm->conindid = newIndexId;
+
+			CatalogTupleUpdate(pg_constraint, &constraintTuple->t_self, constraintTuple);
+		}
+
+		heap_freetuple(constraintTuple);
+
+		/* Search for trigger records */
+		ScanKeyInit(&key[0],
+					Anum_pg_trigger_tgconstraint,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(constraintOid));
+
+		scan = systable_beginscan(pg_trigger, TriggerConstraintIndexId, true,
+								  NULL, 1, key);
+
+		while (HeapTupleIsValid((triggerTuple = systable_getnext(scan))))
+		{
+			Form_pg_trigger tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			if (tgForm->tgconstrindid != oldIndexId)
+				continue;
+
+			/* Make a modifiable copy */
+			triggerTuple = heap_copytuple(triggerTuple);
+			tgForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
+
+			tgForm->tgconstrindid = newIndexId;
+
+			CatalogTupleUpdate(pg_trigger, &triggerTuple->t_self, triggerTuple);
+
+			heap_freetuple(triggerTuple);
+		}
+
+		systable_endscan(scan);
+	}
+
+	/*
+	 * Move comment if any
+	 */
+	{
+		Relation	description;
+		ScanKeyData skey[3];
+		SysScanDesc sd;
+		HeapTuple	tuple;
+		Datum		values[Natts_pg_description] = {0};
+		bool		nulls[Natts_pg_description] = {0};
+		bool		replaces[Natts_pg_description] = {0};
+
+		values[Anum_pg_description_objoid - 1] = ObjectIdGetDatum(newIndexId);
+		replaces[Anum_pg_description_objoid - 1] = true;
+
+		ScanKeyInit(&skey[0],
+					Anum_pg_description_objoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(oldIndexId));
+		ScanKeyInit(&skey[1],
+					Anum_pg_description_classoid,
+					BTEqualStrategyNumber, F_OIDEQ,
+					ObjectIdGetDatum(RelationRelationId));
+		ScanKeyInit(&skey[2],
+					Anum_pg_description_objsubid,
+					BTEqualStrategyNumber, F_INT4EQ,
+					Int32GetDatum(0));
+
+		description = table_open(DescriptionRelationId, RowExclusiveLock);
+
+		sd = systable_beginscan(description, DescriptionObjIndexId, true,
+								NULL, 3, skey);
+
+		while ((tuple = systable_getnext(sd)) != NULL)
+		{
+			tuple = heap_modify_tuple(tuple, RelationGetDescr(description),
+									  values, nulls, replaces);
+			CatalogTupleUpdate(description, &tuple->t_self, tuple);
+
+			break;					/* Assume there can be only one match */
+		}
+
+		systable_endscan(sd);
+		table_close(description, NoLock);
+	}
+
+	/*
+	 * Move all dependencies on the old index to the new one
+	 */
+
+	if (OidIsValid(indexConstraintOid))
+	{
+		ObjectAddress myself,
+					referenced;
+
+		/* Change to having the new index depend on the constraint */
+		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
+										ConstraintRelationId, DEPENDENCY_INTERNAL);
+
+		myself.classId = RelationRelationId;
+		myself.objectId = newIndexId;
+		myself.objectSubId = 0;
+
+		referenced.classId = ConstraintRelationId;
+		referenced.objectId = indexConstraintOid;
+		referenced.objectSubId = 0;
+
+		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
+	}
+
+	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+
+	/*
+	 * Copy over statistics from old to new index
+	 */
+	{
+		PgStat_StatTabEntry *tabentry;
+
+		tabentry = pgstat_fetch_stat_tabentry(oldIndexId);
+		if (tabentry)
+		{
+			if (newClassRel->pgstat_info)
+			{
+				newClassRel->pgstat_info->t_counts.t_numscans = tabentry->numscans;
+				newClassRel->pgstat_info->t_counts.t_tuples_returned = tabentry->tuples_returned;
+				newClassRel->pgstat_info->t_counts.t_tuples_fetched = tabentry->tuples_fetched;
+				newClassRel->pgstat_info->t_counts.t_blocks_fetched = tabentry->blocks_fetched;
+				newClassRel->pgstat_info->t_counts.t_blocks_hit = tabentry->blocks_hit;
+				/* The data will be sent by the next pgstat_report_stat() call. */
+			}
+		}
+	}
+
+	/* Close relations */
+	table_close(pg_class, RowExclusiveLock);
+	table_close(pg_index, RowExclusiveLock);
+	table_close(pg_constraint, RowExclusiveLock);
+	table_close(pg_trigger, RowExclusiveLock);
+
+	/* The lock taken previously is not released until the end of transaction */
+	relation_close(oldClassRel, NoLock);
+	relation_close(newClassRel, NoLock);
+}
+
+/*
+ * index_concurrently_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index.  After calling this
+ * function, the index is seen by all the backends as dead.  Low-level locks
+ * taken here are kept until the end of the transaction calling this function.
+ */
+void
+index_concurrently_set_dead(Oid heapId, Oid indexId)
+{
+	Relation	userHeapRelation;
+	Relation	userIndexRelation;
+
+	/*
+	 * No more predicate locks will be acquired on this index, and we're
+	 * about to stop doing inserts into the index which could show
+	 * conflicts with existing predicate locks, so now is the time to move
+	 * them to the heap relation.
+	 */
+	userHeapRelation = table_open(heapId, ShareUpdateExclusiveLock);
+	userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
+	TransferPredicateLocksToHeapRelation(userIndexRelation);
+
+	/*
+	 * Now we are sure that nobody uses the index for queries; they just
+	 * might have it open for updating it.  So now we can unset indisready
+	 * and indislive, then wait till nobody could be using it at all
+	 * anymore.
+	 */
+	index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
+
+	/*
+	 * Invalidate the relcache for the table, so that after this commit
+	 * all sessions will refresh the table's index list.  Forgetting just
+	 * the index's relcache entry is not enough.
+	 */
+	CacheInvalidateRelcache(userHeapRelation);
+
+	/*
+	 * Close the relations again, though still holding session lock.
+	 */
+	table_close(userHeapRelation, NoLock);
+	index_close(userIndexRelation, NoLock);
+}
+
 /*
  * index_constraint_create
  *
@@ -1447,9 +1905,14 @@ index_constraint_create(Relation heapRelation,
  *
  * NOTE: this routine should now only be called through performDeletion(),
  * else associated dependencies won't be cleaned up.
+ *
+ * If concurrent is true, do a DROP INDEX CONCURRENTLY.  If concurrent is
+ * false but concurrent_lock_mode is true, then do a normal DROP INDEX but
+ * take a lock for CONCURRENTLY processing.  That is used as part of REINDEX
+ * CONCURRENTLY.
  */
 void
-index_drop(Oid indexId, bool concurrent)
+index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
 {
 	Oid			heapId;
 	Relation	userHeapRelation;
@@ -1481,7 +1944,7 @@ index_drop(Oid indexId, bool concurrent)
 	 * using it.)
 	 */
 	heapId = IndexGetRelation(indexId, false);
-	lockmode = concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock;
+	lockmode = (concurrent || concurrent_lock_mode) ? ShareUpdateExclusiveLock : AccessExclusiveLock;
 	userHeapRelation = table_open(heapId, lockmode);
 	userIndexRelation = index_open(indexId, lockmode);
 
@@ -1596,36 +2059,8 @@ index_drop(Oid indexId, bool concurrent)
 		 */
 		WaitForLockers(heaplocktag, AccessExclusiveLock);
 
-		/*
-		 * No more predicate locks will be acquired on this index, and we're
-		 * about to stop doing inserts into the index which could show
-		 * conflicts with existing predicate locks, so now is the time to move
-		 * them to the heap relation.
-		 */
-		userHeapRelation = table_open(heapId, ShareUpdateExclusiveLock);
-		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
-		TransferPredicateLocksToHeapRelation(userIndexRelation);
-
-		/*
-		 * Now we are sure that nobody uses the index for queries; they just
-		 * might have it open for updating it.  So now we can unset indisready
-		 * and indislive, then wait till nobody could be using it at all
-		 * anymore.
-		 */
-		index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
-		/*
-		 * Invalidate the relcache for the table, so that after this commit
-		 * all sessions will refresh the table's index list.  Forgetting just
-		 * the index's relcache entry is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
-		/*
-		 * Close the relations again, though still holding session lock.
-		 */
-		table_close(userHeapRelation, NoLock);
-		index_close(userIndexRelation, NoLock);
+		/* Finish invalidation of index and mark it as dead */
+		index_concurrently_set_dead(heapId, indexId);
 
 		/*
 		 * Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 23b01f841e..d63bf5e56d 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -395,6 +395,94 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to point to a different object of the same type
+ *
+ * refClassId/oldRefObjectId specify the old referenced object.
+ * newRefObjectId is the new referenced object (must be of class refClassId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+					 Oid newRefObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+	ObjectAddress objAddr;
+	bool		newIsPinned;
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	/*
+	 * If oldRefObjectId is pinned, there won't be any dependency entries on
+	 * it --- we can't cope in that case.  (This isn't really worth expending
+	 * code to fix, in current usage; it just means you can't rename stuff out
+	 * of pg_catalog, which would likely be a bad move anyway.)
+	 */
+	objAddr.classId = refClassId;
+	objAddr.objectId = oldRefObjectId;
+	objAddr.objectSubId = 0;
+
+	if (isObjectPinned(&objAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&objAddr))));
+
+	/*
+	 * We can handle adding a dependency on something pinned, though, since
+	 * that just means deleting the dependency entry.
+	 */
+	objAddr.objectId = newRefObjectId;
+
+	newIsPinned = isObjectPinned(&objAddr, depRel);
+
+	/* Now search for dependency records */
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(refClassId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldRefObjectId));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (newIsPinned)
+			CatalogTupleDelete(depRel, &tup->t_self);
+		else
+		{
+			/* make a modifiable copy */
+			tup = heap_copytuple(tup);
+			depform = (Form_pg_depend) GETSTRUCT(tup);
+
+			depform->refobjid = newRefObjectId;
+
+			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+			heap_freetuple(tup);
+		}
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * isObjectPinned()
  *
@@ -754,3 +842,58 @@ get_index_constraint(Oid indexId)
 
 	return constraintId;
 }
+
+/*
+ * get_index_ref_constraints
+ *		Given the OID of an index, return the OID of all foreign key
+ *		constraints which reference the index.
+ */
+List *
+get_index_ref_constraints(Oid indexId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	/* Search the dependency table for the index */
+	depRel = table_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_refclassid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_refobjid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(indexId));
+	ScanKeyInit(&key[2],
+				Anum_pg_depend_refobjsubid,
+				BTEqualStrategyNumber, F_INT4EQ,
+				Int32GetDatum(0));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+							  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume any normal dependency from a constraint must be what we
+		 * are looking for.
+		 */
+		if (deprec->classid == ConstraintRelationId &&
+			deprec->objsubid == 0 &&
+			deprec->deptype == DEPENDENCY_NORMAL)
+		{
+			result = lappend_oid(result, deprec->objid);
+		}
+	}
+
+	systable_endscan(scan);
+	table_close(depRel, AccessShareLock);
+
+	return result;
+}
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index c3a53d81aa..62c457a8a6 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -58,6 +58,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/partcache.h"
+#include "utils/pg_rusage.h"
 #include "utils/regproc.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -83,6 +84,7 @@ static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
 static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
 								Oid relId, Oid oldRelId, void *arg);
+static bool ReindexRelationConcurrently(Oid relationOid, int options);
 static void ReindexPartitionedIndex(Relation parentIdx);
 
 /*
@@ -297,6 +299,90 @@ CheckIndexCompatible(Oid oldId,
 	return ret;
 }
 
+
+/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have an older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually. This is used when building an index concurrently.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours).  Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see the
+ * index being worked on.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either.  (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument.  We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait.  We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+	int			n_old_snapshots;
+	int			i;
+	VirtualTransactionId *old_snapshots;
+
+	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+										  &n_old_snapshots);
+
+	for (i = 0; i < n_old_snapshots; i++)
+	{
+		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+			continue;			/* found uninteresting in previous cycle */
+
+		if (i > 0)
+		{
+			/* see if anything's changed ... */
+			VirtualTransactionId *newer_snapshots;
+			int			n_newer_snapshots;
+			int			j;
+			int			k;
+
+			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+													true, false,
+													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+													&n_newer_snapshots);
+			for (j = i; j < n_old_snapshots; j++)
+			{
+				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+					continue;	/* found uninteresting in previous cycle */
+				for (k = 0; k < n_newer_snapshots; k++)
+				{
+					if (VirtualTransactionIdEquals(old_snapshots[j],
+												   newer_snapshots[k]))
+						break;
+				}
+				if (k >= n_newer_snapshots) /* not there anymore */
+					SetInvalidVirtualTransactionId(old_snapshots[j]);
+			}
+			pfree(newer_snapshots);
+		}
+
+		if (VirtualTransactionIdIsValid(old_snapshots[i]))
+			VirtualXactLock(old_snapshots[i], true);
+	}
+}
+
+
 /*
  * DefineIndex
  *		Creates a new index.
@@ -345,7 +431,6 @@ DefineIndex(Oid relationId,
 	List	   *indexColNames;
 	List	   *allIndexParams;
 	Relation	rel;
-	Relation	indexRelation;
 	HeapTuple	tuple;
 	Form_pg_am	accessMethodForm;
 	IndexAmRoutine *amRoutine;
@@ -360,9 +445,7 @@ DefineIndex(Oid relationId,
 	int			numberOfAttributes;
 	int			numberOfKeyAttributes;
 	TransactionId limitXmin;
-	VirtualTransactionId *old_snapshots;
 	ObjectAddress address;
-	int			n_old_snapshots;
 	LockRelId	heaprelid;
 	LOCKTAG		heaplocktag;
 	LOCKMODE	lockmode;
@@ -1152,34 +1235,11 @@ DefineIndex(Oid relationId,
 	 * HOT-chain or the extension of the chain is HOT-safe for this index.
 	 */
 
-	/* Open and lock the parent heap relation */
-	rel = table_open(relationId, ShareUpdateExclusiveLock);
-
-	/* And the target index relation */
-	indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
 	/* Set ActiveSnapshot since functions in the indexes may need it */
 	PushActiveSnapshot(GetTransactionSnapshot());
 
-	/* We have to re-build the IndexInfo struct, since it was lost in commit */
-	indexInfo = BuildIndexInfo(indexRelation);
-	Assert(!indexInfo->ii_ReadyForInserts);
-	indexInfo->ii_Concurrent = true;
-	indexInfo->ii_BrokenHotChain = false;
-
-	/* Now build the index */
-	index_build(rel, indexRelation, indexInfo, false, true);
-
-	/* Close both the relations, but keep the locks */
-	table_close(rel, NoLock);
-	index_close(indexRelation, NoLock);
-
-	/*
-	 * Update the pg_index row to mark the index as ready for inserts. Once we
-	 * commit this transaction, any new transactions that open the table must
-	 * insert new entries into the index for insertions and non-HOT updates.
-	 */
-	index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+	/* Perform concurrent build of index */
+	index_concurrently_build(relationId, indexRelationId);
 
 	/* we can do away with our snapshot */
 	PopActiveSnapshot();
@@ -1251,74 +1311,9 @@ DefineIndex(Oid relationId,
 	 * The index is now valid in the sense that it contains all currently
 	 * interesting tuples.  But since it might not contain tuples deleted just
 	 * before the reference snap was taken, we have to wait out any
-	 * transactions that might have older snapshots.  Obtain a list of VXIDs
-	 * of such transactions, and wait for them individually.
-	 *
-	 * We can exclude any running transactions that have xmin > the xmin of
-	 * our reference snapshot; their oldest snapshot must be newer than ours.
-	 * We can also exclude any transactions that have xmin = zero, since they
-	 * evidently have no live snapshot at all (and any one they might be in
-	 * process of taking is certainly newer than ours).  Transactions in other
-	 * DBs can be ignored too, since they'll never even be able to see this
-	 * index.
-	 *
-	 * We can also exclude autovacuum processes and processes running manual
-	 * lazy VACUUMs, because they won't be fazed by missing index entries
-	 * either.  (Manual ANALYZEs, however, can't be excluded because they
-	 * might be within transactions that are going to do arbitrary operations
-	 * later.)
-	 *
-	 * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
-	 * check for that.
-	 *
-	 * If a process goes idle-in-transaction with xmin zero, we do not need to
-	 * wait for it anymore, per the above argument.  We do not have the
-	 * infrastructure right now to stop waiting if that happens, but we can at
-	 * least avoid the folly of waiting when it is idle at the time we would
-	 * begin to wait.  We do this by repeatedly rechecking the output of
-	 * GetCurrentVirtualXIDs.  If, during any iteration, a particular vxid
-	 * doesn't show up in the output, we know we can forget about it.
+	 * transactions that might have older snapshots.
 	 */
-	old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
-										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-										  &n_old_snapshots);
-
-	for (i = 0; i < n_old_snapshots; i++)
-	{
-		if (!VirtualTransactionIdIsValid(old_snapshots[i]))
-			continue;			/* found uninteresting in previous cycle */
-
-		if (i > 0)
-		{
-			/* see if anything's changed ... */
-			VirtualTransactionId *newer_snapshots;
-			int			n_newer_snapshots;
-			int			j;
-			int			k;
-
-			newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
-													true, false,
-													PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
-													&n_newer_snapshots);
-			for (j = i; j < n_old_snapshots; j++)
-			{
-				if (!VirtualTransactionIdIsValid(old_snapshots[j]))
-					continue;	/* found uninteresting in previous cycle */
-				for (k = 0; k < n_newer_snapshots; k++)
-				{
-					if (VirtualTransactionIdEquals(old_snapshots[j],
-												   newer_snapshots[k]))
-						break;
-				}
-				if (k >= n_newer_snapshots) /* not there anymore */
-					SetInvalidVirtualTransactionId(old_snapshots[j]);
-			}
-			pfree(newer_snapshots);
-		}
-
-		if (VirtualTransactionIdIsValid(old_snapshots[i]))
-			VirtualXactLock(old_snapshots[i], true);
-	}
+	WaitForOlderSnapshots(limitXmin);
 
 	/*
 	 * Index can now be marked valid -- update its pg_index entry
@@ -2205,7 +2200,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 void
-ReindexIndex(RangeVar *indexRelation, int options)
+ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -2217,7 +2212,8 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	 * obtain lock on table first, to avoid deadlock hazard.  The lock level
 	 * used here must match the index lock obtained in reindex_index().
 	 */
-	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
+	indOid = RangeVarGetRelidExtended(indexRelation,
+									  concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
 									  0,
 									  RangeVarCallbackForReindexIndex,
 									  (void *) &heapOid);
@@ -2237,7 +2233,10 @@ ReindexIndex(RangeVar *indexRelation, int options)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence, options);
+	if (concurrent)
+		ReindexRelationConcurrently(indOid, options);
+	else
+		reindex_index(indOid, false, persistence, options);
 }
 
 /*
@@ -2305,18 +2304,26 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate all indexes of a table (and of its toast table, if any)
  */
 Oid
-ReindexTable(RangeVar *relation, int options)
+ReindexTable(RangeVar *relation, int options, bool concurrent)
 {
 	Oid			heapOid;
+	bool		result;
 
 	/* The lock level used here should match reindex_relation(). */
-	heapOid = RangeVarGetRelidExtended(relation, ShareLock, 0,
+	heapOid = RangeVarGetRelidExtended(relation,
+									   concurrent ? ShareUpdateExclusiveLock : ShareLock,
+									   0,
 									   RangeVarCallbackOwnsTable, NULL);
 
-	if (!reindex_relation(heapOid,
-						  REINDEX_REL_PROCESS_TOAST |
-						  REINDEX_REL_CHECK_CONSTRAINTS,
-						  options))
+	if (concurrent)
+		result = ReindexRelationConcurrently(heapOid, options);
+	else
+		result = reindex_relation(heapOid,
+								  REINDEX_REL_PROCESS_TOAST |
+								  REINDEX_REL_CHECK_CONSTRAINTS,
+								  options);
+
+	if (!result)
 		ereport(NOTICE,
 				(errmsg("table \"%s\" has no indexes",
 						relation->relname)));
@@ -2334,7 +2341,7 @@ ReindexTable(RangeVar *relation, int options)
  */
 void
 ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options)
+					  int options, bool concurrent)
 {
 	Oid			objectOid;
 	Relation	relationRelation;
@@ -2346,12 +2353,18 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	List	   *relids = NIL;
 	ListCell   *l;
 	int			num_keys;
+	bool		concurrent_warning = false;
 
 	AssertArg(objectName);
 	Assert(objectKind == REINDEX_OBJECT_SCHEMA ||
 		   objectKind == REINDEX_OBJECT_SYSTEM ||
 		   objectKind == REINDEX_OBJECT_DATABASE);
 
+	if (objectKind == REINDEX_OBJECT_SYSTEM && concurrent)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("concurrent reindex of system catalogs is not supported")));
+
 	/*
 	 * Get OID of object to reindex, being the database currently being used
 	 * by session for a database or for system catalogs, or the schema defined
@@ -2454,6 +2467,25 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 			!pg_class_ownercheck(relid, GetUserId()))
 			continue;
 
+		/*
+		 * Skip system tables that index_create() would reject to index
+		 * concurrently.  XXX We need the additional check for
+		 * FirstNormalObjectId to skip information_schema tables, because
+		 * IsCatalogClass() here does not cover information_schema, but the
+		 * check in index_create() will error on the TOAST tables of
+		 * information_schema tables.
+		 */
+		if (concurrent &&
+			(IsCatalogClass(relid, classtuple) || relid < FirstNormalObjectId))
+		{
+			if (!concurrent_warning)
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("concurrent reindex is not supported for catalog relations, skipping all")));
+			concurrent_warning = true;
+			continue;
+		}
+
 		/* Save the list of relation OIDs in private context */
 		old = MemoryContextSwitchTo(private_context);
 
@@ -2480,26 +2512,663 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
 	foreach(l, relids)
 	{
 		Oid			relid = lfirst_oid(l);
+		bool		result;
 
 		StartTransactionCommand();
 		/* functions in indexes may want a snapshot set */
 		PushActiveSnapshot(GetTransactionSnapshot());
-		if (reindex_relation(relid,
-							 REINDEX_REL_PROCESS_TOAST |
-							 REINDEX_REL_CHECK_CONSTRAINTS,
-							 options))
 
-			if (options & REINDEXOPT_VERBOSE)
+		if (concurrent)
+		{
+			result = ReindexRelationConcurrently(relid, options);
+			/* ReindexRelationConcurrently() does the verbose output */
+		}
+		else
+		{
+			result = reindex_relation(relid,
+									  REINDEX_REL_PROCESS_TOAST |
+									  REINDEX_REL_CHECK_CONSTRAINTS,
+									  options);
+
+			if (result && (options & REINDEXOPT_VERBOSE))
 				ereport(INFO,
 						(errmsg("table \"%s.%s\" was reindexed",
 								get_namespace_name(get_rel_namespace(relid)),
 								get_rel_name(relid))));
+
+			PopActiveSnapshot();
+		}
+
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	MemoryContextDelete(private_context);
+}
+
+
+/*
+ * ReindexRelationConcurrently - process REINDEX CONCURRENTLY for given
+ * relation OID
+ *
+ * The relation can be either an index or a table.  If it is a table, all its
+ * valid indexes will be rebuilt, including its associated toast table
+ * indexes.  If it is an index, this index itself will be rebuilt.
+ *
+ * The locks taken on parent tables and involved indexes are kept until the
+ * transaction is committed, at which point a session lock is taken on each
+ * relation.  Both of these protect against concurrent schema changes.
+ */
+static bool
+ReindexRelationConcurrently(Oid relationOid, int options)
+{
+	List	   *heapRelationIds = NIL;
+	List	   *indexIds = NIL;
+	List	   *newIndexIds = NIL;
+	List	   *relationLocks = NIL;
+	List	   *lockTags = NIL;
+	ListCell   *lc,
+			   *lc2;
+	MemoryContext private_context;
+	MemoryContext oldcontext;
+	char		relkind;
+	char	   *relationName = NULL;
+	char	   *relationNamespace = NULL;
+	PGRUsage	ru0;
+
+	/*
+	 * Create a memory context that will survive forced transaction commits we
+	 * do below.  Since it is a child of PortalContext, it will go away
+	 * eventually even if we suffer an error; there's no need for special
+	 * abort cleanup logic.
+	 */
+	private_context = AllocSetContextCreate(PortalContext,
+											"ReindexConcurrent",
+											ALLOCSET_SMALL_SIZES);
+
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		/* Save data needed by REINDEX VERBOSE in private context */
+		oldcontext = MemoryContextSwitchTo(private_context);
+
+		relationName = get_rel_name(relationOid);
+		relationNamespace = get_namespace_name(get_rel_namespace(relationOid));
+
+		pg_rusage_init(&ru0);
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	relkind = get_rel_relkind(relationOid);
+
+	/*
+	 * Extract the list of indexes that are going to be rebuilt based on the
+	 * list of relation Oids given by caller.
+	 */
+	switch (relkind)
+	{
+		case RELKIND_RELATION:
+		case RELKIND_MATVIEW:
+		case RELKIND_TOASTVALUE:
+			{
+				/*
+				 * In the case of a relation, find all its indexes including
+				 * toast indexes.
+				 */
+				Relation	heapRelation;
+
+				/* Save the list of relation OIDs in private context */
+				oldcontext = MemoryContextSwitchTo(private_context);
+
+				/* Track this relation for session locks */
+				heapRelationIds = lappend_oid(heapRelationIds, relationOid);
+
+				MemoryContextSwitchTo(oldcontext);
+
+				/* Open relation to get its indexes */
+				heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
+
+				/* Add all the valid indexes of relation to list */
+				foreach(lc, RelationGetIndexList(heapRelation))
+				{
+					Oid			cellOid = lfirst_oid(lc);
+					Relation	indexRelation = index_open(cellOid,
+														   ShareUpdateExclusiveLock);
+
+					if (!indexRelation->rd_index->indisvalid)
+						ereport(WARNING,
+								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+								 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else if (indexRelation->rd_index->indisexclusion)
+						ereport(WARNING,
+								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+								 errmsg("cannot reindex concurrently exclusion constraint index \"%s.%s\", skipping",
+										get_namespace_name(get_rel_namespace(cellOid)),
+										get_rel_name(cellOid))));
+					else
+					{
+						/* Save the list of relation OIDs in private context */
+						oldcontext = MemoryContextSwitchTo(private_context);
+
+						indexIds = lappend_oid(indexIds, cellOid);
+
+						MemoryContextSwitchTo(oldcontext);
+					}
+
+					index_close(indexRelation, NoLock);
+				}
+
+				/* Also add the toast indexes */
+				if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+				{
+					Oid			toastOid = heapRelation->rd_rel->reltoastrelid;
+					Relation	toastRelation = table_open(toastOid,
+														   ShareUpdateExclusiveLock);
+
+					/* Save the list of relation OIDs in private context */
+					oldcontext = MemoryContextSwitchTo(private_context);
+
+					/* Track this relation for session locks */
+					heapRelationIds = lappend_oid(heapRelationIds, toastOid);
+
+					MemoryContextSwitchTo(oldcontext);
+
+					foreach(lc2, RelationGetIndexList(toastRelation))
+					{
+						Oid			cellOid = lfirst_oid(lc2);
+						Relation	indexRelation = index_open(cellOid,
+															   ShareUpdateExclusiveLock);
+
+						if (!indexRelation->rd_index->indisvalid)
+							ereport(WARNING,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+											get_namespace_name(get_rel_namespace(cellOid)),
+											get_rel_name(cellOid))));
+						else
+						{
+							/*
+							 * Save the list of relation OIDs in private
+							 * context
+							 */
+							oldcontext = MemoryContextSwitchTo(private_context);
+
+							indexIds = lappend_oid(indexIds, cellOid);
+
+							MemoryContextSwitchTo(oldcontext);
+						}
+
+						index_close(indexRelation, NoLock);
+					}
+
+					table_close(toastRelation, NoLock);
+				}
+
+				table_close(heapRelation, NoLock);
+				break;
+			}
+		case RELKIND_INDEX:
+			{
+				/*
+				 * For an index simply add its Oid to list. Invalid indexes
+				 * cannot be included in list.
+				 */
+				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+				Oid			heapId = IndexGetRelation(relationOid, false);
+
+				/* A shared relation cannot be reindexed concurrently */
+				if (IsSharedRelation(heapId))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for shared relations")));
+
+				/* A system catalog cannot be reindexed concurrently */
+				if (IsSystemNamespace(get_rel_namespace(heapId)))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("concurrent reindex is not supported for catalog relations")));
+
+				/* Save the list of relation OIDs in private context */
+				oldcontext = MemoryContextSwitchTo(private_context);
+
+				/* Track the heap relation of this index for session locks */
+				heapRelationIds = list_make1_oid(heapId);
+
+				MemoryContextSwitchTo(oldcontext);
+
+				if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					oldcontext = MemoryContextSwitchTo(private_context);
+
+					indexIds = lappend_oid(indexIds, relationOid);
+
+					MemoryContextSwitchTo(oldcontext);
+				}
+
+				index_close(indexRelation, NoLock);
+				break;
+			}
+		case RELKIND_PARTITIONED_TABLE:
+			/* see reindex_relation() */
+			ereport(WARNING,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("REINDEX of partitioned tables is not yet implemented, skipping \"%s\"",
+							get_rel_name(relationOid))));
+			return false;
+		default:
+			/* Return error if type of relation is not supported */
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot reindex concurrently this type of relation")));
+			break;
+	}
+
+	/* Definitely no indexes, so leave */
+	if (indexIds == NIL)
+	{
+		PopActiveSnapshot();
+		return false;
+	}
+
+	Assert(heapRelationIds != NIL);
+
+	/*-----
+	 * Now we have all the indexes we want to process in indexIds.
+	 *
+	 * The phases now are:
+	 *
+	 * 1. create new indexes in the catalog
+	 * 2. build new indexes
+	 * 3. let new indexes catch up with tuples inserted in the meantime
+	 * 4. swap index names
+	 * 5. mark old indexes as dead
+	 * 6. drop old indexes
+	 *
+	 * We process each phase for all indexes before moving to the next phase,
+	 * for efficiency.
+	 */
+
+	/*
+	 * Phase 1 of REINDEX CONCURRENTLY
+	 *
+	 * Create a new index with the same properties as the old one, but it is
+	 * only registered in catalogs and will be built later.  Then get session
+	 * locks on all involved tables.  See analogous code in DefineIndex() for
+	 * more detailed comments.
+	 */
+
+	foreach(lc, indexIds)
+	{
+		char	   *concurrentName;
+		Oid			indexId = lfirst_oid(lc);
+		Oid			newIndexId;
+		Relation	indexRel;
+		Relation	heapRel;
+		Relation	newIndexRel;
+		LockRelId	lockrelid;
+
+		indexRel = index_open(indexId, ShareUpdateExclusiveLock);
+		heapRel = table_open(indexRel->rd_index->indrelid,
+							 ShareUpdateExclusiveLock);
+
+		/* Choose a temporary relation name for the new index */
+		concurrentName = ChooseRelationName(get_rel_name(indexId),
+											NULL,
+											"ccnew",
+											get_rel_namespace(indexRel->rd_index->indrelid),
+											false);
+
+		/* Create new index definition based on given index */
+		newIndexId = index_concurrently_create_copy(heapRel,
+													indexId,
+													concurrentName);
+
+		/* Now open the relation of the new index, a lock is also needed on it */
+		newIndexRel = index_open(indexId, ShareUpdateExclusiveLock);
+
+		/*
+		 * Save the list of OIDs and locks in private context
+		 */
+		oldcontext = MemoryContextSwitchTo(private_context);
+
+		newIndexIds = lappend_oid(newIndexIds, newIndexId);
+
+		/*
+		 * Save lockrelid to protect each relation from drop then close
+		 * relations. The lockrelid on parent relation is not taken here to
+		 * avoid multiple locks taken on the same relation, instead we rely on
+		 * parentRelationIds built earlier.
+		 */
+		lockrelid = indexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+		lockrelid = newIndexRel->rd_lockInfo.lockRelId;
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		MemoryContextSwitchTo(oldcontext);
+
+		index_close(indexRel, NoLock);
+		index_close(newIndexRel, NoLock);
+		table_close(heapRel, NoLock);
+	}
+
+	/*
+	 * Save the heap lock for following visibility checks with other backends
+	 * might conflict with this session.
+	 */
+	foreach(lc, heapRelationIds)
+	{
+		Relation	heapRelation = table_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+		LockRelId	lockrelid = heapRelation->rd_lockInfo.lockRelId;
+		LOCKTAG    *heaplocktag;
+
+		/* Save the list of locks in private context */
+		oldcontext = MemoryContextSwitchTo(private_context);
+
+		/* Add lockrelid of heap relation to the list of locked relations */
+		relationLocks = lappend(relationLocks, &lockrelid);
+
+		heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+		/* Save the LOCKTAG for this parent relation for the wait phase */
+		SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+		lockTags = lappend(lockTags, heaplocktag);
+
+		MemoryContextSwitchTo(oldcontext);
+
+		/* Close heap relation */
+		table_close(heapRelation, NoLock);
+	}
+
+	/* Get a session-level lock on each table. */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+	StartTransactionCommand();
+
+	/*
+	 * Phase 2 of REINDEX CONCURRENTLY
+	 *
+	 * Build the new indexes in a separate transaction for each index to avoid
+	 * having open transactions for an unnecessary long time.  But before
+	 * doing that, wait until no running transactions could have the table of
+	 * the index open with the old list of indexes.  See "phase 2" in
+	 * DefineIndex() for more details.
+	 */
+
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		Relation	indexRel;
+		Oid			oldIndexId = lfirst_oid(lc);
+		Oid			newIndexId = lfirst_oid(lc2);
+		Oid			heapId;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Start new transaction for this index's concurrent build */
+		StartTransactionCommand();
+
+		/* Set ActiveSnapshot since functions in the indexes may need it */
+		PushActiveSnapshot(GetTransactionSnapshot());
+
+		/*
+		 * Index relation has been closed by previous commit, so reopen it to
+		 * get its information.
+		 */
+		indexRel = index_open(oldIndexId, ShareUpdateExclusiveLock);
+		heapId = indexRel->rd_index->indrelid;
+		index_close(indexRel, NoLock);
+
+		/* Perform concurrent build of new index */
+		index_concurrently_build(heapId, newIndexId);
+
+		PopActiveSnapshot();
+		CommitTransactionCommand();
+	}
+	StartTransactionCommand();
+
+	/*
+	 * Phase 3 of REINDEX CONCURRENTLY
+	 *
+	 * During this phase the old indexes catch up with any new tuples that
+	 * were created during the previous phase.  See "phase 3" in DefineIndex()
+	 * for more details.
+	 */
+
+	WaitForLockersMultiple(lockTags, ShareLock);
+	CommitTransactionCommand();
+
+	foreach(lc, newIndexIds)
+	{
+		Oid			newIndexId = lfirst_oid(lc);
+		Oid			heapId;
+		TransactionId limitXmin;
+		Snapshot	snapshot;
+
+		CHECK_FOR_INTERRUPTS();
+
+		StartTransactionCommand();
+
+		heapId = IndexGetRelation(newIndexId, false);
+
+		/*
+		 * Take the "reference snapshot" that will be used by validate_index()
+		 * to filter candidate tuples.
+		 */
+		snapshot = RegisterSnapshot(GetTransactionSnapshot());
+		PushActiveSnapshot(snapshot);
+
+		validate_index(heapId, newIndexId, snapshot);
+
+		/*
+		 * We can now do away with our active snapshot, we still need to save
+		 * the xmin limit to wait for older snapshots.
+		 */
+		limitXmin = snapshot->xmin;
+
 		PopActiveSnapshot();
+		UnregisterSnapshot(snapshot);
+
+		/*
+		 * To ensure no deadlocks, we must commit and start yet another
+		 * transaction, and do our wait before any snapshot has been taken in
+		 * it.
+		 */
+		CommitTransactionCommand();
+		StartTransactionCommand();
+
+		/*
+		 * The index is now valid in the sense that it contains all currently
+		 * interesting tuples.  But since it might not contain tuples deleted just
+		 * before the reference snap was taken, we have to wait out any
+		 * transactions that might have older snapshots.
+		 */
+		WaitForOlderSnapshots(limitXmin);
+
 		CommitTransactionCommand();
 	}
+
+	/*
+	 * Phase 4 of REINDEX CONCURRENTLY
+	 *
+	 * Now that the new indexes have been validated, swap each new index with
+	 * its corresponding old index.
+	 *
+	 * We mark the new indexes as valid and the old indexes as not valid at
+	 * the same time to make sure we only get constraint violations from the
+	 * indexes with the correct names.
+	 */
+
 	StartTransactionCommand();
 
+	forboth(lc, indexIds, lc2, newIndexIds)
+	{
+		char	   *oldName;
+		Oid			oldIndexId = lfirst_oid(lc);
+		Oid			newIndexId = lfirst_oid(lc2);
+		Oid			heapId;
+
+		CHECK_FOR_INTERRUPTS();
+
+		heapId = IndexGetRelation(oldIndexId, false);
+
+		/* Choose a relation name for old index */
+		oldName = ChooseRelationName(get_rel_name(oldIndexId),
+									 NULL,
+									 "ccold",
+									 get_rel_namespace(heapId),
+									 false);
+
+		/*
+		 * Swap old index with the new one.  This also marks the new one as
+		 * valid and the old one as not valid.
+		 */
+		index_concurrently_swap(newIndexId, oldIndexId, oldName);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this commit
+		 * all sessions will refresh any cached plans that might reference the
+		 * index.
+		 */
+		CacheInvalidateRelcacheByRelid(heapId);
+
+		/*
+		 * CCI here so that subsequent iterations see the oldName in the
+		 * catalog and can choose a nonconflicting name for their oldName.
+		 * Otherwise, this could lead to conflicts if a table has two indexes
+		 * whose names are equal for the first NAMEDATALEN-minus-a-few
+		 * characters.
+		 */
+		CommandCounterIncrement();
+	}
+
+	/* Commit this transaction and make index swaps visible */
+	CommitTransactionCommand();
+	StartTransactionCommand();
+
+	/*
+	 * Phase 5 of REINDEX CONCURRENTLY
+	 *
+	 * Mark the old indexes as dead.  First we must wait until no running
+	 * transaction could be using the index for a query.  See also
+	 * index_drop() for more details.
+	 */
+
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	foreach(lc, indexIds)
+	{
+		Oid			oldIndexId = lfirst_oid(lc);
+		Oid			heapId;
+
+		CHECK_FOR_INTERRUPTS();
+		heapId = IndexGetRelation(oldIndexId, false);
+		index_concurrently_set_dead(heapId, oldIndexId);
+	}
+
+	/* Commit this transaction to make the updates visible. */
+	CommitTransactionCommand();
+	StartTransactionCommand();
+
+	/*
+	 * Phase 6 of REINDEX CONCURRENTLY
+	 *
+	 * Drop the old indexes.
+	 */
+
+	WaitForLockersMultiple(lockTags, AccessExclusiveLock);
+
+	PushActiveSnapshot(GetTransactionSnapshot());
+
+	{
+		ObjectAddresses *objects = new_object_addresses();
+
+		foreach(lc, indexIds)
+		{
+			Oid			oldIndexId = lfirst_oid(lc);
+			ObjectAddress *object = palloc(sizeof(ObjectAddress));
+
+			object->classId = RelationRelationId;
+			object->objectId = oldIndexId;
+			object->objectSubId = 0;
+
+			add_exact_object_address(object, objects);
+		}
+
+		/*
+		 * Use PERFORM_DELETION_CONCURRENT_LOCK so that index_drop() uses the
+		 * right lock level.
+		 */
+		performMultipleDeletions(objects, DROP_RESTRICT,
+								 PERFORM_DELETION_CONCURRENT_LOCK | PERFORM_DELETION_INTERNAL);
+	}
+
+	PopActiveSnapshot();
+	CommitTransactionCommand();
+
+	/*
+	 * Finally, release the session-level lock on the table.
+	 */
+	foreach(lc, relationLocks)
+	{
+		LockRelId	lockRel = *((LockRelId *) lfirst(lc));
+
+		UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+	}
+
+	/* Start a new transaction to finish process properly */
+	StartTransactionCommand();
+
+	/* Log what we did */
+	if (options & REINDEXOPT_VERBOSE)
+	{
+		if (relkind == RELKIND_INDEX)
+			ereport(INFO,
+					(errmsg("index \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		else
+		{
+			foreach(lc, newIndexIds)
+			{
+				Oid			indOid = lfirst_oid(lc);
+
+				ereport(INFO,
+						(errmsg("index \"%s.%s\" was reindexed",
+								get_namespace_name(get_rel_namespace(indOid)),
+								get_rel_name(indOid))));
+				/* Don't show rusage here, since it's not per index. */
+			}
+
+			ereport(INFO,
+					(errmsg("table \"%s.%s\" was reindexed",
+							relationNamespace, relationName),
+					 errdetail("%s.",
+							   pg_rusage_show(&ru0))));
+		}
+	}
+
 	MemoryContextDelete(private_context);
+
+	return true;
 }
 
 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3183b2aaa1..7bd49fdb7b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1299,6 +1299,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 	bool		is_partition;
 	Form_pg_class classform;
 	LOCKMODE	heap_lockmode;
+	bool		invalid_system_index = false;
 
 	state = (struct DropRelationCallbackState *) arg;
 	relkind = state->relkind;
@@ -1361,7 +1362,36 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relOid)),
 					   rel->relname);
 
-	if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+	/*
+	 * Check the case of a system index that might have been invalidated by a
+	 * failed concurrent process and allow its drop. For the time being, this
+	 * only concerns indexes of toast relations that became invalid during a
+	 * REINDEX CONCURRENTLY process.
+	 */
+	if (IsSystemClass(relOid, classform) && relkind == RELKIND_INDEX)
+	{
+		HeapTuple		locTuple;
+		Form_pg_index	indexform;
+		bool			indisvalid;
+
+		locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+		if (!HeapTupleIsValid(locTuple))
+		{
+			ReleaseSysCache(tuple);
+			return;
+		}
+
+		indexform = (Form_pg_index) GETSTRUCT(locTuple);
+		indisvalid = indexform->indisvalid;
+		ReleaseSysCache(locTuple);
+
+		/* Mark object as being an invalid index of system catalogs */
+		if (!indisvalid)
+			invalid_system_index = true;
+	}
+
+	/* In the case of an invalid index, it is fine to bypass this check */
+	if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index d97781e1cb..9a2400c015 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4367,6 +4367,7 @@ _copyReindexStmt(const ReindexStmt *from)
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(name);
 	COPY_SCALAR_FIELD(options);
+	COPY_SCALAR_FIELD(concurrent);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 91c007ad5b..7eb9f1dd92 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2103,6 +2103,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(name);
 	COMPARE_SCALAR_FIELD(options);
+	COMPARE_SCALAR_FIELD(concurrent);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0a4822829a..d711f9a736 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8300,42 +8300,46 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type <name>
+ *		REINDEX [ (options) ] type [CONCURRENTLY] <name>
  *****************************************************************************/
 
 ReindexStmt:
-			REINDEX reindex_target_type qualified_name
+			REINDEX reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->relation = $3;
+					n->concurrent = $3;
+					n->relation = $4;
 					n->name = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX reindex_target_multitable name
+			| REINDEX reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $2;
-					n->name = $3;
+					n->concurrent = $3;
+					n->name = $4;
 					n->relation = NULL;
 					n->options = 0;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_type qualified_name
+			| REINDEX '(' reindex_option_list ')' reindex_target_type opt_concurrently qualified_name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->relation = $6;
+					n->concurrent = $6;
+					n->relation = $7;
 					n->name = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
 				}
-			| REINDEX '(' reindex_option_list ')' reindex_target_multitable name
+			| REINDEX '(' reindex_option_list ')' reindex_target_multitable opt_concurrently name
 				{
 					ReindexStmt *n = makeNode(ReindexStmt);
 					n->kind = $5;
-					n->name = $6;
+					n->concurrent = $6;
+					n->name = $7;
 					n->relation = NULL;
 					n->options = $3;
 					$$ = (Node *)n;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 857b7a8b43..edf24c438c 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -774,16 +774,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ReindexStmt *stmt = (ReindexStmt *) parsetree;
 
+				if (stmt->concurrent)
+					PreventInTransactionBlock(isTopLevel,
+											  "REINDEX CONCURRENTLY");
+
 				/* we choose to allow this during "read only" transactions */
 				PreventCommandDuringRecovery("REINDEX");
 				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
-						ReindexIndex(stmt->relation, stmt->options);
+						ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_TABLE:
-						ReindexTable(stmt->relation, stmt->options);
+						ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
 						break;
 					case REINDEX_OBJECT_SCHEMA:
 					case REINDEX_OBJECT_SYSTEM:
@@ -799,7 +803,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 												  (stmt->kind == REINDEX_OBJECT_SCHEMA) ? "REINDEX SCHEMA" :
 												  (stmt->kind == REINDEX_OBJECT_SYSTEM) ? "REINDEX SYSTEM" :
 												  "REINDEX DATABASE");
-						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
+						ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);
 						break;
 					default:
 						elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5d8634d818..82511e34ac 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2192,6 +2192,22 @@ command_no_begin(const char *query)
 			return true;
 		if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
 			return true;
+		if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+							 pg_strncasecmp(query, "table", 5) == 0))
+		{
+			query += wordlen;
+			query = skip_white_space(query);
+			wordlen = 0;
+			while (isalpha((unsigned char) query[wordlen]))
+				wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+			/*
+			 * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+			 * xacts.
+			 */
+			if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+				return true;
+		}
 
 		/* DROP INDEX CONCURRENTLY isn't allowed in xacts */
 		if (wordlen == 5 && pg_strncasecmp(query, "index", 5) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 3ba3498496..2270fc5bea 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3213,12 +3213,24 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("REINDEX"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "INDEX"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+								   " UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SCHEMA"))
-		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+							" UNION SELECT 'CONCURRENTLY'");
 	else if (Matches("REINDEX", "SYSTEM|DATABASE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_databases
+							" UNION SELECT 'CONCURRENTLY'");
+	else if (Matches("REINDEX", "TABLE", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables, NULL);
+	else if (Matches("REINDEX", "INDEX", "CONCURRENTLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
+	else if (Matches("REINDEX", "SCHEMA", "CONCURRENTLY"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
+	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
 
 /* SECURITY LABEL */
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 1cd1ccc951..438500cb08 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -19,16 +19,17 @@ static void reindex_one_database(const char *name, const char *dbname,
 					 const char *type, const char *host,
 					 const char *port, const char *username,
 					 enum trivalue prompt_password, const char *progname,
-					 bool echo, bool verbose);
+					 bool echo, bool verbose, bool concurrently);
 static void reindex_all_databases(const char *maintenance_db,
 					  const char *host, const char *port,
 					  const char *username, enum trivalue prompt_password,
 					  const char *progname, bool echo,
-					  bool quiet, bool verbose);
+					  bool quiet, bool verbose, bool concurrently);
 static void reindex_system_catalogs(const char *dbname,
 						const char *host, const char *port,
 						const char *username, enum trivalue prompt_password,
-						const char *progname, bool echo, bool verbose);
+						const char *progname, bool echo, bool verbose,
+						bool concurrently);
 static void help(const char *progname);
 
 int
@@ -49,6 +50,7 @@ main(int argc, char *argv[])
 		{"table", required_argument, NULL, 't'},
 		{"index", required_argument, NULL, 'i'},
 		{"verbose", no_argument, NULL, 'v'},
+		{"concurrently", no_argument, NULL, 1},
 		{"maintenance-db", required_argument, NULL, 2},
 		{NULL, 0, NULL, 0}
 	};
@@ -68,6 +70,7 @@ main(int argc, char *argv[])
 	bool		echo = false;
 	bool		quiet = false;
 	bool		verbose = false;
+	bool		concurrently = false;
 	SimpleStringList indexes = {NULL, NULL};
 	SimpleStringList tables = {NULL, NULL};
 	SimpleStringList schemas = {NULL, NULL};
@@ -124,6 +127,9 @@ main(int argc, char *argv[])
 			case 'v':
 				verbose = true;
 				break;
+			case 1:
+				concurrently = true;
+				break;
 			case 2:
 				maintenance_db = pg_strdup(optarg);
 				break;
@@ -182,7 +188,7 @@ main(int argc, char *argv[])
 		}
 
 		reindex_all_databases(maintenance_db, host, port, username,
-							  prompt_password, progname, echo, quiet, verbose);
+							  prompt_password, progname, echo, quiet, verbose, concurrently);
 	}
 	else if (syscatalog)
 	{
@@ -213,7 +219,7 @@ main(int argc, char *argv[])
 		}
 
 		reindex_system_catalogs(dbname, host, port, username, prompt_password,
-								progname, echo, verbose);
+								progname, echo, verbose, concurrently);
 	}
 	else
 	{
@@ -234,7 +240,7 @@ main(int argc, char *argv[])
 			for (cell = schemas.head; cell; cell = cell->next)
 			{
 				reindex_one_database(cell->val, dbname, "SCHEMA", host, port,
-									 username, prompt_password, progname, echo, verbose);
+									 username, prompt_password, progname, echo, verbose, concurrently);
 			}
 		}
 
@@ -245,7 +251,7 @@ main(int argc, char *argv[])
 			for (cell = indexes.head; cell; cell = cell->next)
 			{
 				reindex_one_database(cell->val, dbname, "INDEX", host, port,
-									 username, prompt_password, progname, echo, verbose);
+									 username, prompt_password, progname, echo, verbose, concurrently);
 			}
 		}
 		if (tables.head != NULL)
@@ -255,7 +261,7 @@ main(int argc, char *argv[])
 			for (cell = tables.head; cell; cell = cell->next)
 			{
 				reindex_one_database(cell->val, dbname, "TABLE", host, port,
-									 username, prompt_password, progname, echo, verbose);
+									 username, prompt_password, progname, echo, verbose, concurrently);
 			}
 		}
 
@@ -265,7 +271,7 @@ main(int argc, char *argv[])
 		 */
 		if (indexes.head == NULL && tables.head == NULL && schemas.head == NULL)
 			reindex_one_database(NULL, dbname, "DATABASE", host, port,
-								 username, prompt_password, progname, echo, verbose);
+								 username, prompt_password, progname, echo, verbose, concurrently);
 	}
 
 	exit(0);
@@ -275,7 +281,7 @@ static void
 reindex_one_database(const char *name, const char *dbname, const char *type,
 					 const char *host, const char *port, const char *username,
 					 enum trivalue prompt_password, const char *progname, bool echo,
-					 bool verbose)
+					 bool verbose, bool concurrently)
 {
 	PQExpBufferData sql;
 
@@ -284,6 +290,14 @@ reindex_one_database(const char *name, const char *dbname, const char *type,
 	conn = connectDatabase(dbname, host, port, username, prompt_password,
 						   progname, echo, false, false);
 
+	if (concurrently && PQserverVersion(conn) < 120000)
+	{
+		PQfinish(conn);
+		fprintf(stderr, _("%s: cannot use the \"%s\" option on server versions older than PostgreSQL %s\n"),
+				progname, "concurrently", "12");
+		exit(1);
+	}
+
 	initPQExpBuffer(&sql);
 
 	appendPQExpBufferStr(&sql, "REINDEX ");
@@ -293,6 +307,8 @@ reindex_one_database(const char *name, const char *dbname, const char *type,
 
 	appendPQExpBufferStr(&sql, type);
 	appendPQExpBufferChar(&sql, ' ');
+	if (concurrently)
+		appendPQExpBufferStr(&sql, "CONCURRENTLY ");
 	if (strcmp(type, "TABLE") == 0 ||
 		strcmp(type, "INDEX") == 0)
 		appendQualifiedRelation(&sql, name, conn, progname, echo);
@@ -328,7 +344,8 @@ static void
 reindex_all_databases(const char *maintenance_db,
 					  const char *host, const char *port,
 					  const char *username, enum trivalue prompt_password,
-					  const char *progname, bool echo, bool quiet, bool verbose)
+					  const char *progname, bool echo, bool quiet, bool verbose,
+					  bool concurrently)
 {
 	PGconn	   *conn;
 	PGresult   *result;
@@ -357,7 +374,7 @@ reindex_all_databases(const char *maintenance_db,
 
 		reindex_one_database(NULL, connstr.data, "DATABASE", host,
 							 port, username, prompt_password,
-							 progname, echo, verbose);
+							 progname, echo, verbose, concurrently);
 	}
 	termPQExpBuffer(&connstr);
 
@@ -367,7 +384,7 @@ reindex_all_databases(const char *maintenance_db,
 static void
 reindex_system_catalogs(const char *dbname, const char *host, const char *port,
 						const char *username, enum trivalue prompt_password,
-						const char *progname, bool echo, bool verbose)
+						const char *progname, bool echo, bool verbose, bool concurrently)
 {
 	PGconn	   *conn;
 	PQExpBufferData sql;
@@ -382,7 +399,11 @@ reindex_system_catalogs(const char *dbname, const char *host, const char *port,
 	if (verbose)
 		appendPQExpBuffer(&sql, " (VERBOSE)");
 
-	appendPQExpBuffer(&sql, " SYSTEM %s;", fmtId(PQdb(conn)));
+	appendPQExpBufferStr(&sql, " SYSTEM ");
+	if (concurrently)
+		appendPQExpBuffer(&sql, "CONCURRENTLY ");
+	appendPQExpBufferStr(&sql, fmtId(PQdb(conn)));
+	appendPQExpBufferChar(&sql, ';');
 
 	if (!executeMaintenanceCommand(conn, sql.data, echo))
 	{
@@ -403,6 +424,7 @@ help(const char *progname)
 	printf(_("  %s [OPTION]... [DBNAME]\n"), progname);
 	printf(_("\nOptions:\n"));
 	printf(_("  -a, --all                 reindex all databases\n"));
+	printf(_("      --concurrently        reindex concurrently\n"));
 	printf(_("  -d, --dbname=DBNAME       database to reindex\n"));
 	printf(_("  -e, --echo                show the commands being sent to the server\n"));
 	printf(_("  -i, --index=INDEX         recreate specific index(es) only\n"));
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index e57a5e2bad..ef83be767a 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -3,7 +3,7 @@
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 23;
+use Test::More tests => 34;
 
 program_help_ok('reindexdb');
 program_version_ok('reindexdb');
@@ -43,6 +43,33 @@
 	qr/statement: REINDEX \(VERBOSE\) TABLE public\.test1;/,
 	'reindex with verbose output');
 
+# the same with --concurrently
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', 'postgres' ],
+	qr/statement: REINDEX DATABASE CONCURRENTLY postgres;/,
+	'SQL REINDEX CONCURRENTLY run');
+
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-t', 'test1', 'postgres' ],
+	qr/statement: REINDEX TABLE CONCURRENTLY public\.test1;/,
+	'reindex specific table concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-i', 'test1x', 'postgres' ],
+	qr/statement: REINDEX INDEX CONCURRENTLY public\.test1x;/,
+	'reindex specific index concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '--concurrently', '-S', 'public', 'postgres' ],
+	qr/statement: REINDEX SCHEMA CONCURRENTLY public;/,
+	'reindex specific schema concurrently');
+$node->command_fails(
+	[ 'reindexdb', '--concurrently', '-s', 'postgres' ],
+	'reindex system tables concurrently');
+$node->issues_sql_like(
+	[ 'reindexdb', '-v', '-t', 'test1', 'postgres' ],
+	qr/statement: REINDEX \(VERBOSE\) TABLE public\.test1;/,
+	'reindex with verbose output');
+
+# connection strings
 $node->command_ok([qw(reindexdb --echo --table=pg_am dbname=template1)],
 	'reindexdb table with connection string');
 $node->command_ok(
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index f537f01587..4f9dde9df9 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -136,6 +136,7 @@ typedef enum ObjectClass
 #define PERFORM_DELETION_QUIETLY			0x0004	/* suppress notices */
 #define PERFORM_DELETION_SKIP_ORIGINAL		0x0008	/* keep original obj */
 #define PERFORM_DELETION_SKIP_EXTENSIONS	0x0010	/* keep extensions */
+#define PERFORM_DELETION_CONCURRENT_LOCK	0x0020	/* normal drop with concurrent lock mode */
 
 
 /* in dependency.c */
@@ -198,6 +199,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+								 Oid newRefObjectId);
+
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
 
 extern bool sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId);
@@ -208,6 +212,8 @@ extern Oid	get_constraint_index(Oid constraintId);
 
 extern Oid	get_index_constraint(Oid indexId);
 
+extern List *get_index_ref_constraints(Oid indexId);
+
 /* in pg_shdepend.c */
 
 extern void recordSharedDependencyOn(ObjectAddress *depender,
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index 29f7ed6237..70b35bdfcb 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -77,6 +77,20 @@ extern Oid index_create(Relation heapRelation,
 #define	INDEX_CONSTR_CREATE_UPDATE_INDEX	(1 << 3)
 #define	INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS	(1 << 4)
 
+extern Oid index_concurrently_create_copy(Relation heapRelation,
+										  Oid oldIndexId,
+										  const char *newName);
+
+extern void index_concurrently_build(Oid heapRelationId,
+									 Oid indexRelationId);
+
+extern void index_concurrently_swap(Oid newIndexId,
+									Oid oldIndexId,
+									const char *oldName);
+
+extern void index_concurrently_set_dead(Oid heapId,
+										Oid indexId);
+
 extern ObjectAddress index_constraint_create(Relation heapRelation,
 						Oid indexRelationId,
 						Oid parentConstraintId,
@@ -87,7 +101,7 @@ extern ObjectAddress index_constraint_create(Relation heapRelation,
 						bool allow_system_table_mods,
 						bool is_internal);
 
-extern void index_drop(Oid indexId, bool concurrent);
+extern void index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode);
 
 extern IndexInfo *BuildIndexInfo(Relation index);
 
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 3bc2e8eb16..7f49625987 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -34,10 +34,10 @@ extern ObjectAddress DefineIndex(Oid relationId,
 			bool check_not_in_use,
 			bool skip_build,
 			bool quiet);
-extern void ReindexIndex(RangeVar *indexRelation, int options);
-extern Oid	ReindexTable(RangeVar *relation, int options);
+extern void ReindexIndex(RangeVar *indexRelation, int options, bool concurrent);
+extern Oid	ReindexTable(RangeVar *relation, int options, bool concurrent);
 extern void ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
-					  int options);
+					  int options, bool concurrent);
 extern char *makeObjectName(const char *name1, const char *name2,
 			   const char *label);
 extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index bdd2bd2fd9..e81c626913 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3305,6 +3305,7 @@ typedef struct ReindexStmt
 	RangeVar   *relation;		/* Table or index to reindex */
 	const char *name;			/* name of database to reindex */
 	int			options;		/* Reindex options flags */
+	bool		concurrent;		/* reindex concurrently? */
 } ReindexStmt;
 
 /* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000000..9e04169b2f
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data           
+
+aaaa           
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 70d47b3e68..f1ae50e5ba 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -42,6 +42,7 @@ test: multixact-no-forget
 test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
+test: reindex-concurrently
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000000..eb59fe0cba
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+	CREATE TABLE reind_con_tab(id serial primary key, data text);
+	INSERT INTO reind_con_tab(data) VALUES ('aa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+	INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+	DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index cc3dda4c70..6b77d25deb 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3250,6 +3250,101 @@ INFO:  index "reindex_verbose_pkey" was reindexed
 \set VERBOSITY default
 DROP TABLE reindex_verbose;
 --
+-- REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE:  table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex concurrently of exclusion constraint currently not supported
+CREATE TABLE concur_reindex_tab3 (c1 int, c2 int4range, EXCLUDE USING gist (c2 WITH &&));
+INSERT INTO concur_reindex_tab3 VALUES  (3, '[1,2]');
+REINDEX INDEX CONCURRENTLY  concur_reindex_tab3_c2_excl;  -- error
+ERROR:  concurrent index creation for exclusion constraints is not supported
+REINDEX TABLE CONCURRENTLY concur_reindex_tab3;  -- succeeds with warning
+WARNING:  cannot reindex concurrently exclusion constraint index "public.concur_reindex_tab3_c2_excl", skipping
+INSERT INTO concur_reindex_tab3 VALUES  (4, '[2,4]');
+ERROR:  conflicting key value violates exclusion constraint "concur_reindex_tab3_c2_excl"
+DETAIL:  Key (c2)=([2,5)) conflicts with existing key (c2)=([1,3)).
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+ obj_description 
+-----------------
+ test comment
+(1 row)
+
+DROP TABLE testcomment;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR:  REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR:  concurrent index creation on system catalog tables is not supported
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR:  concurrent index creation on system catalog tables is not supported
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR:  concurrent reindex of system catalogs is not supported
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+WARNING:  concurrent reindex is not supported for catalog relations, skipping all
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+         Table "public.concur_reindex_tab"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ c1     | integer |           | not null | 
+ c2     | text    |           |          | 
+Indexes:
+    "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+    "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+    "concur_reindex_ind2" btree (c2)
+    "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+    TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2, concur_reindex_tab3;
+--
 -- REINDEX SCHEMA
 --
 REINDEX SCHEMA schema_to_reindex; -- failure, schema does not exist
@@ -3308,6 +3403,8 @@ BEGIN;
 REINDEX SCHEMA schema_to_reindex; -- failure, cannot run in a transaction
 ERROR:  REINDEX SCHEMA cannot run inside a transaction block
 END;
+-- concurrently
+REINDEX SCHEMA CONCURRENTLY schema_to_reindex;
 -- Failure for unauthorized user
 CREATE ROLE regress_reindexuser NOLOGIN;
 SET SESSION ROLE regress_reindexuser;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 15c0f1f5d1..9ff2dc68ff 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1172,6 +1172,65 @@ CREATE TABLE reindex_verbose(id integer primary key);
 \set VERBOSITY default
 DROP TABLE reindex_verbose;
 
+--
+-- REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES  (1, 'a');
+INSERT INTO concur_reindex_tab VALUES  (2, 'a');
+-- Reindex concurrently of exclusion constraint currently not supported
+CREATE TABLE concur_reindex_tab3 (c1 int, c2 int4range, EXCLUDE USING gist (c2 WITH &&));
+INSERT INTO concur_reindex_tab3 VALUES  (3, '[1,2]');
+REINDEX INDEX CONCURRENTLY  concur_reindex_tab3_c2_excl;  -- error
+REINDEX TABLE CONCURRENTLY concur_reindex_tab3;  -- succeeds with warning
+INSERT INTO concur_reindex_tab3 VALUES  (4, '[2,4]');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check that comments are preserved
+CREATE TABLE testcomment (i int);
+CREATE INDEX testcomment_idx1 ON testcomment (i);
+COMMENT ON INDEX testcomment_idx1 IS 'test comment';
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE testcomment;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+REINDEX TABLE CONCURRENTLY testcomment ;
+SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
+DROP TABLE testcomment;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+-- Warns about catalog relations
+REINDEX SCHEMA CONCURRENTLY pg_catalog;
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2, concur_reindex_tab3;
+
 --
 -- REINDEX SCHEMA
 --
@@ -1214,6 +1273,9 @@ CREATE TABLE reindex_after AS SELECT oid, relname, relfilenode, relkind
 REINDEX SCHEMA schema_to_reindex; -- failure, cannot run in a transaction
 END;
 
+-- concurrently
+REINDEX SCHEMA CONCURRENTLY schema_to_reindex;
+
 -- Failure for unauthorized user
 CREATE ROLE regress_reindexuser NOLOGIN;
 SET SESSION ROLE regress_reindexuser;

base-commit: 148cf5f462e53f374a2085b2fa8dcde944539b03
-- 
2.21.0

#139

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#138)

Re: REINDEX CONCURRENTLY 2.0

On Mon, Mar 25, 2019 at 04:23:34PM +0100, Peter Eisentraut wrote:

Let's do it. :-)

I am pretty sure that this has been said at least once since 2012.

I've gone over this patch a few more times. I've read all the
discussion since 2012 again and made sure all the issues were addressed.
I made particularly sure that during the refactoring nothing in CREATE
INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently
changed. I checked all the steps again. I'm happy with it.

From my side, I would be happy to look at this patch. Unfortunately I
won't have the room to look at it this week I think :(

But if you are happy with it that's fine by me, at least I can fix
anything which is broken :)
--
Michael

#140

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Peter Eisentraut (#138)

Re: REINDEX CONCURRENTLY 2.0

Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

And few questions:
- reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex system catalog. Is this expected?

If support is ever added, then reindexdb supports it automatically. It
seems simpler to not have to repeat the same checks in two places.

ok, reasonable for me

- psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently support.

It seems we don't do version checks for tab completion of keywords.

Hmm, yes, i found only few checks for "create trigger" syntax about "EXECUTE FUNCTION"/"EXECUTE PROCEDURE" difference in 11

regards, Sergei

#141

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Sergei Kornilov (#140)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-28 09:07, Sergei Kornilov wrote:

Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

Committed :)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#142

Sergei Kornilov

sk@zsrv.org

almost 7 years ago

In reply to: Peter Eisentraut (#141)

Re: REINDEX CONCURRENTLY 2.0

Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

Committed :)

wow! Congratulations! This was very long way

my favorite pg12 feature

regards, Sergei

#143

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Sergei Kornilov (#142)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote:

wow! Congratulations! This was very long way

my favorite pg12 feature

So this has been committed, nice! Thanks a lot to all for keeping
alive this patch over the ages, with particular thanks to Andreas and
Peter.
--
Michael

#144

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#143)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-29 09:04, Michael Paquier wrote:

On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote:

wow! Congratulations! This was very long way

my favorite pg12 feature

So this has been committed, nice! Thanks a lot to all for keeping
alive this patch over the ages, with particular thanks to Andreas and
Peter.

So, we're getting buildfarm failures, only with clang. I can reproduce
those (with clang).

It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
CONCURRENTLY". More eyes welcome.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#145

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Peter Eisentraut (#144)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-29 09:13, Peter Eisentraut wrote:

On 2019-03-29 09:04, Michael Paquier wrote:

On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote:

wow! Congratulations! This was very long way

my favorite pg12 feature

So this has been committed, nice! Thanks a lot to all for keeping
alive this patch over the ages, with particular thanks to Andreas and
Peter.

So, we're getting buildfarm failures, only with clang. I can reproduce
those (with clang).

It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
CONCURRENTLY". More eyes welcome.

I think I found a fix.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#146

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#144)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 09:13:35AM +0100, Peter Eisentraut wrote:

So, we're getting buildfarm failures, only with clang. I can reproduce
those (with clang).

Indeed, I can reproduce the failures using -O2 with clang. I am
wondering if we are not missing a volatile flag somewhere and that
some code reordering is at cause here.

It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
CONCURRENTLY". More eyes welcome.

Here is a short reproducer:
create materialized view aam as select 1 AS a;
create index aai on aam(a);
reindex table CONCURRENTLY aam;
--
Michael

#147

Shinoda, Noriyoshi (PN Japan A&PS Delivery)

noriyoshi.shinoda@hpe.com

almost 7 years ago

In reply to: Michael Paquier (#146)

RE: REINDEX CONCURRENTLY 2.0

Hi hackers,

I tried this great feature for partition index.
The first time the REINDEX TABLE CONCURRENTLY statement is executed to the partition, then an error occurs.
The second run succeeds but leaves an index with an INVALID status.
I think this is not the desired behaviour.

# TEST
postgres=> CREATE TABLE part1(c1 INT) PARTITION BY RANGE(c1);
CREATE TABLE
postgres=> CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
postgres=> CREATE INDEX idx1_part1 ON part1(c1);
CREATE INDEX
postgres=> REINDEX TABLE CONCURRENTLY part1v1;
ERROR: cannot drop index part1v1_c1_idx_ccold because index idx1_part1 requires it
HINT: You can drop index idx1_part1 instead.
postgres=> \d+ part1v1
Table "public.part1v1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
c1 | integer | | | | plain | |
Partition of: part1 FOR VALUES FROM (0) TO (100)
Partition constraint: ((c1 IS NOT NULL) AND (c1 >= 0) AND (c1 < 100))
Indexes:
"part1v1_c1_idx" btree (c1)
"part1v1_c1_idx_ccold" btree (c1) INVALID
Access method: heap

postgres=> REINDEX TABLE CONCURRENTLY part1v1;
REINDEX
postgres=> \d+ part1v1
Table "public.part1v1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
c1 | integer | | | | plain | |
Partition of: part1 FOR VALUES FROM (0) TO (100)
Partition constraint: ((c1 IS NOT NULL) AND (c1 >= 0) AND (c1 < 100))
Indexes:
"part1v1_c1_idx" btree (c1)
"part1v1_c1_idx_ccold" btree (c1) INVALID
Access method: heap

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Michael Paquier [mailto:michael@paquier.xyz]
Sent: Friday, March 29, 2019 6:21 PM
To: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Cc: Sergei Kornilov <sk@zsrv.org>; pgsql-hackers@lists.postgresql.org
Subject: Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 09:13:35AM +0100, Peter Eisentraut wrote:

So, we're getting buildfarm failures, only with clang. I can
reproduce those (with clang).

Indeed, I can reproduce the failures using -O2 with clang. I am wondering if we are not missing a volatile flag somewhere and that some code reordering is at cause here.

It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
CONCURRENTLY". More eyes welcome.

Here is a short reproducer:
create materialized view aam as select 1 AS a; create index aai on aam(a); reindex table CONCURRENTLY aam;
--
Michael

#148

Robert Treat

rob@xzilla.net

almost 7 years ago

In reply to: Peter Eisentraut (#141)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 3:28 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-03-28 09:07, Sergei Kornilov wrote:

Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

Committed :)

Given this has been committed I've probably missed the window, but
philosophically speaking, is there any reason not to make the
"concurrently" behavior the default behavior, and require a keyword
for the more heavy-weight old behavior? In most production scenarios
you probably want to avoid exclusive locking, and in the cases where
that isn't an issue, 'concurrently' isn't that much slower that most
users would object to it. I would perhaps give a nod to historical
syntax concerns, but this would more closely align with the behavior
in vacuum vs vacuum full, and we've done behavior modifying changes
such as the recent WITH ... MATERIALIZED change. Thoughts?

Robert Treat
https://xzilla.net

#149

Andres Freund

andres@anarazel.de

almost 7 years ago

In reply to: Robert Treat (#148)

Re: REINDEX CONCURRENTLY 2.0

Hi,

On 2019-03-29 11:47:10 -0400, Robert Treat wrote:

On Fri, Mar 29, 2019 at 3:28 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-03-28 09:07, Sergei Kornilov wrote:

Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

Committed :)

Given this has been committed I've probably missed the window, but
philosophically speaking, is there any reason not to make the
"concurrently" behavior the default behavior, and require a keyword
for the more heavy-weight old behavior?

Yes, it increases the total runtime quite considerably. And it adds new
failure modes with partially built invalid indexes hanging around that
need to be dropped manually.

In most production scenarios
you probably want to avoid exclusive locking, and in the cases where
that isn't an issue, 'concurrently' isn't that much slower that most
users would object to it.

It does at *least* twice as much IO.

Greetings,

Andres Freund

#150

Bossart, Nathan

bossartn@amazon.com

almost 7 years ago

In reply to: Andres Freund (#149)

Re: REINDEX CONCURRENTLY 2.0

I noticed a very small typo in the documentation for this feature.

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index ccabb330cb..e45bf86c8d 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -349,7 +349,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
      <listitem>
       <para>
        The old indexes are dropped.  The <literal>SHARE UPDATE
-       EXCLUSIVE</literal> session locks for the indexes and the table ar
+       EXCLUSIVE</literal> session locks for the indexes and the table are
        released.
       </para>
      </listitem>

Nathan

#151

Justin Pryzby

pryzby@telsasoft.com

almost 7 years ago

In reply to: Bossart, Nathan (#150)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 03:53:05PM +0000, Bossart, Nathan wrote:

I noticed a very small typo in the documentation for this feature.

I submit a bunch more changes for consideration, attached.

Attachments:

v1-0001-Doc-review-for-REINDEX-CONCURRENTLY.patchtext/x-diff; charset=us-asciiDownload

From dafdb15fb3e7c69de82a2206c9bf07588b5665ce Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Fri, 29 Mar 2019 10:59:59 -0500
Subject: [PATCH v1] Doc review for REINDEX CONCURRENTLY

---
 doc/src/sgml/ref/reindex.sgml | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index ccabb33..e05a76c 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -300,11 +300,11 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
     <orderedlist>
      <listitem>
       <para>
-       A new temporary index definition is added into the catalog
+       A new temporary index definition is added to the catalog
        <literal>pg_index</literal>.  This definition will be used to replace
        the old index.  A <literal>SHARE UPDATE EXCLUSIVE</literal> lock at
-       session level is taken on the indexes being reindexed as well as its
-       associated table to prevent any schema modification while processing.
+       session level is taken on the indexes being reindexed as well as their
+       associated tables to prevent any schema modification while processing.
       </para>
      </listitem>
 
@@ -312,7 +312,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
       <para>
        A first pass to build the index is done for each new index.  Once the
        index is built, its flag <literal>pg_index.indisready</literal> is
-       switched to <quote>true</quote> to make ready for inserts, making it
+       switched to <quote>true</quote> to make it ready for inserts, making it
        visible to other sessions once the transaction that performed the build
        is finished.  This step is done in a separate transaction for each
        index.
@@ -322,7 +322,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
      <listitem>
       <para>
        Then a second pass is performed to add tuples that were added while the
-       first pass build was running.  This step is also done in a separate
+       first pass was running.  This step is also done in a separate
        transaction for each index.
       </para>
      </listitem>
@@ -331,10 +331,10 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
       <para>
        All the constraints that refer to the index are changed to refer to the
        new index definition, and the names of the indexes are changed.  At
-       this point <literal>pg_index.indisvalid</literal> is switched to
+       this point, <literal>pg_index.indisvalid</literal> is switched to
        <quote>true</quote> for the new index and to <quote>false</quote> for
-       the old, and a cache invalidation is done so as all the sessions that
-       referenced the old index are invalidated.
+       the old, and a cache invalidation is done causing all sessions that
+       referenced the old index to be invalidated.
       </para>
      </listitem>
 
@@ -349,7 +349,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
      <listitem>
       <para>
        The old indexes are dropped.  The <literal>SHARE UPDATE
-       EXCLUSIVE</literal> session locks for the indexes and the table ar
+       EXCLUSIVE</literal> session locks for the indexes and the table are
        released.
       </para>
      </listitem>
@@ -359,8 +359,8 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
    <para>
     If a problem arises while rebuilding the indexes, such as a
     uniqueness violation in a unique index, the <command>REINDEX</command>
-    command will fail but leave behind an <quote>invalid</quote> new index on top
-    of the existing one. This index will be ignored for querying purposes
+    command will fail but leave behind an <quote>invalid</quote> new index in addition to
+    the pre-existing one. This index will be ignored for querying purposes
     because it might be incomplete; however it will still consume update
     overhead. The <application>psql</application> <command>\d</command> command will report
     such an index as <literal>INVALID</literal>:
@@ -387,7 +387,7 @@ Indexes:
 
    <para>
     Regular index builds permit other regular index builds on the same table
-    to occur in parallel, but only one concurrent index build can occur on a
+    to occur simultaneously, but only one concurrent index build can occur on a
     table at a time. In both cases, no other types of schema modification on
     the table are allowed meanwhile.  Another difference is that a regular
     <command>REINDEX TABLE</command> or <command>REINDEX INDEX</command>
@@ -406,7 +406,7 @@ Indexes:
     concurrently.  If such an index is named directly in this command, an
     error is raised.  If a table or database with exclusion constraint indexes
     is reindexed concurrently, those indexes will be skipped.  (It is possible
-    to reindex such indexes without the concurrently option.)
+    to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
   </refsect2>
  </refsect1>
-- 
2.1.4

#152

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Bossart, Nathan (#150)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-29 16:53, Bossart, Nathan wrote:

I noticed a very small typo in the documentation for this feature.

fixed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#153

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Justin Pryzby (#151)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-29 17:01, Justin Pryzby wrote:

On Fri, Mar 29, 2019 at 03:53:05PM +0000, Bossart, Nathan wrote:

I noticed a very small typo in the documentation for this feature.

I submit a bunch more changes for consideration, attached.

fixed, thanks

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#154

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Shinoda, Noriyoshi (PN Japan A&PS Delivery) (#147)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 03:10:23PM +0000, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote:

I tried this great feature for partition index.
The first time the REINDEX TABLE CONCURRENTLY statement is executed
to the partition, then an error occurs.

Yes, that's a problem. I am adding an open item.

The second run succeeds but leaves an index with an INVALID status.
I think this is not the desired behaviour.

This one is partially expected actually. Invalid indexes are ignored
when processing as that would cause at least a table-level reindex to
double its amount of processing. However, it is not possible either
to detach an index from an partition tree for indexes, hence the index
cannot be dropped directly either. It seems to me that the root of
the problem is that the partition indexes created as copycats of the
original ones should never be in a state where they are attached to
the index tree.
--
Michael

#155

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Andres Freund (#149)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 08:48:03AM -0700, Andres Freund wrote:

Yes, it increases the total runtime quite considerably. And it adds new
failure modes with partially built invalid indexes hanging around that
need to be dropped manually.

On top of that CONCURRENTLY needs multiple transactions to perform its
different phases for each transaction: build, validation, swap and
cleanup. So it cannot run in a transaction block. Having a separate
option makes the most sense.

It does at *least* twice as much IO.

Yeah, I can guarantee you that it is much slower, at the advantage of
being lock-free.
--
Michael

#156

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Shinoda, Noriyoshi (PN Japan A&PS Delivery) (#147)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On 2019-03-29 16:10, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote:

postgres=> CREATE TABLE part1(c1 INT) PARTITION BY RANGE(c1);
CREATE TABLE
postgres=> CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
postgres=> CREATE INDEX idx1_part1 ON part1(c1);
CREATE INDEX
postgres=> REINDEX TABLE CONCURRENTLY part1v1;
ERROR: cannot drop index part1v1_c1_idx_ccold because index idx1_part1 requires it
HINT: You can drop index idx1_part1 instead.

The attached patch fixes this. The issue was that we didn't move all
dependencies from the index (only in the other direction). Maybe that
was sufficient when the patch was originally written, before partitioned
indexes.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Fix-REINDEX-CONCURRENTLY-of-partitions.patchtext/plain; charset=UTF-8; name=0001-Fix-REINDEX-CONCURRENTLY-of-partitions.patch; x-mac-creator=0; x-mac-type=0Download

From 66a5ce802ede0bd28a6f4881e063ac92a35c0c8c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Sat, 30 Mar 2019 09:37:19 +0100
Subject: [PATCH] Fix REINDEX CONCURRENTLY of partitions

When swapping the old and new index, we not only need to change
dependencies referencing the index, but also dependencies of the
index referencing something else.  The previous code did this only
specifically for a constraint, but we also need to do this for
partitioned indexes.  So instead write a generic function that does it
for all dependencies.
---
 src/backend/catalog/index.c                | 24 +---------
 src/backend/catalog/pg_depend.c            | 56 ++++++++++++++++++++++
 src/include/catalog/dependency.h           |  3 ++
 src/test/regress/expected/create_index.out |  6 +++
 src/test/regress/sql/create_index.sql      |  6 +++
 5 files changed, 73 insertions(+), 22 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 0d9d405c54..77e2088034 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1555,29 +1555,9 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move all dependencies on the old index to the new one
+	 * Move all dependencies of and on the old index to the new one
 	 */
-
-	if (OidIsValid(indexConstraintOid))
-	{
-		ObjectAddress myself,
-					referenced;
-
-		/* Change to having the new index depend on the constraint */
-		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
-										ConstraintRelationId, DEPENDENCY_INTERNAL);
-
-		myself.classId = RelationRelationId;
-		myself.objectId = newIndexId;
-		myself.objectSubId = 0;
-
-		referenced.classId = ConstraintRelationId;
-		referenced.objectId = indexConstraintOid;
-		referenced.objectSubId = 0;
-
-		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
-	}
-
+	changeDependenciesOf(RelationRelationId, oldIndexId, newIndexId);
 	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
 
 	/*
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d63bf5e56d..f7caedcc02 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -395,6 +395,62 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to come from a different object of the same type
+ *
+ * classId/oldObjectId specify the old referencing object.
+ * newObjectId is the new referencing object (must be of class classId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOf(Oid classId, Oid oldObjectId,
+					 Oid newObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_classid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(classId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_objid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldObjectId));
+
+	scan = systable_beginscan(depRel, DependDependerIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		/* make a modifiable copy */
+		tup = heap_copytuple(tup);
+		depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		depform->objid = newObjectId;
+
+		CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+		heap_freetuple(tup);
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * Adjust all dependency records to point to a different object of the same type
  *
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 4f9dde9df9..57545b70d8 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -199,6 +199,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOf(Oid classId, Oid oldObjectId,
+								 Oid newObjectId);
+
 extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
 								 Oid newRefObjectId);
 
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 6b77d25deb..61c7a3a67f 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3312,6 +3312,12 @@ SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
 (1 row)
 
 DROP TABLE testcomment;
+-- partitions
+CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1);
+CREATE TABLE concur_reindex_part1v1 PARTITION OF concur_reindex_part1 FOR VALUES FROM (0) TO (100);
+CREATE INDEX concur_reindex_idx1_part1 ON concur_reindex_part1 (c1);
+REINDEX TABLE CONCURRENTLY concur_reindex_part1v1;
+DROP TABLE concur_reindex_part1;
 -- Check errors
 -- Cannot run inside a transaction block
 BEGIN;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 9ff2dc68ff..559a6e5cb8 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1214,6 +1214,12 @@ CREATE INDEX testcomment_idx1 ON testcomment (i);
 REINDEX TABLE CONCURRENTLY testcomment ;
 SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
 DROP TABLE testcomment;
+-- partitions
+CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1);
+CREATE TABLE concur_reindex_part1v1 PARTITION OF concur_reindex_part1 FOR VALUES FROM (0) TO (100);
+CREATE INDEX concur_reindex_idx1_part1 ON concur_reindex_part1 (c1);
+REINDEX TABLE CONCURRENTLY concur_reindex_part1v1;
+DROP TABLE concur_reindex_part1;
 
 -- Check errors
 -- Cannot run inside a transaction block
-- 
2.21.0

#157

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#156)

Re: REINDEX CONCURRENTLY 2.0

On Sat, Mar 30, 2019 at 11:56:27AM +0100, Peter Eisentraut wrote:

The attached patch fixes this. The issue was that we didn't move all
dependencies from the index (only in the other direction). Maybe that
was sufficient when the patch was originally written, before partitioned
indexes.

Hm. I don't think that it is quite right either because the new index
is missing from the partition tree after the reindex, taking the
example from your patch I see that:
=# CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1);
CREATE TABLE
=# CREATE TABLE concur_reindex_part1v1 PARTITION OF
concur_reindex_part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
=# SELECT relid, level FROM
pg_partition_tree('concur_reindex_idx1_part1');
relid | level
-------------------------------+-------
concur_reindex_idx1_part1 | 0
concur_reindex_part1v1_c1_idx | 1
(2 rows)
=# CREATE INDEX concur_reindex_idx1_part1 ON
concur_reindex_part1 (c1);
CREATE INDEX
=# REINDEX TABLE CONCURRENTLY concur_reindex_part1v1;
REINDEX
SELECT relid, level FROM
pg_partition_tree('concur_reindex_idx1_part1');
relid | level
---------------------------+-------
concur_reindex_idx1_part1 | 0
(1 row)

#158

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#156)

Re: REINDEX CONCURRENTLY 2.0

On Sat, Mar 30, 2019 at 11:56:27AM +0100, Peter Eisentraut wrote:

The attached patch fixes this. The issue was that we didn't move all
dependencies from the index (only in the other direction). Maybe that
was sufficient when the patch was originally written, before partitioned
indexes.

Hm. I don't think that it is quite right either because the new index
is missing from the partition tree after the reindex. Taking the
example from your patch I see that:
=# CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1);
CREATE TABLE
=# CREATE TABLE concur_reindex_part1v1 PARTITION OF
concur_reindex_part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
=# SELECT relid, level FROM
pg_partition_tree('concur_reindex_idx1_part1');
relid | level
-------------------------------+-------
concur_reindex_idx1_part1 | 0
concur_reindex_part1v1_c1_idx | 1
(2 rows)
=# CREATE INDEX concur_reindex_idx1_part1 ON
concur_reindex_part1 (c1);
CREATE INDEX
=# REINDEX TABLE CONCURRENTLY concur_reindex_part1v1;
REINDEX
SELECT relid, level FROM
pg_partition_tree('concur_reindex_idx1_part1');
relid | level
---------------------------+-------
concur_reindex_idx1_part1 | 0
(1 row)

And I would have expected concur_reindex_part1v1_c1_idx to still be
part of the partition tree. I think that the issue is in
index_concurrently_create_copy() where we create the new index with
index_create() without setting parentIndexRelid, causing the
dependency to be lost. This parameter ought to be set to the OID of
the parent index so I think that we need to look at the ancestors of
the index if relispartition is set, and use get_partition_ancestors()
for that purpose.
--
Michael

#159

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Michael Paquier (#158)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Mon, Apr 01, 2019 at 03:43:43PM +0900, Michael Paquier wrote:

And I would have expected concur_reindex_part1v1_c1_idx to still be
part of the partition tree. I think that the issue is in
index_concurrently_create_copy() where we create the new index with
index_create() without setting parentIndexRelid, causing the
dependency to be lost. This parameter ought to be set to the OID of
the parent index so I think that we need to look at the ancestors of
the index if relispartition is set, and use get_partition_ancestors()
for that purpose.

And here is the patch to address this issue. It happens that a bit
more than the dependency switch was lacking here:
- At swap time, we need to have the new index definition track
relispartition from the old index.
- Again at swap time, the inheritance link needs to be updated between
the old/new index and its parent when reindexing a partition index.

Tracking the OID of the parent via index_concurrently_create_copy() is
not a bright idea as we would finish with the impossibility to drop
invalid indexes if the REINDEX CONCURRENTLY failed in the middle (just
added some manual elog(ERROR) to test that). I have added a comment
before making the index duplica. I have also expanded the regression
tests so as we have more coverage for all that, finishing with the
attached which keeps partition trees consistent across the operations.
Thoughts?
--
Michael

Attachments:

reindex-conc-partition.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 9b1d546791..9c6c305c1e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -39,6 +39,7 @@
 #include "catalog/heap.h"
 #include "catalog/index.h"
 #include "catalog/objectaccess.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_constraint.h"
@@ -1263,7 +1264,12 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, const char
 		indexColNames = lappend(indexColNames, NameStr(att->attname));
 	}
 
-	/* Now create the new index */
+	/*
+	 * Now create the new index.  Note that for partition indexes the
+	 * partitionining dependency is switched at swap time to ensure the
+	 * consistency of the operation at the same time dependencies are
+	 * switched, so parentIndexRelid should never be set here.
+	 */
 	newIndexId = index_create(heapRelation,
 							  newName,
 							  InvalidOid,	/* indexRelationId */
@@ -1395,6 +1401,9 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	namestrcpy(&newClassForm->relname, NameStr(oldClassForm->relname));
 	namestrcpy(&oldClassForm->relname, oldName);
 
+	/* Copy partitioning flag to track inheritance properly */
+	newClassForm->relispartition = oldClassForm->relispartition;
+
 	CatalogTupleUpdate(pg_class, &oldClassTuple->t_self, oldClassTuple);
 	CatalogTupleUpdate(pg_class, &newClassTuple->t_self, newClassTuple);
 
@@ -1555,31 +1564,28 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move all dependencies on the old index to the new one
+	 * Move all dependencies of and on the old index to the new one.
 	 */
-
-	if (OidIsValid(indexConstraintOid))
-	{
-		ObjectAddress myself,
-					referenced;
-
-		/* Change to having the new index depend on the constraint */
-		deleteDependencyRecordsForClass(RelationRelationId, oldIndexId,
-										ConstraintRelationId, DEPENDENCY_INTERNAL);
-
-		myself.classId = RelationRelationId;
-		myself.objectId = newIndexId;
-		myself.objectSubId = 0;
-
-		referenced.classId = ConstraintRelationId;
-		referenced.objectId = indexConstraintOid;
-		referenced.objectSubId = 0;
-
-		recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL);
-	}
-
+	changeDependenciesOf(RelationRelationId, oldIndexId, newIndexId);
 	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
 
+	/*
+	 * Inheritance needs to be swapped for partition indexes.
+	 */
+	if (get_rel_relispartition(oldIndexId))
+	{
+		List   *ancestors = get_partition_ancestors(oldIndexId);
+		Oid		parentIndexRelid = linitial_oid(ancestors);
+
+		/* Remove the old inheritance link first */
+		DeleteInheritsTuple(oldIndexId, parentIndexRelid);
+
+		/* Then add the new one */
+		StoreSingleInheritance(newIndexId, parentIndexRelid, 1);
+
+		list_free(ancestors);
+	}
+
 	/*
 	 * Copy over statistics from old to new index
 	 */
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d63bf5e56d..f7caedcc02 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -395,6 +395,62 @@ changeDependencyFor(Oid classId, Oid objectId,
 	return count;
 }
 
+/*
+ * Adjust all dependency records to come from a different object of the same type
+ *
+ * classId/oldObjectId specify the old referencing object.
+ * newObjectId is the new referencing object (must be of class classId).
+ *
+ * Returns the number of records updated.
+ */
+long
+changeDependenciesOf(Oid classId, Oid oldObjectId,
+					 Oid newObjectId)
+{
+	long		count = 0;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	ScanKeyInit(&key[0],
+				Anum_pg_depend_classid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(classId));
+	ScanKeyInit(&key[1],
+				Anum_pg_depend_objid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(oldObjectId));
+
+	scan = systable_beginscan(depRel, DependDependerIndexId, true,
+							  NULL, 2, key);
+
+	while (HeapTupleIsValid((tup = systable_getnext(scan))))
+	{
+		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		/* make a modifiable copy */
+		tup = heap_copytuple(tup);
+		depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		depform->objid = newObjectId;
+
+		CatalogTupleUpdate(depRel, &tup->t_self, tup);
+
+		heap_freetuple(tup);
+
+		count++;
+	}
+
+	systable_endscan(scan);
+
+	table_close(depRel, RowExclusiveLock);
+
+	return count;
+}
+
 /*
  * Adjust all dependency records to point to a different object of the same type
  *
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 4f9dde9df9..57545b70d8 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -199,6 +199,9 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 					Oid refClassId, Oid oldRefObjectId,
 					Oid newRefObjectId);
 
+extern long changeDependenciesOf(Oid classId, Oid oldObjectId,
+								 Oid newObjectId);
+
 extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
 								 Oid newRefObjectId);
 
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 388d709875..9b2a4e0324 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -3309,6 +3309,91 @@ SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
 (1 row)
 
 DROP TABLE testcomment;
+-- Partitions
+-- Create partition table layer.
+CREATE TABLE concur_reindex_part (c1 int, c2 int) PARTITION BY RANGE (c1);
+CREATE TABLE concur_reindex_part_0 PARTITION OF concur_reindex_part
+  FOR VALUES FROM (0) TO (10) PARTITION BY list (c2);
+CREATE TABLE concur_reindex_part_0_1 PARTITION OF concur_reindex_part_0
+  FOR VALUES IN (1);
+CREATE TABLE concur_reindex_part_0_2 PARTITION OF concur_reindex_part_0
+  FOR VALUES IN (2);
+-- This partitioned table should remain with no partitions.
+CREATE TABLE concur_reindex_part_10 PARTITION OF concur_reindex_part
+  FOR VALUES FROM (10) TO (20) PARTITION BY list (c2);
+-- Create partition index layer.
+CREATE INDEX concur_reindex_part_index ON ONLY concur_reindex_part (c1);
+CREATE INDEX concur_reindex_part_index_0 ON ONLY concur_reindex_part_0 (c1);
+ALTER INDEX concur_reindex_part_index ATTACH PARTITION concur_reindex_part_index_0;
+-- This partitioned index should remain with no partitions.
+CREATE INDEX concur_reindex_part_index_10 ON ONLY concur_reindex_part_10 (c1);
+ALTER INDEX concur_reindex_part_index ATTACH PARTITION concur_reindex_part_index_10;
+CREATE INDEX concur_reindex_part_index_0_1 ON ONLY concur_reindex_part_0_1 (c1);
+ALTER INDEX concur_reindex_part_index_0 ATTACH PARTITION concur_reindex_part_index_0_1;
+CREATE INDEX concur_reindex_part_index_0_2 ON ONLY concur_reindex_part_0_2 (c1);
+ALTER INDEX concur_reindex_part_index_0 ATTACH PARTITION concur_reindex_part_index_0_2;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+             relid             |         parentrelid         | level 
+-------------------------------+-----------------------------+-------
+ concur_reindex_part_index     |                             |     0
+ concur_reindex_part_index_0   | concur_reindex_part_index   |     1
+ concur_reindex_part_index_10  | concur_reindex_part_index   |     1
+ concur_reindex_part_index_0_1 | concur_reindex_part_index_0 |     2
+ concur_reindex_part_index_0_2 | concur_reindex_part_index_0 |     2
+(5 rows)
+
+-- REINDEX fails for partitioned indexes
+REINDEX INDEX concur_reindex_part_index_10;
+ERROR:  REINDEX is not yet implemented for partitioned indexes
+REINDEX INDEX CONCURRENTLY concur_reindex_part_index_10;
+ERROR:  REINDEX is not yet implemented for partitioned indexes
+-- REINDEX is a no-op for partitioned tables
+REINDEX TABLE concur_reindex_part_10;
+WARNING:  REINDEX of partitioned tables is not yet implemented, skipping "concur_reindex_part_10"
+NOTICE:  table "concur_reindex_part_10" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_part_10;
+WARNING:  REINDEX of partitioned tables is not yet implemented, skipping "concur_reindex_part_10"
+NOTICE:  table "concur_reindex_part_10" has no indexes
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+             relid             |         parentrelid         | level 
+-------------------------------+-----------------------------+-------
+ concur_reindex_part_index     |                             |     0
+ concur_reindex_part_index_0   | concur_reindex_part_index   |     1
+ concur_reindex_part_index_10  | concur_reindex_part_index   |     1
+ concur_reindex_part_index_0_1 | concur_reindex_part_index_0 |     2
+ concur_reindex_part_index_0_2 | concur_reindex_part_index_0 |     2
+(5 rows)
+
+-- REINDEX should preserve dependencies of partition tree.
+REINDEX INDEX CONCURRENTLY concur_reindex_part_index_0_1;
+REINDEX INDEX CONCURRENTLY concur_reindex_part_index_0_2;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+             relid             |         parentrelid         | level 
+-------------------------------+-----------------------------+-------
+ concur_reindex_part_index     |                             |     0
+ concur_reindex_part_index_0   | concur_reindex_part_index   |     1
+ concur_reindex_part_index_10  | concur_reindex_part_index   |     1
+ concur_reindex_part_index_0_1 | concur_reindex_part_index_0 |     2
+ concur_reindex_part_index_0_2 | concur_reindex_part_index_0 |     2
+(5 rows)
+
+REINDEX TABLE CONCURRENTLY concur_reindex_part_0_1;
+REINDEX TABLE CONCURRENTLY concur_reindex_part_0_2;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+             relid             |         parentrelid         | level 
+-------------------------------+-----------------------------+-------
+ concur_reindex_part_index     |                             |     0
+ concur_reindex_part_index_0   | concur_reindex_part_index   |     1
+ concur_reindex_part_index_10  | concur_reindex_part_index   |     1
+ concur_reindex_part_index_0_1 | concur_reindex_part_index_0 |     2
+ concur_reindex_part_index_0_2 | concur_reindex_part_index_0 |     2
+(5 rows)
+
+DROP TABLE concur_reindex_part;
 -- Check errors
 -- Cannot run inside a transaction block
 BEGIN;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 4d2535b482..b1f3767165 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1211,6 +1211,49 @@ SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
 REINDEX TABLE CONCURRENTLY testcomment ;
 SELECT obj_description('testcomment_idx1'::regclass, 'pg_class');
 DROP TABLE testcomment;
+-- Partitions
+-- Create partition table layer.
+CREATE TABLE concur_reindex_part (c1 int, c2 int) PARTITION BY RANGE (c1);
+CREATE TABLE concur_reindex_part_0 PARTITION OF concur_reindex_part
+  FOR VALUES FROM (0) TO (10) PARTITION BY list (c2);
+CREATE TABLE concur_reindex_part_0_1 PARTITION OF concur_reindex_part_0
+  FOR VALUES IN (1);
+CREATE TABLE concur_reindex_part_0_2 PARTITION OF concur_reindex_part_0
+  FOR VALUES IN (2);
+-- This partitioned table should remain with no partitions.
+CREATE TABLE concur_reindex_part_10 PARTITION OF concur_reindex_part
+  FOR VALUES FROM (10) TO (20) PARTITION BY list (c2);
+-- Create partition index layer.
+CREATE INDEX concur_reindex_part_index ON ONLY concur_reindex_part (c1);
+CREATE INDEX concur_reindex_part_index_0 ON ONLY concur_reindex_part_0 (c1);
+ALTER INDEX concur_reindex_part_index ATTACH PARTITION concur_reindex_part_index_0;
+-- This partitioned index should remain with no partitions.
+CREATE INDEX concur_reindex_part_index_10 ON ONLY concur_reindex_part_10 (c1);
+ALTER INDEX concur_reindex_part_index ATTACH PARTITION concur_reindex_part_index_10;
+CREATE INDEX concur_reindex_part_index_0_1 ON ONLY concur_reindex_part_0_1 (c1);
+ALTER INDEX concur_reindex_part_index_0 ATTACH PARTITION concur_reindex_part_index_0_1;
+CREATE INDEX concur_reindex_part_index_0_2 ON ONLY concur_reindex_part_0_2 (c1);
+ALTER INDEX concur_reindex_part_index_0 ATTACH PARTITION concur_reindex_part_index_0_2;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+-- REINDEX fails for partitioned indexes
+REINDEX INDEX concur_reindex_part_index_10;
+REINDEX INDEX CONCURRENTLY concur_reindex_part_index_10;
+-- REINDEX is a no-op for partitioned tables
+REINDEX TABLE concur_reindex_part_10;
+REINDEX TABLE CONCURRENTLY concur_reindex_part_10;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+-- REINDEX should preserve dependencies of partition tree.
+REINDEX INDEX CONCURRENTLY concur_reindex_part_index_0_1;
+REINDEX INDEX CONCURRENTLY concur_reindex_part_index_0_2;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+REINDEX TABLE CONCURRENTLY concur_reindex_part_0_1;
+REINDEX TABLE CONCURRENTLY concur_reindex_part_0_2;
+SELECT relid, parentrelid, level FROM pg_partition_tree('concur_reindex_part_index')
+  ORDER BY relid, level;
+DROP TABLE concur_reindex_part;
 
 -- Check errors
 -- Cannot run inside a transaction block

#160

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Michael Paquier (#159)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Apr 09, 2019 at 03:50:27PM +0900, Michael Paquier wrote:

And here is the patch to address this issue. It happens that a bit
more than the dependency switch was lacking here:
- At swap time, we need to have the new index definition track
relispartition from the old index.
- Again at swap time, the inheritance link needs to be updated between
the old/new index and its parent when reindexing a partition index.

Peter, this is an open item, and I think as the committer of the
feature you are its owner. Well, in this case, I don't mind taking
the ownership as need be as I know this stuff. Anyway, could you have
a look at the patch proposed and see if you have any issues with it?
--
Michael

#161

Noname

ilmari@ilmari.org

almost 7 years ago

In reply to: Peter Eisentraut (#141)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 2019-03-28 09:07, Sergei Kornilov wrote:

Unfortunately patch does not apply due recent commits. Any chance
this can be fixed (and even committed in pg12)?

Committed :)

I noticed that the docs for how to recover from a failed CREATE INDEX
CONCURRENTLY say that «REINDEX does not support concurrent builds»,
which is no longer true. I was going to just remove the caveat, but
then I discovered that REINDEX CONCURRENTLY doesn't work on INVALID
indexes (why?).

Attached is a patch that instead adjust the claim to say that REINDEX
dos not support rebulilding invalid indexess concurrently.

- ilmari
--
"A disappointingly low fraction of the human race is,
at any given time, on fire." - Stig Sandbeck Mathisen

Attachments:

0001-Correct-claim-about-REINDEX-CONCURRENTLY-in-CREATE-I.patchtext/x-diffDownload

From e2de72b348f8a96e24128fc4188bd542eb676610 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Thu, 11 Apr 2019 10:58:47 +0100
Subject: [PATCH] Correct claim about REINDEX CONCURRENTLY in CREATE INDEX
 CONCURRENTLY docs

REINDEX CONCURRENTLY exists, but cannot reindex invalid indexes.
---
 doc/src/sgml/ref/create_index.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index d9d95b20e3..c458f54ef1 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -557,8 +557,8 @@
     method in such cases is to drop the index and try again to perform
     <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is to rebuild
     the index with <command>REINDEX</command>.  However, since <command>REINDEX</command>
-    does not support concurrent builds, this option is unlikely to seem
-    attractive.)
+    does not support reindexing invalid indexes concurrently, this option is
+    unlikely to seem attractive.)
    </para>
 
    <para>
-- 
2.20.1

#162

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Noname (#161)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Apr 11, 2019 at 11:21:29AM +0100, Dagfinn Ilmari Mannsåker wrote:

I noticed that the docs for how to recover from a failed CREATE INDEX
CONCURRENTLY say that «REINDEX does not support concurrent builds»,
which is no longer true.

Good catch. I'll apply that in a couple of hours.

I was going to just remove the caveat, but then I discovered that
REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?).

This is a wanted choice. The index built in parallel of the existing
one during a concurrent reindex is marked invalid during most of the
operation. Hence, if the reindex is interrupted or fails, you finish
with potentially twice the number of original indexes, half being
invalid and the other half being the ones in use. If the user decides
to rinse and repeat the concurrent reindex, and if we were to also
select invalid indexes for the operation, then you would finish by
potentially doing twice the amount of work when working on a relation,
half of it for nothing.
--
Michael

#163

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#162)

Re: REINDEX CONCURRENTLY 2.0

On 2019-Apr-11, Michael Paquier wrote:

I was going to just remove the caveat, but then I discovered that
REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?).

This is a wanted choice. The index built in parallel of the existing
one during a concurrent reindex is marked invalid during most of the
operation. Hence, if the reindex is interrupted or fails, you finish
with potentially twice the number of original indexes, half being
invalid and the other half being the ones in use. If the user decides
to rinse and repeat the concurrent reindex, and if we were to also
select invalid indexes for the operation, then you would finish by
potentially doing twice the amount of work when working on a relation,
half of it for nothing.

Hmm, I suppose that makes sense for REINDEX TABLE or anything that
reindexes more than one index, but if you do REINDEX INDEX surely it
is reasonable to allow the case?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#164

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Alvaro Herrera (#163)

1 attachment(s)

Re: REINDEX CONCURRENTLY 2.0

On Thu, Apr 11, 2019 at 09:49:47AM -0400, Alvaro Herrera wrote:

Hmm, I suppose that makes sense for REINDEX TABLE or anything that
reindexes more than one index, but if you do REINDEX INDEX surely it
is reasonable to allow the case?

Yes, we could revisit the REINDEX INDEX portion of the decision, and
after sleeping on it my previous argument makes limited sense for
REINDEX processes using only one index. One could note that the
header comment of ReindexRelationConcurrently() kind of implies the
same conclusion as you do, but that's perhaps just an accident.

So... I am coming up with the patch attached. I have introduced some
tests using a trick with CIC to have an invalid index to work on.
--
Michael

Attachments:

reindex-conc-invalid.patchtext/x-diff; charset=us-asciiDownload

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index d9d95b20e3..929a326ae7 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -555,10 +555,8 @@ Indexes:
 
     The recommended recovery
     method in such cases is to drop the index and try again to perform
-    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is to rebuild
-    the index with <command>REINDEX</command>.  However, since <command>REINDEX</command>
-    does not support concurrent builds, this option is unlikely to seem
-    attractive.)
+    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is
+    to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>.
    </para>
 
    <para>
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index e05a76c6d8..303436c89d 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -65,12 +65,11 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
 
     <listitem>
      <para>
-      An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
-      an <quote>invalid</quote> index. Such indexes are useless but it can be
-      convenient to use <command>REINDEX</command> to rebuild them. Note that
-      <command>REINDEX</command> will not perform a concurrent build on an invalid index. To build the
-      index without interfering with production you should drop the index and
-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
+      If an index build fails with the <literal>CONCURRENTLY</literal> option,
+      this index is left as <quote>invalid</quote>. Such indexes are useless
+      but it can be convenient to use <command>REINDEX</command> to rebuild
+      them. Note that only <command>REINDEX INDEX</command> is able
+      to perform a concurrent build on an invalid index.
      </para>
     </listitem>
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 46f32c21f9..a1c91b5fb8 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2776,11 +2776,6 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 			}
 		case RELKIND_INDEX:
 			{
-				/*
-				 * For an index simply add its Oid to list. Invalid indexes
-				 * cannot be included in list.
-				 */
-				Relation	indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
 				Oid			heapId = IndexGetRelation(relationOid, false);
 
 				/* A shared relation cannot be reindexed concurrently */
@@ -2801,25 +2796,13 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 				/* Track the heap relation of this index for session locks */
 				heapRelationIds = list_make1_oid(heapId);
 
+				/*
+				 * Save the list of relation OIDs in private context.  Note
+				 * that invalid indexes are allowed here.
+				 */
+				indexIds = lappend_oid(indexIds, relationOid);
+
 				MemoryContextSwitchTo(oldcontext);
-
-				if (!indexRelation->rd_index->indisvalid)
-					ereport(WARNING,
-							(errcode(ERRCODE_INDEX_CORRUPTED),
-							 errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
-									get_namespace_name(get_rel_namespace(relationOid)),
-									get_rel_name(relationOid))));
-				else
-				{
-					/* Save the list of relation OIDs in private context */
-					oldcontext = MemoryContextSwitchTo(private_context);
-
-					indexIds = lappend_oid(indexIds, relationOid);
-
-					MemoryContextSwitchTo(oldcontext);
-				}
-
-				index_close(indexRelation, NoLock);
 				break;
 			}
 		case RELKIND_PARTITIONED_TABLE:
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index f9b4768aee..8bfcf57d5a 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2033,6 +2033,38 @@ Referenced by:
 
 DROP MATERIALIZED VIEW concur_reindex_matview;
 DROP TABLE concur_reindex_tab, concur_reindex_tab2, concur_reindex_tab3;
+-- Check handling of invalid indexes
+CREATE TABLE concur_reindex_tab4 (c1 int);
+INSERT INTO concur_reindex_tab4 VALUES (1), (1), (2);
+-- This trick creates an invalid index.
+CREATE UNIQUE INDEX CONCURRENTLY concur_reindex_ind5 ON concur_reindex_tab4 (c1);
+ERROR:  could not create unique index "concur_reindex_ind5"
+DETAIL:  Key (c1)=(1) is duplicated.
+-- And this makes the previous failure go away, so the index can become valid.
+DELETE FROM concur_reindex_tab4 WHERE c1 = 1;
+-- The invalid index is not processed when running REINDEX TABLE.
+REINDEX TABLE CONCURRENTLY concur_reindex_tab4;
+WARNING:  cannot reindex concurrently invalid index "public.concur_reindex_ind5", skipping
+NOTICE:  table "concur_reindex_tab4" has no indexes
+\d concur_reindex_tab4
+        Table "public.concur_reindex_tab4"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ c1     | integer |           |          | 
+Indexes:
+    "concur_reindex_ind5" UNIQUE, btree (c1) INVALID
+
+-- But it is fixed with REINDEX INDEX
+REINDEX INDEX CONCURRENTLY concur_reindex_ind5;
+\d concur_reindex_tab4
+        Table "public.concur_reindex_tab4"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ c1     | integer |           |          | 
+Indexes:
+    "concur_reindex_ind5" UNIQUE, btree (c1)
+
+DROP TABLE concur_reindex_tab4;
 --
 -- REINDEX SCHEMA
 --
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 2f0e9a63e6..74242098e3 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -806,6 +806,21 @@ REINDEX SCHEMA CONCURRENTLY pg_catalog;
 DROP MATERIALIZED VIEW concur_reindex_matview;
 DROP TABLE concur_reindex_tab, concur_reindex_tab2, concur_reindex_tab3;
 
+-- Check handling of invalid indexes
+CREATE TABLE concur_reindex_tab4 (c1 int);
+INSERT INTO concur_reindex_tab4 VALUES (1), (1), (2);
+-- This trick creates an invalid index.
+CREATE UNIQUE INDEX CONCURRENTLY concur_reindex_ind5 ON concur_reindex_tab4 (c1);
+-- And this makes the previous failure go away, so the index can become valid.
+DELETE FROM concur_reindex_tab4 WHERE c1 = 1;
+-- The invalid index is not processed when running REINDEX TABLE.
+REINDEX TABLE CONCURRENTLY concur_reindex_tab4;
+\d concur_reindex_tab4
+-- But it is fixed with REINDEX INDEX
+REINDEX INDEX CONCURRENTLY concur_reindex_ind5;
+\d concur_reindex_tab4
+DROP TABLE concur_reindex_tab4;
+
 --
 -- REINDEX SCHEMA
 --

#165

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#160)

Re: REINDEX CONCURRENTLY 2.0

On 2019-04-11 05:59, Michael Paquier wrote:

On Tue, Apr 09, 2019 at 03:50:27PM +0900, Michael Paquier wrote:

And here is the patch to address this issue. It happens that a bit
more than the dependency switch was lacking here:
- At swap time, we need to have the new index definition track
relispartition from the old index.
- Again at swap time, the inheritance link needs to be updated between
the old/new index and its parent when reindexing a partition index.

Peter, this is an open item, and I think as the committer of the
feature you are its owner. Well, in this case, I don't mind taking
the ownership as need be as I know this stuff. Anyway, could you have
a look at the patch proposed and see if you have any issues with it?

Looks good, committed.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#166

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Peter Eisentraut (#165)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Apr 12, 2019 at 08:44:18AM +0200, Peter Eisentraut wrote:

Looks good, committed.

Thanks for committing!
--
Michael

#167

Noname

ilmari@ilmari.org

almost 7 years ago

In reply to: Michael Paquier (#164)

Re: REINDEX CONCURRENTLY 2.0

Michael Paquier <michael@paquier.xyz> writes:

So... I am coming up with the patch attached. I have introduced some
tests using a trick with CIC to have an invalid index to work on.

I don't have any comments on the code (but the test looks sensible, it's
the same trick I used to discover the issue in the first place).

However, the doc patch lost the trailing paren:

The recommended recovery
method in such cases is to drop the index and try again to perform
-    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is to rebuild
-    the index with <command>REINDEX</command>.  However, since <command>REINDEX</command>
-    does not support concurrent builds, this option is unlikely to seem
-    attractive.)
+    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is
+    to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>.
</para>

- ilmari
--
- Twitter seems more influential [than blogs] in the 'gets reported in
the mainstream press' sense at least. - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
to a mainstream media article. - Calle Dybedahl

#168

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Noname (#167)

Re: REINDEX CONCURRENTLY 2.0

On Fri, Apr 12, 2019 at 12:11:12PM +0100, Dagfinn Ilmari Mannsåker wrote:

I don't have any comments on the code (but the test looks sensible, it's
the same trick I used to discover the issue in the first place).

After thinking some more on it, this behavior looks rather sensible to
me. Are there any objections?

However, the doc patch lost the trailing paren:

Fixed on my branch, thanks.
--
Michael

#169

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 6 years ago

In reply to: Michael Paquier (#168)

Re: REINDEX CONCURRENTLY 2.0

On 2019-04-16 08:19, Michael Paquier wrote:

On Fri, Apr 12, 2019 at 12:11:12PM +0100, Dagfinn Ilmari Mannsï¿½ker wrote:

I don't have any comments on the code (but the test looks sensible, it's
the same trick I used to discover the issue in the first place).

After thinking some more on it, this behavior looks rather sensible to
me. Are there any objections?

Looks good to me.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#170

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Peter Eisentraut (#169)

Re: REINDEX CONCURRENTLY 2.0

On Tue, Apr 16, 2019 at 08:50:31AM +0200, Peter Eisentraut wrote:

Looks good to me.

Thanks, committed. If there are additional discussions on various
points of the feature, let's move to a new thread please. This one
has been already extensively used ;)
--
Michael