refresh materialized view concurrently

Started by Kevin Grittnerover 12 years ago43 messages
#1Kevin Grittner
kgrittn@ymail.com
1 attachment(s)

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1.  The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.

I didn't need to touch very much outside of matview-specific files
for this.  My biggest concern is that I needed two small functions
which did *exactly* what some static functions in ri_triggers.c
were doing and couldn't see where the best place to share them from
was.  For the moment I just duplicated them, but my hope would be
that they could be put in a suitable location and called from both
places, rather than duplicating the 30-some lines of code.  The
function signatures are:

void quoteOneName(char *buffer, const char *name)
void quoteRelationName(char *buffer, Relation rel)

Comments in the patch describe the technique used for the
transactional refresh, but I'm not sure how easy it is to
understand the technique from the comments.  Here is a
demonstration of the basic technique, using a table to mock the
materialized view so it can be run directly.

-------------------------------------------------------------------

--
-- Setup
--
drop table if exists n, nt, nd cascade;
drop table if exists nm;

create table n (id int not null primary key, val text);
insert into n values
  (1, 'one'), (2, 'two'), (3, 'three'), (4, 'four'), (5, 'five'),
  (6, null), (7, null), (8, null), (9, null);
-- We use a table to mock this materialized view definition:
--   create materialized view nm as select * from n;
create table nm as select * from n;
insert into n values (10, 'ten'), (11, null);
update n set val = 'zwei' where id = 2;
update n set val = null where id = 3;
update n set id = 44, val = 'forty-four' where id = 4;
update n set val = 'seven' where id = 7;
delete from n where id = 5;
delete from n where id = 8;

vacuum analyze;

--
-- Sample of internal processing for REFRESH MV CONCURRENTLY.
--
begin;
create temp table nt as select * from n;
analyze nt;
create temp table nd as
  SELECT x.ctid as tid, y
    FROM nm x
    FULL JOIN n y ON (y.id OPERATOR(pg_catalog.=) x.id)
    WHERE (y.*) IS DISTINCT FROM (x.*)
    ORDER BY tid;
analyze nd;

delete from nm where ctid in
  (select tid from nd
    where tid is not null and y is not distinct from null);
update nm x set id = (d.y).id, val = (d.y).val from nd d
  where d.tid is not null and x.ctid = d.tid;
insert into nm select (y).* from nd where tid is null;
commit;

--
-- Check that results match.
--
select * from n order by id;
select * from nm order by id;

-------------------------------------------------------------------

I also tried a million-row materialized view with the patch to see
what the performace was like on a large table with just a few
changes.  I was surprised that a small change-set like this was
actually faster than replacing the heap, at least on my machine.
Obviously, when a larger number of rows are affected the
transactional CONCURRENTLY option will be slower, and this is not
intended in any way as a performace-enhancing feature, that was
just a happy surprise in testing.

-------------------------------------------------------------------

-- drop from previous test
drop table if exists testv cascade;

-- create and populate permanent table
create table testv (id int primary key, val text);
insert into testv
  select n, cash_words((floor(random() * 100000000) / 100)::text::money)
  from (select generate_series(1, 2000000, 2)) s(n);
update testv
  set val = NULL
  where id = 547345;

create materialized view matv as select * from testv;
create unique index matv_id on matv (id);
vacuum analyze matv;

delete from testv where id = 16405;
insert into testv
  values (393466, cash_words((floor(random() * 100000000) / 100)::text::money));
update testv
  set val = cash_words((floor(random() * 100000000) / 100)::text::money)
  where id = 1947141;

refresh materialized view concurrently matv;

-------------------------------------------------------------------

People may be surprised to see this using SPI even more than
ri_triggers.c does.  I think this is the safest and most
maintainable approach, although I welcome alternative suggestions.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

refresh-concurrently-v1.patchtext/x-diff; name=refresh-concurrently-v1.patchDownload
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***************
*** 928,935 **** ERROR:  could not serialize access due to read/write dependencies among transact
          </para>
  
          <para>
!          This lock mode is not automatically acquired on tables by any
!          <productname>PostgreSQL</productname> command.
          </para>
         </listitem>
        </varlistentry>
--- 928,934 ----
          </para>
  
          <para>
!          Acquired by <command>REFRESH MATERIALIZED VIEW CONCURRENTLY</command>.
          </para>
         </listitem>
        </varlistentry>
*** a/doc/src/sgml/ref/refresh_materialized_view.sgml
--- b/doc/src/sgml/ref/refresh_materialized_view.sgml
***************
*** 21,27 **** PostgreSQL documentation
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
--- 21,27 ----
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
***************
*** 38,43 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 38,47 ----
     data is generated and the materialized view is left in an unscannable
     state.
    </para>
+   <para>
+    <literal>CONCURRENTLY</literal> and <literal>WITH NO DATA</literal> may not
+    be specified together.
+   </para>
   </refsect1>
  
   <refsect1>
***************
*** 45,50 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 49,82 ----
  
    <variablelist>
     <varlistentry>
+     <term><literal>CONCURRENTLY</literal></term>
+     <listitem>
+      <para>
+       Refresh the materialized view without locking out concurrent selects on
+       the materialized view.  Without this option a refresh which affects a
+       lot of rows will tend to use fewer resources and complete more quickly,
+       but could block other connections which are trying to read from the
+       materialized view.  This option may be faster in cases where a small
+       number of rows are affected.
+      </para>
+      <para>
+       This option is only allowed if there is at least one
+       <literal>UNIQUE</literal> index on the materialized view which uses only
+       column names and includes all rows; that is, it must not index on any
+       expressions nor include a <literal>WHERE</literal> clause.
+      </para>
+      <para>
+       This option may not be used when the materialized view is not already
+       populated.
+      </para>
+      <para>
+       Even with this option only one <literal>REFRESH</literal> at a time may
+       run against any one materialized view.
+      </para>
+     </listitem>
+    </varlistentry>
+ 
+    <varlistentry>
      <term><replaceable class="PARAMETER">name</replaceable></term>
      <listitem>
       <para>
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 583,589 **** rebuild_relation(Relation OldHeap, Oid indexOid,
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
--- 583,590 ----
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace, false,
! 							   AccessExclusiveLock);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
***************
*** 610,616 **** rebuild_relation(Relation OldHeap, Oid indexOid,
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
--- 611,618 ----
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
***************
*** 620,627 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
  
! 	OldHeap = heap_open(OIDOldHeap, AccessExclusiveLock);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
--- 622,631 ----
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
+ 	Oid			namespaceid;
+ 	char		relpersistence;
  
! 	OldHeap = heap_open(OIDOldHeap, lockmode);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
***************
*** 642,647 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
--- 646,662 ----
  	if (isNull)
  		reloptions = (Datum) 0;
  
+ 	if (forcetemp)
+ 	{
+ 		namespaceid = LookupCreationNamespace("pg_temp");
+ 		relpersistence = RELPERSISTENCE_TEMP;
+ 	}
+ 	else
+ 	{
+ 		namespaceid = RelationGetNamespace(OldHeap);
+ 		relpersistence = OldHeap->rd_rel->relpersistence;
+ 	}
+ 
  	/*
  	 * Create the new heap, using a temporary name in the same namespace as
  	 * the existing table.	NOTE: there is some risk of collision with user
***************
*** 657,663 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  RelationGetNamespace(OldHeap),
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
--- 672,678 ----
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  namespaceid,
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
***************
*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  OldHeap->rd_rel->relkind,
! 										  OldHeap->rd_rel->relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
--- 680,687 ----
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  RELKIND_RELATION,
! 										  relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 20,36 ****
--- 20,42 ----
  #include "catalog/catalog.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
+ #include "catalog/pg_operator.h"
  #include "commands/cluster.h"
  #include "commands/matview.h"
  #include "commands/tablecmds.h"
+ #include "commands/tablespace.h"
  #include "executor/executor.h"
+ #include "executor/spi.h"
  #include "miscadmin.h"
+ #include "parser/parse_relation.h"
  #include "rewrite/rewriteHandler.h"
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
+ #include "utils/lsyscache.h"
  #include "utils/rel.h"
  #include "utils/snapmgr.h"
  #include "utils/syscache.h"
+ #include "utils/typcache.h"
  
  
  typedef struct
***************
*** 44,49 **** typedef struct
--- 50,60 ----
  	BulkInsertState bistate;	/* bulk insert state */
  } DR_transientrel;
  
+ #define MAX_QUOTED_NAME_LEN  (NAMEDATALEN*2+3)
+ #define MAX_QUOTED_REL_NAME_LEN  (MAX_QUOTED_NAME_LEN*2)
+ 
+ static int	matview_maintenance_depth = 0;
+ 
  static void transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo);
  static void transientrel_receive(TupleTableSlot *slot, DestReceiver *self);
  static void transientrel_shutdown(DestReceiver *self);
***************
*** 51,56 **** static void transientrel_destroy(DestReceiver *self);
--- 62,77 ----
  static void refresh_matview_datafill(DestReceiver *dest, Query *query,
  						 const char *queryString);
  
+ static void quoteOneName(char *buffer, const char *name);
+ static void quoteRelationName(char *buffer, Relation rel);
+ static void mv_GenerateOper(StringInfo buf, Oid opoid);
+ 
+ static void refresh_by_match_merge(Oid matviewOid, Oid tempOid);
+ static void refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap);
+ 
+ static void OpenMatViewIncrementalMaintenance(void);
+ static void CloseMatViewIncrementalMaintenance(void);
+ 
  /*
   * SetMatViewPopulatedState
   *		Mark a materialized view as populated, or not.
***************
*** 125,136 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  AccessExclusiveLock, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
--- 146,163 ----
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
+ 	bool		concurrent;
+ 	LOCKMODE	lockmode;
+ 
+ 	/* Determine strength of lock needed. */
+ 	concurrent = stmt->concurrent;
+ 	lockmode = concurrent ? ExclusiveLock : AccessExclusiveLock;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  lockmode, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
***************
*** 141,151 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/*
! 	 * We're not using materialized views in the system catalogs.
! 	 */
  	Assert(!IsSystemRelation(matviewRel));
  
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
--- 168,189 ----
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/* Check that CONCURRENTLY is not specified if not populated. */
! 	if (concurrent && !RelationIsPopulated(matviewRel))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 				 errmsg("CONCURRENTLY cannot be used when the materialized view is not populated")));
! 
! 	/* Check that conflicting options have not been specified. */
! 	if (concurrent && stmt->skipData)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_SYNTAX_ERROR),
! 				 errmsg("CONCURRENTLY and WITH NO DATA options cannot be used together")));
! 
! 	/* We're not using materialized views in the system catalogs. */
  	Assert(!IsSystemRelation(matviewRel));
  
+ 	/* We don't allow an oid column for a materialized view. */
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
***************
*** 197,222 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/*
! 	 * Swap the physical files of the target and transient tables, then
! 	 * rebuild the target's indexes and throw away the transient table.
! 	 */
! 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
! 					 RecentXmin, ReadNextMultiXactId());
! 
! 	RelationCacheInvalidateEntry(matviewOid);
  }
  
  /*
--- 235,262 ----
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	/* Concurrent refresh builds new data in temp tablespace, and does diff. */
! 	if (concurrent)
! 		tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP);
! 	else
! 		tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace, concurrent,
! 							   ExclusiveLock);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/* Make the matview match the newly generated data. */
! 	if (concurrent)
! 		refresh_by_match_merge(matviewOid, OIDNewHeap);
! 	else
! 		refresh_by_heap_swap(matviewOid, OIDNewHeap);
  }
  
  /*
***************
*** 369,371 **** transientrel_destroy(DestReceiver *self)
--- 409,778 ----
  {
  	pfree(self);
  }
+ 
+ 
+ /*
+  * quoteOneName --- safely quote a single SQL name
+  *
+  * buffer must be MAX_QUOTED_NAME_LEN long (includes room for \0)
+  */
+ static void
+ quoteOneName(char *buffer, const char *name)
+ {
+ 	/* Rather than trying to be smart, just always quote it. */
+ 	*buffer++ = '"';
+ 	while (*name)
+ 	{
+ 		if (*name == '"')
+ 			*buffer++ = '"';
+ 		*buffer++ = *name++;
+ 	}
+ 	*buffer++ = '"';
+ 	*buffer = '\0';
+ }
+ 
+ /*
+  * quoteRelationName --- safely quote a fully qualified relation name
+  *
+  * buffer must be MAX_QUOTED_REL_NAME_LEN long (includes room for \0)
+  */
+ static void
+ quoteRelationName(char *buffer, Relation rel)
+ {
+ 	quoteOneName(buffer, get_namespace_name(RelationGetNamespace(rel)));
+ 	buffer += strlen(buffer);
+ 	*buffer++ = '.';
+ 	quoteOneName(buffer, RelationGetRelationName(rel));
+ }
+ 
+ static void
+ mv_GenerateOper(StringInfo buf, Oid opoid)
+ {
+ 	HeapTuple	opertup;
+ 	Form_pg_operator operform;
+ 	char		nspname[MAX_QUOTED_NAME_LEN];
+ 
+ 	opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(opoid));
+ 	if (!HeapTupleIsValid(opertup))
+ 		elog(ERROR, "cache lookup failed for operator %u", opoid);
+ 	operform = (Form_pg_operator) GETSTRUCT(opertup);
+ 	Assert(operform->oprkind == 'b');
+ 
+ 	quoteOneName(nspname, get_namespace_name(operform->oprnamespace));
+ 	appendStringInfo(buf, "OPERATOR(%s.%s)",
+ 					 nspname, NameStr(operform->oprname));
+ 
+ 	ReleaseSysCache(opertup);
+ }
+ 
+ /*
+  * refresh_by_match_merge
+  *
+  * Refresh a materialized view with transactional semantics, while allowing
+  * concurrent reads.
+  *
+  * This is called after a new version of the data has been created in a
+  * temporary table.  It performs a full outer join against the old version of
+  * the data, producing "diff" results.	This join cannot work if there are any
+  * duplicated rows in either the old or new versions, in the sense that every
+  * column would compare as equal between the two rows.	It does work correctly
+  * in the face of rows which have at least one NULL value, with all non-NULL
+  * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
+  * indexes turns out to be quite convenient here; the tests we need to make
+  * are consistent with default behavior.  If there is at least one UNIQUE
+  * index on the materialized view, we have exactly the guarantee we need.  By
+  * joining based on equality on all columns which are part of any unique
+  * index, we identify the rows on which we can use UPDATE without any problem.
+  * If any column is NULL in either the old or new version of a row (or both),
+  * we must use DELETE and INSERT, since there could be multiple rows which are
+  * NOT DISTINCT FROM each other, and we could otherwise end up with the wrong
+  * number of occurrences in the updated relation.  The temporary table used to
+  * hold the diff results contains just the TID of the old record (if matched)
+  * and the ROW from the new table as a single column of complex record type
+  * (if matched).
+  *
+  * Once we have the diff table, we perform set-based DELETE, UPDATE, and
+  * INSERT operations against the materialized view, and discard both temporary
+  * tables.
+  *
+  * Everything from the generation of the new data to applying the differences
+  * takes place under cover of an ExclusiveLock, since it seems as though we
+  * would want to prohibit not only concurrent REFRESH operations, but also
+  * incremental maintenance.  It also doesn't seem reasonable or safe to allow
+  * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
+  * this command.
+  */
+ static void
+ refresh_by_match_merge(Oid matviewOid, Oid tempOid)
+ {
+ 	StringInfoData querybuf;
+ 	Relation	matviewRel;
+ 	Relation	tempRel;
+ 	char		matviewname[MAX_QUOTED_REL_NAME_LEN];
+ 	char		tempname[MAX_QUOTED_REL_NAME_LEN];
+ 	char		diffname[MAX_QUOTED_REL_NAME_LEN];
+ 	TupleDesc	tupdesc;
+ 	bool		foundUniqueIndex;
+ 	List	   *indexoidlist;
+ 	ListCell   *indexoidscan;
+ 	int16		relnatts;
+ 	bool	   *usedForQual;
+ 
+ 	initStringInfo(&querybuf);
+ 	matviewRel = heap_open(matviewOid, NoLock);
+ 	quoteRelationName(matviewname, matviewRel);
+ 	tempRel = heap_open(tempOid, NoLock);
+ 	quoteRelationName(tempname, tempRel);
+ 	strcpy(diffname, tempname);
+ 	strcpy(diffname + strlen(diffname) - 1, "_2\"");
+ 
+ 	relnatts = matviewRel->rd_rel->relnatts;
+ 	usedForQual = (bool *) palloc0(sizeof(bool) * relnatts);
+ 
+ 	/* Open SPI context. */
+ 	if (SPI_connect() != SPI_OK_CONNECT)
+ 		elog(ERROR, "SPI_connect failed");
+ 
+ 	/* Analyze the temp table with the new contents. */
+ 	appendStringInfo(&querybuf, "ANALYZE %s", tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Start building the query for creating the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "CREATE TEMP TABLE %s AS "
+ 					 "SELECT x.ctid AS tid, y FROM %s x FULL JOIN %s y ON (",
+ 					 diffname, matviewname, tempname);
+ 
+ 	/*
+ 	 * Get the list of index OIDs for the table from the relcache, and look up
+ 	 * each one in the pg_index syscache.  We will test for equality on all
+ 	 * columns present in all unique indexes which only reference columns and
+ 	 * include all rows.
+ 	 */
+ 	tupdesc = matviewRel->rd_att;
+ 	foundUniqueIndex = false;
+ 	indexoidlist = RelationGetIndexList(matviewRel);
+ 
+ 	foreach(indexoidscan, indexoidlist)
+ 	{
+ 		Oid			indexoid = lfirst_oid(indexoidscan);
+ 		HeapTuple	indexTuple;
+ 		Form_pg_index index;
+ 
+ 		indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexoid));
+ 		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
+ 			elog(ERROR, "cache lookup failed for index %u", indexoid);
+ 		index = (Form_pg_index) GETSTRUCT(indexTuple);
+ 
+ 		/* We're only interested if it is unique and valid. */
+ 		if (index->indisunique && IndexIsValid(index))
+ 		{
+ 			int			numatts = index->indnatts;
+ 			int			i;
+ 			bool		expr = false;
+ 			Relation	indexRel;
+ 
+ 			/* Skip any index on an expression. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				if (index->indkey.values[i] == 0)
+ 				{
+ 					expr = true;
+ 					break;
+ 				}
+ 			}
+ 			if (expr)
+ 			{
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 
+ 			/*
+ 			 * Skip partial indexes.  We count on the ExclusiveLock on the
+ 			 * heap to keep things stable while we check this.
+ 			 */
+ 			indexRel = index_open(index->indexrelid, NoLock);
+ 			if (indexRel->rd_indpred != NIL)
+ 			{
+ 				index_close(indexRel, NoLock);
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 			index_close(indexRel, NoLock);
+ 
+ 			/* Add quals for all columns from this index. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				int			attnum = index->indkey.values[i];
+ 				Oid			type;
+ 				Oid			op;
+ 				char		colname[MAX_QUOTED_NAME_LEN];
+ 
+ 				/*
+ 				 * Only include the column once regardless of how many times
+ 				 * it shows up in how many indexes.
+ 				 *
+ 				 * This is also useful later to omit columns which can not
+ 				 * have changed from the SET clause of the UPDATE statement.
+ 				 */
+ 				if (usedForQual[attnum - 1])
+ 					continue;
+ 				usedForQual[attnum - 1] = true;
+ 
+ 				/*
+ 				 * Actually add the qual, ANDed with any others.
+ 				 */
+ 				if (foundUniqueIndex)
+ 					appendStringInfoString(&querybuf, " AND ");
+ 
+ 				quoteOneName(colname,
+ 							 NameStr((tupdesc->attrs[attnum - 1])->attname));
+ 				appendStringInfo(&querybuf, "y.%s ", colname);
+ 				type = attnumTypeId(matviewRel, attnum);
+ 				op = lookup_type_cache(type, TYPECACHE_EQ_OPR)->eq_opr;
+ 				mv_GenerateOper(&querybuf, op);
+ 				appendStringInfo(&querybuf, " x.%s", colname);
+ 
+ 				foundUniqueIndex = true;
+ 			}
+ 		}
+ 		ReleaseSysCache(indexTuple);
+ 	}
+ 
+ 	list_free(indexoidlist);
+ 
+ 	if (!foundUniqueIndex)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 				 errmsg("concurrent refresh requires a unique index on just columns for all rows of the materialized view")));
+ 
+ 	appendStringInfoString(&querybuf,
+ 						   ") WHERE (y.*) IS DISTINCT FROM (x.*)"
+ 						   " ORDER BY tid");
+ 
+ 	/* Create the temporary "diff" table. */
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We have no further use for data from the "full-data" temp table, but we
+ 	 * must keep it around because its type is reference from the diff table.
+ 	 */
+ 
+ 	/* Analyze the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "ANALYZE %s", diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	OpenMatViewIncrementalMaintenance();
+ 
+ 	/* Deletes must come before inserts; do them first. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "DELETE FROM %s WHERE ctid IN "
+ 					 "(SELECT d.tid FROM %s d "
+ 					 "WHERE d.tid IS NOT NULL "
+ 					 "AND (d.y) IS NOT DISTINCT FROM NULL)",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Updates before inserts gives a better chance at HOT updates. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "UPDATE %s x SET ", matviewname);
+ 
+ 	{
+ 		int			i;
+ 		bool		needComma = false;
+ 
+ 		for (i = 0; i < tupdesc->natts; i++)
+ 		{
+ 			char		colname[MAX_QUOTED_NAME_LEN];
+ 
+ 			if (tupdesc->attrs[i]->attisdropped)
+ 				continue;
+ 
+ 			if (usedForQual[i])
+ 				continue;
+ 
+ 			if (needComma)
+ 				appendStringInfoString(&querybuf, ", ");
+ 			needComma = true;
+ 
+ 			quoteOneName(colname,
+ 						 NameStr((tupdesc->attrs[i])->attname));
+ 			appendStringInfo(&querybuf, "%s = (d.y).%s", colname, colname);
+ 		}
+ 	}
+ 
+ 	appendStringInfo(&querybuf,
+ 					 " FROM %s d WHERE d.tid IS NOT NULL AND x.ctid = d.tid",
+ 					 diffname);
+ 
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UPDATE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Inserts go last. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "INSERT INTO %s SELECT (y).* FROM %s WHERE tid IS NULL",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* We're done maintaining the materialized view. */
+ 	CloseMatViewIncrementalMaintenance();
+ 	heap_close(tempRel, NoLock);
+ 	heap_close(matviewRel, NoLock);
+ 
+ 	/* Clean up temp tables. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "DROP TABLE %s, %s", diffname, tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Close SPI context. */
+ 	if (SPI_finish() != SPI_OK_FINISH)
+ 		elog(ERROR, "SPI_finish failed");
+ }
+ 
+ /*
+  * Swap the physical files of the target and transient tables, then rebuild
+  * the target's indexes and throw away the transient table.
+  */
+ static void
+ refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap)
+ {
+ 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
+ 					 RecentXmin, ReadNextMultiXactId());
+ 
+ 	RelationCacheInvalidateEntry(matviewOid);
+ }
+ 
+ static void
+ OpenMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth++;
+ }
+ 
+ static void
+ CloseMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth--;
+ 	Assert(matview_maintenance_depth >= 0);
+ }
+ 
+ /*
+  * This should be used to test whether the backend is in a context where it is
+  * OK to allow DML statements to modify materialized views.  We only want to
+  * allow that for internal code driven by the materialized view definition,
+  * not for arbitrary user-supplied code.
+  */
+ bool
+ MatViewIncrementalMaintenanceIsEnabled(void)
+ {
+ 	return matview_maintenance_depth > 0;
+ }
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 3529,3535 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace);
  
  			/*
  			 * Copy the heap data into the new table with the desired
--- 3529,3536 ----
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace, false,
! 									   AccessExclusiveLock);
  
  			/*
  			 * Copy the heap data into the new table with the desired
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 42,47 ****
--- 42,48 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/namespace.h"
+ #include "commands/matview.h"
  #include "commands/trigger.h"
  #include "executor/execdebug.h"
  #include "foreign/fdwapi.h"
***************
*** 999,1008 **** CheckValidResultRel(Relation resultRel, CmdType operation)
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 					 errmsg("cannot change materialized view \"%s\"",
! 							RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
--- 1000,1010 ----
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			if (!MatViewIncrementalMaintenanceIsEnabled())
! 				ereport(ERROR,
! 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 						 errmsg("cannot change materialized view \"%s\"",
! 								RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***************
*** 950,956 **** ExecModifyTable(ModifyTableState *node)
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
--- 950,956 ----
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
***************
*** 1280,1286 **** ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
--- 1280,1287 ----
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION ||
! 						relkind == RELKIND_MATVIEW)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3241,3246 **** _copyRefreshMatViewStmt(const RefreshMatViewStmt *from)
--- 3241,3247 ----
  {
  	RefreshMatViewStmt *newnode = makeNode(RefreshMatViewStmt);
  
+ 	COPY_SCALAR_FIELD(concurrent);
  	COPY_SCALAR_FIELD(skipData);
  	COPY_NODE_FIELD(relation);
  
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1521,1526 **** _equalCreateTableAsStmt(const CreateTableAsStmt *a, const CreateTableAsStmt *b)
--- 1521,1527 ----
  static bool
  _equalRefreshMatViewStmt(const RefreshMatViewStmt *a, const RefreshMatViewStmt *b)
  {
+ 	COMPARE_SCALAR_FIELD(concurrent);
  	COMPARE_SCALAR_FIELD(skipData);
  	COMPARE_NODE_FIELD(relation);
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 3284,3294 **** OptNoLog:	UNLOGGED					{ $$ = RELPERSISTENCE_UNLOGGED; }
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->relation = $4;
! 					n->skipData = !($5);
  					$$ = (Node *) n;
  				}
  		;
--- 3284,3295 ----
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW opt_concurrently qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->concurrent = $4;
! 					n->relation = $5;
! 					n->skipData = !($6);
  					$$ = (Node *) n;
  				}
  		;
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 2871,2876 **** psql_completion(char *text, int start, int end)
--- 2871,2882 ----
  	else if (pg_strcasecmp(prev3_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev2_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev_wd, "VIEW") == 0)
+ 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews,
+ 								   " UNION SELECT 'CONCURRENTLY'");
+ 	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews, NULL);
  	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
***************
*** 2879,2884 **** psql_completion(char *text, int start, int end)
--- 2885,2895 ----
  	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "CONCURRENTLY") == 0)
+ 		COMPLETE_WITH_CONST("WITH DATA");
+ 	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev_wd, "WITH") == 0)
  	{
  		static const char *const list_WITH_DATA[] =
***************
*** 2889,2894 **** psql_completion(char *text, int start, int end)
--- 2900,2911 ----
  	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "CONCURRENTLY") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "WITH") == 0)
+ 		COMPLETE_WITH_CONST("DATA");
+ 	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev2_wd, "WITH") == 0 &&
  			 pg_strcasecmp(prev_wd, "NO") == 0)
  		COMPLETE_WITH_CONST("DATA");
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 25,31 **** extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid,
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid	make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
--- 25,32 ----
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 27,30 **** extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString
--- 27,32 ----
  
  extern DestReceiver *CreateTransientRelDestReceiver(Oid oid);
  
+ extern bool MatViewIncrementalMaintenanceIsEnabled(void);
+ 
  #endif   /* MATVIEW_H */
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 2472,2477 **** typedef struct CreateTableAsStmt
--- 2472,2478 ----
  typedef struct RefreshMatViewStmt
  {
  	NodeTag		type;
+ 	bool		concurrent;		/* allow concurrent access? */
  	bool		skipData;		/* true for WITH NO DATA */
  	RangeVar   *relation;		/* relation to insert into */
  } RefreshMatViewStmt;
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 73,78 **** SELECT * FROM tvm;
--- 73,80 ----
  
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 141,146 **** ALTER MATERIALIZED VIEW tvm SET SCHEMA mvschema;
--- 143,151 ----
    Column  |  Type   | Modifiers | Storage | Stats target | Description 
  ----------+---------+-----------+---------+--------------+-------------
   grandtot | numeric |           | main    |              | 
+ Indexes:
+     "tvmm_expr" UNIQUE, btree ((grandtot > 0::numeric))
+     "tvmm_pred" UNIQUE, btree (grandtot) WHERE grandtot < 0::numeric
  View definition:
   SELECT sum(tvm.totamt) AS grandtot
     FROM mvschema.tvm;
***************
*** 177,183 **** SELECT * FROM tvm ORDER BY type;
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
--- 182,188 ----
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
***************
*** 237,242 **** SELECT * FROM tvvm;
--- 242,249 ----
  (1 row)
  
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
+ ERROR:  concurrent refresh requires a unique index on just columns for all rows of the materialized view
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 281,286 **** SELECT * FROM tvvm;
--- 288,296 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  NOTICE:  materialized view "no_such_mv" does not exist, skipping
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ ERROR:  CONCURRENTLY and WITH NO DATA options cannot be used together
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 29,34 **** CREATE MATERIALIZED VIEW tvm AS SELECT * FROM tv ORDER BY type;
--- 29,36 ----
  SELECT * FROM tvm;
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 57,63 **** INSERT INTO t VALUES (6, 'z', 13);
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
--- 59,65 ----
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
***************
*** 74,79 **** SELECT * FROM tmm;
--- 76,82 ----
  SELECT * FROM tvmm;
  SELECT * FROM tvvm;
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 89,94 **** SELECT * FROM tvvm;
--- 92,100 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ 
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#1)
Re: refresh materialized view concurrently

On 14 June 2013 17:05, Kevin Grittner <kgrittn@ymail.com> wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

Is there a reason to keep the non-concurrent behaviour? Anybody that
wants to explicitly lock should just run a LOCK statement. Can't we
treat behaviour when fully locked as an optimisation, so we can just
do the right thing without extra thought and keywords?

It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.

Incremental maintenance will be very straightforward using the logical
changeset extraction code Andres is working on. Having two parallel
mechanisms for changeset extraction in one release seems like a waste
of time. Especially when one is known to be better than the other
already.

Given that we also want to do concurrent CLUSTER and ALTER TABLE ...
SET TABLESPACE using changeset extraction I think its time that
discussion happened on hackers.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Kevin Grittner
kgrittn@ymail.com
In reply to: Simon Riggs (#2)
Re: refresh materialized view concurrently

Simon Riggs <simon@2ndQuadrant.com> wrote:

On 14 June 2013 17:05, Kevin Grittner <kgrittn@ymail.com> wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY
for 9.4 CF1.  The goal of this patch is to allow a refresh
without interfering with concurrent reads, using transactional
semantics.

Is there a reason to keep the non-concurrent behaviour?

Yes.  For initial population, truncation, and replacement of
contents when more than a small percentage of rows are affected.

Anybody that wants to explicitly lock should just run a LOCK
statement. Can't we treat behaviour when fully locked as an
optimisation, so we can just do the right thing without extra
thought and keywords?

Are you suggesting that we use heap replacement or DML depending on
what heavyweight locks held when the statement is executed?

It is my hope to get this committed during this CF to allow me
to focus on incremental maintenance for the rest of the release
cycle.

Incremental maintenance will be very straightforward using the
logical changeset extraction code Andres is working on.

At most, changeset extraction will help with obtaining the initial
delta for the base relations, which is less than 5% of what needs
doing for incremental maintenance of materialized views.  If it
looks like a good fit, of course I'll use it.

Having two parallel mechanisms for changeset extraction in one
release seems like a waste of time.

I haven't looked in depth at what technique to use for capturing
the base relation deltas.  The changeset extraction technique is
something to consider, for sure.  I have a lot of work left to see
whether it works for this.  In particular, to handle all requested
timings, it would need to have low enough latency to provide a
delta during the completion of each DML statement, to support
requests for "eager" maintenance of a materialized view -- where
the transaction which just changed the base relation would see the
effect if they queried the matview.  That may not be the something
to try to tackle in this release, but there are people who want it,
and I would prefer to pick a technique which didn't have a latency
high enough to make that impractical.  That's not to say that I
know that to be a problem for using the changeset extraction
technique for this -- just that I haven't gotten to the point of
evaluating that.

Especially when one is known to be better than the other already.

What is the hypothetical technique you're arguing is inferior?  For
my own part, I haven't gotten beyond the phase of knowing that to
meet all requests for the feature, it would need to be available at
about the same point that AFTER EACH STATEMENT triggers fire, but
that it should not involve any user-written triggers.  Have you
implemented something similar to what you think I might be
considering?  Do you have benchmark results?  Can you share
details?

Given that we also want to do concurrent CLUSTER and ALTER TABLE
... SET TABLESPACE using changeset extraction I think its time
that discussion happened on hackers.

No objections to that here; but please don't hijack this thread for
that discussion.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#3)
Re: refresh materialized view concurrently

On 17 June 2013 00:43, Kevin Grittner <kgrittn@ymail.com> wrote:

Especially when one is known to be better than the other already.

What is the hypothetical technique you're arguing is inferior? For
my own part, I haven't gotten beyond the phase of knowing that to
meet all requests for the feature, it would need to be available at
about the same point that AFTER EACH STATEMENT triggers fire, but
that it should not involve any user-written triggers. Have you
implemented something similar to what you think I might be
considering? Do you have benchmark results? Can you share
details?

Recording the changeset required by replication is known to be more
efficient using WAL based extraction than using triggers. WAL writes
are effectively free and using WAL concentrates the reads to avoid
random I/O in large databases. That would be the most suitable
approach for continuously updated matviews, or frequently updates.

Extraction using multiple snapshots is also possible, using a
technique similar to "concurrently" mechanism. That would require
re-scanning the whole table which might be overkill depending upon the
number of changes. That would work for reasonably infrequent updates.

Given that we also want to do concurrent CLUSTER and ALTER TABLE
... SET TABLESPACE using changeset extraction I think its time
that discussion happened on hackers.

No objections to that here; but please don't hijack this thread for
that discussion.

There are multiple features all requiring efficient change set
extraction. It seems extremely relevant to begin discussing what that
mechanism might be in each case, so we don't develop 2 or even 3
different ones while everybody ignores each other. As you said, we
should be helping each other and working together, and I agree.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Kevin Grittner (#1)
Re: refresh materialized view concurrently

On 14.06.2013 19:05, Kevin Grittner wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.

I must say this seems a bit pointless on its own. But if it's a stepping
stone to incremental maintenance, I have no objections.

I didn't need to touch very much outside of matview-specific files
for this. My biggest concern is that I needed two small functions
which did *exactly* what some static functions in ri_triggers.c
were doing and couldn't see where the best place to share them from
was. For the moment I just duplicated them, but my hope would be
that they could be put in a suitable location and called from both
places, rather than duplicating the 30-some lines of code. The
function signatures are:

void quoteOneName(char *buffer, const char *name)
void quoteRelationName(char *buffer, Relation rel)

I'd just use quote_identifier and quote_qualified_identifier instead.

I didn't understand this error message:

+ 	if (!foundUniqueIndex)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 				 errmsg("concurrent refresh requires a unique index on just 
columns for all rows of the materialized view")));
+

What does that mean?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Kevin Grittner
kgrittn@ymail.com
In reply to: Simon Riggs (#4)
Re: refresh materialized view concurrently

Simon Riggs <simon@2ndQuadrant.com> wrote:

There are multiple features all requiring efficient change set
extraction. It seems extremely relevant to begin discussing what
that mechanism might be in each case

Changeset extraction has nothing to do with this patch, and cannot
possibly be useful for it.  Please keep discussion which is
completely unrelated to this patch off this thread.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#5)
Re: refresh materialized view concurrently

On 17 June 2013 12:13, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 14.06.2013 19:05, Kevin Grittner wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.

I must say this seems a bit pointless on its own. But if it's a stepping
stone to incremental maintenance, I have no objections.

There are generally 4 kinds of mat view

1. Transactionally updated
2. Incremental update, eventually consistent
3. Incremental update, regular refresh
4. Full refresh

At the moment we only have type 4 and it holds a full lock while it
runs. We definitely need a CONCURRENTLY option and this is it.

Implementing the other types won't invalidate what we currently have,
so this makes sense to me.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#6)
Re: refresh materialized view concurrently

On 17 June 2013 13:15, Kevin Grittner <kgrittn@ymail.com> wrote:

Simon Riggs <simon@2ndQuadrant.com> wrote:

There are multiple features all requiring efficient change set
extraction. It seems extremely relevant to begin discussing what
that mechanism might be in each case

Changeset extraction has nothing to do with this patch, and cannot
possibly be useful for it. Please keep discussion which is
completely unrelated to this patch off this thread.

Kevin,

You mentioned "incremental maintenance" in your original post and I
have been discussing it. Had you not mentioned it, I doubt I would
have thought of it.

But since you did mention it, and its clearly an important issue, it
seems relevant to have discussed it here and now.

I'm happy to wait for you to start the thread discussing it directly though.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Kevin Grittner
kgrittn@ymail.com
In reply to: Heikki Linnakangas (#5)
Re: refresh materialized view concurrently

Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 14.06.2013 19:05, Kevin Grittner wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY
for 9.4 CF1.  The goal of this patch is to allow a refresh
without interfering with concurrent reads, using transactional
semantics.

It is my hope to get this committed during this CF to allow me
to focus on incremental maintenance for the rest of the release
cycle.

I must say this seems a bit pointless on its own.

I completely disagree.  When I read what people were posting about
the materialized view creation that went into 9.3, there were many
comments by people that they can't use it until the materialized
views can be refreshed without blocking readers.  There is a clear
need for this.  It doesn't do much to advance incremental
maintenance, but it is a much smaller patch which will make
matviews usable by a lot of people who can't use the initial
feature set.

I didn't understand this error message:

+     if (!foundUniqueIndex)
+         ereport(ERROR,
+                 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                  errmsg("concurrent refresh requires a unique index on just columns for all rows of the materialized view")));
+

What does that mean?

It means that the REFRESH MATERIALIZED VIEW CONCURRENTLY command
cannot be used on a materialized view unless it has at least one
UNIQUE index which is not partial (i.e., there is no WHERE clause)
and is not indexing on an expression (i.e., the index is entirely
on bare column names).  Set logic to do the "diff" is hard to get
right if the tables are not proper sets (i.e., they contain
duplicate rows).  I can see at least three ways it *could* be done,
but all of them are much more complex and significantly slower.
With a UNIQUE index on some set of columns in all rows the correct
guarantees exist to use fast set logic.  It isn't that it's needed
for access; it is needed to provide a guarantee that there is no
row without NULLs that exactly duplicates another row.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Kevin Grittner
kgrittn@ymail.com
In reply to: Simon Riggs (#8)
Re: refresh materialized view concurrently

Simon Riggs <simon@2ndQuadrant.com> wrote:

Kevin Grittner <kgrittn@ymail.com> wrote:

Changeset extraction has nothing to do with this patch, and
cannot possibly be useful for it.  Please keep discussion which
is completely unrelated to this patch off this thread.

You mentioned "incremental maintenance" in your original post and
I have been discussing it. Had you not mentioned it, I doubt I
would have thought of it.

But since you did mention it, and its clearly an important issue,
it seems relevant to have discussed it here and now.

What I said was that I wanted to get this out of the way before I
started working on incremental maintenance.

I'm happy to wait for you to start the thread discussing it
directly though.

Cool.

-Kevin

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Nicolas Barbier
nicolas.barbier@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: refresh materialized view concurrently

2013/6/17 Heikki Linnakangas <hlinnakangas@vmware.com>:

+ errmsg("concurrent refresh requires a
unique index on just columns for all rows of the materialized view")));

Maybe my english is failing me here, but I don’t understand the “just” part.

Nicolas

--
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Kevin Grittner
kgrittn@ymail.com
In reply to: Nicolas Barbier (#11)
Re: refresh materialized view concurrently

Nicolas Barbier <nicolas.barbier@gmail.com> wrote:

2013/6/17 Heikki Linnakangas <hlinnakangas@vmware.com>:

+                                errmsg("concurrent refresh requires a
unique index on just columns for all rows of the materialized view")));

Maybe my english is failing me here, but I don’t understand the “just” part.

It means that the index must not use any expressions in the list of
what it's indexing on -- only column names.  Suggestions for better
wording would be welcome.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#12)
Re: refresh materialized view concurrently

On Mon, Jun 17, 2013 at 11:21 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Nicolas Barbier <nicolas.barbier@gmail.com> wrote:

2013/6/17 Heikki Linnakangas <hlinnakangas@vmware.com>:

+ errmsg("concurrent refresh requires a
unique index on just columns for all rows of the materialized view")));

Maybe my english is failing me here, but I don’t understand the “just” part.

It means that the index must not use any expressions in the list of
what it's indexing on -- only column names. Suggestions for better
wording would be welcome.

Random idea:

ERROR: materialized view \"%s\" does not have a unique key

Perhaps augmented with:

HINT: Create a UNIQUE btree index with no WHERE clause on one or more
columns of the materialized view.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Josh Berkus
josh@agliodbs.com
In reply to: Kevin Grittner (#1)
Re: refresh materialized view concurrently

On 06/17/2013 04:13 AM, Heikki Linnakangas wrote:

On 14.06.2013 19:05, Kevin Grittner wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.

I must say this seems a bit pointless on its own. But if it's a stepping
stone to incremental maintenance, I have no objections.

Actually, CONCURRENTLY was cited as the #1 deficiency for the matview
feature for 9.3. With it, the use-case for Postgres matviews is
broadened considerably. So it's very valuable on its own, even if (for
example) INCREMENTAL didn't get done for 9.3.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Kevin Grittner (#1)
Re: refresh materialized view concurrently

On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

I spent a few hours to review the patch.

As far as I can tell, the overall approach is as follows.

- create a new temp heap as non-concurrent does, but with ExclusiveLock on
the matview, so that reader wouldn't be blocked
- with this temp table and the matview, query FULL JOIN and extract
difference between original matview and temp heap (via SPI)
- this operation requires unique index for performance reason (or
correctness reason too)
- run ANALYZE on this diff table
- run UPDATE, INSERT and DELETE via SPI, to do the merge
- close these temp heap

If I don't miss something, the requirement for the CONCURRENTLY option is
to allow simple SELECT reader to read the matview concurrently while the
view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possible just
to acquire ExclusiveLock on the matview while populating the data and swap
the relfile by taking small AccessExclusiveLock. This lock escalation is
no dead lock hazard, I suppose, because concurrent operation would block
the other at the point ExclusiveLock is acquired, and ExclusiveLock
conflicts AccessExclusiveLock. Then you don't need the complicated SPI
logic or unique key index dependency.

Assuming I'm asking something wrong and going for the current approach,
here's what I've found in the patch.

- I'm not sure if unique key index requirement is reasonable or not,
because users only add CONCURRENTLY option and not merging or incremental
update.
- This could be an overflow in diffname buffer.
+     quoteRelationName(tempname, tempRel);
+     strcpy(diffname, tempname);
+     strcpy(diffname + strlen(diffname) - 1, "_2\"");
- As others pointed out, quoteOneName can be replaced with quote_identifier
- This looks harmless, but I wonder if you need to change relkind.
*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
                                            OldHeap->rd_rel->relowner,
                                            OldHeapDesc,
                                            NIL,
!                                           OldHeap->rd_rel->relkind,
!                                           OldHeap->rd_rel->relpersistence,
                                            false,
                                            RelationIsMapped(OldHeap),
                                            true,
--- 680,687 ----
                                            OldHeap->rd_rel->relowner,
                                            OldHeapDesc,
                                            NIL,
!                                           RELKIND_RELATION,
!                                           relpersistence,
                                            false,
                                            RelationIsMapped(OldHeap),
                                            true,

Since OldHeap->rd_rel->relkind has been working with 'm', too, not only 'r'?
- I found two additional parameters on make_new_heap ugly, but couldn't
come up with better solution. Maybe we can pass Relation of old heap to
the function instead of Oid..

That's about it from me.

Thanks,
--
Hitoshi Harada

#16Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Hitoshi Harada (#15)
Re: refresh materialized view concurrently

On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada <umi.tanuki@gmail.com>wrote:

On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

I spent a few hours to review the patch.

Oh, BTW, though it is not part of this patch, but I came across this.

! /*
! * We're not using materialized views in the system catalogs.
! */
Assert(!IsSystemRelation(matviewRel));

Of course we don't have builtin matview on system catalog, but it is
possible to create such one by allow_system_table_mods=true, so Assert
doesn't look like good to me.

Thanks,
--
Hitoshi Harada

#17Andres Freund
andres@2ndquadrant.com
In reply to: Hitoshi Harada (#16)
Re: refresh materialized view concurrently

On 2013-06-21 02:43:23 -0700, Hitoshi Harada wrote:

On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada <umi.tanuki@gmail.com>wrote:

On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.

I spent a few hours to review the patch.

Oh, BTW, though it is not part of this patch, but I came across this.

! /*
! * We're not using materialized views in the system catalogs.
! */
Assert(!IsSystemRelation(matviewRel));

Of course we don't have builtin matview on system catalog, but it is
possible to create such one by allow_system_table_mods=true, so Assert
doesn't look like good to me.

Anybody using allow_system_table_mods gets to keep the pieces. There are
so many ways to break just about everything things using it that I don't
think worrying much makes sense.
If you want you could replace that by an elog(), but...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Hitoshi Harada (#15)
Re: refresh materialized view concurrently

On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:

If I don't miss something, the requirement for the CONCURRENTLY option is to
allow simple SELECT reader to read the matview concurrently while the view
is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possible just to
acquire ExclusiveLock on the matview while populating the data and swap the
relfile by taking small AccessExclusiveLock. This lock escalation is no
dead lock hazard, I suppose, because concurrent operation would block the
other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts
AccessExclusiveLock. Then you don't need the complicated SPI logic or
unique key index dependency.

This is no good. One, all lock upgrades are deadlock hazards. In
this case, that plays out as follows: suppose that the session running
REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something
else. Some other process takes an AccessShareLock on the materialized
view and then tries to take a conflicting lock on the other object.
Kaboom, deadlock. Granted, the chances of that happening in practice
are small, but it IS the reason why we typically try to having
long-running operations perform lock upgrades. Users get really
annoyed when their DDL runs for an hour and then rolls back.

Two, until we get MVCC catalog scans, it's not safe to update any
system catalog tuple without an AccessExclusiveLock on some locktag
that will prevent concurrent catalog scans for that tuple. Under
SnapshotNow semantics, concurrent readers can fail to see that the
object is present at all, leading to mysterious failures - especially
if some of the object's catalog scans are seen and others are missed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Kevin Grittner
kgrittn@ymail.com
In reply to: Hitoshi Harada (#15)
1 attachment(s)
Re: refresh materialized view concurrently

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

I spent a few hours to review the patch.

Thanks!

As far as I can tell, the overall approach is as follows.

- create a new temp heap as non-concurrent does, but with
ExclusiveLock on the matview, so that reader wouldn't be blocked

Non-concurrent creates the heap in the matview's tablespace and
namespace, so the "temp" part is different in concurrent
generation.  This difference is why concurrent can be faster when
few rows change.

Also, before the next step there is an ANALYZE of the temp table,
so the planner can make good choices in the next step.

- with this temp table and the matview, query FULL JOIN and
extract difference between original matview and temp heap (via SPI)

Right; into another temp table.

- this operation requires unique index for performance reason (or
correctness reason too)

It is primarily for correctness in the face of duplicate rows which
have no nulls.  Do you think the reasons need to be better
documented with comments?

- run ANALYZE on this diff table

- run UPDATE, INSERT and DELETE via SPI, to do the merge

- close these temp heap

Right, then drop them.

Assuming I'm asking something wrong and going for the current
approach, here's what I've found in the patch.

- I'm not sure if unique key index requirement is reasonable or
not, because users only add CONCURRENTLY option and not merging
or incremental update.

The patch would need to be about an order of magnitude more complex
without that requirement due to the problems handling duplicate
rows.  This patch uses set logic, and set logic falls down badly in
the face of duplicate rows.  Consider how you would try to make
this technique work if the "old" data has 3 versions of a row and
the "new" data in the temp table has 4 versions of that same row.
You can play around with that by modifying the examples of the
logic using regular tables I included with the first version of the
patch.  A UNIQUE index is the only way to prevent duplicate rows.
The fact that there can be duplicates with NULL in one or more of
the columns which are part of a duplicate index is not a fatal
problem; it just means that those cases much be handled through
DELETE and re-INSERT of the rows containing NULL in any column
which is part of a duplicate index key.

- As others pointed out, quoteOneName can be replaced with
quote_identifier

OK, I did it that way.  The quote_identifier and
quote_qualified_identifier functions seem rather clumsy for the
case where you already know you have an identifier (because you are
dealing with an open Relation).  I think we should have different
functions for that case, but that should probably not be material
for this patch, so changed to use the existing tools.

- This looks harmless, but I wonder if you need to change relkind.

*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
                                             OldHeap->rd_rel->relowner,
                                             OldHeapDesc,
                                             NIL,
!                                           OldHeap->rd_rel->relkind,
!                                           OldHeap->rd_rel->relpersistence,
                                             false,
                                             RelationIsMapped(OldHeap),
                                             true,
--- 680,687 ----
                                             OldHeap->rd_rel->relowner,
                                             OldHeapDesc,
                                             NIL,
!                                           RELKIND_RELATION,
!                                           relpersistence,
                                             false,
                                             RelationIsMapped(OldHeap),
                                             true,

Since OldHeap->rd_rel->relkind has been working with 'm', too,
not only 'r'?

No, the relation created by this is not going to be around when
we're done; we're either going to move its heap onto the existing
matview or drop the temp table.  Either way, it's OK for it to be a
table until we do.  I don't see the benefit of complicating the
code to do otherwise.

- I found two additional parameters on make_new_heap ugly, but
couldn't come up with better solution.  Maybe we can pass
Relation of old heap to the function instead of Oid..

This was the cleanest way I could see.  Opening the relation
elsewhere and passing it in would not only be uglier IMO, it would
do nothing to tell the function that it should create a temporary
table.

I also modified the confusing error message to something close to
the suggestion from Robert.

What to do about the Assert that the matview is not a system
relation seems like material for a separate patch.  After review,
I'm inclined to remove the test altogether, so that extensions can
create matviews in pg_catalog.

New version attached.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

refresh-concurrently-v2.patchtext/x-diff; name=refresh-concurrently-v2.patchDownload
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***************
*** 928,935 **** ERROR:  could not serialize access due to read/write dependencies among transact
          </para>
  
          <para>
!          This lock mode is not automatically acquired on tables by any
!          <productname>PostgreSQL</productname> command.
          </para>
         </listitem>
        </varlistentry>
--- 928,934 ----
          </para>
  
          <para>
!          Acquired by <command>REFRESH MATERIALIZED VIEW CONCURRENTLY</command>.
          </para>
         </listitem>
        </varlistentry>
*** a/doc/src/sgml/ref/refresh_materialized_view.sgml
--- b/doc/src/sgml/ref/refresh_materialized_view.sgml
***************
*** 21,27 **** PostgreSQL documentation
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
--- 21,27 ----
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
***************
*** 38,43 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 38,47 ----
     data is generated and the materialized view is left in an unscannable
     state.
    </para>
+   <para>
+    <literal>CONCURRENTLY</literal> and <literal>WITH NO DATA</literal> may not
+    be specified together.
+   </para>
   </refsect1>
  
   <refsect1>
***************
*** 45,50 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 49,82 ----
  
    <variablelist>
     <varlistentry>
+     <term><literal>CONCURRENTLY</literal></term>
+     <listitem>
+      <para>
+       Refresh the materialized view without locking out concurrent selects on
+       the materialized view.  Without this option a refresh which affects a
+       lot of rows will tend to use fewer resources and complete more quickly,
+       but could block other connections which are trying to read from the
+       materialized view.  This option may be faster in cases where a small
+       number of rows are affected.
+      </para>
+      <para>
+       This option is only allowed if there is at least one
+       <literal>UNIQUE</literal> index on the materialized view which uses only
+       column names and includes all rows; that is, it must not index on any
+       expressions nor include a <literal>WHERE</literal> clause.
+      </para>
+      <para>
+       This option may not be used when the materialized view is not already
+       populated.
+      </para>
+      <para>
+       Even with this option only one <literal>REFRESH</literal> at a time may
+       run against any one materialized view.
+      </para>
+     </listitem>
+    </varlistentry>
+ 
+    <varlistentry>
      <term><replaceable class="PARAMETER">name</replaceable></term>
      <listitem>
       <para>
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 583,589 **** rebuild_relation(Relation OldHeap, Oid indexOid,
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
--- 583,590 ----
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace, false,
! 							   AccessExclusiveLock);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
***************
*** 610,616 **** rebuild_relation(Relation OldHeap, Oid indexOid,
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
--- 611,618 ----
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
***************
*** 620,627 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
  
! 	OldHeap = heap_open(OIDOldHeap, AccessExclusiveLock);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
--- 622,631 ----
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
+ 	Oid			namespaceid;
+ 	char		relpersistence;
  
! 	OldHeap = heap_open(OIDOldHeap, lockmode);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
***************
*** 642,647 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
--- 646,662 ----
  	if (isNull)
  		reloptions = (Datum) 0;
  
+ 	if (forcetemp)
+ 	{
+ 		namespaceid = LookupCreationNamespace("pg_temp");
+ 		relpersistence = RELPERSISTENCE_TEMP;
+ 	}
+ 	else
+ 	{
+ 		namespaceid = RelationGetNamespace(OldHeap);
+ 		relpersistence = OldHeap->rd_rel->relpersistence;
+ 	}
+ 
  	/*
  	 * Create the new heap, using a temporary name in the same namespace as
  	 * the existing table.	NOTE: there is some risk of collision with user
***************
*** 657,663 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  RelationGetNamespace(OldHeap),
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
--- 672,678 ----
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  namespaceid,
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
***************
*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  OldHeap->rd_rel->relkind,
! 										  OldHeap->rd_rel->relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
--- 680,687 ----
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  RELKIND_RELATION,
! 										  relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 20,36 ****
--- 20,43 ----
  #include "catalog/catalog.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
+ #include "catalog/pg_operator.h"
  #include "commands/cluster.h"
  #include "commands/matview.h"
  #include "commands/tablecmds.h"
+ #include "commands/tablespace.h"
  #include "executor/executor.h"
+ #include "executor/spi.h"
  #include "miscadmin.h"
+ #include "parser/parse_relation.h"
  #include "rewrite/rewriteHandler.h"
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
+ #include "utils/builtins.h"
+ #include "utils/lsyscache.h"
  #include "utils/rel.h"
  #include "utils/snapmgr.h"
  #include "utils/syscache.h"
+ #include "utils/typcache.h"
  
  
  typedef struct
***************
*** 44,49 **** typedef struct
--- 51,58 ----
  	BulkInsertState bistate;	/* bulk insert state */
  } DR_transientrel;
  
+ static int	matview_maintenance_depth = 0;
+ 
  static void transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo);
  static void transientrel_receive(TupleTableSlot *slot, DestReceiver *self);
  static void transientrel_shutdown(DestReceiver *self);
***************
*** 51,56 **** static void transientrel_destroy(DestReceiver *self);
--- 60,74 ----
  static void refresh_matview_datafill(DestReceiver *dest, Query *query,
  						 const char *queryString);
  
+ static char *make_temptable_name_n(char *tempname, int n);
+ static void mv_GenerateOper(StringInfo buf, Oid opoid);
+ 
+ static void refresh_by_match_merge(Oid matviewOid, Oid tempOid);
+ static void refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap);
+ 
+ static void OpenMatViewIncrementalMaintenance(void);
+ static void CloseMatViewIncrementalMaintenance(void);
+ 
  /*
   * SetMatViewPopulatedState
   *		Mark a materialized view as populated, or not.
***************
*** 125,136 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  AccessExclusiveLock, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
--- 143,160 ----
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
+ 	bool		concurrent;
+ 	LOCKMODE	lockmode;
+ 
+ 	/* Determine strength of lock needed. */
+ 	concurrent = stmt->concurrent;
+ 	lockmode = concurrent ? ExclusiveLock : AccessExclusiveLock;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  lockmode, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
***************
*** 141,151 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/*
! 	 * We're not using materialized views in the system catalogs.
! 	 */
  	Assert(!IsSystemRelation(matviewRel));
  
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
--- 165,186 ----
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/* Check that CONCURRENTLY is not specified if not populated. */
! 	if (concurrent && !RelationIsPopulated(matviewRel))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 				 errmsg("CONCURRENTLY cannot be used when the materialized view is not populated")));
! 
! 	/* Check that conflicting options have not been specified. */
! 	if (concurrent && stmt->skipData)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_SYNTAX_ERROR),
! 				 errmsg("CONCURRENTLY and WITH NO DATA options cannot be used together")));
! 
! 	/* We're not using materialized views in the system catalogs. */
  	Assert(!IsSystemRelation(matviewRel));
  
+ 	/* We don't allow an oid column for a materialized view. */
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
***************
*** 197,222 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/*
! 	 * Swap the physical files of the target and transient tables, then
! 	 * rebuild the target's indexes and throw away the transient table.
! 	 */
! 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
! 					 RecentXmin, ReadNextMultiXactId());
! 
! 	RelationCacheInvalidateEntry(matviewOid);
  }
  
  /*
--- 232,259 ----
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	/* Concurrent refresh builds new data in temp tablespace, and does diff. */
! 	if (concurrent)
! 		tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP);
! 	else
! 		tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace, concurrent,
! 							   ExclusiveLock);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/* Make the matview match the newly generated data. */
! 	if (concurrent)
! 		refresh_by_match_merge(matviewOid, OIDNewHeap);
! 	else
! 		refresh_by_heap_swap(matviewOid, OIDNewHeap);
  }
  
  /*
***************
*** 369,371 **** transientrel_destroy(DestReceiver *self)
--- 406,759 ----
  {
  	pfree(self);
  }
+ 
+ 
+ /*
+  * Given a qualified temporary table name, append an underscore followed by
+  * the given integer, to make a new table name based on the old one.
+  *
+  * This leaks memory through palloc(), which won't be cleaned up until the
+  * current memory memory context is freed.
+  */
+ static char *
+ make_temptable_name_n(char *tempname, int n)
+ {
+ 	StringInfoData namebuf;
+ 
+ 	initStringInfo(&namebuf);
+ 	appendStringInfoString(&namebuf, tempname);
+ 	appendStringInfo(&namebuf, "_%i", n);
+ 	return namebuf.data;
+ }
+ 
+ static void
+ mv_GenerateOper(StringInfo buf, Oid opoid)
+ {
+ 	HeapTuple	opertup;
+ 	Form_pg_operator operform;
+ 
+ 	opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(opoid));
+ 	if (!HeapTupleIsValid(opertup))
+ 		elog(ERROR, "cache lookup failed for operator %u", opoid);
+ 	operform = (Form_pg_operator) GETSTRUCT(opertup);
+ 	Assert(operform->oprkind == 'b');
+ 
+ 	appendStringInfo(buf, "OPERATOR(%s.%s)",
+ 					 quote_identifier(get_namespace_name(operform->oprnamespace)),
+ 					 NameStr(operform->oprname));
+ 
+ 	ReleaseSysCache(opertup);
+ }
+ 
+ /*
+  * refresh_by_match_merge
+  *
+  * Refresh a materialized view with transactional semantics, while allowing
+  * concurrent reads.
+  *
+  * This is called after a new version of the data has been created in a
+  * temporary table.  It performs a full outer join against the old version of
+  * the data, producing "diff" results.	This join cannot work if there are any
+  * duplicated rows in either the old or new versions, in the sense that every
+  * column would compare as equal between the two rows.	It does work correctly
+  * in the face of rows which have at least one NULL value, with all non-NULL
+  * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
+  * indexes turns out to be quite convenient here; the tests we need to make
+  * are consistent with default behavior.  If there is at least one UNIQUE
+  * index on the materialized view, we have exactly the guarantee we need.  By
+  * joining based on equality on all columns which are part of any unique
+  * index, we identify the rows on which we can use UPDATE without any problem.
+  * If any column is NULL in either the old or new version of a row (or both),
+  * we must use DELETE and INSERT, since there could be multiple rows which are
+  * NOT DISTINCT FROM each other, and we could otherwise end up with the wrong
+  * number of occurrences in the updated relation.  The temporary table used to
+  * hold the diff results contains just the TID of the old record (if matched)
+  * and the ROW from the new table as a single column of complex record type
+  * (if matched).
+  *
+  * Once we have the diff table, we perform set-based DELETE, UPDATE, and
+  * INSERT operations against the materialized view, and discard both temporary
+  * tables.
+  *
+  * Everything from the generation of the new data to applying the differences
+  * takes place under cover of an ExclusiveLock, since it seems as though we
+  * would want to prohibit not only concurrent REFRESH operations, but also
+  * incremental maintenance.  It also doesn't seem reasonable or safe to allow
+  * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
+  * this command.
+  */
+ static void
+ refresh_by_match_merge(Oid matviewOid, Oid tempOid)
+ {
+ 	StringInfoData querybuf;
+ 	Relation	matviewRel;
+ 	Relation	tempRel;
+ 	char	   *matviewname;
+ 	char	   *tempname;
+ 	char	   *diffname;
+ 	TupleDesc	tupdesc;
+ 	bool		foundUniqueIndex;
+ 	List	   *indexoidlist;
+ 	ListCell   *indexoidscan;
+ 	int16		relnatts;
+ 	bool	   *usedForQual;
+ 
+ 	initStringInfo(&querybuf);
+ 	matviewRel = heap_open(matviewOid, NoLock);
+ 	matviewname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(matviewRel)),
+ 											  RelationGetRelationName(matviewRel));
+ 	tempRel = heap_open(tempOid, NoLock);
+ 	tempname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(tempRel)),
+ 										   RelationGetRelationName(tempRel));
+ 	diffname = make_temptable_name_n(tempname, 2);
+ 
+ 	relnatts = matviewRel->rd_rel->relnatts;
+ 	usedForQual = (bool *) palloc0(sizeof(bool) * relnatts);
+ 
+ 	/* Open SPI context. */
+ 	if (SPI_connect() != SPI_OK_CONNECT)
+ 		elog(ERROR, "SPI_connect failed");
+ 
+ 	/* Analyze the temp table with the new contents. */
+ 	appendStringInfo(&querybuf, "ANALYZE %s", tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Start building the query for creating the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "CREATE TEMP TABLE %s AS "
+ 					 "SELECT x.ctid AS tid, y FROM %s x FULL JOIN %s y ON (",
+ 					 diffname, matviewname, tempname);
+ 
+ 	/*
+ 	 * Get the list of index OIDs for the table from the relcache, and look up
+ 	 * each one in the pg_index syscache.  We will test for equality on all
+ 	 * columns present in all unique indexes which only reference columns and
+ 	 * include all rows.
+ 	 */
+ 	tupdesc = matviewRel->rd_att;
+ 	foundUniqueIndex = false;
+ 	indexoidlist = RelationGetIndexList(matviewRel);
+ 
+ 	foreach(indexoidscan, indexoidlist)
+ 	{
+ 		Oid			indexoid = lfirst_oid(indexoidscan);
+ 		HeapTuple	indexTuple;
+ 		Form_pg_index index;
+ 
+ 		indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexoid));
+ 		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
+ 			elog(ERROR, "cache lookup failed for index %u", indexoid);
+ 		index = (Form_pg_index) GETSTRUCT(indexTuple);
+ 
+ 		/* We're only interested if it is unique and valid. */
+ 		if (index->indisunique && IndexIsValid(index))
+ 		{
+ 			int			numatts = index->indnatts;
+ 			int			i;
+ 			bool		expr = false;
+ 			Relation	indexRel;
+ 
+ 			/* Skip any index on an expression. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				if (index->indkey.values[i] == 0)
+ 				{
+ 					expr = true;
+ 					break;
+ 				}
+ 			}
+ 			if (expr)
+ 			{
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 
+ 			/*
+ 			 * Skip partial indexes.  We count on the ExclusiveLock on the
+ 			 * heap to keep things stable while we check this.
+ 			 */
+ 			indexRel = index_open(index->indexrelid, NoLock);
+ 			if (indexRel->rd_indpred != NIL)
+ 			{
+ 				index_close(indexRel, NoLock);
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 			index_close(indexRel, NoLock);
+ 
+ 			/* Add quals for all columns from this index. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				int			attnum = index->indkey.values[i];
+ 				Oid			type;
+ 				Oid			op;
+ 				const char	   *colname;
+ 
+ 				/*
+ 				 * Only include the column once regardless of how many times
+ 				 * it shows up in how many indexes.
+ 				 *
+ 				 * This is also useful later to omit columns which can not
+ 				 * have changed from the SET clause of the UPDATE statement.
+ 				 */
+ 				if (usedForQual[attnum - 1])
+ 					continue;
+ 				usedForQual[attnum - 1] = true;
+ 
+ 				/*
+ 				 * Actually add the qual, ANDed with any others.
+ 				 */
+ 				if (foundUniqueIndex)
+ 					appendStringInfoString(&querybuf, " AND ");
+ 
+ 				colname = quote_identifier(NameStr((tupdesc->attrs[attnum - 1])->attname));
+ 				appendStringInfo(&querybuf, "y.%s ", colname);
+ 				type = attnumTypeId(matviewRel, attnum);
+ 				op = lookup_type_cache(type, TYPECACHE_EQ_OPR)->eq_opr;
+ 				mv_GenerateOper(&querybuf, op);
+ 				appendStringInfo(&querybuf, " x.%s", colname);
+ 
+ 				foundUniqueIndex = true;
+ 			}
+ 		}
+ 		ReleaseSysCache(indexTuple);
+ 	}
+ 
+ 	list_free(indexoidlist);
+ 
+ 	if (!foundUniqueIndex)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 				 errmsg("cannot refresh materialized view \"%s\" concurrently",
+ 						matviewname),
+ 				 errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
+ 
+ 	appendStringInfoString(&querybuf,
+ 						   ") WHERE (y.*) IS DISTINCT FROM (x.*)"
+ 						   " ORDER BY tid");
+ 
+ 	/* Create the temporary "diff" table. */
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We have no further use for data from the "full-data" temp table, but we
+ 	 * must keep it around because its type is reference from the diff table.
+ 	 */
+ 
+ 	/* Analyze the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "ANALYZE %s", diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	OpenMatViewIncrementalMaintenance();
+ 
+ 	/* Deletes must come before inserts; do them first. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "DELETE FROM %s WHERE ctid IN "
+ 					 "(SELECT d.tid FROM %s d "
+ 					 "WHERE d.tid IS NOT NULL "
+ 					 "AND (d.y) IS NOT DISTINCT FROM NULL)",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Updates before inserts gives a better chance at HOT updates. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "UPDATE %s x SET ", matviewname);
+ 
+ 	{
+ 		int			i;
+ 		bool		needComma = false;
+ 
+ 		for (i = 0; i < tupdesc->natts; i++)
+ 		{
+ 			const char	   *colname;
+ 
+ 			if (tupdesc->attrs[i]->attisdropped)
+ 				continue;
+ 
+ 			if (usedForQual[i])
+ 				continue;
+ 
+ 			if (needComma)
+ 				appendStringInfoString(&querybuf, ", ");
+ 			needComma = true;
+ 
+ 			colname = quote_identifier(NameStr((tupdesc->attrs[i])->attname));
+ 			appendStringInfo(&querybuf, "%s = (d.y).%s", colname, colname);
+ 		}
+ 	}
+ 
+ 	appendStringInfo(&querybuf,
+ 					 " FROM %s d WHERE d.tid IS NOT NULL AND x.ctid = d.tid",
+ 					 diffname);
+ 
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UPDATE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Inserts go last. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "INSERT INTO %s SELECT (y).* FROM %s WHERE tid IS NULL",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* We're done maintaining the materialized view. */
+ 	CloseMatViewIncrementalMaintenance();
+ 	heap_close(tempRel, NoLock);
+ 	heap_close(matviewRel, NoLock);
+ 
+ 	/* Clean up temp tables. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "DROP TABLE %s, %s", diffname, tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Close SPI context. */
+ 	if (SPI_finish() != SPI_OK_FINISH)
+ 		elog(ERROR, "SPI_finish failed");
+ }
+ 
+ /*
+  * Swap the physical files of the target and transient tables, then rebuild
+  * the target's indexes and throw away the transient table.
+  */
+ static void
+ refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap)
+ {
+ 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
+ 					 RecentXmin, ReadNextMultiXactId());
+ 
+ 	RelationCacheInvalidateEntry(matviewOid);
+ }
+ 
+ static void
+ OpenMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth++;
+ }
+ 
+ static void
+ CloseMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth--;
+ 	Assert(matview_maintenance_depth >= 0);
+ }
+ 
+ /*
+  * This should be used to test whether the backend is in a context where it is
+  * OK to allow DML statements to modify materialized views.  We only want to
+  * allow that for internal code driven by the materialized view definition,
+  * not for arbitrary user-supplied code.
+  */
+ bool
+ MatViewIncrementalMaintenanceIsEnabled(void)
+ {
+ 	return matview_maintenance_depth > 0;
+ }
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 3529,3535 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace);
  
  			/*
  			 * Copy the heap data into the new table with the desired
--- 3529,3536 ----
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace, false,
! 									   AccessExclusiveLock);
  
  			/*
  			 * Copy the heap data into the new table with the desired
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 42,47 ****
--- 42,48 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/namespace.h"
+ #include "commands/matview.h"
  #include "commands/trigger.h"
  #include "executor/execdebug.h"
  #include "foreign/fdwapi.h"
***************
*** 999,1008 **** CheckValidResultRel(Relation resultRel, CmdType operation)
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 					 errmsg("cannot change materialized view \"%s\"",
! 							RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
--- 1000,1010 ----
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			if (!MatViewIncrementalMaintenanceIsEnabled())
! 				ereport(ERROR,
! 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 						 errmsg("cannot change materialized view \"%s\"",
! 								RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***************
*** 950,956 **** ExecModifyTable(ModifyTableState *node)
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
--- 950,956 ----
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
***************
*** 1280,1286 **** ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
--- 1280,1287 ----
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION ||
! 						relkind == RELKIND_MATVIEW)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3241,3246 **** _copyRefreshMatViewStmt(const RefreshMatViewStmt *from)
--- 3241,3247 ----
  {
  	RefreshMatViewStmt *newnode = makeNode(RefreshMatViewStmt);
  
+ 	COPY_SCALAR_FIELD(concurrent);
  	COPY_SCALAR_FIELD(skipData);
  	COPY_NODE_FIELD(relation);
  
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1521,1526 **** _equalCreateTableAsStmt(const CreateTableAsStmt *a, const CreateTableAsStmt *b)
--- 1521,1527 ----
  static bool
  _equalRefreshMatViewStmt(const RefreshMatViewStmt *a, const RefreshMatViewStmt *b)
  {
+ 	COMPARE_SCALAR_FIELD(concurrent);
  	COMPARE_SCALAR_FIELD(skipData);
  	COMPARE_NODE_FIELD(relation);
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 3284,3294 **** OptNoLog:	UNLOGGED					{ $$ = RELPERSISTENCE_UNLOGGED; }
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->relation = $4;
! 					n->skipData = !($5);
  					$$ = (Node *) n;
  				}
  		;
--- 3284,3295 ----
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW opt_concurrently qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->concurrent = $4;
! 					n->relation = $5;
! 					n->skipData = !($6);
  					$$ = (Node *) n;
  				}
  		;
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 2871,2876 **** psql_completion(char *text, int start, int end)
--- 2871,2882 ----
  	else if (pg_strcasecmp(prev3_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev2_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev_wd, "VIEW") == 0)
+ 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews,
+ 								   " UNION SELECT 'CONCURRENTLY'");
+ 	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews, NULL);
  	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
***************
*** 2879,2884 **** psql_completion(char *text, int start, int end)
--- 2885,2895 ----
  	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "CONCURRENTLY") == 0)
+ 		COMPLETE_WITH_CONST("WITH DATA");
+ 	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev_wd, "WITH") == 0)
  	{
  		static const char *const list_WITH_DATA[] =
***************
*** 2889,2894 **** psql_completion(char *text, int start, int end)
--- 2900,2911 ----
  	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "CONCURRENTLY") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "WITH") == 0)
+ 		COMPLETE_WITH_CONST("DATA");
+ 	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev2_wd, "WITH") == 0 &&
  			 pg_strcasecmp(prev_wd, "NO") == 0)
  		COMPLETE_WITH_CONST("DATA");
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 25,31 **** extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid,
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid	make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
--- 25,32 ----
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 27,30 **** extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString
--- 27,32 ----
  
  extern DestReceiver *CreateTransientRelDestReceiver(Oid oid);
  
+ extern bool MatViewIncrementalMaintenanceIsEnabled(void);
+ 
  #endif   /* MATVIEW_H */
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 2472,2477 **** typedef struct CreateTableAsStmt
--- 2472,2478 ----
  typedef struct RefreshMatViewStmt
  {
  	NodeTag		type;
+ 	bool		concurrent;		/* allow concurrent access? */
  	bool		skipData;		/* true for WITH NO DATA */
  	RangeVar   *relation;		/* relation to insert into */
  } RefreshMatViewStmt;
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 73,78 **** SELECT * FROM tvm;
--- 73,80 ----
  
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 141,146 **** ALTER MATERIALIZED VIEW tvm SET SCHEMA mvschema;
--- 143,151 ----
    Column  |  Type   | Modifiers | Storage | Stats target | Description 
  ----------+---------+-----------+---------+--------------+-------------
   grandtot | numeric |           | main    |              | 
+ Indexes:
+     "tvmm_expr" UNIQUE, btree ((grandtot > 0::numeric))
+     "tvmm_pred" UNIQUE, btree (grandtot) WHERE grandtot < 0::numeric
  View definition:
   SELECT sum(tvm.totamt) AS grandtot
     FROM mvschema.tvm;
***************
*** 177,183 **** SELECT * FROM tvm ORDER BY type;
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
--- 182,188 ----
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
***************
*** 237,242 **** SELECT * FROM tvvm;
--- 242,250 ----
  (1 row)
  
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
+ ERROR:  cannot refresh materialized view "public.tvmm" concurrently
+ HINT:  Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 281,286 **** SELECT * FROM tvvm;
--- 289,297 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  NOTICE:  materialized view "no_such_mv" does not exist, skipping
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ ERROR:  CONCURRENTLY and WITH NO DATA options cannot be used together
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 29,34 **** CREATE MATERIALIZED VIEW tvm AS SELECT * FROM tv ORDER BY type;
--- 29,36 ----
  SELECT * FROM tvm;
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 57,63 **** INSERT INTO t VALUES (6, 'z', 13);
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
--- 59,65 ----
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
***************
*** 74,79 **** SELECT * FROM tmm;
--- 76,82 ----
  SELECT * FROM tvmm;
  SELECT * FROM tvvm;
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 89,94 **** SELECT * FROM tvvm;
--- 92,100 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ 
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
#20Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Robert Haas (#18)
Re: refresh materialized view concurrently

On Tue, Jun 25, 2013 at 9:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada <umi.tanuki@gmail.com>
wrote:

If I don't miss something, the requirement for the CONCURRENTLY option

is to

allow simple SELECT reader to read the matview concurrently while the

view

is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possible

just to

acquire ExclusiveLock on the matview while populating the data and swap

the

relfile by taking small AccessExclusiveLock. This lock escalation is no
dead lock hazard, I suppose, because concurrent operation would block the
other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts
AccessExclusiveLock. Then you don't need the complicated SPI logic or
unique key index dependency.

This is no good. One, all lock upgrades are deadlock hazards. In
this case, that plays out as follows: suppose that the session running
REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something
else. Some other process takes an AccessShareLock on the materialized
view and then tries to take a conflicting lock on the other object.
Kaboom, deadlock. Granted, the chances of that happening in practice
are small, but it IS the reason why we typically try to having
long-running operations perform lock upgrades. Users get really
annoyed when their DDL runs for an hour and then rolls back.

OK, that' not safe. What I was thinking was something similar to
compare-and-swap, where the whole operation is atomic under an
AccessExclusiveLock. What if we release ExclusiveLock once a new matview
was created and re-acquire AccessExclusiveLock before trying swap? Note
matview is a little different from index which I know people are talking
about in REINDEX CONCURRENTLY thread, in that the content of matview does
not change incrementally (at least at this point), but only does change
fully in swapping operation by the same REFRESH MATERIALIZED VIEW command.
The only race condition is between releasing Exclusive lock and re-acquire
AccessExclusiveLock someone else can go ahead with the same operation and
could create another one. If it happens, let's abort us, because I guess
that's the way our transaction system is working anyway; in case of unique
key index insertion for example, if I find another guy is inserting the
same value in the index, I wait for the other guy to finish his work and if
his transaction commits I give up, otherwise I go ahead. Maybe it's
annoying if an hour operation finally gets aborted, but my purpose is
actually achieved by the other guy. If the primary goal of this feature is
let reader reads the matview concurrently it should be ok?

Hmm, but in such cases the first guy is always win and the second guy who
may come an hour later loses so we cannot get the result from the latest
command... I still wonder there should be some way.

Two, until we get MVCC catalog scans, it's not safe to update any
system catalog tuple without an AccessExclusiveLock on some locktag
that will prevent concurrent catalog scans for that tuple. Under
SnapshotNow semantics, concurrent readers can fail to see that the
object is present at all, leading to mysterious failures - especially
if some of the object's catalog scans are seen and others are missed.

So what I'm saying above is take AccessExclusiveLock on swapping relfile

in catalog. This doesn't violate your statement, I suppose. I'm actually
still skeptical about MVCC catalog, because even if you can make catalog
lookup MVCC, relfile on the filesystem is not MVCC. If session 1 changes
relfilenode in pg_class and commit transaction, delete the old relfile from
the filesystem, but another concurrent session 2 that just took a snapshot
before 1 made such change keeps running and tries to open this relation,
grabbing the old relfile and open it from filesystem -- ERROR: relfile not
found. So everyone actually needs to see up-to-date information that
synchronizes with what filesystem says and that's SnapshotNow. In my
experimental thought above about compare-and-swap way, in compare phase he
needs to see the most recent valid information, otherwise he never thinks
someone did something new. Since I haven't read the whole thread, maybe we
have already discussed about it, but it would help if you clarify this
concern.

Thanks,
--
Hitoshi Harada

#21Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Kevin Grittner (#19)
Re: refresh materialized view concurrently

On Wed, Jun 26, 2013 at 1:38 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

I spent a few hours to review the patch.

Thanks!

As far as I can tell, the overall approach is as follows.

- create a new temp heap as non-concurrent does, but with
ExclusiveLock on the matview, so that reader wouldn't be blocked

Non-concurrent creates the heap in the matview's tablespace and
namespace, so the "temp" part is different in concurrent
generation. This difference is why concurrent can be faster when
few rows change.

It's still not clear to me why you need temp in concurrent and not in
non-concurrent. If this type of operations is always creating "temp" table
and just swap it with existing one, why can't we just make it temp always?
And if the performance is the only concern, is temp better than just
turning off WAL for the table or UNLOGGED table?

Also, before the next step there is an ANALYZE of the temp table,
so the planner can make good choices in the next step.

- with this temp table and the matview, query FULL JOIN and
extract difference between original matview and temp heap (via SPI)

Right; into another temp table.

- this operation requires unique index for performance reason (or
correctness reason too)

It is primarily for correctness in the face of duplicate rows which
have no nulls. Do you think the reasons need to be better
documented with comments?

Ah, yes, even after looking at patch I was confused if it was for
performance or correctness. It's a shame we cannot refresh it concurrently
if we have duplicate rows in the matview. I thought it would make sense to
allow it without unique key if it was only performance tradeoffs.

I also modified the confusing error message to something close to
the suggestion from Robert.

What to do about the Assert that the matview is not a system
relation seems like material for a separate patch. After review,
I'm inclined to remove the test altogether, so that extensions can
create matviews in pg_catalog.

I like this better.

New version attached.

Will take another look.

Thanks,
--
Hitoshi Harada

#22Andres Freund
andres@2ndquadrant.com
In reply to: Hitoshi Harada (#20)
Re: refresh materialized view concurrently

On 2013-06-27 00:12:07 -0700, Hitoshi Harada wrote:

Two, until we get MVCC catalog scans, it's not safe to update any
system catalog tuple without an AccessExclusiveLock on some locktag
that will prevent concurrent catalog scans for that tuple. Under
SnapshotNow semantics, concurrent readers can fail to see that the
object is present at all, leading to mysterious failures - especially
if some of the object's catalog scans are seen and others are missed.

So what I'm saying above is take AccessExclusiveLock on swapping relfile

in catalog. This doesn't violate your statement, I suppose. I'm actually
still skeptical about MVCC catalog, because even if you can make catalog
lookup MVCC, relfile on the filesystem is not MVCC. If session 1 changes
relfilenode in pg_class and commit transaction, delete the old relfile from
the filesystem, but another concurrent session 2 that just took a snapshot
before 1 made such change keeps running and tries to open this relation,
grabbing the old relfile and open it from filesystem -- ERROR: relfile not
found.

We can play cute tricks akin to what CREATE INDEX CONCURRENTLY currently
does, i.e. wait for all other relations that could have possibly seen
the old relfilenode (they must have at least a share lock on the
relation) before dropping the actual storage.

The reason we cannot currently do that in most scenarios is that we
cannot perform transactional/mvcc updates of non-exclusively locked
objects due to the SnapshotNow problems of seeing multiple or no
versions of a row during a single scan.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Kevin Grittner
kgrittn@ymail.com
In reply to: Hitoshi Harada (#21)
Re: refresh materialized view concurrently

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

Kevin Grittner <kgrittn@ymail.com> wrote:

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

As far as I can tell, the overall approach is as follows.

- create a new temp heap as non-concurrent does, but with
ExclusiveLock on the matview, so that reader wouldn't be
blocked

Non-concurrent creates the heap in the matview's tablespace and
namespace, so the "temp" part is different in concurrent
generation.  This difference is why concurrent can be faster
when few rows change.

It's still not clear to me why you need temp in concurrent and
not in non-concurrent.

Well, temp tables can be in an entirely different tablespace, so
you can't just move the heap of a temp table into place as the new
heap for the matview (as we do for a non-concurrent REFRESH) -- at
least not without potentially copying everything an extra time.

For concurrent we are modifying the existing matview heap, and the
freshly generated set of values, as well as the "diff" table, are
just needed, well, temporarily.  That makes the behavior of temp
tables ideal.  Not only are they not logged, they are potentially
placed on faster tablespaces, and the data written to them might
never be written to disk:

http://www.postgresql.org/docs/9.2/interactive/runtime-config-resource.html#GUC-TEMP-BUFFERS

If this type of operations is always creating "temp" table and
just swap it with existing one, why can't we just make it temp
always?

Because of the potentially differrent tablespaces.

And if the performance is the only concern, is temp better than
just turning off WAL for the table or UNLOGGED table?

Yes, it is.

- this operation requires unique index for performance reason
(or correctness reason too)

It is primarily for correctness in the face of duplicate rows
which have no nulls.  Do you think the reasons need to be better
documented with comments?

Ah, yes, even after looking at patch I was confused if it was for
performance or correctness.

This is the part of the function's comment which attempts to
explain the problem.

 * This join cannot work if there are any
 * duplicated rows in either the old or new versions, in the sense that every
 * column would compare as equal between the two rows.  It does work correctly
 * in the face of rows which have at least one NULL value, with all non-NULL
 * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
 * indexes turns out to be quite convenient here; the tests we need to make
 * are consistent with default behavior.  If there is at least one UNIQUE
 * index on the materialized view, we have exactly the guarantee we need.  By
 * joining based on equality on all columns which are part of any unique
 * index, we identify the rows on which we can use UPDATE without any problem.
 * If any column is NULL in either the old or new version of a row (or both),
 * we must use DELETE and INSERT, since there could be multiple rows which are
 * NOT DISTINCT FROM each other, and we could otherwise end up with the wrong
 * number of occurrences in the updated relation.

I'm open to suggestions on better wording.

As an example of the way the full join gets into trouble with
duplicate rows when used for a diff, see this example:

test=# create table old (c1 int, c2 int);
CREATE TABLE
test=# create table new (c1 int, c2 int);
CREATE TABLE
test=# insert into old values
test-#   (1,1),(1,2),(1,2),(1,2),(1,3),(1,null),(1,null);
INSERT 0 7
test=# insert into new values
test-#   (1,1),(1,2),(1,2),(1,2),(1,2),(1,2),(1,4),(1,null),(1,null),(1,null);
INSERT 0 10

At this point it is clear that we need to add two rows with values
(1,2) and we need to wind up with one more row with values (1,null)
than we already have.  We also need to delete (1,3) and add (1,4).
But full join logic fails to get things right for the case of
duplicate rows with no nulls:

test=# select old, new
test-#   from old
test-#   full join new on old.c1 = new.c1 and old.c2 = new.c2
test-#   where (old.*) is distinct from (new.*);
  old  |  new  
-------+-------
 (1,3) |
 (1,)  |
 (1,)  |
       | (1,4)
       | (1,)
       | (1,)
       | (1,)
(7 rows)

It's a shame we cannot refresh it concurrently if we have
duplicate rows in the matview.  I thought it would make sense to
allow it without unique key if it as only performance tradeoffs.

Well, we *could* allow it without a unique index, but the code
would be more complex and significantly slower.  I think we would
still want to handle it the way the current patch does when a
unique index is present, even if we have a way to handle cases
where such an index is not present.  Even if I were convinced it
was worthwhile to support the more general case, I would want to
commit this patch first, and add the more complicated code as a
follow-on patch.

At this point I'm not convinced of the value of supporting
concurrent refresh of materialized views which contain duplicate
rows, nor of the benefit of allowing it to work ten times slower on
matviews without duplicate rows but on which no unique index has
been added, rather than requiring the user to add a unique index if
they want the concurrent refresh.  I'm open to arguments as to why
either of those is a good idea and worth doing ahead of incremental
maintenance of matviews.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Kevin Grittner
kgrittn@ymail.com
In reply to: Andres Freund (#22)
Re: refresh materialized view concurrently

Andres Freund <andres@2ndquadrant.com> wrote:

We can play cute tricks akin to what CREATE INDEX CONCURRENTLY currently
does, i.e. wait for all other relations that could have possibly seen
the old relfilenode (they must have at least a share lock on the
relation) before dropping the actual storage.

The reason we cannot currently do that in most scenarios is that we
cannot perform transactional/mvcc updates of non-exclusively locked
objects due to the SnapshotNow problems of seeing multiple or no
versions of a row during a single scan.

Not only would that be slower than the submitted patch in cases
where only a few rows differ, but it could be waiting to swap in
that new heap for an unbounded amount of time.  I think the current
patch will "play nicer" with incremental maintenance than what you
suggest here.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Hitoshi Harada (#21)
Re: refresh materialized view concurrently

On Thu, Jun 27, 2013 at 12:19 AM, Hitoshi Harada <umi.tanuki@gmail.com>wrote:

On Wed, Jun 26, 2013 at 1:38 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

New version attached.

Will take another look.

Oops!

drop materialized view if exists mv;
drop table if exists foo;
create table foo(a, b) as values(1, 10);
create materialized view mv as select * from foo;
create unique index on mv(a);
insert into foo select * from foo;
refresh materialized view mv;
refresh materialized view concurrently mv;

test=# refresh materialized view mv;
ERROR: could not create unique index "mv_a_idx"
DETAIL: Key (a)=(1) is duplicated.
test=# refresh materialized view concurrently mv;
REFRESH MATERIALIZED VIEW

Here's one more.

create table foo(a, b, c) as values(1, 2, 3);
create materialized view mv as select * from foo;
create unique index on mv (a);
create unique index on mv (b);
create unique index on mv (c);
insert into foo values(2, 3, 4);
insert into foo values(3, 4, 5);
refresh materialized view concurrently mv;

test=# refresh materialized view concurrently mv;
ERROR: syntax error at or near "FROM"
LINE 1: UPDATE public.mv x SET FROM pg_temp_2.pg_temp_16615_2 d WHE...
^
QUERY: UPDATE public.mv x SET FROM pg_temp_2.pg_temp_16615_2 d WHERE
d.tid IS NOT NULL AND x.ctid = d.tid

Other than these, I've found index is opened with NoLock, relying on
ExclusiveLock of parent matview, and ALTER INDEX SET TABLESPACE or
something similar can run concurrently, but it is presumably safe. DROP
INDEX, REINDEX would be blocked by the ExclusiveLock.

--
Hitoshi Harada

#26Robert Haas
robertmhaas@gmail.com
In reply to: Hitoshi Harada (#25)
Re: refresh materialized view concurrently

On Tue, Jul 2, 2013 at 4:02 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:

Other than these, I've found index is opened with NoLock, relying on
ExclusiveLock of parent matview, and ALTER INDEX SET TABLESPACE or something
similar can run concurrently, but it is presumably safe. DROP INDEX,
REINDEX would be blocked by the ExclusiveLock.

I doubt very much that this is safe. And even if it is safe today, I
think it's a bad idea, because we're likely to try to reduce lock
levels in the future. Taking no lock on a relation we're opening,
even an index, seems certain to be a bad idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Kevin Grittner
kgrittn@ymail.com
In reply to: Robert Haas (#26)
Re: refresh materialized view concurrently

Robert Haas <robertmhaas@gmail.com> wrote:

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

Other than these, I've found index is opened with NoLock,
relying on ExclusiveLock of parent matview, and ALTER INDEX SET
TABLESPACE or something similar can run concurrently, but it is
presumably safe.  DROP INDEX, REINDEX would be blocked by the
ExclusiveLock.

I doubt very much that this is safe.  And even if it is safe
today, I think it's a bad idea, because we're likely to try to
reduce lock levels in the future.  Taking no lock on a relation
we're opening, even an index, seems certain to be a bad idea.

What we're talking about is taking a look at the index definition
while the indexed table involved is covered by an ExclusiveLock.
Why is that more dangerous than inserting entries into an index
without taking a lock on that index while the indexed table is
covered by a RowExclusiveLock, as happens on INSERT?
RowExclusiveLock is a much weaker lock, and we can't add entries to
an index without looking at its definition.  Should we be taking
out locks on every index for every INSERT?  If not, what makes that
safe while merely looking at the definition is too risky?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#27)
Re: refresh materialized view concurrently

Kevin Grittner <kgrittn@ymail.com> writes:

Robert Haas <robertmhaas@gmail.com> wrote:

I doubt very much that this is safe.� And even if it is safe
today, I think it's a bad idea, because we're likely to try to
reduce lock levels in the future.� Taking no lock on a relation
we're opening, even an index, seems certain to be a bad idea.

I'm with Robert on this.

What we're talking about is taking a look at the index definition
while the indexed table involved is covered by an ExclusiveLock.
Why is that more dangerous than inserting entries into an index
without taking a lock on that index while the indexed table is
covered by a RowExclusiveLock, as happens on INSERT?

I don't believe that that happens. If it does, it's a bug. Either the
planner or the executor should be taking a lock on each index touched
by a query.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#28)
Re: refresh materialized view concurrently

On Wed, Jul 3, 2013 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Kevin Grittner <kgrittn@ymail.com> writes:

Robert Haas <robertmhaas@gmail.com> wrote:

I doubt very much that this is safe. And even if it is safe
today, I think it's a bad idea, because we're likely to try to
reduce lock levels in the future. Taking no lock on a relation
we're opening, even an index, seems certain to be a bad idea.

I'm with Robert on this.

What we're talking about is taking a look at the index definition
while the indexed table involved is covered by an ExclusiveLock.
Why is that more dangerous than inserting entries into an index
without taking a lock on that index while the indexed table is
covered by a RowExclusiveLock, as happens on INSERT?

I don't believe that that happens. If it does, it's a bug. Either the
planner or the executor should be taking a lock on each index touched
by a query.

It seems Kevin's right. Not sure why that doesn't break.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#29)
Re: refresh materialized view concurrently

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jul 3, 2013 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't believe that that happens. If it does, it's a bug. Either the
planner or the executor should be taking a lock on each index touched
by a query.

It seems Kevin's right. Not sure why that doesn't break.

Are we somehow not going through ExecOpenIndices?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#30)
Re: refresh materialized view concurrently

On Wed, Jul 3, 2013 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jul 3, 2013 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't believe that that happens. If it does, it's a bug. Either the
planner or the executor should be taking a lock on each index touched
by a query.

It seems Kevin's right. Not sure why that doesn't break.

Are we somehow not going through ExecOpenIndices?

I dunno. I just did a quick black-box test:

CREATE TABLE foo (a int primary key);
BEGIN;
INSERT INTO foo VALUES (1);
SELECT relation::regclass, locktype, mode, granted FROM pg_locks;

I get:

relation | locktype | mode | granted
----------+---------------+------------------+---------
pg_locks | relation | AccessShareLock | t
foo | relation | RowExclusiveLock | t
| virtualxid | ExclusiveLock | t
| transactionid | ExclusiveLock | t

No foo_pkey anywhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#31)
Re: refresh materialized view concurrently

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jul 3, 2013 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Are we somehow not going through ExecOpenIndices?

I dunno. I just did a quick black-box test:

CREATE TABLE foo (a int primary key);
BEGIN;
INSERT INTO foo VALUES (1);
SELECT relation::regclass, locktype, mode, granted FROM pg_locks;

I get:

relation | locktype | mode | granted
----------+---------------+------------------+---------
pg_locks | relation | AccessShareLock | t
foo | relation | RowExclusiveLock | t
| virtualxid | ExclusiveLock | t
| transactionid | ExclusiveLock | t

No foo_pkey anywhere.

That proves nothing, as we don't keep such locks after the query
(and there's no reason to AFAICS). See ExecCloseIndices.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#32)
Re: refresh materialized view concurrently

On 2013-07-03 11:08:32 -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jul 3, 2013 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Are we somehow not going through ExecOpenIndices?

I dunno. I just did a quick black-box test:

CREATE TABLE foo (a int primary key);
BEGIN;
INSERT INTO foo VALUES (1);
SELECT relation::regclass, locktype, mode, granted FROM pg_locks;

I get:

relation | locktype | mode | granted
----------+---------------+------------------+---------
pg_locks | relation | AccessShareLock | t
foo | relation | RowExclusiveLock | t
| virtualxid | ExclusiveLock | t
| transactionid | ExclusiveLock | t

No foo_pkey anywhere.

That proves nothing, as we don't keep such locks after the query
(and there's no reason to AFAICS). See ExecCloseIndices.

Should be easy enough to test by hacking LOCK TABLE to lock indexes and
taking out a conflicting lock (SHARE?) in a second transaction. I wonder
if we shouldn't just generally allow that, I remember relaxing that
check before (when playing with CREATE/DROP CONCURRENTLY afair).

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34Kevin Grittner
kgrittn@ymail.com
In reply to: Tom Lane (#32)
Re: refresh materialized view concurrently

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Are we somehow not going through ExecOpenIndices?

I dunno.  I just did a quick black-box test:

[ begin; insert; without commit ]

No foo_pkey anywhere.

That proves nothing, as we don't keep such locks after the query
(and there's no reason to AFAICS).  See ExecCloseIndices.

OK.  I had seen that no locks were held after the insert and wasn't
aware that we acquired and then released them for each insert
within a transaction.  On the other hand, we acquire locks on all
indexes even for a HOT UPDATE which uses a seqscan, and hold those
until end of transaction.  Is there a reason for that?

I suppose that since a concurrent refresh is very likely to lock
all these indexes with RowExclusiveLock anyway, as a result of the
DELETE, UPDATE and INSERT statements subsequently issued through
SPI, I might was well acquire that lock when I look at the
definition, and not release it -- so that the subsequent locks are
local to the backend, and therefore faster.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#34)
Re: refresh materialized view concurrently

Kevin Grittner <kgrittn@ymail.com> writes:

OK.� I had seen that no locks were held after the insert and wasn't
aware that we acquired and then released them for each insert
within a transaction.� On the other hand, we acquire locks on all
indexes even for a HOT UPDATE which uses a seqscan, and hold those
until end of transaction.� Is there a reason for that?

Sounds dubious to me; although in the HOT code it might be that there's
no convenient place to release the index locks.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36Kevin Grittner
kgrittn@ymail.com
In reply to: Tom Lane (#35)
Re: refresh materialized view concurrently

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Kevin Grittner <kgrittn@ymail.com> writes:

we acquire locks on all indexes even for a HOT UPDATE which uses
a seqscan, and hold those until end of transaction.  Is there a
reason for that?

Sounds dubious to me; although in the HOT code it might be that
there's no convenient place to release the index locks.

Further testing shows that any UPDATE or DELETE statement acquires
a RowExclusiveLock on every index on the table and holds it until
end of transaction, whether or not any rows are affected and
regardless of whether an index scan or a seqscan is used.  In fact,
just an EXPLAIN of an UPDATE or DELETE does so.  It is only INSERT
which releases the locks at the end of the statement.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#36)
Re: refresh materialized view concurrently

Kevin Grittner <kgrittn@ymail.com> writes:

Further testing shows that any UPDATE or DELETE statement acquires
a RowExclusiveLock on every index on the table and holds it until
end of transaction, whether or not any rows are affected and
regardless of whether an index scan or a seqscan is used.� In fact,
just an EXPLAIN of an UPDATE or DELETE does so.� It is only INSERT
which releases the locks at the end of the statement.

Hm, possibly the planner is taking those locks. I don't think the
executor's behavior is any different. But the planner only cares about
indexes in SELECT/UPDATE/DELETE, since an INSERT has no interest in
scanning the target table.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38Kevin Grittner
kgrittn@ymail.com
In reply to: Hitoshi Harada (#25)
1 attachment(s)
Re: refresh materialized view concurrently

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

Oops!

Indeed.  Thanks for the careful testing.

drop materialized view if exists mv;
drop table if exists foo;
create table foo(a, b) as values(1, 10);
create materialized view mv as select * from foo;
create unique index on mv(a);
insert into foo select * from foo;
refresh materialized view mv;
refresh materialized view concurrently mv;

test=# refresh materialized view mv;
ERROR:  could not create unique index "mv_a_idx"
DETAIL:  Key (a)=(1) is duplicated.
test=# refresh materialized view concurrently mv;
REFRESH MATERIALIZED VIEW

Fixed by scanning the temp table for duplicates before generating
the diff:

test=# refresh materialized view concurrently mv;
ERROR:  new data for "mv" contains duplicate rows without any NULL columns
DETAIL:  Row: (1,10)

[ matview with all columns covered by unique indexes fails ]

Fixed.

Other than these, I've found index is opened with NoLock, relying
on ExclusiveLock of parent matview, and ALTER INDEX SET
TABLESPACE or something similar can run concurrently, but it is
presumably safe.  DROP INDEX, REINDEX would be blocked by the
ExclusiveLock.

Since others were also worried that an index definition could be
modified while another process is holding an ExclusiveLock on its
table, I changed this.

New version attached.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

refresh-concurrently-v3.patchtext/x-diff; name=refresh-concurrently-v3.patchDownload
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***************
*** 928,935 **** ERROR:  could not serialize access due to read/write dependencies among transact
          </para>
  
          <para>
!          This lock mode is not automatically acquired on tables by any
!          <productname>PostgreSQL</productname> command.
          </para>
         </listitem>
        </varlistentry>
--- 928,934 ----
          </para>
  
          <para>
!          Acquired by <command>REFRESH MATERIALIZED VIEW CONCURRENTLY</command>.
          </para>
         </listitem>
        </varlistentry>
*** a/doc/src/sgml/ref/refresh_materialized_view.sgml
--- b/doc/src/sgml/ref/refresh_materialized_view.sgml
***************
*** 21,27 **** PostgreSQL documentation
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
--- 21,27 ----
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
***************
*** 38,43 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 38,47 ----
     data is generated and the materialized view is left in an unscannable
     state.
    </para>
+   <para>
+    <literal>CONCURRENTLY</literal> and <literal>WITH NO DATA</literal> may not
+    be specified together.
+   </para>
   </refsect1>
  
   <refsect1>
***************
*** 45,50 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 49,82 ----
  
    <variablelist>
     <varlistentry>
+     <term><literal>CONCURRENTLY</literal></term>
+     <listitem>
+      <para>
+       Refresh the materialized view without locking out concurrent selects on
+       the materialized view.  Without this option a refresh which affects a
+       lot of rows will tend to use fewer resources and complete more quickly,
+       but could block other connections which are trying to read from the
+       materialized view.  This option may be faster in cases where a small
+       number of rows are affected.
+      </para>
+      <para>
+       This option is only allowed if there is at least one
+       <literal>UNIQUE</literal> index on the materialized view which uses only
+       column names and includes all rows; that is, it must not index on any
+       expressions nor include a <literal>WHERE</literal> clause.
+      </para>
+      <para>
+       This option may not be used when the materialized view is not already
+       populated.
+      </para>
+      <para>
+       Even with this option only one <literal>REFRESH</literal> at a time may
+       run against any one materialized view.
+      </para>
+     </listitem>
+    </varlistentry>
+ 
+    <varlistentry>
      <term><replaceable class="PARAMETER">name</replaceable></term>
      <listitem>
       <para>
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 589,595 **** rebuild_relation(Relation OldHeap, Oid indexOid,
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
--- 589,596 ----
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace, false,
! 							   AccessExclusiveLock);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
***************
*** 616,622 **** rebuild_relation(Relation OldHeap, Oid indexOid,
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
--- 617,624 ----
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
***************
*** 626,633 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
  
! 	OldHeap = heap_open(OIDOldHeap, AccessExclusiveLock);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
--- 628,637 ----
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
+ 	Oid			namespaceid;
+ 	char		relpersistence;
  
! 	OldHeap = heap_open(OIDOldHeap, lockmode);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
***************
*** 648,653 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
--- 652,668 ----
  	if (isNull)
  		reloptions = (Datum) 0;
  
+ 	if (forcetemp)
+ 	{
+ 		namespaceid = LookupCreationNamespace("pg_temp");
+ 		relpersistence = RELPERSISTENCE_TEMP;
+ 	}
+ 	else
+ 	{
+ 		namespaceid = RelationGetNamespace(OldHeap);
+ 		relpersistence = OldHeap->rd_rel->relpersistence;
+ 	}
+ 
  	/*
  	 * Create the new heap, using a temporary name in the same namespace as
  	 * the existing table.	NOTE: there is some risk of collision with user
***************
*** 663,669 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  RelationGetNamespace(OldHeap),
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
--- 678,684 ----
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  namespaceid,
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
***************
*** 671,678 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  OldHeap->rd_rel->relkind,
! 										  OldHeap->rd_rel->relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
--- 686,693 ----
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  RELKIND_RELATION,
! 										  relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 20,36 ****
--- 20,43 ----
  #include "catalog/catalog.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
+ #include "catalog/pg_operator.h"
  #include "commands/cluster.h"
  #include "commands/matview.h"
  #include "commands/tablecmds.h"
+ #include "commands/tablespace.h"
  #include "executor/executor.h"
+ #include "executor/spi.h"
  #include "miscadmin.h"
+ #include "parser/parse_relation.h"
  #include "rewrite/rewriteHandler.h"
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
+ #include "utils/builtins.h"
+ #include "utils/lsyscache.h"
  #include "utils/rel.h"
  #include "utils/snapmgr.h"
  #include "utils/syscache.h"
+ #include "utils/typcache.h"
  
  
  typedef struct
***************
*** 44,49 **** typedef struct
--- 51,58 ----
  	BulkInsertState bistate;	/* bulk insert state */
  } DR_transientrel;
  
+ static int	matview_maintenance_depth = 0;
+ 
  static void transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo);
  static void transientrel_receive(TupleTableSlot *slot, DestReceiver *self);
  static void transientrel_shutdown(DestReceiver *self);
***************
*** 51,56 **** static void transientrel_destroy(DestReceiver *self);
--- 60,74 ----
  static void refresh_matview_datafill(DestReceiver *dest, Query *query,
  						 const char *queryString);
  
+ static char *make_temptable_name_n(char *tempname, int n);
+ static void mv_GenerateOper(StringInfo buf, Oid opoid);
+ 
+ static void refresh_by_match_merge(Oid matviewOid, Oid tempOid);
+ static void refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap);
+ 
+ static void OpenMatViewIncrementalMaintenance(void);
+ static void CloseMatViewIncrementalMaintenance(void);
+ 
  /*
   * SetMatViewPopulatedState
   *		Mark a materialized view as populated, or not.
***************
*** 125,136 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  AccessExclusiveLock, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
--- 143,160 ----
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
+ 	bool		concurrent;
+ 	LOCKMODE	lockmode;
+ 
+ 	/* Determine strength of lock needed. */
+ 	concurrent = stmt->concurrent;
+ 	lockmode = concurrent ? ExclusiveLock : AccessExclusiveLock;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  lockmode, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
***************
*** 141,151 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/*
! 	 * We're not using materialized views in the system catalogs.
! 	 */
  	Assert(!IsSystemRelation(matviewRel));
  
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
--- 165,186 ----
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/* Check that CONCURRENTLY is not specified if not populated. */
! 	if (concurrent && !RelationIsPopulated(matviewRel))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 				 errmsg("CONCURRENTLY cannot be used when the materialized view is not populated")));
! 
! 	/* Check that conflicting options have not been specified. */
! 	if (concurrent && stmt->skipData)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_SYNTAX_ERROR),
! 				 errmsg("CONCURRENTLY and WITH NO DATA options cannot be used together")));
! 
! 	/* We're not using materialized views in the system catalogs. */
  	Assert(!IsSystemRelation(matviewRel));
  
+ 	/* We don't allow an oid column for a materialized view. */
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
***************
*** 197,222 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/*
! 	 * Swap the physical files of the target and transient tables, then
! 	 * rebuild the target's indexes and throw away the transient table.
! 	 */
! 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
! 					 RecentXmin, ReadNextMultiXactId());
! 
! 	RelationCacheInvalidateEntry(matviewOid);
  }
  
  /*
--- 232,259 ----
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	/* Concurrent refresh builds new data in temp tablespace, and does diff. */
! 	if (concurrent)
! 		tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP);
! 	else
! 		tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace, concurrent,
! 							   ExclusiveLock);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/* Make the matview match the newly generated data. */
! 	if (concurrent)
! 		refresh_by_match_merge(matviewOid, OIDNewHeap);
! 	else
! 		refresh_by_heap_swap(matviewOid, OIDNewHeap);
  }
  
  /*
***************
*** 369,371 **** transientrel_destroy(DestReceiver *self)
--- 406,786 ----
  {
  	pfree(self);
  }
+ 
+ 
+ /*
+  * Given a qualified temporary table name, append an underscore followed by
+  * the given integer, to make a new table name based on the old one.
+  *
+  * This leaks memory through palloc(), which won't be cleaned up until the
+  * current memory memory context is freed.
+  */
+ static char *
+ make_temptable_name_n(char *tempname, int n)
+ {
+ 	StringInfoData namebuf;
+ 
+ 	initStringInfo(&namebuf);
+ 	appendStringInfoString(&namebuf, tempname);
+ 	appendStringInfo(&namebuf, "_%i", n);
+ 	return namebuf.data;
+ }
+ 
+ static void
+ mv_GenerateOper(StringInfo buf, Oid opoid)
+ {
+ 	HeapTuple	opertup;
+ 	Form_pg_operator operform;
+ 
+ 	opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(opoid));
+ 	if (!HeapTupleIsValid(opertup))
+ 		elog(ERROR, "cache lookup failed for operator %u", opoid);
+ 	operform = (Form_pg_operator) GETSTRUCT(opertup);
+ 	Assert(operform->oprkind == 'b');
+ 
+ 	appendStringInfo(buf, "OPERATOR(%s.%s)",
+ 					 quote_identifier(get_namespace_name(operform->oprnamespace)),
+ 					 NameStr(operform->oprname));
+ 
+ 	ReleaseSysCache(opertup);
+ }
+ 
+ /*
+  * refresh_by_match_merge
+  *
+  * Refresh a materialized view with transactional semantics, while allowing
+  * concurrent reads.
+  *
+  * This is called after a new version of the data has been created in a
+  * temporary table.  It performs a full outer join against the old version of
+  * the data, producing "diff" results.	This join cannot work if there are any
+  * duplicated rows in either the old or new versions, in the sense that every
+  * column would compare as equal between the two rows.	It does work correctly
+  * in the face of rows which have at least one NULL value, with all non-NULL
+  * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
+  * indexes turns out to be quite convenient here; the tests we need to make
+  * are consistent with default behavior.  If there is at least one UNIQUE
+  * index on the materialized view, we have exactly the guarantee we need.  By
+  * joining based on equality on all columns which are part of any unique
+  * index, we identify the rows on which we can use UPDATE without any problem.
+  * If any column is NULL in either the old or new version of a row (or both),
+  * we must use DELETE and INSERT, since there could be multiple rows which are
+  * NOT DISTINCT FROM each other, and we could otherwise end up with the wrong
+  * number of occurrences in the updated relation.  The temporary table used to
+  * hold the diff results contains just the TID of the old record (if matched)
+  * and the ROW from the new table as a single column of complex record type
+  * (if matched).
+  *
+  * Once we have the diff table, we perform set-based DELETE, UPDATE, and
+  * INSERT operations against the materialized view, and discard both temporary
+  * tables.
+  *
+  * Everything from the generation of the new data to applying the differences
+  * takes place under cover of an ExclusiveLock, since it seems as though we
+  * would want to prohibit not only concurrent REFRESH operations, but also
+  * incremental maintenance.  It also doesn't seem reasonable or safe to allow
+  * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
+  * this command.
+  */
+ static void
+ refresh_by_match_merge(Oid matviewOid, Oid tempOid)
+ {
+ 	StringInfoData querybuf;
+ 	Relation	matviewRel;
+ 	Relation	tempRel;
+ 	char	   *matviewname;
+ 	char	   *tempname;
+ 	char	   *diffname;
+ 	TupleDesc	tupdesc;
+ 	bool		foundUniqueIndex;
+ 	List	   *indexoidlist;
+ 	ListCell   *indexoidscan;
+ 	int16		relnatts;
+ 	bool	   *usedForQual;
+ 
+ 	initStringInfo(&querybuf);
+ 	matviewRel = heap_open(matviewOid, NoLock);
+ 	matviewname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(matviewRel)),
+ 											  RelationGetRelationName(matviewRel));
+ 	tempRel = heap_open(tempOid, NoLock);
+ 	tempname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(tempRel)),
+ 										   RelationGetRelationName(tempRel));
+ 	diffname = make_temptable_name_n(tempname, 2);
+ 
+ 	relnatts = matviewRel->rd_rel->relnatts;
+ 	usedForQual = (bool *) palloc0(sizeof(bool) * relnatts);
+ 
+ 	/* Open SPI context. */
+ 	if (SPI_connect() != SPI_OK_CONNECT)
+ 		elog(ERROR, "SPI_connect failed");
+ 
+ 	/* Analyze the temp table with the new contents. */
+ 	appendStringInfo(&querybuf, "ANALYZE %s", tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We need to ensure that there are not duplicate rows without NULLs in
+ 	 * the new data set before we can count on the "diff" results.  Check for
+ 	 * that in a way that allows showing the first duplicated row found.  Even
+ 	 * after we pass this test, a unique index on the materialized view may
+ 	 * find a duplicate key problem.
+ 	 */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					  "SELECT x FROM %s x WHERE x IS NOT NULL AND EXISTS "
+ 					  "(SELECT * FROM %s y WHERE (y.*) = (x.*) "
+ 					  "AND y.ctid <> x.ctid) LIMIT 1",
+ 					  tempname, tempname);
+ 	if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 	if (SPI_processed > 0)
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_CARDINALITY_VIOLATION),
+ 				 errmsg("new data for \"%s\" contains duplicate rows without any NULL columns",
+ 						RelationGetRelationName(matviewRel)),
+ 				 errdetail("Row: %s",
+ 						   SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1))));
+ 	}
+ 
+ 	/* Start building the query for creating the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "CREATE TEMP TABLE %s AS "
+ 					 "SELECT x.ctid AS tid, y FROM %s x FULL JOIN %s y ON (",
+ 					 diffname, matviewname, tempname);
+ 
+ 	/*
+ 	 * Get the list of index OIDs for the table from the relcache, and look up
+ 	 * each one in the pg_index syscache.  We will test for equality on all
+ 	 * columns present in all unique indexes which only reference columns and
+ 	 * include all rows.
+ 	 */
+ 	tupdesc = matviewRel->rd_att;
+ 	foundUniqueIndex = false;
+ 	indexoidlist = RelationGetIndexList(matviewRel);
+ 
+ 	foreach(indexoidscan, indexoidlist)
+ 	{
+ 		Oid			indexoid = lfirst_oid(indexoidscan);
+ 		HeapTuple	indexTuple;
+ 		Form_pg_index index;
+ 
+ 		indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexoid));
+ 		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
+ 			elog(ERROR, "cache lookup failed for index %u", indexoid);
+ 		index = (Form_pg_index) GETSTRUCT(indexTuple);
+ 
+ 		/* We're only interested if it is unique and valid. */
+ 		if (index->indisunique && IndexIsValid(index))
+ 		{
+ 			int			numatts = index->indnatts;
+ 			int			i;
+ 			bool		expr = false;
+ 			Relation	indexRel;
+ 
+ 			/* Skip any index on an expression. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				if (index->indkey.values[i] == 0)
+ 				{
+ 					expr = true;
+ 					break;
+ 				}
+ 			}
+ 			if (expr)
+ 			{
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 
+ 			/* Skip partial indexes. */
+ 			indexRel = index_open(index->indexrelid, RowExclusiveLock);
+ 			if (indexRel->rd_indpred != NIL)
+ 			{
+ 				index_close(indexRel, NoLock);
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 			/* Hold the locks, since we're about to run DML which needs them. */
+ 			index_close(indexRel, NoLock);
+ 
+ 			/* Add quals for all columns from this index. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				int			attnum = index->indkey.values[i];
+ 				Oid			type;
+ 				Oid			op;
+ 				const char	   *colname;
+ 
+ 				/*
+ 				 * Only include the column once regardless of how many times
+ 				 * it shows up in how many indexes.
+ 				 *
+ 				 * This is also useful later to omit columns which can not
+ 				 * have changed from the SET clause of the UPDATE statement.
+ 				 */
+ 				if (usedForQual[attnum - 1])
+ 					continue;
+ 				usedForQual[attnum - 1] = true;
+ 
+ 				/*
+ 				 * Actually add the qual, ANDed with any others.
+ 				 */
+ 				if (foundUniqueIndex)
+ 					appendStringInfoString(&querybuf, " AND ");
+ 
+ 				colname = quote_identifier(NameStr((tupdesc->attrs[attnum - 1])->attname));
+ 				appendStringInfo(&querybuf, "y.%s ", colname);
+ 				type = attnumTypeId(matviewRel, attnum);
+ 				op = lookup_type_cache(type, TYPECACHE_EQ_OPR)->eq_opr;
+ 				mv_GenerateOper(&querybuf, op);
+ 				appendStringInfo(&querybuf, " x.%s", colname);
+ 
+ 				foundUniqueIndex = true;
+ 			}
+ 		}
+ 		ReleaseSysCache(indexTuple);
+ 	}
+ 
+ 	list_free(indexoidlist);
+ 
+ 	if (!foundUniqueIndex)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 				 errmsg("cannot refresh materialized view \"%s\" concurrently",
+ 						matviewname),
+ 				 errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
+ 
+ 	appendStringInfoString(&querybuf,
+ 						   ") WHERE (y.*) IS DISTINCT FROM (x.*)"
+ 						   " ORDER BY tid");
+ 
+ 	/* Create the temporary "diff" table. */
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We have no further use for data from the "full-data" temp table, but we
+ 	 * must keep it around because its type is reference from the diff table.
+ 	 */
+ 
+ 	/* Analyze the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "ANALYZE %s", diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	OpenMatViewIncrementalMaintenance();
+ 
+ 	/* Deletes must come before inserts; do them first. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "DELETE FROM %s WHERE ctid IN "
+ 					 "(SELECT d.tid FROM %s d "
+ 					 "WHERE d.tid IS NOT NULL "
+ 					 "AND (d.y) IS NOT DISTINCT FROM NULL)",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Updates before inserts gives a better chance at HOT updates. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "UPDATE %s x SET ", matviewname);
+ 
+ 	{
+ 		int			i;
+ 		bool		targetColFound = false;
+ 
+ 		for (i = 0; i < tupdesc->natts; i++)
+ 		{
+ 			const char	   *colname;
+ 
+ 			if (tupdesc->attrs[i]->attisdropped)
+ 				continue;
+ 
+ 			if (usedForQual[i])
+ 				continue;
+ 
+ 			if (targetColFound)
+ 				appendStringInfoString(&querybuf, ", ");
+ 			targetColFound = true;
+ 
+ 			colname = quote_identifier(NameStr((tupdesc->attrs[i])->attname));
+ 			appendStringInfo(&querybuf, "%s = (d.y).%s", colname, colname);
+ 		}
+ 
+ 		if (targetColFound)
+ 		{
+ 			appendStringInfo(&querybuf,
+ 							  " FROM %s d "
+ 							  "WHERE d.tid IS NOT NULL AND x.ctid = d.tid",
+ 							  diffname);
+ 
+ 			if (SPI_exec(querybuf.data, 0) != SPI_OK_UPDATE)
+ 				elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 		}
+ 	}
+ 
+ 	/* Inserts go last. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "INSERT INTO %s SELECT (y).* FROM %s WHERE tid IS NULL",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* We're done maintaining the materialized view. */
+ 	CloseMatViewIncrementalMaintenance();
+ 	heap_close(tempRel, NoLock);
+ 	heap_close(matviewRel, NoLock);
+ 
+ 	/* Clean up temp tables. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "DROP TABLE %s, %s", diffname, tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Close SPI context. */
+ 	if (SPI_finish() != SPI_OK_FINISH)
+ 		elog(ERROR, "SPI_finish failed");
+ }
+ 
+ /*
+  * Swap the physical files of the target and transient tables, then rebuild
+  * the target's indexes and throw away the transient table.
+  */
+ static void
+ refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap)
+ {
+ 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
+ 					 RecentXmin, ReadNextMultiXactId());
+ 
+ 	RelationCacheInvalidateEntry(matviewOid);
+ }
+ 
+ static void
+ OpenMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth++;
+ }
+ 
+ static void
+ CloseMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth--;
+ 	Assert(matview_maintenance_depth >= 0);
+ }
+ 
+ /*
+  * This should be used to test whether the backend is in a context where it is
+  * OK to allow DML statements to modify materialized views.  We only want to
+  * allow that for internal code driven by the materialized view definition,
+  * not for arbitrary user-supplied code.
+  */
+ bool
+ MatViewIncrementalMaintenanceIsEnabled(void)
+ {
+ 	return matview_maintenance_depth > 0;
+ }
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 3541,3547 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace);
  
  			/*
  			 * Copy the heap data into the new table with the desired
--- 3541,3548 ----
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace, false,
! 									   AccessExclusiveLock);
  
  			/*
  			 * Copy the heap data into the new table with the desired
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 42,47 ****
--- 42,48 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/namespace.h"
+ #include "commands/matview.h"
  #include "commands/trigger.h"
  #include "executor/execdebug.h"
  #include "foreign/fdwapi.h"
***************
*** 999,1008 **** CheckValidResultRel(Relation resultRel, CmdType operation)
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 					 errmsg("cannot change materialized view \"%s\"",
! 							RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
--- 1000,1010 ----
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			if (!MatViewIncrementalMaintenanceIsEnabled())
! 				ereport(ERROR,
! 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 						 errmsg("cannot change materialized view \"%s\"",
! 								RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***************
*** 950,956 **** ExecModifyTable(ModifyTableState *node)
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
--- 950,956 ----
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
***************
*** 1280,1286 **** ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
--- 1280,1287 ----
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION ||
! 						relkind == RELKIND_MATVIEW)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3241,3246 **** _copyRefreshMatViewStmt(const RefreshMatViewStmt *from)
--- 3241,3247 ----
  {
  	RefreshMatViewStmt *newnode = makeNode(RefreshMatViewStmt);
  
+ 	COPY_SCALAR_FIELD(concurrent);
  	COPY_SCALAR_FIELD(skipData);
  	COPY_NODE_FIELD(relation);
  
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1521,1526 **** _equalCreateTableAsStmt(const CreateTableAsStmt *a, const CreateTableAsStmt *b)
--- 1521,1527 ----
  static bool
  _equalRefreshMatViewStmt(const RefreshMatViewStmt *a, const RefreshMatViewStmt *b)
  {
+ 	COMPARE_SCALAR_FIELD(concurrent);
  	COMPARE_SCALAR_FIELD(skipData);
  	COMPARE_NODE_FIELD(relation);
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 3301,3311 **** OptNoLog:	UNLOGGED					{ $$ = RELPERSISTENCE_UNLOGGED; }
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->relation = $4;
! 					n->skipData = !($5);
  					$$ = (Node *) n;
  				}
  		;
--- 3301,3312 ----
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW opt_concurrently qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->concurrent = $4;
! 					n->relation = $5;
! 					n->skipData = !($6);
  					$$ = (Node *) n;
  				}
  		;
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 2871,2876 **** psql_completion(char *text, int start, int end)
--- 2871,2882 ----
  	else if (pg_strcasecmp(prev3_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev2_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev_wd, "VIEW") == 0)
+ 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews,
+ 								   " UNION SELECT 'CONCURRENTLY'");
+ 	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews, NULL);
  	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
***************
*** 2879,2884 **** psql_completion(char *text, int start, int end)
--- 2885,2895 ----
  	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "CONCURRENTLY") == 0)
+ 		COMPLETE_WITH_CONST("WITH DATA");
+ 	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev_wd, "WITH") == 0)
  	{
  		static const char *const list_WITH_DATA[] =
***************
*** 2889,2894 **** psql_completion(char *text, int start, int end)
--- 2900,2911 ----
  	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "CONCURRENTLY") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "WITH") == 0)
+ 		COMPLETE_WITH_CONST("DATA");
+ 	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev2_wd, "WITH") == 0 &&
  			 pg_strcasecmp(prev_wd, "NO") == 0)
  		COMPLETE_WITH_CONST("DATA");
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 25,31 **** extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid,
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid	make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
--- 25,32 ----
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 27,30 **** extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString
--- 27,32 ----
  
  extern DestReceiver *CreateTransientRelDestReceiver(Oid oid);
  
+ extern bool MatViewIncrementalMaintenanceIsEnabled(void);
+ 
  #endif   /* MATVIEW_H */
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 2478,2483 **** typedef struct CreateTableAsStmt
--- 2478,2484 ----
  typedef struct RefreshMatViewStmt
  {
  	NodeTag		type;
+ 	bool		concurrent;		/* allow concurrent access? */
  	bool		skipData;		/* true for WITH NO DATA */
  	RangeVar   *relation;		/* relation to insert into */
  } RefreshMatViewStmt;
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 73,78 **** SELECT * FROM tvm;
--- 73,80 ----
  
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 141,146 **** ALTER MATERIALIZED VIEW tvm SET SCHEMA mvschema;
--- 143,151 ----
    Column  |  Type   | Modifiers | Storage | Stats target | Description 
  ----------+---------+-----------+---------+--------------+-------------
   grandtot | numeric |           | main    |              | 
+ Indexes:
+     "tvmm_expr" UNIQUE, btree ((grandtot > 0::numeric))
+     "tvmm_pred" UNIQUE, btree (grandtot) WHERE grandtot < 0::numeric
  View definition:
   SELECT sum(tvm.totamt) AS grandtot
     FROM mvschema.tvm;
***************
*** 177,183 **** SELECT * FROM tvm ORDER BY type;
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
--- 182,188 ----
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
***************
*** 237,242 **** SELECT * FROM tvvm;
--- 242,250 ----
  (1 row)
  
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
+ ERROR:  cannot refresh materialized view "public.tvmm" concurrently
+ HINT:  Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 281,286 **** SELECT * FROM tvvm;
--- 289,297 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  NOTICE:  materialized view "no_such_mv" does not exist, skipping
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ ERROR:  CONCURRENTLY and WITH NO DATA options cannot be used together
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
***************
*** 385,387 **** SELECT * FROM hogeview WHERE i < 10;
--- 396,423 ----
  
  DROP TABLE hoge CASCADE;
  NOTICE:  drop cascades to materialized view hogeview
+ -- test that duplicate values on unique index prevent refresh
+ CREATE TABLE foo(a, b) AS VALUES(1, 10);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv(a);
+ INSERT INTO foo SELECT * FROM foo;
+ REFRESH MATERIALIZED VIEW mv;
+ ERROR:  could not create unique index "mv_a_idx"
+ DETAIL:  Key (a)=(1) is duplicated.
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ ERROR:  new data for "mv" contains duplicate rows without any NULL columns
+ DETAIL:  Row: (1,10)
+ DROP TABLE foo CASCADE;
+ NOTICE:  drop cascades to materialized view mv
+ -- make sure that all indexes covered by unique indexes works
+ CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv (a);
+ CREATE UNIQUE INDEX ON mv (b);
+ CREATE UNIQUE INDEX on mv (c);
+ INSERT INTO foo VALUES(2, 3, 4);
+ INSERT INTO foo VALUES(3, 4, 5);
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
+ NOTICE:  drop cascades to materialized view mv
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 29,34 **** CREATE MATERIALIZED VIEW tvm AS SELECT * FROM tv ORDER BY type;
--- 29,36 ----
  SELECT * FROM tvm;
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 57,63 **** INSERT INTO t VALUES (6, 'z', 13);
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
--- 59,65 ----
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
***************
*** 74,79 **** SELECT * FROM tmm;
--- 76,82 ----
  SELECT * FROM tvmm;
  SELECT * FROM tvvm;
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 89,94 **** SELECT * FROM tvvm;
--- 92,100 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ 
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
***************
*** 124,126 **** SELECT * FROM hogeview WHERE i < 10;
--- 130,153 ----
  VACUUM ANALYZE;
  SELECT * FROM hogeview WHERE i < 10;
  DROP TABLE hoge CASCADE;
+ 
+ -- test that duplicate values on unique index prevent refresh
+ CREATE TABLE foo(a, b) AS VALUES(1, 10);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv(a);
+ INSERT INTO foo SELECT * FROM foo;
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
+ 
+ -- make sure that all indexes covered by unique indexes works
+ CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv (a);
+ CREATE UNIQUE INDEX ON mv (b);
+ CREATE UNIQUE INDEX on mv (c);
+ INSERT INTO foo VALUES(2, 3, 4);
+ INSERT INTO foo VALUES(3, 4, 5);
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
#39Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Kevin Grittner (#38)
Re: refresh materialized view concurrently

On Sat, Jul 6, 2013 at 9:20 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

Oops!

Indeed. Thanks for the careful testing.

drop materialized view if exists mv;
drop table if exists foo;
create table foo(a, b) as values(1, 10);
create materialized view mv as select * from foo;
create unique index on mv(a);
insert into foo select * from foo;
refresh materialized view mv;
refresh materialized view concurrently mv;

test=# refresh materialized view mv;
ERROR: could not create unique index "mv_a_idx"
DETAIL: Key (a)=(1) is duplicated.
test=# refresh materialized view concurrently mv;
REFRESH MATERIALIZED VIEW

Fixed by scanning the temp table for duplicates before generating
the diff:

test=# refresh materialized view concurrently mv;
ERROR: new data for "mv" contains duplicate rows without any NULL columns
DETAIL: Row: (1,10)

I think the point is not check the duplicate rows. It is about unique
key constraint violation. So, if you change the original table foo as
values(1, 10), (1, 20), the issue is still reproduced. In
non-concurrent operation it is checked by reindex_index when swapping
the heap, but apparently we are not doing constraint check in
concurrent mode.

[ matview with all columns covered by unique indexes fails ]

Fixed.

Other than these, I've found index is opened with NoLock, relying
on ExclusiveLock of parent matview, and ALTER INDEX SET
TABLESPACE or something similar can run concurrently, but it is
presumably safe. DROP INDEX, REINDEX would be blocked by the
ExclusiveLock.

Since others were also worried that an index definition could be
modified while another process is holding an ExclusiveLock on its
table, I changed this.

OK, others are resolved. One thing I need to apology
make_temptable_name_n, because I pointed the previous coding would be
a potential memory overrun, but actually the relation name is defined
by make_new_heap, so unless the function generates stupid long name,
there is not possibility to make such big name that overruns
NAMEDATALEN. So +1 for revert make_temptable_name_n, which is also
meaninglessly invented.

I've found another issue, which is regarding
matview_maintenance_depth. If SPI calls failed (pg_cancel_backend is
quite possible) and errors out in the middle of processing, this value
stays above zero, so subsequently we can issue DML on the matview. We
should make sure the value becomes zero before jumping out from this
function.

--
Hitoshi Harada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40Kevin Grittner
kgrittn@ymail.com
In reply to: Hitoshi Harada (#39)
1 attachment(s)
Re: refresh materialized view concurrently

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

I think the point is not check the duplicate rows.  It is about unique
key constraint violation.  So, if you change the original table foo as
values(1, 10), (1, 20), the issue is still reproduced.  In
non-concurrent operation it is checked by reindex_index when swapping
the heap, but apparently we are not doing constraint check in
concurrent mode.

The check for duplicate rows is necessary to allow the FULL join to
work correctly, but it is not sufficient without this:

@@ -654,7 +668,7 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid)
                 errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
 
    appendStringInfoString(&querybuf,
-                          ") WHERE (y.*) IS DISTINCT FROM (x.*)"
+                          " AND y = x) WHERE (y.*) IS DISTINCT FROM (x.*)"
                           " ORDER BY tid");
 
    /* Create the temporary "diff" table. */

The sad thing is that I had had that extra test in much earlier
because the relational algebra seemed to require it, but convinced
myself that it wasn't really needed because I couldn't think of a
test that caused a failure without it.  It turns out that was a
failure of imagination on my part.  I guess I should trust the
math.

The only thing I don't like about this is that the duplicate key
errors show as their context the SPI query doing the UPDATE or
INSERT.  I'm not sure whether it's worth the extra processing time
to avoid that.

One thing I need to apology > make_temptable_name_n, because I
pointed the previous coding would be a potential memory overrun,
but actually the relation name is defined by make_new_heap, so
unless the function generates stupid long name, there is not
possibility to make such big name that overruns NAMEDATALEN.  So
+1 for revert make_temptable_name_n, which is also meaninglessly
invented.

I knew that but didn't point it out since I was changing to the
existing functions which use palloc() -- that became immaterial.

I've found another issue, which is regarding
matview_maintenance_depth.  If SPI calls failed
(pg_cancel_backend is quite possible) and errors out in the
middle of processing, this value stays above zero, so
subsequently we can issue DML on the matview.  We should make
sure the value becomes zero before jumping out from this
function.

Good point.  There's more than one way to do that, but this seemed
cleanest to me because it avoids allowing any modification of
matview_maintenance_depth outside of matview.c:

@@ -251,7 +251,21 @@ ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
 
    /* Make the matview match the newly generated data. */
    if (concurrent)
-       refresh_by_match_merge(matviewOid, OIDNewHeap);
+   {
+       int     old_depth = matview_maintenance_depth;
+
+       PG_TRY();
+       {
+           refresh_by_match_merge(matviewOid, OIDNewHeap);
+       }
+       PG_CATCH();
+       {
+           matview_maintenance_depth = old_depth;
+           PG_RE_THROW();
+       }
+       PG_END_TRY();
+       Assert(matview_maintenance_depth == old_depth);
+   }
    else
        refresh_by_heap_swap(matviewOid, OIDNewHeap);
 }

Thanks again!  New patch attached.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

refresh-concurrently-v4.patchtext/x-diff; name=refresh-concurrently-v4.patchDownload
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***************
*** 928,935 **** ERROR:  could not serialize access due to read/write dependencies among transact
          </para>
  
          <para>
!          This lock mode is not automatically acquired on tables by any
!          <productname>PostgreSQL</productname> command.
          </para>
         </listitem>
        </varlistentry>
--- 928,934 ----
          </para>
  
          <para>
!          Acquired by <command>REFRESH MATERIALIZED VIEW CONCURRENTLY</command>.
          </para>
         </listitem>
        </varlistentry>
*** a/doc/src/sgml/ref/refresh_materialized_view.sgml
--- b/doc/src/sgml/ref/refresh_materialized_view.sgml
***************
*** 21,27 **** PostgreSQL documentation
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
--- 21,27 ----
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
***************
*** 38,43 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 38,47 ----
     data is generated and the materialized view is left in an unscannable
     state.
    </para>
+   <para>
+    <literal>CONCURRENTLY</literal> and <literal>WITH NO DATA</literal> may not
+    be specified together.
+   </para>
   </refsect1>
  
   <refsect1>
***************
*** 45,50 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 49,82 ----
  
    <variablelist>
     <varlistentry>
+     <term><literal>CONCURRENTLY</literal></term>
+     <listitem>
+      <para>
+       Refresh the materialized view without locking out concurrent selects on
+       the materialized view.  Without this option a refresh which affects a
+       lot of rows will tend to use fewer resources and complete more quickly,
+       but could block other connections which are trying to read from the
+       materialized view.  This option may be faster in cases where a small
+       number of rows are affected.
+      </para>
+      <para>
+       This option is only allowed if there is at least one
+       <literal>UNIQUE</literal> index on the materialized view which uses only
+       column names and includes all rows; that is, it must not index on any
+       expressions nor include a <literal>WHERE</literal> clause.
+      </para>
+      <para>
+       This option may not be used when the materialized view is not already
+       populated.
+      </para>
+      <para>
+       Even with this option only one <literal>REFRESH</literal> at a time may
+       run against any one materialized view.
+      </para>
+     </listitem>
+    </varlistentry>
+ 
+    <varlistentry>
      <term><replaceable class="PARAMETER">name</replaceable></term>
      <listitem>
       <para>
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 589,595 **** rebuild_relation(Relation OldHeap, Oid indexOid,
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
--- 589,596 ----
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace, false,
! 							   AccessExclusiveLock);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
***************
*** 616,622 **** rebuild_relation(Relation OldHeap, Oid indexOid,
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
--- 617,624 ----
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
***************
*** 626,633 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
  
! 	OldHeap = heap_open(OIDOldHeap, AccessExclusiveLock);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
--- 628,637 ----
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
+ 	Oid			namespaceid;
+ 	char		relpersistence;
  
! 	OldHeap = heap_open(OIDOldHeap, lockmode);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
***************
*** 648,653 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
--- 652,668 ----
  	if (isNull)
  		reloptions = (Datum) 0;
  
+ 	if (forcetemp)
+ 	{
+ 		namespaceid = LookupCreationNamespace("pg_temp");
+ 		relpersistence = RELPERSISTENCE_TEMP;
+ 	}
+ 	else
+ 	{
+ 		namespaceid = RelationGetNamespace(OldHeap);
+ 		relpersistence = OldHeap->rd_rel->relpersistence;
+ 	}
+ 
  	/*
  	 * Create the new heap, using a temporary name in the same namespace as
  	 * the existing table.	NOTE: there is some risk of collision with user
***************
*** 663,669 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  RelationGetNamespace(OldHeap),
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
--- 678,684 ----
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  namespaceid,
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
***************
*** 671,678 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  OldHeap->rd_rel->relkind,
! 										  OldHeap->rd_rel->relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
--- 686,693 ----
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  RELKIND_RELATION,
! 										  relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 20,36 ****
--- 20,43 ----
  #include "catalog/catalog.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
+ #include "catalog/pg_operator.h"
  #include "commands/cluster.h"
  #include "commands/matview.h"
  #include "commands/tablecmds.h"
+ #include "commands/tablespace.h"
  #include "executor/executor.h"
+ #include "executor/spi.h"
  #include "miscadmin.h"
+ #include "parser/parse_relation.h"
  #include "rewrite/rewriteHandler.h"
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
+ #include "utils/builtins.h"
+ #include "utils/lsyscache.h"
  #include "utils/rel.h"
  #include "utils/snapmgr.h"
  #include "utils/syscache.h"
+ #include "utils/typcache.h"
  
  
  typedef struct
***************
*** 44,49 **** typedef struct
--- 51,58 ----
  	BulkInsertState bistate;	/* bulk insert state */
  } DR_transientrel;
  
+ static int	matview_maintenance_depth = 0;
+ 
  static void transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo);
  static void transientrel_receive(TupleTableSlot *slot, DestReceiver *self);
  static void transientrel_shutdown(DestReceiver *self);
***************
*** 51,56 **** static void transientrel_destroy(DestReceiver *self);
--- 60,74 ----
  static void refresh_matview_datafill(DestReceiver *dest, Query *query,
  						 const char *queryString);
  
+ static char *make_temptable_name_n(char *tempname, int n);
+ static void mv_GenerateOper(StringInfo buf, Oid opoid);
+ 
+ static void refresh_by_match_merge(Oid matviewOid, Oid tempOid);
+ static void refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap);
+ 
+ static void OpenMatViewIncrementalMaintenance(void);
+ static void CloseMatViewIncrementalMaintenance(void);
+ 
  /*
   * SetMatViewPopulatedState
   *		Mark a materialized view as populated, or not.
***************
*** 125,136 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  AccessExclusiveLock, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
--- 143,160 ----
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
+ 	bool		concurrent;
+ 	LOCKMODE	lockmode;
+ 
+ 	/* Determine strength of lock needed. */
+ 	concurrent = stmt->concurrent;
+ 	lockmode = concurrent ? ExclusiveLock : AccessExclusiveLock;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  lockmode, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
***************
*** 141,151 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/*
! 	 * We're not using materialized views in the system catalogs.
! 	 */
  	Assert(!IsSystemRelation(matviewRel));
  
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
--- 165,186 ----
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/* Check that CONCURRENTLY is not specified if not populated. */
! 	if (concurrent && !RelationIsPopulated(matviewRel))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 				 errmsg("CONCURRENTLY cannot be used when the materialized view is not populated")));
! 
! 	/* Check that conflicting options have not been specified. */
! 	if (concurrent && stmt->skipData)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_SYNTAX_ERROR),
! 				 errmsg("CONCURRENTLY and WITH NO DATA options cannot be used together")));
! 
! 	/* We're not using materialized views in the system catalogs. */
  	Assert(!IsSystemRelation(matviewRel));
  
+ 	/* We don't allow an oid column for a materialized view. */
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
***************
*** 197,222 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/*
! 	 * Swap the physical files of the target and transient tables, then
! 	 * rebuild the target's indexes and throw away the transient table.
! 	 */
! 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
! 					 RecentXmin, ReadNextMultiXactId());
! 
! 	RelationCacheInvalidateEntry(matviewOid);
  }
  
  /*
--- 232,273 ----
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	/* Concurrent refresh builds new data in temp tablespace, and does diff. */
! 	if (concurrent)
! 		tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP);
! 	else
! 		tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace, concurrent,
! 							   ExclusiveLock);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
! 	/* Make the matview match the newly generated data. */
! 	if (concurrent)
! 	{
! 		int		old_depth = matview_maintenance_depth;
! 
! 		PG_TRY();
! 		{
! 			refresh_by_match_merge(matviewOid, OIDNewHeap);
! 		}
! 		PG_CATCH();
! 		{
! 			matview_maintenance_depth = old_depth;
! 			PG_RE_THROW();
! 		}
! 		PG_END_TRY();
! 		Assert(matview_maintenance_depth == old_depth);
! 	}
! 	else
! 		refresh_by_heap_swap(matviewOid, OIDNewHeap);
  }
  
  /*
***************
*** 369,371 **** transientrel_destroy(DestReceiver *self)
--- 420,800 ----
  {
  	pfree(self);
  }
+ 
+ 
+ /*
+  * Given a qualified temporary table name, append an underscore followed by
+  * the given integer, to make a new table name based on the old one.
+  *
+  * This leaks memory through palloc(), which won't be cleaned up until the
+  * current memory memory context is freed.
+  */
+ static char *
+ make_temptable_name_n(char *tempname, int n)
+ {
+ 	StringInfoData namebuf;
+ 
+ 	initStringInfo(&namebuf);
+ 	appendStringInfoString(&namebuf, tempname);
+ 	appendStringInfo(&namebuf, "_%i", n);
+ 	return namebuf.data;
+ }
+ 
+ static void
+ mv_GenerateOper(StringInfo buf, Oid opoid)
+ {
+ 	HeapTuple	opertup;
+ 	Form_pg_operator operform;
+ 
+ 	opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(opoid));
+ 	if (!HeapTupleIsValid(opertup))
+ 		elog(ERROR, "cache lookup failed for operator %u", opoid);
+ 	operform = (Form_pg_operator) GETSTRUCT(opertup);
+ 	Assert(operform->oprkind == 'b');
+ 
+ 	appendStringInfo(buf, "OPERATOR(%s.%s)",
+ 					 quote_identifier(get_namespace_name(operform->oprnamespace)),
+ 					 NameStr(operform->oprname));
+ 
+ 	ReleaseSysCache(opertup);
+ }
+ 
+ /*
+  * refresh_by_match_merge
+  *
+  * Refresh a materialized view with transactional semantics, while allowing
+  * concurrent reads.
+  *
+  * This is called after a new version of the data has been created in a
+  * temporary table.  It performs a full outer join against the old version of
+  * the data, producing "diff" results.	This join cannot work if there are any
+  * duplicated rows in either the old or new versions, in the sense that every
+  * column would compare as equal between the two rows.	It does work correctly
+  * in the face of rows which have at least one NULL value, with all non-NULL
+  * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
+  * indexes turns out to be quite convenient here; the tests we need to make
+  * are consistent with default behavior.  If there is at least one UNIQUE
+  * index on the materialized view, we have exactly the guarantee we need.  By
+  * joining based on equality on all columns which are part of any unique
+  * index, we identify the rows on which we can use UPDATE without any problem.
+  * If any column is NULL in either the old or new version of a row (or both),
+  * we must use DELETE and INSERT, since there could be multiple rows which are
+  * NOT DISTINCT FROM each other, and we could otherwise end up with the wrong
+  * number of occurrences in the updated relation.  The temporary table used to
+  * hold the diff results contains just the TID of the old record (if matched)
+  * and the ROW from the new table as a single column of complex record type
+  * (if matched).
+  *
+  * Once we have the diff table, we perform set-based DELETE, UPDATE, and
+  * INSERT operations against the materialized view, and discard both temporary
+  * tables.
+  *
+  * Everything from the generation of the new data to applying the differences
+  * takes place under cover of an ExclusiveLock, since it seems as though we
+  * would want to prohibit not only concurrent REFRESH operations, but also
+  * incremental maintenance.  It also doesn't seem reasonable or safe to allow
+  * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
+  * this command.
+  */
+ static void
+ refresh_by_match_merge(Oid matviewOid, Oid tempOid)
+ {
+ 	StringInfoData querybuf;
+ 	Relation	matviewRel;
+ 	Relation	tempRel;
+ 	char	   *matviewname;
+ 	char	   *tempname;
+ 	char	   *diffname;
+ 	TupleDesc	tupdesc;
+ 	bool		foundUniqueIndex;
+ 	List	   *indexoidlist;
+ 	ListCell   *indexoidscan;
+ 	int16		relnatts;
+ 	bool	   *usedForQual;
+ 
+ 	initStringInfo(&querybuf);
+ 	matviewRel = heap_open(matviewOid, NoLock);
+ 	matviewname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(matviewRel)),
+ 											  RelationGetRelationName(matviewRel));
+ 	tempRel = heap_open(tempOid, NoLock);
+ 	tempname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(tempRel)),
+ 										   RelationGetRelationName(tempRel));
+ 	diffname = make_temptable_name_n(tempname, 2);
+ 
+ 	relnatts = matviewRel->rd_rel->relnatts;
+ 	usedForQual = (bool *) palloc0(sizeof(bool) * relnatts);
+ 
+ 	/* Open SPI context. */
+ 	if (SPI_connect() != SPI_OK_CONNECT)
+ 		elog(ERROR, "SPI_connect failed");
+ 
+ 	/* Analyze the temp table with the new contents. */
+ 	appendStringInfo(&querybuf, "ANALYZE %s", tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We need to ensure that there are not duplicate rows without NULLs in
+ 	 * the new data set before we can count on the "diff" results.  Check for
+ 	 * that in a way that allows showing the first duplicated row found.  Even
+ 	 * after we pass this test, a unique index on the materialized view may
+ 	 * find a duplicate key problem.
+ 	 */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					  "SELECT x FROM %s x WHERE x IS NOT NULL AND EXISTS "
+ 					  "(SELECT * FROM %s y WHERE y IS NOT NULL "
+ 					  "AND (y.*) = (x.*) AND y.ctid <> x.ctid) LIMIT 1",
+ 					  tempname, tempname);
+ 	if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 	if (SPI_processed > 0)
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_CARDINALITY_VIOLATION),
+ 				 errmsg("new data for \"%s\" contains duplicate rows without any NULL columns",
+ 						RelationGetRelationName(matviewRel)),
+ 				 errdetail("Row: %s",
+ 						   SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1))));
+ 	}
+ 
+ 	/* Start building the query for creating the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "CREATE TEMP TABLE %s AS "
+ 					 "SELECT x.ctid AS tid, y FROM %s x FULL JOIN %s y ON (",
+ 					 diffname, matviewname, tempname);
+ 
+ 	/*
+ 	 * Get the list of index OIDs for the table from the relcache, and look up
+ 	 * each one in the pg_index syscache.  We will test for equality on all
+ 	 * columns present in all unique indexes which only reference columns and
+ 	 * include all rows.
+ 	 */
+ 	tupdesc = matviewRel->rd_att;
+ 	foundUniqueIndex = false;
+ 	indexoidlist = RelationGetIndexList(matviewRel);
+ 
+ 	foreach(indexoidscan, indexoidlist)
+ 	{
+ 		Oid			indexoid = lfirst_oid(indexoidscan);
+ 		HeapTuple	indexTuple;
+ 		Form_pg_index index;
+ 
+ 		indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexoid));
+ 		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
+ 			elog(ERROR, "cache lookup failed for index %u", indexoid);
+ 		index = (Form_pg_index) GETSTRUCT(indexTuple);
+ 
+ 		/* We're only interested if it is unique and valid. */
+ 		if (index->indisunique && IndexIsValid(index))
+ 		{
+ 			int			numatts = index->indnatts;
+ 			int			i;
+ 			bool		expr = false;
+ 			Relation	indexRel;
+ 
+ 			/* Skip any index on an expression. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				if (index->indkey.values[i] == 0)
+ 				{
+ 					expr = true;
+ 					break;
+ 				}
+ 			}
+ 			if (expr)
+ 			{
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 
+ 			/* Skip partial indexes. */
+ 			indexRel = index_open(index->indexrelid, RowExclusiveLock);
+ 			if (indexRel->rd_indpred != NIL)
+ 			{
+ 				index_close(indexRel, NoLock);
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 			/* Hold the locks, since we're about to run DML which needs them. */
+ 			index_close(indexRel, NoLock);
+ 
+ 			/* Add quals for all columns from this index. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				int			attnum = index->indkey.values[i];
+ 				Oid			type;
+ 				Oid			op;
+ 				const char	   *colname;
+ 
+ 				/*
+ 				 * Only include the column once regardless of how many times
+ 				 * it shows up in how many indexes.
+ 				 *
+ 				 * This is also useful later to omit columns which can not
+ 				 * have changed from the SET clause of the UPDATE statement.
+ 				 */
+ 				if (usedForQual[attnum - 1])
+ 					continue;
+ 				usedForQual[attnum - 1] = true;
+ 
+ 				/*
+ 				 * Actually add the qual, ANDed with any others.
+ 				 */
+ 				if (foundUniqueIndex)
+ 					appendStringInfoString(&querybuf, " AND ");
+ 
+ 				colname = quote_identifier(NameStr((tupdesc->attrs[attnum - 1])->attname));
+ 				appendStringInfo(&querybuf, "y.%s ", colname);
+ 				type = attnumTypeId(matviewRel, attnum);
+ 				op = lookup_type_cache(type, TYPECACHE_EQ_OPR)->eq_opr;
+ 				mv_GenerateOper(&querybuf, op);
+ 				appendStringInfo(&querybuf, " x.%s", colname);
+ 
+ 				foundUniqueIndex = true;
+ 			}
+ 		}
+ 		ReleaseSysCache(indexTuple);
+ 	}
+ 
+ 	list_free(indexoidlist);
+ 
+ 	if (!foundUniqueIndex)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 				 errmsg("cannot refresh materialized view \"%s\" concurrently",
+ 						matviewname),
+ 				 errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
+ 
+ 	appendStringInfoString(&querybuf,
+ 						   " AND y = x) WHERE (y.*) IS DISTINCT FROM (x.*)"
+ 						   " ORDER BY tid");
+ 
+ 	/* Create the temporary "diff" table. */
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We have no further use for data from the "full-data" temp table, but we
+ 	 * must keep it around because its type is reference from the diff table.
+ 	 */
+ 
+ 	/* Analyze the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "ANALYZE %s", diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	OpenMatViewIncrementalMaintenance();
+ 
+ 	/* Deletes must come before inserts; do them first. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "DELETE FROM %s WHERE ctid IN "
+ 					 "(SELECT d.tid FROM %s d "
+ 					 "WHERE d.tid IS NOT NULL "
+ 					 "AND (d.y) IS NOT DISTINCT FROM NULL)",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Updates before inserts gives a better chance at HOT updates. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "UPDATE %s x SET ", matviewname);
+ 
+ 	{
+ 		int			i;
+ 		bool		targetColFound = false;
+ 
+ 		for (i = 0; i < tupdesc->natts; i++)
+ 		{
+ 			const char	   *colname;
+ 
+ 			if (tupdesc->attrs[i]->attisdropped)
+ 				continue;
+ 
+ 			if (usedForQual[i])
+ 				continue;
+ 
+ 			if (targetColFound)
+ 				appendStringInfoString(&querybuf, ", ");
+ 			targetColFound = true;
+ 
+ 			colname = quote_identifier(NameStr((tupdesc->attrs[i])->attname));
+ 			appendStringInfo(&querybuf, "%s = (d.y).%s", colname, colname);
+ 		}
+ 
+ 		if (targetColFound)
+ 		{
+ 			appendStringInfo(&querybuf,
+ 							  " FROM %s d "
+ 							  "WHERE d.tid IS NOT NULL AND x.ctid = d.tid",
+ 							  diffname);
+ 
+ 			if (SPI_exec(querybuf.data, 0) != SPI_OK_UPDATE)
+ 				elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 		}
+ 	}
+ 
+ 	/* Inserts go last. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "INSERT INTO %s SELECT (y).* FROM %s WHERE tid IS NULL",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* We're done maintaining the materialized view. */
+ 	CloseMatViewIncrementalMaintenance();
+ 	heap_close(tempRel, NoLock);
+ 	heap_close(matviewRel, NoLock);
+ 
+ 	/* Clean up temp tables. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "DROP TABLE %s, %s", diffname, tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Close SPI context. */
+ 	if (SPI_finish() != SPI_OK_FINISH)
+ 		elog(ERROR, "SPI_finish failed");
+ }
+ 
+ /*
+  * Swap the physical files of the target and transient tables, then rebuild
+  * the target's indexes and throw away the transient table.
+  */
+ static void
+ refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap)
+ {
+ 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
+ 					 RecentXmin, ReadNextMultiXactId());
+ 
+ 	RelationCacheInvalidateEntry(matviewOid);
+ }
+ 
+ static void
+ OpenMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth++;
+ }
+ 
+ static void
+ CloseMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth--;
+ 	Assert(matview_maintenance_depth >= 0);
+ }
+ 
+ /*
+  * This should be used to test whether the backend is in a context where it is
+  * OK to allow DML statements to modify materialized views.  We only want to
+  * allow that for internal code driven by the materialized view definition,
+  * not for arbitrary user-supplied code.
+  */
+ bool
+ MatViewIncrementalMaintenanceIsEnabled(void)
+ {
+ 	return matview_maintenance_depth > 0;
+ }
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 3541,3547 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace);
  
  			/*
  			 * Copy the heap data into the new table with the desired
--- 3541,3548 ----
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace, false,
! 									   AccessExclusiveLock);
  
  			/*
  			 * Copy the heap data into the new table with the desired
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 42,47 ****
--- 42,48 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/namespace.h"
+ #include "commands/matview.h"
  #include "commands/trigger.h"
  #include "executor/execdebug.h"
  #include "foreign/fdwapi.h"
***************
*** 999,1008 **** CheckValidResultRel(Relation resultRel, CmdType operation)
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 					 errmsg("cannot change materialized view \"%s\"",
! 							RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
--- 1000,1010 ----
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			if (!MatViewIncrementalMaintenanceIsEnabled())
! 				ereport(ERROR,
! 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 						 errmsg("cannot change materialized view \"%s\"",
! 								RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***************
*** 950,956 **** ExecModifyTable(ModifyTableState *node)
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
--- 950,956 ----
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
***************
*** 1280,1286 **** ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
--- 1280,1287 ----
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION ||
! 						relkind == RELKIND_MATVIEW)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3241,3246 **** _copyRefreshMatViewStmt(const RefreshMatViewStmt *from)
--- 3241,3247 ----
  {
  	RefreshMatViewStmt *newnode = makeNode(RefreshMatViewStmt);
  
+ 	COPY_SCALAR_FIELD(concurrent);
  	COPY_SCALAR_FIELD(skipData);
  	COPY_NODE_FIELD(relation);
  
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1521,1526 **** _equalCreateTableAsStmt(const CreateTableAsStmt *a, const CreateTableAsStmt *b)
--- 1521,1527 ----
  static bool
  _equalRefreshMatViewStmt(const RefreshMatViewStmt *a, const RefreshMatViewStmt *b)
  {
+ 	COMPARE_SCALAR_FIELD(concurrent);
  	COMPARE_SCALAR_FIELD(skipData);
  	COMPARE_NODE_FIELD(relation);
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 3301,3311 **** OptNoLog:	UNLOGGED					{ $$ = RELPERSISTENCE_UNLOGGED; }
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->relation = $4;
! 					n->skipData = !($5);
  					$$ = (Node *) n;
  				}
  		;
--- 3301,3312 ----
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW opt_concurrently qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->concurrent = $4;
! 					n->relation = $5;
! 					n->skipData = !($6);
  					$$ = (Node *) n;
  				}
  		;
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 2871,2876 **** psql_completion(char *text, int start, int end)
--- 2871,2882 ----
  	else if (pg_strcasecmp(prev3_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev2_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev_wd, "VIEW") == 0)
+ 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews,
+ 								   " UNION SELECT 'CONCURRENTLY'");
+ 	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews, NULL);
  	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
***************
*** 2879,2884 **** psql_completion(char *text, int start, int end)
--- 2885,2895 ----
  	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "CONCURRENTLY") == 0)
+ 		COMPLETE_WITH_CONST("WITH DATA");
+ 	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev_wd, "WITH") == 0)
  	{
  		static const char *const list_WITH_DATA[] =
***************
*** 2889,2894 **** psql_completion(char *text, int start, int end)
--- 2900,2911 ----
  	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "CONCURRENTLY") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "WITH") == 0)
+ 		COMPLETE_WITH_CONST("DATA");
+ 	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev2_wd, "WITH") == 0 &&
  			 pg_strcasecmp(prev_wd, "NO") == 0)
  		COMPLETE_WITH_CONST("DATA");
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 25,31 **** extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid,
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid	make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
--- 25,32 ----
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 27,30 **** extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString
--- 27,32 ----
  
  extern DestReceiver *CreateTransientRelDestReceiver(Oid oid);
  
+ extern bool MatViewIncrementalMaintenanceIsEnabled(void);
+ 
  #endif   /* MATVIEW_H */
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 2478,2483 **** typedef struct CreateTableAsStmt
--- 2478,2484 ----
  typedef struct RefreshMatViewStmt
  {
  	NodeTag		type;
+ 	bool		concurrent;		/* allow concurrent access? */
  	bool		skipData;		/* true for WITH NO DATA */
  	RangeVar   *relation;		/* relation to insert into */
  } RefreshMatViewStmt;
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 73,78 **** SELECT * FROM tvm;
--- 73,80 ----
  
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 141,146 **** ALTER MATERIALIZED VIEW tvm SET SCHEMA mvschema;
--- 143,151 ----
    Column  |  Type   | Modifiers | Storage | Stats target | Description 
  ----------+---------+-----------+---------+--------------+-------------
   grandtot | numeric |           | main    |              | 
+ Indexes:
+     "tvmm_expr" UNIQUE, btree ((grandtot > 0::numeric))
+     "tvmm_pred" UNIQUE, btree (grandtot) WHERE grandtot < 0::numeric
  View definition:
   SELECT sum(tvm.totamt) AS grandtot
     FROM mvschema.tvm;
***************
*** 177,183 **** SELECT * FROM tvm ORDER BY type;
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
--- 182,188 ----
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
***************
*** 237,242 **** SELECT * FROM tvvm;
--- 242,250 ----
  (1 row)
  
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
+ ERROR:  cannot refresh materialized view "public.tvmm" concurrently
+ HINT:  Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 281,286 **** SELECT * FROM tvvm;
--- 289,297 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  NOTICE:  materialized view "no_such_mv" does not exist, skipping
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ ERROR:  CONCURRENTLY and WITH NO DATA options cannot be used together
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
***************
*** 385,387 **** SELECT * FROM hogeview WHERE i < 10;
--- 396,423 ----
  
  DROP TABLE hoge CASCADE;
  NOTICE:  drop cascades to materialized view hogeview
+ -- test that duplicate values on unique index prevent refresh
+ CREATE TABLE foo(a, b) AS VALUES(1, 10);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv(a);
+ INSERT INTO foo SELECT * FROM foo;
+ REFRESH MATERIALIZED VIEW mv;
+ ERROR:  could not create unique index "mv_a_idx"
+ DETAIL:  Key (a)=(1) is duplicated.
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ ERROR:  new data for "mv" contains duplicate rows without any NULL columns
+ DETAIL:  Row: (1,10)
+ DROP TABLE foo CASCADE;
+ NOTICE:  drop cascades to materialized view mv
+ -- make sure that all indexes covered by unique indexes works
+ CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv (a);
+ CREATE UNIQUE INDEX ON mv (b);
+ CREATE UNIQUE INDEX on mv (c);
+ INSERT INTO foo VALUES(2, 3, 4);
+ INSERT INTO foo VALUES(3, 4, 5);
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
+ NOTICE:  drop cascades to materialized view mv
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 29,34 **** CREATE MATERIALIZED VIEW tvm AS SELECT * FROM tv ORDER BY type;
--- 29,36 ----
  SELECT * FROM tvm;
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 57,63 **** INSERT INTO t VALUES (6, 'z', 13);
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
--- 59,65 ----
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
***************
*** 74,79 **** SELECT * FROM tmm;
--- 76,82 ----
  SELECT * FROM tvmm;
  SELECT * FROM tvvm;
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 89,94 **** SELECT * FROM tvvm;
--- 92,100 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ 
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
***************
*** 124,126 **** SELECT * FROM hogeview WHERE i < 10;
--- 130,153 ----
  VACUUM ANALYZE;
  SELECT * FROM hogeview WHERE i < 10;
  DROP TABLE hoge CASCADE;
+ 
+ -- test that duplicate values on unique index prevent refresh
+ CREATE TABLE foo(a, b) AS VALUES(1, 10);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv(a);
+ INSERT INTO foo SELECT * FROM foo;
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
+ 
+ -- make sure that all indexes covered by unique indexes works
+ CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv (a);
+ CREATE UNIQUE INDEX ON mv (b);
+ CREATE UNIQUE INDEX on mv (c);
+ INSERT INTO foo VALUES(2, 3, 4);
+ INSERT INTO foo VALUES(3, 4, 5);
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
#41Hitoshi Harada
umi.tanuki@gmail.com
In reply to: Kevin Grittner (#40)
Re: refresh materialized view concurrently

On Tue, Jul 9, 2013 at 12:50 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

Thanks again! New patch attached.

After a couple of more attempts trying to break it, I mark this as
ready to go. One small question: why do we use multiple unique
indexes if exist? One index isn't enough?

--
Hitoshi Harada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42Kevin Grittner
kgrittn@ymail.com
In reply to: Hitoshi Harada (#41)
Re: refresh materialized view concurrently

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

After a couple of more attempts trying to break it, I mark this
as ready to go.

Thanks.

One small question:  why do we use multiple unique indexes if
exist?  

Two reasons.

(1)  By only matching up rows which test as equal on all columns
used in primary keys, we can use UPDATE for the matches rather than
DELETE and INSERT without fear of hitting a transient duplicate key
error on one of the indexes.  Since any update to an indexed column
prevents a HOT update, and results in the equivalent of a DELETE
and an INSERT anyway, there's no performance loss from letting them
not match if *any* column which is part of *any* unique index
doesn't match.

(2)  The planner can use one of the unique indexes to optimize the
diff phase.  How would we pick the one which is fastest?  By
comparing for equality on all the columns used in all unique
indexes, we let the planner decide.  I see no reason to try to
second-guess it; just let it do it's thing.

One index isn't enough?

It would be enough for correctness, yes.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43Kevin Grittner
kgrittn@ymail.com
In reply to: Hitoshi Harada (#41)
1 attachment(s)
Re: refresh materialized view concurrently

Hitoshi Harada <umi.tanuki@gmail.com> wrote:

After a couple of more attempts trying to break it, I mark this
as ready to go.

Pushed, after merging with Noah's security patch.  Merged version
attached, but it was fairly mechanical and continued to pass all
tests.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

refresh-concurrently-v5.patchtext/x-diff; name=refresh-concurrently-v5.patchDownload
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***************
*** 928,935 **** ERROR:  could not serialize access due to read/write dependencies among transact
          </para>
  
          <para>
!          This lock mode is not automatically acquired on tables by any
!          <productname>PostgreSQL</productname> command.
          </para>
         </listitem>
        </varlistentry>
--- 928,934 ----
          </para>
  
          <para>
!          Acquired by <command>REFRESH MATERIALIZED VIEW CONCURRENTLY</command>.
          </para>
         </listitem>
        </varlistentry>
*** a/doc/src/sgml/ref/refresh_materialized_view.sgml
--- b/doc/src/sgml/ref/refresh_materialized_view.sgml
***************
*** 21,27 **** PostgreSQL documentation
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
--- 21,27 ----
  
   <refsynopsisdiv>
  <synopsis>
! REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
      [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
***************
*** 38,43 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 38,47 ----
     data is generated and the materialized view is left in an unscannable
     state.
    </para>
+   <para>
+    <literal>CONCURRENTLY</literal> and <literal>WITH NO DATA</literal> may not
+    be specified together.
+   </para>
   </refsect1>
  
   <refsect1>
***************
*** 45,50 **** REFRESH MATERIALIZED VIEW <replaceable class="PARAMETER">name</replaceable>
--- 49,82 ----
  
    <variablelist>
     <varlistentry>
+     <term><literal>CONCURRENTLY</literal></term>
+     <listitem>
+      <para>
+       Refresh the materialized view without locking out concurrent selects on
+       the materialized view.  Without this option a refresh which affects a
+       lot of rows will tend to use fewer resources and complete more quickly,
+       but could block other connections which are trying to read from the
+       materialized view.  This option may be faster in cases where a small
+       number of rows are affected.
+      </para>
+      <para>
+       This option is only allowed if there is at least one
+       <literal>UNIQUE</literal> index on the materialized view which uses only
+       column names and includes all rows; that is, it must not index on any
+       expressions nor include a <literal>WHERE</literal> clause.
+      </para>
+      <para>
+       This option may not be used when the materialized view is not already
+       populated.
+      </para>
+      <para>
+       Even with this option only one <literal>REFRESH</literal> at a time may
+       run against any one materialized view.
+      </para>
+     </listitem>
+    </varlistentry>
+ 
+    <varlistentry>
      <term><replaceable class="PARAMETER">name</replaceable></term>
      <listitem>
       <para>
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 589,595 **** rebuild_relation(Relation OldHeap, Oid indexOid,
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
--- 589,596 ----
  	heap_close(OldHeap, NoLock);
  
  	/* Create the transient table that will receive the re-ordered data */
! 	OIDNewHeap = make_new_heap(tableOid, tableSpace, false,
! 							   AccessExclusiveLock);
  
  	/* Copy the heap data into the new table in the desired order */
  	copy_heap_data(OIDNewHeap, tableOid, indexOid,
***************
*** 616,622 **** rebuild_relation(Relation OldHeap, Oid indexOid,
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
--- 617,624 ----
   * data, then call finish_heap_swap to complete the operation.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode)
  {
  	TupleDesc	OldHeapDesc;
  	char		NewHeapName[NAMEDATALEN];
***************
*** 626,633 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
  
! 	OldHeap = heap_open(OIDOldHeap, AccessExclusiveLock);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
--- 628,637 ----
  	HeapTuple	tuple;
  	Datum		reloptions;
  	bool		isNull;
+ 	Oid			namespaceid;
+ 	char		relpersistence;
  
! 	OldHeap = heap_open(OIDOldHeap, lockmode);
  	OldHeapDesc = RelationGetDescr(OldHeap);
  
  	/*
***************
*** 648,653 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
--- 652,668 ----
  	if (isNull)
  		reloptions = (Datum) 0;
  
+ 	if (forcetemp)
+ 	{
+ 		namespaceid = LookupCreationNamespace("pg_temp");
+ 		relpersistence = RELPERSISTENCE_TEMP;
+ 	}
+ 	else
+ 	{
+ 		namespaceid = RelationGetNamespace(OldHeap);
+ 		relpersistence = OldHeap->rd_rel->relpersistence;
+ 	}
+ 
  	/*
  	 * Create the new heap, using a temporary name in the same namespace as
  	 * the existing table.	NOTE: there is some risk of collision with user
***************
*** 663,669 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  RelationGetNamespace(OldHeap),
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
--- 678,684 ----
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
  
  	OIDNewHeap = heap_create_with_catalog(NewHeapName,
! 										  namespaceid,
  										  NewTableSpace,
  										  InvalidOid,
  										  InvalidOid,
***************
*** 671,678 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  OldHeap->rd_rel->relkind,
! 										  OldHeap->rd_rel->relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
--- 686,693 ----
  										  OldHeap->rd_rel->relowner,
  										  OldHeapDesc,
  										  NIL,
! 										  RELKIND_RELATION,
! 										  relpersistence,
  										  false,
  										  RelationIsMapped(OldHeap),
  										  true,
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 20,36 ****
--- 20,43 ----
  #include "catalog/catalog.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
+ #include "catalog/pg_operator.h"
  #include "commands/cluster.h"
  #include "commands/matview.h"
  #include "commands/tablecmds.h"
+ #include "commands/tablespace.h"
  #include "executor/executor.h"
+ #include "executor/spi.h"
  #include "miscadmin.h"
+ #include "parser/parse_relation.h"
  #include "rewrite/rewriteHandler.h"
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
+ #include "utils/builtins.h"
+ #include "utils/lsyscache.h"
  #include "utils/rel.h"
  #include "utils/snapmgr.h"
  #include "utils/syscache.h"
+ #include "utils/typcache.h"
  
  
  typedef struct
***************
*** 44,55 **** typedef struct
  	BulkInsertState bistate;	/* bulk insert state */
  } DR_transientrel;
  
  static void transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo);
  static void transientrel_receive(TupleTableSlot *slot, DestReceiver *self);
  static void transientrel_shutdown(DestReceiver *self);
  static void transientrel_destroy(DestReceiver *self);
  static void refresh_matview_datafill(DestReceiver *dest, Query *query,
! 						 const char *queryString);
  
  /*
   * SetMatViewPopulatedState
--- 51,73 ----
  	BulkInsertState bistate;	/* bulk insert state */
  } DR_transientrel;
  
+ static int	matview_maintenance_depth = 0;
+ 
  static void transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo);
  static void transientrel_receive(TupleTableSlot *slot, DestReceiver *self);
  static void transientrel_shutdown(DestReceiver *self);
  static void transientrel_destroy(DestReceiver *self);
  static void refresh_matview_datafill(DestReceiver *dest, Query *query,
! 						 const char *queryString, Oid relowner);
! 
! static char *make_temptable_name_n(char *tempname, int n);
! static void mv_GenerateOper(StringInfo buf, Oid opoid);
! 
! static void refresh_by_match_merge(Oid matviewOid, Oid tempOid);
! static void refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap);
! 
! static void OpenMatViewIncrementalMaintenance(void);
! static void CloseMatViewIncrementalMaintenance(void);
  
  /*
   * SetMatViewPopulatedState
***************
*** 122,139 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	RewriteRule *rule;
  	List	   *actions;
  	Query	   *dataQuery;
- 	Oid			save_userid;
- 	int			save_sec_context;
- 	int			save_nestlevel;
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  AccessExclusiveLock, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
--- 140,160 ----
  	RewriteRule *rule;
  	List	   *actions;
  	Query	   *dataQuery;
  	Oid			tableSpace;
  	Oid			OIDNewHeap;
  	DestReceiver *dest;
+ 	bool		concurrent;
+ 	LOCKMODE	lockmode;
+ 
+ 	/* Determine strength of lock needed. */
+ 	concurrent = stmt->concurrent;
+ 	lockmode = concurrent ? ExclusiveLock : AccessExclusiveLock;
  
  	/*
  	 * Get a lock until end of transaction.
  	 */
  	matviewOid = RangeVarGetRelidExtended(stmt->relation,
! 										  lockmode, false, false,
  										  RangeVarCallbackOwnsTable, NULL);
  	matviewRel = heap_open(matviewOid, NoLock);
  
***************
*** 144,154 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/*
! 	 * We're not using materialized views in the system catalogs.
! 	 */
  	Assert(!IsSystemRelation(matviewRel));
  
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
--- 165,186 ----
  				 errmsg("\"%s\" is not a materialized view",
  						RelationGetRelationName(matviewRel))));
  
! 	/* Check that CONCURRENTLY is not specified if not populated. */
! 	if (concurrent && !RelationIsPopulated(matviewRel))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 				 errmsg("CONCURRENTLY cannot be used when the materialized view is not populated")));
! 
! 	/* Check that conflicting options have not been specified. */
! 	if (concurrent && stmt->skipData)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_SYNTAX_ERROR),
! 				 errmsg("CONCURRENTLY and WITH NO DATA options cannot be used together")));
! 
! 	/* We're not using materialized views in the system catalogs. */
  	Assert(!IsSystemRelation(matviewRel));
  
+ 	/* We don't allow an oid column for a materialized view. */
  	Assert(!matviewRel->rd_rel->relhasoids);
  
  	/*
***************
*** 195,241 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  	CheckTableNotInUse(matviewRel, "REFRESH MATERIALIZED VIEW");
  
  	/*
- 	 * Switch to the owner's userid, so that any functions are run as that
- 	 * user.  Also lock down security-restricted operations and arrange to
- 	 * make GUC variable changes local to this command.
- 	 */
- 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
- 	SetUserIdAndSecContext(matviewRel->rd_rel->relowner,
- 						   save_sec_context | SECURITY_RESTRICTED_OPERATION);
- 	save_nestlevel = NewGUCNestLevel();
- 
- 	/*
  	 * Tentatively mark the matview as populated or not (this will roll back
  	 * if we fail later).
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
! 		refresh_matview_datafill(dest, dataQuery, queryString);
! 
! 	/*
! 	 * Swap the physical files of the target and transient tables, then
! 	 * rebuild the target's indexes and throw away the transient table.
! 	 */
! 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
! 					 RecentXmin, ReadNextMultiXactId());
! 
! 	RelationCacheInvalidateEntry(matviewOid);
! 
! 	/* Roll back any GUC changes */
! 	AtEOXact_GUC(false, save_nestlevel);
! 
! 	/* Restore userid and security context */
! 	SetUserIdAndSecContext(save_userid, save_sec_context);
  }
  
  /*
--- 227,274 ----
  	CheckTableNotInUse(matviewRel, "REFRESH MATERIALIZED VIEW");
  
  	/*
  	 * Tentatively mark the matview as populated or not (this will roll back
  	 * if we fail later).
  	 */
  	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
  
! 	/* Concurrent refresh builds new data in temp tablespace, and does diff. */
! 	if (concurrent)
! 		tableSpace = GetDefaultTablespace(RELPERSISTENCE_TEMP);
! 	else
! 		tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
  
  	/* Create the transient table that will receive the regenerated data. */
! 	OIDNewHeap = make_new_heap(matviewOid, tableSpace, concurrent,
! 							   ExclusiveLock);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
  	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
! 		refresh_matview_datafill(dest, dataQuery, queryString,
! 								 matviewRel->rd_rel->relowner);
! 
! 	/* Make the matview match the newly generated data. */
! 	if (concurrent)
! 	{
! 		int			old_depth = matview_maintenance_depth;
! 
! 		PG_TRY();
! 		{
! 			refresh_by_match_merge(matviewOid, OIDNewHeap);
! 		}
! 		PG_CATCH();
! 		{
! 			matview_maintenance_depth = old_depth;
! 			PG_RE_THROW();
! 		}
! 		PG_END_TRY();
! 		Assert(matview_maintenance_depth == old_depth);
! 	}
! 	else
! 		refresh_by_heap_swap(matviewOid, OIDNewHeap);
  }
  
  /*
***************
*** 243,253 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
   */
  static void
  refresh_matview_datafill(DestReceiver *dest, Query *query,
! 						 const char *queryString)
  {
  	List	   *rewritten;
  	PlannedStmt *plan;
  	QueryDesc  *queryDesc;
  
  	/* Rewrite, copying the given Query to make sure it's not changed */
  	rewritten = QueryRewrite((Query *) copyObject(query));
--- 276,299 ----
   */
  static void
  refresh_matview_datafill(DestReceiver *dest, Query *query,
! 						 const char *queryString, Oid relowner)
  {
  	List	   *rewritten;
  	PlannedStmt *plan;
  	QueryDesc  *queryDesc;
+ 	Oid			save_userid;
+ 	int			save_sec_context;
+ 	int			save_nestlevel;
+ 
+ 	/*
+ 	 * Switch to the owner's userid, so that any functions are run as that
+ 	 * user.  Also lock down security-restricted operations and arrange to
+ 	 * make GUC variable changes local to this command.
+ 	 */
+ 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
+ 	SetUserIdAndSecContext(relowner,
+ 						   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+ 	save_nestlevel = NewGUCNestLevel();
  
  	/* Rewrite, copying the given Query to make sure it's not changed */
  	rewritten = QueryRewrite((Query *) copyObject(query));
***************
*** 290,295 **** refresh_matview_datafill(DestReceiver *dest, Query *query,
--- 336,347 ----
  	FreeQueryDesc(queryDesc);
  
  	PopActiveSnapshot();
+ 
+ 	/* Roll back any GUC changes */
+ 	AtEOXact_GUC(false, save_nestlevel);
+ 
+ 	/* Restore userid and security context */
+ 	SetUserIdAndSecContext(save_userid, save_sec_context);
  }
  
  DestReceiver *
***************
*** 388,390 **** transientrel_destroy(DestReceiver *self)
--- 440,840 ----
  {
  	pfree(self);
  }
+ 
+ 
+ /*
+  * Given a qualified temporary table name, append an underscore followed by
+  * the given integer, to make a new table name based on the old one.
+  *
+  * This leaks memory through palloc(), which won't be cleaned up until the
+  * current memory memory context is freed.
+  */
+ static char *
+ make_temptable_name_n(char *tempname, int n)
+ {
+ 	StringInfoData namebuf;
+ 
+ 	initStringInfo(&namebuf);
+ 	appendStringInfoString(&namebuf, tempname);
+ 	appendStringInfo(&namebuf, "_%i", n);
+ 	return namebuf.data;
+ }
+ 
+ static void
+ mv_GenerateOper(StringInfo buf, Oid opoid)
+ {
+ 	HeapTuple	opertup;
+ 	Form_pg_operator operform;
+ 
+ 	opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(opoid));
+ 	if (!HeapTupleIsValid(opertup))
+ 		elog(ERROR, "cache lookup failed for operator %u", opoid);
+ 	operform = (Form_pg_operator) GETSTRUCT(opertup);
+ 	Assert(operform->oprkind == 'b');
+ 
+ 	appendStringInfo(buf, "OPERATOR(%s.%s)",
+ 				quote_identifier(get_namespace_name(operform->oprnamespace)),
+ 					 NameStr(operform->oprname));
+ 
+ 	ReleaseSysCache(opertup);
+ }
+ 
+ /*
+  * refresh_by_match_merge
+  *
+  * Refresh a materialized view with transactional semantics, while allowing
+  * concurrent reads.
+  *
+  * This is called after a new version of the data has been created in a
+  * temporary table.  It performs a full outer join against the old version of
+  * the data, producing "diff" results.	This join cannot work if there are any
+  * duplicated rows in either the old or new versions, in the sense that every
+  * column would compare as equal between the two rows.	It does work correctly
+  * in the face of rows which have at least one NULL value, with all non-NULL
+  * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
+  * indexes turns out to be quite convenient here; the tests we need to make
+  * are consistent with default behavior.  If there is at least one UNIQUE
+  * index on the materialized view, we have exactly the guarantee we need.  By
+  * joining based on equality on all columns which are part of any unique
+  * index, we identify the rows on which we can use UPDATE without any problem.
+  * If any column is NULL in either the old or new version of a row (or both),
+  * we must use DELETE and INSERT, since there could be multiple rows which are
+  * NOT DISTINCT FROM each other, and we could otherwise end up with the wrong
+  * number of occurrences in the updated relation.  The temporary table used to
+  * hold the diff results contains just the TID of the old record (if matched)
+  * and the ROW from the new table as a single column of complex record type
+  * (if matched).
+  *
+  * Once we have the diff table, we perform set-based DELETE, UPDATE, and
+  * INSERT operations against the materialized view, and discard both temporary
+  * tables.
+  *
+  * Everything from the generation of the new data to applying the differences
+  * takes place under cover of an ExclusiveLock, since it seems as though we
+  * would want to prohibit not only concurrent REFRESH operations, but also
+  * incremental maintenance.  It also doesn't seem reasonable or safe to allow
+  * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
+  * this command.
+  */
+ static void
+ refresh_by_match_merge(Oid matviewOid, Oid tempOid)
+ {
+ 	StringInfoData querybuf;
+ 	Relation	matviewRel;
+ 	Relation	tempRel;
+ 	char	   *matviewname;
+ 	char	   *tempname;
+ 	char	   *diffname;
+ 	TupleDesc	tupdesc;
+ 	bool		foundUniqueIndex;
+ 	List	   *indexoidlist;
+ 	ListCell   *indexoidscan;
+ 	int16		relnatts;
+ 	bool	   *usedForQual;
+ 	Oid			save_userid;
+ 	int			save_sec_context;
+ 	int			save_nestlevel;
+ 
+ 	initStringInfo(&querybuf);
+ 	matviewRel = heap_open(matviewOid, NoLock);
+ 	matviewname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(matviewRel)),
+ 										RelationGetRelationName(matviewRel));
+ 	tempRel = heap_open(tempOid, NoLock);
+ 	tempname = quote_qualified_identifier(get_namespace_name(RelationGetNamespace(tempRel)),
+ 										  RelationGetRelationName(tempRel));
+ 	diffname = make_temptable_name_n(tempname, 2);
+ 
+ 	relnatts = matviewRel->rd_rel->relnatts;
+ 	usedForQual = (bool *) palloc0(sizeof(bool) * relnatts);
+ 
+ 	/* Open SPI context. */
+ 	if (SPI_connect() != SPI_OK_CONNECT)
+ 		elog(ERROR, "SPI_connect failed");
+ 
+ 	/* Analyze the temp table with the new contents. */
+ 	appendStringInfo(&querybuf, "ANALYZE %s", tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We need to ensure that there are not duplicate rows without NULLs in
+ 	 * the new data set before we can count on the "diff" results.	Check for
+ 	 * that in a way that allows showing the first duplicated row found.  Even
+ 	 * after we pass this test, a unique index on the materialized view may
+ 	 * find a duplicate key problem.
+ 	 */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "SELECT x FROM %s x WHERE x IS NOT NULL AND EXISTS "
+ 					 "(SELECT * FROM %s y WHERE y IS NOT NULL "
+ 					 "AND (y.*) = (x.*) AND y.ctid <> x.ctid) LIMIT 1",
+ 					 tempname, tempname);
+ 	if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 	if (SPI_processed > 0)
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_CARDINALITY_VIOLATION),
+ 				 errmsg("new data for \"%s\" contains duplicate rows without any NULL columns",
+ 						RelationGetRelationName(matviewRel)),
+ 				 errdetail("Row: %s",
+ 			SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1))));
+ 	}
+ 
+ 	/* Start building the query for creating the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "CREATE TEMP TABLE %s AS "
+ 					 "SELECT x.ctid AS tid, y FROM %s x FULL JOIN %s y ON (",
+ 					 diffname, matviewname, tempname);
+ 
+ 	/*
+ 	 * Get the list of index OIDs for the table from the relcache, and look up
+ 	 * each one in the pg_index syscache.  We will test for equality on all
+ 	 * columns present in all unique indexes which only reference columns and
+ 	 * include all rows.
+ 	 */
+ 	tupdesc = matviewRel->rd_att;
+ 	foundUniqueIndex = false;
+ 	indexoidlist = RelationGetIndexList(matviewRel);
+ 
+ 	foreach(indexoidscan, indexoidlist)
+ 	{
+ 		Oid			indexoid = lfirst_oid(indexoidscan);
+ 		HeapTuple	indexTuple;
+ 		Form_pg_index index;
+ 
+ 		indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexoid));
+ 		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
+ 			elog(ERROR, "cache lookup failed for index %u", indexoid);
+ 		index = (Form_pg_index) GETSTRUCT(indexTuple);
+ 
+ 		/* We're only interested if it is unique and valid. */
+ 		if (index->indisunique && IndexIsValid(index))
+ 		{
+ 			int			numatts = index->indnatts;
+ 			int			i;
+ 			bool		expr = false;
+ 			Relation	indexRel;
+ 
+ 			/* Skip any index on an expression. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				if (index->indkey.values[i] == 0)
+ 				{
+ 					expr = true;
+ 					break;
+ 				}
+ 			}
+ 			if (expr)
+ 			{
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 
+ 			/* Skip partial indexes. */
+ 			indexRel = index_open(index->indexrelid, RowExclusiveLock);
+ 			if (indexRel->rd_indpred != NIL)
+ 			{
+ 				index_close(indexRel, NoLock);
+ 				ReleaseSysCache(indexTuple);
+ 				continue;
+ 			}
+ 			/* Hold the locks, since we're about to run DML which needs them. */
+ 			index_close(indexRel, NoLock);
+ 
+ 			/* Add quals for all columns from this index. */
+ 			for (i = 0; i < numatts; i++)
+ 			{
+ 				int			attnum = index->indkey.values[i];
+ 				Oid			type;
+ 				Oid			op;
+ 				const char *colname;
+ 
+ 				/*
+ 				 * Only include the column once regardless of how many times
+ 				 * it shows up in how many indexes.
+ 				 *
+ 				 * This is also useful later to omit columns which can not
+ 				 * have changed from the SET clause of the UPDATE statement.
+ 				 */
+ 				if (usedForQual[attnum - 1])
+ 					continue;
+ 				usedForQual[attnum - 1] = true;
+ 
+ 				/*
+ 				 * Actually add the qual, ANDed with any others.
+ 				 */
+ 				if (foundUniqueIndex)
+ 					appendStringInfoString(&querybuf, " AND ");
+ 
+ 				colname = quote_identifier(NameStr((tupdesc->attrs[attnum - 1])->attname));
+ 				appendStringInfo(&querybuf, "y.%s ", colname);
+ 				type = attnumTypeId(matviewRel, attnum);
+ 				op = lookup_type_cache(type, TYPECACHE_EQ_OPR)->eq_opr;
+ 				mv_GenerateOper(&querybuf, op);
+ 				appendStringInfo(&querybuf, " x.%s", colname);
+ 
+ 				foundUniqueIndex = true;
+ 			}
+ 		}
+ 		ReleaseSysCache(indexTuple);
+ 	}
+ 
+ 	list_free(indexoidlist);
+ 
+ 	if (!foundUniqueIndex)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 			   errmsg("cannot refresh materialized view \"%s\" concurrently",
+ 					  matviewname),
+ 				 errhint("Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.")));
+ 
+ 	appendStringInfoString(&querybuf,
+ 						   " AND y = x) WHERE (y.*) IS DISTINCT FROM (x.*)"
+ 						   " ORDER BY tid");
+ 
+ 	/* Create the temporary "diff" table. */
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/*
+ 	 * We have no further use for data from the "full-data" temp table, but we
+ 	 * must keep it around because its type is reference from the diff table.
+ 	 */
+ 
+ 	/* Analyze the diff table. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "ANALYZE %s", diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	OpenMatViewIncrementalMaintenance();
+ 
+ 	/*
+ 	 * Switch to the owner's userid, so that any functions are run as that
+ 	 * user.  Also lock down security-restricted operations and arrange to
+ 	 * make GUC variable changes local to this command.
+ 	 */
+ 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
+ 	SetUserIdAndSecContext(matviewRel->rd_rel->relowner,
+ 						   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+ 	save_nestlevel = NewGUCNestLevel();
+ 
+ 	/* Deletes must come before inserts; do them first. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "DELETE FROM %s WHERE ctid IN "
+ 					 "(SELECT d.tid FROM %s d "
+ 					 "WHERE d.tid IS NOT NULL "
+ 					 "AND (d.y) IS NOT DISTINCT FROM NULL)",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Updates before inserts gives a better chance at HOT updates. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "UPDATE %s x SET ", matviewname);
+ 
+ 	{
+ 		int			i;
+ 		bool		targetColFound = false;
+ 
+ 		for (i = 0; i < tupdesc->natts; i++)
+ 		{
+ 			const char *colname;
+ 
+ 			if (tupdesc->attrs[i]->attisdropped)
+ 				continue;
+ 
+ 			if (usedForQual[i])
+ 				continue;
+ 
+ 			if (targetColFound)
+ 				appendStringInfoString(&querybuf, ", ");
+ 			targetColFound = true;
+ 
+ 			colname = quote_identifier(NameStr((tupdesc->attrs[i])->attname));
+ 			appendStringInfo(&querybuf, "%s = (d.y).%s", colname, colname);
+ 		}
+ 
+ 		if (targetColFound)
+ 		{
+ 			appendStringInfo(&querybuf,
+ 							 " FROM %s d "
+ 							 "WHERE d.tid IS NOT NULL AND x.ctid = d.tid",
+ 							 diffname);
+ 
+ 			if (SPI_exec(querybuf.data, 0) != SPI_OK_UPDATE)
+ 				elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 		}
+ 	}
+ 
+ 	/* Inserts go last. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf,
+ 					 "INSERT INTO %s SELECT (y).* FROM %s WHERE tid IS NULL",
+ 					 matviewname, diffname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_INSERT)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Roll back any GUC changes */
+ 	AtEOXact_GUC(false, save_nestlevel);
+ 
+ 	/* Restore userid and security context */
+ 	SetUserIdAndSecContext(save_userid, save_sec_context);
+ 
+ 	/* We're done maintaining the materialized view. */
+ 	CloseMatViewIncrementalMaintenance();
+ 	heap_close(tempRel, NoLock);
+ 	heap_close(matviewRel, NoLock);
+ 
+ 	/* Clean up temp tables. */
+ 	resetStringInfo(&querybuf);
+ 	appendStringInfo(&querybuf, "DROP TABLE %s, %s", diffname, tempname);
+ 	if (SPI_exec(querybuf.data, 0) != SPI_OK_UTILITY)
+ 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
+ 
+ 	/* Close SPI context. */
+ 	if (SPI_finish() != SPI_OK_FINISH)
+ 		elog(ERROR, "SPI_finish failed");
+ }
+ 
+ /*
+  * Swap the physical files of the target and transient tables, then rebuild
+  * the target's indexes and throw away the transient table.  Security context
+  * swapping is handled by the called function, so it is not needed here.
+  */
+ static void
+ refresh_by_heap_swap(Oid matviewOid, Oid OIDNewHeap)
+ {
+ 	finish_heap_swap(matviewOid, OIDNewHeap, false, false, true, true,
+ 					 RecentXmin, ReadNextMultiXactId());
+ 
+ 	RelationCacheInvalidateEntry(matviewOid);
+ }
+ 
+ static void
+ OpenMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth++;
+ }
+ 
+ static void
+ CloseMatViewIncrementalMaintenance(void)
+ {
+ 	matview_maintenance_depth--;
+ 	Assert(matview_maintenance_depth >= 0);
+ }
+ 
+ /*
+  * This should be used to test whether the backend is in a context where it is
+  * OK to allow DML statements to modify materialized views.  We only want to
+  * allow that for internal code driven by the materialized view definition,
+  * not for arbitrary user-supplied code.
+  */
+ bool
+ MatViewIncrementalMaintenanceIsEnabled(void)
+ {
+ 	return matview_maintenance_depth > 0;
+ }
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 3541,3547 **** ATRewriteTables(List **wqueue, LOCKMODE lockmode)
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace);
  
  			/*
  			 * Copy the heap data into the new table with the desired
--- 3541,3548 ----
  			heap_close(OldHeap, NoLock);
  
  			/* Create transient table that will receive the modified data */
! 			OIDNewHeap = make_new_heap(tab->relid, NewTableSpace, false,
! 									   AccessExclusiveLock);
  
  			/*
  			 * Copy the heap data into the new table with the desired
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 42,47 ****
--- 42,48 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/namespace.h"
+ #include "commands/matview.h"
  #include "commands/trigger.h"
  #include "executor/execdebug.h"
  #include "foreign/fdwapi.h"
***************
*** 999,1008 **** CheckValidResultRel(Relation resultRel, CmdType operation)
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			ereport(ERROR,
! 					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 					 errmsg("cannot change materialized view \"%s\"",
! 							RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
--- 1000,1010 ----
  			}
  			break;
  		case RELKIND_MATVIEW:
! 			if (!MatViewIncrementalMaintenanceIsEnabled())
! 				ereport(ERROR,
! 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
! 						 errmsg("cannot change materialized view \"%s\"",
! 								RelationGetRelationName(resultRel))));
  			break;
  		case RELKIND_FOREIGN_TABLE:
  			/* Okay only if the FDW supports it */
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***************
*** 950,956 **** ExecModifyTable(ModifyTableState *node)
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
--- 950,956 ----
  				bool		isNull;
  
  				relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 				if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
  				{
  					datum = ExecGetJunkAttribute(slot,
  												 junkfilter->jf_junkAttNo,
***************
*** 1280,1286 **** ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
--- 1280,1287 ----
  					char		relkind;
  
  					relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
! 					if (relkind == RELKIND_RELATION ||
! 						relkind == RELKIND_MATVIEW)
  					{
  						j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
  						if (!AttributeNumberIsValid(j->jf_junkAttNo))
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 3241,3246 **** _copyRefreshMatViewStmt(const RefreshMatViewStmt *from)
--- 3241,3247 ----
  {
  	RefreshMatViewStmt *newnode = makeNode(RefreshMatViewStmt);
  
+ 	COPY_SCALAR_FIELD(concurrent);
  	COPY_SCALAR_FIELD(skipData);
  	COPY_NODE_FIELD(relation);
  
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 1521,1526 **** _equalCreateTableAsStmt(const CreateTableAsStmt *a, const CreateTableAsStmt *b)
--- 1521,1527 ----
  static bool
  _equalRefreshMatViewStmt(const RefreshMatViewStmt *a, const RefreshMatViewStmt *b)
  {
+ 	COMPARE_SCALAR_FIELD(concurrent);
  	COMPARE_SCALAR_FIELD(skipData);
  	COMPARE_NODE_FIELD(relation);
  
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 3301,3311 **** OptNoLog:	UNLOGGED					{ $$ = RELPERSISTENCE_UNLOGGED; }
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->relation = $4;
! 					n->skipData = !($5);
  					$$ = (Node *) n;
  				}
  		;
--- 3301,3312 ----
   *****************************************************************************/
  
  RefreshMatViewStmt:
! 			REFRESH MATERIALIZED VIEW opt_concurrently qualified_name opt_with_data
  				{
  					RefreshMatViewStmt *n = makeNode(RefreshMatViewStmt);
! 					n->concurrent = $4;
! 					n->relation = $5;
! 					n->skipData = !($6);
  					$$ = (Node *) n;
  				}
  		;
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 2871,2876 **** psql_completion(char *text, int start, int end)
--- 2871,2882 ----
  	else if (pg_strcasecmp(prev3_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev2_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev_wd, "VIEW") == 0)
+ 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews,
+ 								   " UNION SELECT 'CONCURRENTLY'");
+ 	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_matviews, NULL);
  	else if (pg_strcasecmp(prev4_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev3_wd, "MATERIALIZED") == 0 &&
***************
*** 2879,2884 **** psql_completion(char *text, int start, int end)
--- 2885,2895 ----
  	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "CONCURRENTLY") == 0)
+ 		COMPLETE_WITH_CONST("WITH DATA");
+ 	else if (pg_strcasecmp(prev5_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev_wd, "WITH") == 0)
  	{
  		static const char *const list_WITH_DATA[] =
***************
*** 2889,2894 **** psql_completion(char *text, int start, int end)
--- 2900,2911 ----
  	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
  			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
  			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
+ 			 pg_strcasecmp(prev3_wd, "CONCURRENTLY") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "WITH") == 0)
+ 		COMPLETE_WITH_CONST("DATA");
+ 	else if (pg_strcasecmp(prev6_wd, "REFRESH") == 0 &&
+ 			 pg_strcasecmp(prev5_wd, "MATERIALIZED") == 0 &&
+ 			 pg_strcasecmp(prev4_wd, "VIEW") == 0 &&
  			 pg_strcasecmp(prev2_wd, "WITH") == 0 &&
  			 pg_strcasecmp(prev_wd, "NO") == 0)
  		COMPLETE_WITH_CONST("DATA");
*** a/src/include/commands/cluster.h
--- b/src/include/commands/cluster.h
***************
*** 25,31 **** extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid,
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid	make_new_heap(Oid OIDOldHeap, Oid NewTableSpace);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
--- 25,32 ----
  						   bool recheck, LOCKMODE lockmode);
  extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);
  
! extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
! 			  LOCKMODE lockmode);
  extern void finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
  				 bool is_system_catalog,
  				 bool swap_toast_by_content,
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 27,30 **** extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString
--- 27,32 ----
  
  extern DestReceiver *CreateTransientRelDestReceiver(Oid oid);
  
+ extern bool MatViewIncrementalMaintenanceIsEnabled(void);
+ 
  #endif   /* MATVIEW_H */
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 2478,2483 **** typedef struct CreateTableAsStmt
--- 2478,2484 ----
  typedef struct RefreshMatViewStmt
  {
  	NodeTag		type;
+ 	bool		concurrent;		/* allow concurrent access? */
  	bool		skipData;		/* true for WITH NO DATA */
  	RangeVar   *relation;		/* relation to insert into */
  } RefreshMatViewStmt;
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 73,78 **** SELECT * FROM tvm;
--- 73,80 ----
  
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 141,146 **** ALTER MATERIALIZED VIEW tvm SET SCHEMA mvschema;
--- 143,151 ----
    Column  |  Type   | Modifiers | Storage | Stats target | Description 
  ----------+---------+-----------+---------+--------------+-------------
   grandtot | numeric |           | main    |              | 
+ Indexes:
+     "tvmm_expr" UNIQUE, btree ((grandtot > 0::numeric))
+     "tvmm_pred" UNIQUE, btree (grandtot) WHERE grandtot < 0::numeric
  View definition:
   SELECT sum(tvm.totamt) AS grandtot
     FROM mvschema.tvm;
***************
*** 177,183 **** SELECT * FROM tvm ORDER BY type;
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
--- 182,188 ----
   z    |     11
  (3 rows)
  
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
   type | totamt 
***************
*** 237,242 **** SELECT * FROM tvvm;
--- 242,250 ----
  (1 row)
  
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
+ ERROR:  cannot refresh materialized view "public.tvmm" concurrently
+ HINT:  Create a UNIQUE index with no WHERE clause on one or more columns of the materialized view.
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 281,286 **** SELECT * FROM tvvm;
--- 289,297 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  NOTICE:  materialized view "no_such_mv" does not exist, skipping
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ ERROR:  CONCURRENTLY and WITH NO DATA options cannot be used together
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
***************
*** 385,387 **** SELECT * FROM hogeview WHERE i < 10;
--- 396,423 ----
  
  DROP TABLE hoge CASCADE;
  NOTICE:  drop cascades to materialized view hogeview
+ -- test that duplicate values on unique index prevent refresh
+ CREATE TABLE foo(a, b) AS VALUES(1, 10);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv(a);
+ INSERT INTO foo SELECT * FROM foo;
+ REFRESH MATERIALIZED VIEW mv;
+ ERROR:  could not create unique index "mv_a_idx"
+ DETAIL:  Key (a)=(1) is duplicated.
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ ERROR:  new data for "mv" contains duplicate rows without any NULL columns
+ DETAIL:  Row: (1,10)
+ DROP TABLE foo CASCADE;
+ NOTICE:  drop cascades to materialized view mv
+ -- make sure that all indexes covered by unique indexes works
+ CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv (a);
+ CREATE UNIQUE INDEX ON mv (b);
+ CREATE UNIQUE INDEX on mv (c);
+ INSERT INTO foo VALUES(2, 3, 4);
+ INSERT INTO foo VALUES(3, 4, 5);
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
+ NOTICE:  drop cascades to materialized view mv
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 29,34 **** CREATE MATERIALIZED VIEW tvm AS SELECT * FROM tv ORDER BY type;
--- 29,36 ----
  SELECT * FROM tvm;
  CREATE MATERIALIZED VIEW tmm AS SELECT sum(totamt) AS grandtot FROM tm;
  CREATE MATERIALIZED VIEW tvmm AS SELECT sum(totamt) AS grandtot FROM tvm;
+ CREATE UNIQUE INDEX tvmm_expr ON tvmm ((grandtot > 0));
+ CREATE UNIQUE INDEX tvmm_pred ON tvmm (grandtot) WHERE grandtot < 0;
  CREATE VIEW tvv AS SELECT sum(totamt) AS grandtot FROM tv;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tvvm AS SELECT * FROM tvv;
***************
*** 57,63 **** INSERT INTO t VALUES (6, 'z', 13);
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
--- 59,65 ----
  -- confirm pre- and post-refresh contents of fairly simple materialized views
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
! REFRESH MATERIALIZED VIEW CONCURRENTLY tm;
  REFRESH MATERIALIZED VIEW tvm;
  SELECT * FROM tm ORDER BY type;
  SELECT * FROM tvm ORDER BY type;
***************
*** 74,79 **** SELECT * FROM tmm;
--- 76,82 ----
  SELECT * FROM tvmm;
  SELECT * FROM tvvm;
  REFRESH MATERIALIZED VIEW tmm;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm;
  REFRESH MATERIALIZED VIEW tvmm;
  REFRESH MATERIALIZED VIEW tvvm;
  EXPLAIN (costs off)
***************
*** 89,94 **** SELECT * FROM tvvm;
--- 92,100 ----
  -- test diemv when the mv does not exist
  DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  
+ -- make sure invalid comination of options is prohibited
+ REFRESH MATERIALIZED VIEW CONCURRENTLY tvmm WITH NO DATA;
+ 
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
***************
*** 124,126 **** SELECT * FROM hogeview WHERE i < 10;
--- 130,153 ----
  VACUUM ANALYZE;
  SELECT * FROM hogeview WHERE i < 10;
  DROP TABLE hoge CASCADE;
+ 
+ -- test that duplicate values on unique index prevent refresh
+ CREATE TABLE foo(a, b) AS VALUES(1, 10);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv(a);
+ INSERT INTO foo SELECT * FROM foo;
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;
+ 
+ -- make sure that all indexes covered by unique indexes works
+ CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3);
+ CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo;
+ CREATE UNIQUE INDEX ON mv (a);
+ CREATE UNIQUE INDEX ON mv (b);
+ CREATE UNIQUE INDEX on mv (c);
+ INSERT INTO foo VALUES(2, 3, 4);
+ INSERT INTO foo VALUES(3, 4, 5);
+ REFRESH MATERIALIZED VIEW mv;
+ REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ DROP TABLE foo CASCADE;