adding partitioned tables to publications

Started by Amit Langoteover 6 years ago81 messages
#1Amit Langote
amitlangote09@gmail.com
1 attachment(s)

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Thanks,
Amit

Attachments:

0001-Support-adding-partitioned-tables-to-publication.patchapplication/octet-stream; name=0001-Support-adding-partitioned-tables-to-publication.patchDownload
From cc9751bed8d98927669eb2e2349dd53782d765eb Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Wed, 2 Oct 2019 18:52:49 +0900
Subject: [PATCH] Support adding partitioned tables to publication

Adding a partitioned table to a publication in turn adds all of its
existing and future partitions.  Detaching a partition doesn't remove
it from the publication, but its membership is dissociated from
the parent's membership, that is, it becomes a standalone member.
---
 doc/src/sgml/logical-replication.sgml       |  22 +-
 doc/src/sgml/ref/alter_publication.sgml     |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  12 +-
 src/backend/catalog/pg_publication.c        |  89 +++++--
 src/backend/commands/publicationcmds.c      | 394 +++++++++++++++++++---------
 src/backend/commands/tablecmds.c            |  13 +-
 src/backend/executor/execReplication.c      |  19 +-
 src/backend/replication/logical/tablesync.c |   7 +
 src/bin/pg_dump/pg_dump.c                   |  20 +-
 src/include/catalog/pg_publication.h        |   6 +-
 src/include/catalog/pg_publication_rel.h    |   1 +
 src/include/commands/publicationcmds.h      |   2 +
 src/test/regress/expected/publication.out   | 152 ++++++++++-
 src/test/regress/sql/publication.sql        |  80 +++++-
 14 files changed, 644 insertions(+), 184 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..c14861ddfb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,21 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only possible between combinations of regular and
+     partitioned tables.  That is, the tables on the publication and on the
+     subscription side must be normal or partitioned tables, not views,
+     materialized views, or foreign tables.  Attempts to replicate tables other
+     than regular and partitioned tables will result in an error.
+    </para>
+
+    <para>
+     Actually, when a partitioned table is added to a publication, all of its
+     existing and future partitions are automatically added to the publication.
+     Any changes made to the leaf partitions are sent to the subscription server
+     which must contain a partitioned table with partition hierarchy matching
+     one-to-one with the publication side partitioned table.  For partitioned
+     tables on the two sides to match one-to-one, each partition with a given
+     partition constraint must have the same name on both sides.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..e9db773d9b 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -46,7 +46,11 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
    tables from the publication.  Note that adding tables to a publication that
    is already subscribed to will require a <literal>ALTER SUBSCRIPTION
    ... REFRESH PUBLICATION</literal> action on the subscribing side in order
-   to become effective.
+   to become effective.  Using <literal>DROP TABLE</literal> to remove a
+   partitioned table from a publication will also remove all of its partitions
+   from the publication unless <literal>ONLY</literal> is specified.  However,
+   removing a partition from a publication without first removing its parent
+   will result in an error.
   </para>
 
   <para>
@@ -91,7 +95,10 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
       table name, only that table is affected.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are
       affected.  Optionally, <literal>*</literal> can be specified after the table
-      name to explicitly indicate that descendant tables are included.
+      name to explicitly indicate that descendant tables are included.  Specifying
+      <literal>ONLY</literal> with <literal>SET TABLE</literal> will result in an
+      error for a partitioned table if it contains partitions, because partitions
+      must be added to the publication too.
      </para>
     </listitem>
    </varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..7354665e47 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -72,11 +72,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base and partitioned tables can be part of a publication.
+      Temporary tables, unlogged tables, foreign tables, materialized views,
+      regular views cannot be part of a publication.  Specifying
+      <literal>ONLY</literal> results in an error for a partitioned table if
+      it contains partitions, because partitions must be added to the
+      publication too.  See <xref linkend="logical-replication-publication"/>
+      for details about how partitioned tables are replicated.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..2547cb71f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -50,17 +50,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
 	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -106,7 +98,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -144,13 +137,56 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * Update prinh flag for a given relation's pg_publication_rel entry
+ */
+void
+publication_rel_update_inheritance(Relation pubrelCatalog, Oid pubid,
+								   Relation rel, bool inh)
+{
+	Oid			relid = RelationGetRelid(rel);
+	HeapTuple	tuple;
+	Form_pg_publication_rel prform;
+
+	Assert(pubrelCatalog != NULL);
+
+	tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+							ObjectIdGetDatum(pubid));
+	Assert(tuple != NULL);
+
+	prform = (Form_pg_publication_rel) GETSTRUCT(tuple);
+	if (prform->prinh != inh)
+	{
+		Datum		newValues[Natts_pg_publication_rel];
+		bool		newNulls[Natts_pg_publication_rel];
+		bool		replaces[Natts_pg_publication_rel];
+		HeapTuple	newTuple;
+
+		MemSet(newValues, 0, sizeof(newValues));
+		MemSet(newNulls, false, sizeof(newValues));
+		MemSet(replaces, false, sizeof(replaces));
+		newValues[Anum_pg_publication_rel_prinh - 1] = inh;
+		newNulls[Anum_pg_publication_rel_prinh - 1] = false;
+		replaces[Anum_pg_publication_rel_prinh - 1] = true;
+
+		newTuple = heap_modify_tuple(tuple,
+									 RelationGetDescr(pubrelCatalog),
+									 newValues, newNulls,
+									 replaces);
+		CatalogTupleUpdate(pubrelCatalog, &newTuple->t_self, newTuple);
+		heap_freetuple(newTuple);
+	}
+
+	ReleaseSysCache(tuple);
+}
+
 
 /*
  * Insert new publication / relation mapping.
  */
 ObjectAddress
 publication_add_relation(Oid pubid, Relation targetrel,
-						 bool if_not_exists)
+						 bool inh, bool if_not_exists)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -172,10 +208,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
 							  ObjectIdGetDatum(pubid)))
 	{
-		table_close(rel, RowExclusiveLock);
-
-		if (if_not_exists)
+		if (if_not_exists || inh)
+		{
+			/*
+			 * It's possible that the target relation is being re-added to the
+			 * publication due to inheritance recursion.  In that case, simply
+			 * set the inheritance flag of the found entry.  Note that the
+			 * flag is turned off when the partition is detached from the
+			 * parent.
+			 */
+			if (inh)
+				publication_rel_update_inheritance(rel, pubid, targetrel, inh);
+			table_close(rel, RowExclusiveLock);
 			return InvalidObjectAddress;
+		}
 
 		ereport(ERROR,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
@@ -196,6 +242,9 @@ publication_add_relation(Oid pubid, Relation targetrel,
 		ObjectIdGetDatum(pubid);
 	values[Anum_pg_publication_rel_prrelid - 1] =
 		ObjectIdGetDatum(relid);
+	/* Set inheritance only for partitions. */
+	values[Anum_pg_publication_rel_prinh - 1] =
+		BoolGetDatum(inh && targetrel->rd_rel->relispartition);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -254,9 +303,12 @@ GetRelationPublications(Oid relid)
  *
  * This should only be used for normal publications, the FOR ALL TABLES
  * should use GetAllTablesPublicationRelations().
+ *
+ * Return partitions that were added to the publication via parent only
+ * if 'get_children' is true.
  */
 List *
-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, bool get_children)
 {
 	List	   *result;
 	Relation	pubrelsrel;
@@ -282,7 +334,8 @@ GetPublicationRelations(Oid pubid)
 
 		pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
 
-		result = lappend_oid(result, pubrel->prrelid);
+		if (!pubrel->prinh || get_children)
+			result = lappend_oid(result, pubrel->prrelid);
 	}
 
 	systable_endscan(scan);
@@ -497,7 +550,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		if (publication->alltables)
 			tables = GetAllTablesPublicationRelations();
 		else
-			tables = GetPublicationRelations(publication->oid);
+			tables = GetPublicationRelations(publication->oid, true);
 		funcctx->user_fctx = (void *) tables;
 
 		MemoryContextSwitchTo(oldcontext);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..9bd85b13de 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -37,6 +37,10 @@
 #include "commands/event_trigger.h"
 #include "commands/publicationcmds.h"
 
+#include "nodes/makefuncs.h"
+
+#include "partitioning/partdesc.h"
+
 #include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/catcache.h"
@@ -50,11 +54,16 @@
 /* Same as MAXNUMMESSAGES in sinvaladt.c */
 #define MAX_RELCACHE_INVAL_MSGS 4096
 
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
-static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
+static void PublicationAddTables(Oid pubid, List *tables, bool if_not_exists,
 								 AlterPublicationStmt *stmt);
+static void PublicationAddTable(Oid pubid, Relation rel, bool if_not_exists,
+					AlterPublicationStmt *stmt,
+					bool recurse, bool recursing,
+					List **processed_relids);
 static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
+static void PublicationDropTable(Oid pubid, Relation rel, bool missing_ok,
+					 bool recurse, bool recursing,
+					 List **processed_relids);
 
 static void
 parse_publication_options(List *options,
@@ -219,13 +228,8 @@ CreatePublication(CreatePublicationStmt *stmt)
 
 	if (stmt->tables)
 	{
-		List	   *rels;
-
 		Assert(list_length(stmt->tables) > 0);
-
-		rels = OpenTableList(stmt->tables);
-		PublicationAddTables(puboid, rels, true, NULL);
-		CloseTableList(rels);
+		PublicationAddTables(puboid, stmt->tables, true, NULL);
 	}
 
 	table_close(rel, RowExclusiveLock);
@@ -303,7 +307,7 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	}
 	else
 	{
-		List	   *relids = GetPublicationRelations(pubform->oid);
+		List	   *relids = GetPublicationRelations(pubform->oid, true);
 
 		/*
 		 * We don't want to send too many individual messages, at some point
@@ -338,7 +342,6 @@ static void
 AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 					   HeapTuple tup)
 {
-	List	   *rels = NIL;
 	Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
 	Oid			pubid = pubform->oid;
 
@@ -352,15 +355,18 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 
 	Assert(list_length(stmt->tables) > 0);
 
-	rels = OpenTableList(stmt->tables);
-
 	if (stmt->tableAction == DEFELEM_ADD)
-		PublicationAddTables(pubid, rels, false, stmt);
+		PublicationAddTables(pubid, stmt->tables, false, stmt);
 	else if (stmt->tableAction == DEFELEM_DROP)
-		PublicationDropTables(pubid, rels, false);
+		PublicationDropTables(pubid, stmt->tables, false);
 	else						/* DEFELEM_SET */
 	{
-		List	   *oldrelids = GetPublicationRelations(pubid);
+		/*
+		 * Only fetch parent relations, because child relations cannot
+		 * be dropped on their own as we might do based on the logic
+		 * below.
+		 */
+		List	   *oldrelids = GetPublicationRelations(pubid, false);
 		List	   *delrels = NIL;
 		ListCell   *oldlc;
 
@@ -371,11 +377,13 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 			ListCell   *newlc;
 			bool		found = false;
 
-			foreach(newlc, rels)
+			foreach(newlc, stmt->tables)
 			{
-				Relation	newrel = (Relation) lfirst(newlc);
+				RangeVar *newrel = (RangeVar *) lfirst(newlc);
 
-				if (RelationGetRelid(newrel) == oldrelid)
+				if (RangeVarGetRelid(newrel,
+									 ShareUpdateExclusiveLock,
+									 false) == oldrelid)
 				{
 					found = true;
 					break;
@@ -384,10 +392,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 
 			if (!found)
 			{
-				Relation	oldrel = table_open(oldrelid,
-												ShareUpdateExclusiveLock);
+				RangeVar *oldrelrv = makeRangeVar(get_namespace_name(get_rel_namespace(oldrelid)),
+												  get_rel_name(oldrelid), -1);
 
-				delrels = lappend(delrels, oldrel);
+				delrels = lappend(delrels, oldrelrv);
 			}
 		}
 
@@ -398,12 +406,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 		 * Don't bother calculating the difference for adding, we'll catch and
 		 * skip existing ones when doing catalog update.
 		 */
-		PublicationAddTables(pubid, rels, true, stmt);
-
-		CloseTableList(delrels);
+		PublicationAddTables(pubid, stmt->tables, true, stmt);
 	}
-
-	CloseTableList(rels);
 }
 
 /*
@@ -501,19 +505,17 @@ RemovePublicationRelById(Oid proid)
 }
 
 /*
- * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * Add listed tables to the publication.
  */
-static List *
-OpenTableList(List *tables)
+static void
+PublicationAddTables(Oid pubid, List *tables, bool if_not_exists,
+					 AlterPublicationStmt *stmt)
 {
 	List	   *relids = NIL;
-	List	   *rels = NIL;
 	ListCell   *lc;
 
-	/*
-	 * Open, share-lock, and check all the explicitly-specified relations
-	 */
+	Assert(!stmt || !stmt->for_all_tables);
+
 	foreach(lc, tables)
 	{
 		RangeVar   *rv = castNode(RangeVar, lfirst(lc));
@@ -540,129 +542,221 @@ OpenTableList(List *tables)
 			continue;
 		}
 
-		rels = lappend(rels, rel);
+		/* Add to processed list. */
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
-		{
-			List	   *children;
-			ListCell   *child;
+		PublicationAddTable(pubid, rel, if_not_exists,
+							stmt, recurse, false, &relids);
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
-										   NULL);
-
-			foreach(child, children)
-			{
-				Oid			childrelid = lfirst_oid(child);
-
-				/* Allow query cancel in case this takes a long time */
-				CHECK_FOR_INTERRUPTS();
-
-				/*
-				 * Skip duplicates if user specified both parent and child
-				 * tables.
-				 */
-				if (list_member_oid(relids, childrelid))
-					continue;
-
-				/* find_all_inheritors already got lock */
-				rel = table_open(childrelid, NoLock);
-				rels = lappend(rels, rel);
-				relids = lappend_oid(relids, childrelid);
-			}
-		}
-	}
-
-	list_free(relids);
-
-	return rels;
-}
-
-/*
- * Close all relations in the list.
- */
-static void
-CloseTableList(List *rels)
-{
-	ListCell   *lc;
-
-	foreach(lc, rels)
-	{
-		Relation	rel = (Relation) lfirst(lc);
-
-		table_close(rel, NoLock);
+		table_close(rel, ShareUpdateExclusiveLock);
 	}
 }
 
 /*
- * Add listed tables to the publication.
+ * Add given table and children (if any) to the publication.
  */
 static void
-PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
-					 AlterPublicationStmt *stmt)
+PublicationAddTable(Oid pubid, Relation rel, bool if_not_exists,
+					AlterPublicationStmt *stmt,
+					bool recurse, bool recursing,
+					List **processed_relids)
 {
-	ListCell   *lc;
+	ObjectAddress obj;
+	Oid			relid = RelationGetRelid(rel);
 
-	Assert(!stmt || !stmt->for_all_tables);
+	/* Must be owner of the table or superuser. */
+	if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
+					   RelationGetRelationName(rel));
 
-	foreach(lc, rels)
+	obj = publication_add_relation(pubid, rel, recursing, if_not_exists);
+	if (stmt)
 	{
-		Relation	rel = (Relation) lfirst(lc);
-		ObjectAddress obj;
+		EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
+										 (Node *) stmt);
 
-		/* Must be owner of the table or superuser. */
-		if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
-			aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
-						   RelationGetRelationName(rel));
+		InvokeObjectPostCreateHook(PublicationRelRelationId,
+								   obj.objectId, 0);
+	}
 
-		obj = publication_add_relation(pubid, rel, if_not_exists);
-		if (stmt)
+	/* Process children of this rel, if requested */
+	if (recurse)
+	{
+		List	   *children;
+		ListCell   *child;
+
+		children = find_all_inheritors(relid, ShareUpdateExclusiveLock,
+									   NULL);
+
+		foreach(child, children)
 		{
-			EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
-											 (Node *) stmt);
+			Oid			childrelid = lfirst_oid(child);
 
-			InvokeObjectPostCreateHook(PublicationRelRelationId,
-									   obj.objectId, 0);
+			/* Allow query cancel in case this takes a long time */
+			CHECK_FOR_INTERRUPTS();
+
+			/*
+			 * Skip duplicates if user specified both parent and child
+			 * tables.
+			 */
+			if (list_member_oid(*processed_relids, childrelid))
+				continue;
+
+			/* find_all_inheritors already got lock */
+			rel = table_open(childrelid, NoLock);
+
+			/* Add to processed list. */
+			*processed_relids = lappend_oid(*processed_relids, childrelid);
+
+			/* Recursively add this child to the publication. */
+			PublicationAddTable(pubid, rel, if_not_exists, stmt,
+								recurse, true, processed_relids);
+			table_close(rel, NoLock);
 		}
 	}
+	else if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			 rel->rd_partdesc->nparts > 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+				 errmsg("cannot add only partitioned table to publication when partitions exist"),
+				 errhint("Do not specify the ONLY keyword.")));
 }
 
 /*
  * Remove listed tables from the publication.
  */
 static void
-PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
+PublicationDropTables(Oid pubid, List *tables, bool missing_ok)
 {
-	ObjectAddress obj;
-	ListCell   *lc;
-	Oid			prid;
+	ListCell *lc;
+	List	 *relids = NIL;
 
-	foreach(lc, rels)
+	foreach(lc, tables)
 	{
-		Relation	rel = (Relation) lfirst(lc);
-		Oid			relid = RelationGetRelid(rel);
+		RangeVar   *rv = castNode(RangeVar, lfirst(lc));
+		bool		recurse = rv->inh;
+		Relation	rel;
+		Oid			myrelid;
 
-		prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
-							   ObjectIdGetDatum(relid),
-							   ObjectIdGetDatum(pubid));
-		if (!OidIsValid(prid))
+		/* Allow query cancel in case this takes a long time */
+		CHECK_FOR_INTERRUPTS();
+
+		rel = table_openrv(rv, ShareUpdateExclusiveLock);
+		myrelid = RelationGetRelid(rel);
+
+		/*
+		 * Filter out duplicates if user specifies "foo, foo".
+		 *
+		 * Note that this algorithm is known to not be very efficient (O(N^2))
+		 * but given that it only works on list of tables given to us by user
+		 * it's deemed acceptable.
+		 */
+		if (list_member_oid(relids, myrelid))
 		{
-			if (missing_ok)
-				continue;
-
-			ereport(ERROR,
-					(errcode(ERRCODE_UNDEFINED_OBJECT),
-					 errmsg("relation \"%s\" is not part of the publication",
-							RelationGetRelationName(rel))));
+			table_close(rel, ShareUpdateExclusiveLock);
+			continue;
 		}
 
-		ObjectAddressSet(obj, PublicationRelRelationId, prid);
-		performDeletion(&obj, DROP_CASCADE, 0);
+		/* Add to processed list. */
+		relids = lappend_oid(relids, myrelid);
+
+		PublicationDropTable(pubid, rel, missing_ok, recurse, false, &relids);
+
+		table_close(rel, ShareUpdateExclusiveLock);
 	}
 }
 
 /*
+ * Remove given table and children (if any) from the publication.
+ */
+static void
+PublicationDropTable(Oid pubid, Relation rel, bool missing_ok,
+					 bool recurse, bool recursing,
+					 List **processed_relids)
+{
+	ObjectAddress obj;
+	Oid			relid = RelationGetRelid(rel);
+	Relation	pubrelCatalog;
+	HeapTuple	tuple;
+	Form_pg_publication_rel prform;
+	List	   *children;
+	ListCell   *child;
+
+	pubrelCatalog = table_open(PublicationRelRelationId, RowExclusiveLock);
+	tuple = SearchSysCache2(PUBLICATIONRELMAP,
+							ObjectIdGetDatum(relid),
+							ObjectIdGetDatum(pubid));
+	if (!HeapTupleIsValid(tuple))
+	{
+		if (missing_ok)
+			return;
+
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("relation \"%s\" is not part of the publication",
+						RelationGetRelationName(rel))));
+	}
+
+	prform = (Form_pg_publication_rel) GETSTRUCT(tuple);
+
+	/*
+	 * For a partition, check if we can really drop it from the
+	 * publication.
+	 */
+	if (rel->rd_rel->relispartition && prform->prinh && !recursing)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+				 errmsg("cannot drop partition \"%s\" from an inherited publication",
+						RelationGetRelationName(rel)),
+				 errhint("Drop the parent from publication instead.")));
+
+	ObjectAddressSet(obj, PublicationRelRelationId, prform->oid);
+	performDeletion(&obj, DROP_CASCADE, 0);
+	ReleaseSysCache(tuple);
+
+	/* Process children of this rel, if requested */
+	children = find_all_inheritors(relid, ShareUpdateExclusiveLock,
+								   NULL);
+
+	foreach(child, children)
+	{
+		Oid			childrelid = lfirst_oid(child);
+		Relation	childrel;
+
+		/* Allow query cancel in case this takes a long time */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Skip duplicates if user specified both parent and child
+		 * tables.
+		 */
+		if (list_member_oid(*processed_relids, childrelid))
+			continue;
+
+		/* find_all_inheritors already got lock */
+		childrel = table_open(childrelid, NoLock);
+
+		/* Add to processed list. */
+		*processed_relids = lappend_oid(*processed_relids, childrelid);
+
+		/*
+		 * If requested, recursively drop this child from the publication.
+		 * Otherwise, simply reset the inheritance flag of child's publication
+		 * membership because the parent is no longer part of the publication.
+		 */
+		if (recurse)
+			PublicationDropTable(pubid, childrel, missing_ok, recurse, true,
+								 processed_relids);
+		else
+			publication_rel_update_inheritance(pubrelCatalog, pubid, childrel,
+											   false);
+		table_close(childrel, NoLock);
+	}
+
+	table_close(pubrelCatalog, NoLock);
+}
+
+/*
  * Internal workhorse for changing a publication owner
  */
 static void
@@ -772,3 +866,59 @@ AlterPublicationOwner_oid(Oid subid, Oid newOwnerId)
 
 	table_close(rel, RowExclusiveLock);
 }
+
+/*
+ * This adds the partition and its sub-partitions (if any) to all of
+ * parent's publications.  Partition's membership of those publications
+ * is attached to parent's membership, so for example, partition cannot
+ * be removed from the publication unless parent is also removed.
+ */
+void
+ClonePublicationsForPartition(Relation parent, Relation partition)
+{
+	ListCell   *lc;
+	List	   *parentPubs = GetRelationPublications(RelationGetRelid(parent));
+
+	/* Add partition and its sub-partitions (if any) to each publication. */
+	foreach(lc, parentPubs)
+	{
+		Oid		pubid = lfirst_oid(lc);
+		List   *relids = list_make1_oid(RelationGetRelid(partition));
+
+		PublicationAddTable(pubid, partition, true, NULL, true, true,
+							&relids);
+	}
+}
+
+/*
+ * This marks partition's membership in parent's publications as standalone,
+ * that is, not attached to the parent's membership in those publications.
+ * This function is called when detaching a partition from its parent.
+ */
+void
+DetachPartitionPublications(Relation parent, Relation partition)
+{
+	ListCell   *lc;
+	List	   *parentPubs = GetRelationPublications(RelationGetRelid(parent));
+	Relation	pubrelCatalog = NULL;
+
+	if (parentPubs != NIL)
+		pubrelCatalog = table_open(PublicationRelRelationId,
+								   RowExclusiveLock);
+
+	/*
+	 * For each publication, mark the partition's membership as no longer
+	 * being inherited from parent.  Note that we don't recurse to
+	 * partition's own partitions.
+	 */
+	foreach(lc, parentPubs)
+	{
+		Oid		pubid = lfirst_oid(lc);
+
+		publication_rel_update_inheritance(pubrelCatalog, pubid, partition,
+										   false);
+	}
+
+	if (pubrelCatalog)
+		table_close(pubrelCatalog, NoLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 05593f3316..f31af59ae9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -53,6 +53,7 @@
 #include "commands/defrem.h"
 #include "commands/event_trigger.h"
 #include "commands/policy.h"
+#include "commands/publicationcmds.h"
 #include "commands/sequence.h"
 #include "commands/tablecmds.h"
 #include "commands/tablespace.h"
@@ -1044,7 +1045,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 
 	/*
 	 * If we're creating a partition, create now all the indexes, triggers,
-	 * FKs defined in the parent.
+	 * FKs defined in the parent.  Also, add the partition to all
+	 * publications that the parent is part of.
 	 *
 	 * We can't do it earlier, because DefineIndex wants to know the partition
 	 * key which we just stored.
@@ -1117,6 +1119,9 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		 */
 		CloneForeignKeyConstraints(NULL, parent, rel);
 
+		/* Add the partition to any publications that parent is part of. */
+		ClonePublicationsForPartition(parent, rel);
+
 		table_close(parent, NoLock);
 	}
 
@@ -15663,6 +15668,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd)
 	 */
 	CloneForeignKeyConstraints(wqueue, rel, attachrel);
 
+	/* Add the partition to any publications that parent is part of. */
+	ClonePublicationsForPartition(rel, attachrel);
+
 	/*
 	 * Generate partition constraint from the partition bound specification.
 	 * If the parent itself is a partition, make sure to include its
@@ -16243,6 +16251,9 @@ ATExecDetachPartition(Relation rel, RangeVar *name)
 	}
 	CommandCounterIncrement();
 
+	/* Mark partition's membership in parent's publications as local. */
+	DetachPartitionPublications(rel, partRel);
+
 	/*
 	 * Invalidate the parent's relcache so that the partition is no longer
 	 * included in its partition descriptor.
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..f05f44c99f 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -591,17 +591,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +602,11 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	/*
+	 * Subscription for partitioned tables are really placeholder objects, as
+	 * replication itself occurs on the individual partition level.
+	 */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..0955a57a32 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -758,6 +758,13 @@ copy_table(Relation rel)
 	List	   *attnamelist;
 	ParseState *pstate;
 
+	/*
+	 * Skip copy for partitioned tables, because their partitions would
+	 * be copied instead.
+	 */
+	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		return;
+
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
 							RelationGetRelationName(rel), &lrel);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f01fea5b91..5ef01faeed 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3972,8 +3972,9 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/* Only plain and partitioned tables can be aded to publications. */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
@@ -3989,12 +3990,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 
 		resetPQExpBuffer(query);
 
-		/* Get the publication membership for the table. */
+		/*
+		 * Get the publication membership for the table.  Skip publications
+		 * for which it has been added as a child via inheritance.
+		 */
 		appendPQExpBuffer(query,
 						  "SELECT pr.tableoid, pr.oid, p.pubname "
 						  "FROM pg_publication_rel pr, pg_publication p "
 						  "WHERE pr.prrelid = '%u'"
-						  "  AND p.oid = pr.prpubid",
+						  "  AND p.oid = pr.prpubid"
+						  "  AND NOT prinh",
 						  tbinfo->dobj.catId.oid);
 		res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
 
@@ -4053,7 +4058,12 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
 
 	query = createPQExpBuffer();
 
-	appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
+	/* For partitioned tables, recurse by default to add partitions. */
+	if (pubrinfo->pubtable->relkind == RELKIND_PARTITIONED_TABLE)
+		appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE",
+					  fmtId(pubrinfo->pubname));
+	else
+		appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
 					  fmtId(pubrinfo->pubname));
 	appendPQExpBuffer(query, " %s;\n",
 					  fmtQualifiedDumpable(tbinfo));
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..4a0b17806f 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -81,13 +81,15 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelations(Oid pubid, bool get_children);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
-											  bool if_not_exists);
+											  bool inh, bool if_not_exists);
+void publication_rel_update_inheritance(Relation pubrelCatalog, Oid pubid,
+								   Relation rel, bool inh);
 
 extern Oid	get_publication_oid(const char *pubname, bool missing_ok);
 extern char *get_publication_name(Oid pubid, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..46e5039a12 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,7 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
 	Oid			oid;			/* oid */
 	Oid			prpubid;		/* Oid of the publication */
 	Oid			prrelid;		/* Oid of the relation */
+	bool		prinh;			/* Is relation added due to inheritance? */
 } FormData_pg_publication_rel;
 
 /* ----------------
diff --git a/src/include/commands/publicationcmds.h b/src/include/commands/publicationcmds.h
index c536b648f8..915b1e0056 100644
--- a/src/include/commands/publicationcmds.h
+++ b/src/include/commands/publicationcmds.h
@@ -25,5 +25,7 @@ extern void RemovePublicationRelById(Oid proid);
 
 extern ObjectAddress AlterPublicationOwner(const char *name, Oid newOwnerId);
 extern void AlterPublicationOwner_oid(Oid pubid, Oid newOwnerId);
+extern void ClonePublicationsForPartition(Relation parent, Relation partition);
+extern void DetachPartitionPublications(Relation parent, Relation partition);
 
 #endif							/* PUBLICATIONCMDS_H */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..bf378782f8 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -47,7 +47,6 @@ CREATE SCHEMA pub_test;
 CREATE TABLE testpub_tbl1 (id serial primary key, data text);
 CREATE TABLE pub_test.testpub_nopk (foo int, bar int);
 CREATE VIEW testpub_view AS SELECT 1;
-CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
 SET client_min_messages = 'ERROR';
 CREATE PUBLICATION testpub_foralltables FOR ALL TABLES WITH (publish = 'insert');
 RESET client_min_messages;
@@ -142,11 +141,151 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
+--
+-- Tests for partitioned tables
+--
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
+-- Can add "only" partitioned table when there are no partitions
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+CREATE TABLE testpub_parted1 partition of testpub_parted FOR VALUES IN (1) PARTITION BY LIST (a);
+CREATE TABLE testpub_parted11 partition of testpub_parted1 FOR VALUES IN (1);
+-- fail - cannot add "only" partitioned table
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+ERROR:  cannot add only partitioned table to publication when partitions exist
+HINT:  Do not specify the ONLY keyword.
+-- ok, should add partition testpub_parted1, sub-partition testpub_parted11
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+
+-- New partitions should automatically get added to the publications that
+-- the parent is in.
+CREATE TABLE testpub_parted2 PARTITION OF testpub_parted FOR VALUES IN (2);
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+
+-- Create a table that will later be attached to the parent and add it to
+-- the same publication as the one that the parent is in
+CREATE TABLE testpub_parted3 (LIKE testpub_parted);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted3;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- Attaching to parent should not result in the table being duplicatively
+-- added to the publication, nor an error
+ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted3 FOR VALUES IN (3);
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- cannot drop a partition from publication which parent is still part of
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+ERROR:  cannot drop partition "testpub_parted1" from an inherited publication
+HINT:  Drop the parent from publication instead.
+-- When the partition is detached, it and sub-partitions continue to be
+-- members of the publication
+ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- sub-partition's membership is still inherited, so can't drop
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted11;
+ERROR:  cannot drop partition "testpub_parted11" from an inherited publication
+HINT:  Drop the parent from publication instead.
+-- dropping the now-detached partition should work though
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- Some tests for SET TABLE
+-- no changes to the membership, because testpub_parted and testpub_parted3
+-- are already in the publication
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted, testpub_parted3;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- This should drop testpub_parted (hence, testpub_parted2, testpub_parted3)
+-- and add testpub_parted2, testpub_parted1 (hence testpub_parted11)
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted2, testpub_parted1;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+
+DROP PUBLICATION testpub_forparted;
+DROP TABLE testpub_parted, testpub_parted1;
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
@@ -219,7 +358,6 @@ ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- ok
 DROP PUBLICATION testpub2;
 SET ROLE regress_publication_user;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
-DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..06ece9e338 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -35,7 +35,6 @@ CREATE SCHEMA pub_test;
 CREATE TABLE testpub_tbl1 (id serial primary key, data text);
 CREATE TABLE pub_test.testpub_nopk (foo int, bar int);
 CREATE VIEW testpub_view AS SELECT 1;
-CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
 
 SET client_min_messages = 'ERROR';
 CREATE PUBLICATION testpub_foralltables FOR ALL TABLES WITH (publish = 'insert');
@@ -83,8 +82,82 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
+
+--
+-- Tests for partitioned tables
+--
+
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+
+CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
+
+-- Can add "only" partitioned table when there are no partitions
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+
+\dRp+ testpub_forparted
+
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+
+CREATE TABLE testpub_parted1 partition of testpub_parted FOR VALUES IN (1) PARTITION BY LIST (a);
+CREATE TABLE testpub_parted11 partition of testpub_parted1 FOR VALUES IN (1);
+
+-- fail - cannot add "only" partitioned table
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+
+-- ok, should add partition testpub_parted1, sub-partition testpub_parted11
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+
+-- New partitions should automatically get added to the publications that
+-- the parent is in.
+CREATE TABLE testpub_parted2 PARTITION OF testpub_parted FOR VALUES IN (2);
+\dRp+ testpub_forparted
+
+-- Create a table that will later be attached to the parent and add it to
+-- the same publication as the one that the parent is in
+CREATE TABLE testpub_parted3 (LIKE testpub_parted);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted3;
+\dRp+ testpub_forparted
+
+-- Attaching to parent should not result in the table being duplicatively
+-- added to the publication, nor an error
+ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted3 FOR VALUES IN (3);
+\dRp+ testpub_forparted
+
+-- cannot drop a partition from publication which parent is still part of
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+
+-- When the partition is detached, it and sub-partitions continue to be
+-- members of the publication
+ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
+
+\dRp+ testpub_forparted
+
+-- sub-partition's membership is still inherited, so can't drop
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted11;
+
+-- dropping the now-detached partition should work though
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+
+\dRp+ testpub_forparted
+
+-- Some tests for SET TABLE
+
+-- no changes to the membership, because testpub_parted and testpub_parted3
+-- are already in the publication
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted, testpub_parted3;
+
+\dRp+ testpub_forparted
+
+-- This should drop testpub_parted (hence, testpub_parted2, testpub_parted3)
+-- and add testpub_parted2, testpub_parted1 (hence testpub_parted11)
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted2, testpub_parted1;
+\dRp+ testpub_forparted
+
+DROP PUBLICATION testpub_forparted;
+DROP TABLE testpub_parted, testpub_parted1;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
@@ -125,7 +198,6 @@ DROP PUBLICATION testpub2;
 SET ROLE regress_publication_user;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
 
-DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 
-- 
2.11.0

#2Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#1)
Re: adding partitioned tables to publications

On Mon, Oct 7, 2019 at 9:55 AM Amit Langote <amitlangote09@gmail.com> wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Added this to the next CF: https://commitfest.postgresql.org/25/2301/

Thanks,
Amit

#3Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Amit Langote (#2)
Re: adding partitioned tables to publications

On Thu, 10 Oct 2019 at 08:29, Amit Langote <amitlangote09@gmail.com> wrote:

On Mon, Oct 7, 2019 at 9:55 AM Amit Langote <amitlangote09@gmail.com>
wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Added this to the next CF: https://commitfest.postgresql.org/25/2301/

Hi Amit,

Lately I was exploring logical replication feature of postgresql and I
found this addition in the scope of feature for partitioned tables a useful
one.

In order to understand the working of your patch a bit more, I performed an
experiment wherein I created a partitioned table with several children and
a default partition at the publisher side and normal tables of the same
name as parent, children, and default partition of the publisher side at
the subscriber side. Next I established the logical replication connection
and to my surprise the data was successfully replicated from partitioned
tables to normal tables and then this error filled the logs,
LOG: logical replication table synchronization worker for subscription
"my_subscription", table "parent" has started
ERROR: table "public.parent" not found on publisher

here parent is the name of the partitioned table at the publisher side and
it is present as normal table at subscriber side as well. Which is
understandable, it is trying to find a normal table of the same name but
couldn't find one, maybe it should not worry about that now also if not at
replication time.

Please let me know if this is something expected because in my opinion this
is not desirable, there should be some check to check the table type for
replication. This wasn't important till now maybe because only normal
tables were to be replicated, but with the extension of the scope of
logical replication to more objects such checks would be helpful.

On a separate note was thinking for partitioned tables, wouldn't it be
cleaner to have something like you create only partition table at the
subscriber and then when logical replication starts it creates the child
tables accordingly. Or would that be too much in future...?

--
Regards,
Rafia Sabih

#4Amit Langote
amitlangote09@gmail.com
In reply to: Rafia Sabih (#3)
1 attachment(s)
Re: adding partitioned tables to publications

Hello Rafia,

Great to hear that you are interested in this feature and thanks for
testing the patch.

On Thu, Oct 10, 2019 at 10:13 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

Lately I was exploring logical replication feature of postgresql and I found this addition in the scope of feature for partitioned tables a useful one.

In order to understand the working of your patch a bit more, I performed an experiment wherein I created a partitioned table with several children and a default partition at the publisher side and normal tables of the same name as parent, children, and default partition of the publisher side at the subscriber side. Next I established the logical replication connection and to my surprise the data was successfully replicated from partitioned tables to normal tables and then this error filled the logs,
LOG: logical replication table synchronization worker for subscription "my_subscription", table "parent" has started
ERROR: table "public.parent" not found on publisher

here parent is the name of the partitioned table at the publisher side and it is present as normal table at subscriber side as well. Which is understandable, it is trying to find a normal table of the same name but couldn't find one, maybe it should not worry about that now also if not at replication time.

Please let me know if this is something expected because in my opinion this is not desirable, there should be some check to check the table type for replication. This wasn't important till now maybe because only normal tables were to be replicated, but with the extension of the scope of logical replication to more objects such checks would be helpful.

Thanks for sharing this case. I hadn't considered it, but you're
right that it should be handled sensibly. I have fixed table sync
code to handle this case properly. Could you please check your case
with the attached updated patch?

On a separate note was thinking for partitioned tables, wouldn't it be cleaner to have something like you create only partition table at the subscriber and then when logical replication starts it creates the child tables accordingly. Or would that be too much in future...?

Hmm, we'd first need to built the "automatic partition creation"
feature to consider doing something like that. I'm sure you'd agree
that we should undertake that project separately from this tiny
logical replication usability improvement project. :)

Thanks again.

Regards,
Amit

Attachments:

v2-0001-Support-adding-partitioned-tables-to-publication.patchapplication/octet-stream; name=v2-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From 110244aa38bc27da051c0b13ee3a79d689ccaa2c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Wed, 2 Oct 2019 18:52:49 +0900
Subject: [PATCH v2] Support adding partitioned tables to publication

Adding a partitioned table to a publication in turn adds all of its
existing and future partitions.  Detaching a partition doesn't remove
it from the publication, but its membership is dissociated from
the parent's membership, that is, it becomes a standalone member.
---
 doc/src/sgml/logical-replication.sgml       |  22 +-
 doc/src/sgml/ref/alter_publication.sgml     |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  12 +-
 src/backend/catalog/pg_publication.c        |  89 +++++--
 src/backend/commands/publicationcmds.c      | 394 +++++++++++++++++++---------
 src/backend/commands/tablecmds.c            |  13 +-
 src/backend/executor/execReplication.c      |  19 +-
 src/backend/replication/logical/tablesync.c |  21 +-
 src/bin/pg_dump/pg_dump.c                   |  20 +-
 src/include/catalog/pg_publication.h        |   6 +-
 src/include/catalog/pg_publication_rel.h    |   1 +
 src/include/commands/publicationcmds.h      |   2 +
 src/include/replication/logicalproto.h      |   1 +
 src/test/regress/expected/publication.out   | 152 ++++++++++-
 src/test/regress/sql/publication.sql        |  80 +++++-
 15 files changed, 655 insertions(+), 188 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..c14861ddfb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,21 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only possible between combinations of regular and
+     partitioned tables.  That is, the tables on the publication and on the
+     subscription side must be normal or partitioned tables, not views,
+     materialized views, or foreign tables.  Attempts to replicate tables other
+     than regular and partitioned tables will result in an error.
+    </para>
+
+    <para>
+     Actually, when a partitioned table is added to a publication, all of its
+     existing and future partitions are automatically added to the publication.
+     Any changes made to the leaf partitions are sent to the subscription server
+     which must contain a partitioned table with partition hierarchy matching
+     one-to-one with the publication side partitioned table.  For partitioned
+     tables on the two sides to match one-to-one, each partition with a given
+     partition constraint must have the same name on both sides.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 534e598d93..e9db773d9b 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -46,7 +46,11 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
    tables from the publication.  Note that adding tables to a publication that
    is already subscribed to will require a <literal>ALTER SUBSCRIPTION
    ... REFRESH PUBLICATION</literal> action on the subscribing side in order
-   to become effective.
+   to become effective.  Using <literal>DROP TABLE</literal> to remove a
+   partitioned table from a publication will also remove all of its partitions
+   from the publication unless <literal>ONLY</literal> is specified.  However,
+   removing a partition from a publication without first removing its parent
+   will result in an error.
   </para>
 
   <para>
@@ -91,7 +95,10 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
       table name, only that table is affected.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are
       affected.  Optionally, <literal>*</literal> can be specified after the table
-      name to explicitly indicate that descendant tables are included.
+      name to explicitly indicate that descendant tables are included.  Specifying
+      <literal>ONLY</literal> with <literal>SET TABLE</literal> will result in an
+      error for a partitioned table if it contains partitions, because partitions
+      must be added to the publication too.
      </para>
     </listitem>
    </varlistentry>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..7354665e47 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -72,11 +72,13 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base and partitioned tables can be part of a publication.
+      Temporary tables, unlogged tables, foreign tables, materialized views,
+      regular views cannot be part of a publication.  Specifying
+      <literal>ONLY</literal> results in an error for a partitioned table if
+      it contains partitions, because partitions must be added to the
+      publication too.  See <xref linkend="logical-replication-publication"/>
+      for details about how partitioned tables are replicated.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..2547cb71f8 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -50,17 +50,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
 	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -106,7 +98,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -144,13 +137,56 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+/*
+ * Update prinh flag for a given relation's pg_publication_rel entry
+ */
+void
+publication_rel_update_inheritance(Relation pubrelCatalog, Oid pubid,
+								   Relation rel, bool inh)
+{
+	Oid			relid = RelationGetRelid(rel);
+	HeapTuple	tuple;
+	Form_pg_publication_rel prform;
+
+	Assert(pubrelCatalog != NULL);
+
+	tuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
+							ObjectIdGetDatum(pubid));
+	Assert(tuple != NULL);
+
+	prform = (Form_pg_publication_rel) GETSTRUCT(tuple);
+	if (prform->prinh != inh)
+	{
+		Datum		newValues[Natts_pg_publication_rel];
+		bool		newNulls[Natts_pg_publication_rel];
+		bool		replaces[Natts_pg_publication_rel];
+		HeapTuple	newTuple;
+
+		MemSet(newValues, 0, sizeof(newValues));
+		MemSet(newNulls, false, sizeof(newValues));
+		MemSet(replaces, false, sizeof(replaces));
+		newValues[Anum_pg_publication_rel_prinh - 1] = inh;
+		newNulls[Anum_pg_publication_rel_prinh - 1] = false;
+		replaces[Anum_pg_publication_rel_prinh - 1] = true;
+
+		newTuple = heap_modify_tuple(tuple,
+									 RelationGetDescr(pubrelCatalog),
+									 newValues, newNulls,
+									 replaces);
+		CatalogTupleUpdate(pubrelCatalog, &newTuple->t_self, newTuple);
+		heap_freetuple(newTuple);
+	}
+
+	ReleaseSysCache(tuple);
+}
+
 
 /*
  * Insert new publication / relation mapping.
  */
 ObjectAddress
 publication_add_relation(Oid pubid, Relation targetrel,
-						 bool if_not_exists)
+						 bool inh, bool if_not_exists)
 {
 	Relation	rel;
 	HeapTuple	tup;
@@ -172,10 +208,20 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
 							  ObjectIdGetDatum(pubid)))
 	{
-		table_close(rel, RowExclusiveLock);
-
-		if (if_not_exists)
+		if (if_not_exists || inh)
+		{
+			/*
+			 * It's possible that the target relation is being re-added to the
+			 * publication due to inheritance recursion.  In that case, simply
+			 * set the inheritance flag of the found entry.  Note that the
+			 * flag is turned off when the partition is detached from the
+			 * parent.
+			 */
+			if (inh)
+				publication_rel_update_inheritance(rel, pubid, targetrel, inh);
+			table_close(rel, RowExclusiveLock);
 			return InvalidObjectAddress;
+		}
 
 		ereport(ERROR,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
@@ -196,6 +242,9 @@ publication_add_relation(Oid pubid, Relation targetrel,
 		ObjectIdGetDatum(pubid);
 	values[Anum_pg_publication_rel_prrelid - 1] =
 		ObjectIdGetDatum(relid);
+	/* Set inheritance only for partitions. */
+	values[Anum_pg_publication_rel_prinh - 1] =
+		BoolGetDatum(inh && targetrel->rd_rel->relispartition);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -254,9 +303,12 @@ GetRelationPublications(Oid relid)
  *
  * This should only be used for normal publications, the FOR ALL TABLES
  * should use GetAllTablesPublicationRelations().
+ *
+ * Return partitions that were added to the publication via parent only
+ * if 'get_children' is true.
  */
 List *
-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, bool get_children)
 {
 	List	   *result;
 	Relation	pubrelsrel;
@@ -282,7 +334,8 @@ GetPublicationRelations(Oid pubid)
 
 		pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
 
-		result = lappend_oid(result, pubrel->prrelid);
+		if (!pubrel->prinh || get_children)
+			result = lappend_oid(result, pubrel->prrelid);
 	}
 
 	systable_endscan(scan);
@@ -497,7 +550,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		if (publication->alltables)
 			tables = GetAllTablesPublicationRelations();
 		else
-			tables = GetPublicationRelations(publication->oid);
+			tables = GetPublicationRelations(publication->oid, true);
 		funcctx->user_fctx = (void *) tables;
 
 		MemoryContextSwitchTo(oldcontext);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..9bd85b13de 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -37,6 +37,10 @@
 #include "commands/event_trigger.h"
 #include "commands/publicationcmds.h"
 
+#include "nodes/makefuncs.h"
+
+#include "partitioning/partdesc.h"
+
 #include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/catcache.h"
@@ -50,11 +54,16 @@
 /* Same as MAXNUMMESSAGES in sinvaladt.c */
 #define MAX_RELCACHE_INVAL_MSGS 4096
 
-static List *OpenTableList(List *tables);
-static void CloseTableList(List *rels);
-static void PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
+static void PublicationAddTables(Oid pubid, List *tables, bool if_not_exists,
 								 AlterPublicationStmt *stmt);
+static void PublicationAddTable(Oid pubid, Relation rel, bool if_not_exists,
+					AlterPublicationStmt *stmt,
+					bool recurse, bool recursing,
+					List **processed_relids);
 static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
+static void PublicationDropTable(Oid pubid, Relation rel, bool missing_ok,
+					 bool recurse, bool recursing,
+					 List **processed_relids);
 
 static void
 parse_publication_options(List *options,
@@ -219,13 +228,8 @@ CreatePublication(CreatePublicationStmt *stmt)
 
 	if (stmt->tables)
 	{
-		List	   *rels;
-
 		Assert(list_length(stmt->tables) > 0);
-
-		rels = OpenTableList(stmt->tables);
-		PublicationAddTables(puboid, rels, true, NULL);
-		CloseTableList(rels);
+		PublicationAddTables(puboid, stmt->tables, true, NULL);
 	}
 
 	table_close(rel, RowExclusiveLock);
@@ -303,7 +307,7 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	}
 	else
 	{
-		List	   *relids = GetPublicationRelations(pubform->oid);
+		List	   *relids = GetPublicationRelations(pubform->oid, true);
 
 		/*
 		 * We don't want to send too many individual messages, at some point
@@ -338,7 +342,6 @@ static void
 AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 					   HeapTuple tup)
 {
-	List	   *rels = NIL;
 	Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
 	Oid			pubid = pubform->oid;
 
@@ -352,15 +355,18 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 
 	Assert(list_length(stmt->tables) > 0);
 
-	rels = OpenTableList(stmt->tables);
-
 	if (stmt->tableAction == DEFELEM_ADD)
-		PublicationAddTables(pubid, rels, false, stmt);
+		PublicationAddTables(pubid, stmt->tables, false, stmt);
 	else if (stmt->tableAction == DEFELEM_DROP)
-		PublicationDropTables(pubid, rels, false);
+		PublicationDropTables(pubid, stmt->tables, false);
 	else						/* DEFELEM_SET */
 	{
-		List	   *oldrelids = GetPublicationRelations(pubid);
+		/*
+		 * Only fetch parent relations, because child relations cannot
+		 * be dropped on their own as we might do based on the logic
+		 * below.
+		 */
+		List	   *oldrelids = GetPublicationRelations(pubid, false);
 		List	   *delrels = NIL;
 		ListCell   *oldlc;
 
@@ -371,11 +377,13 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 			ListCell   *newlc;
 			bool		found = false;
 
-			foreach(newlc, rels)
+			foreach(newlc, stmt->tables)
 			{
-				Relation	newrel = (Relation) lfirst(newlc);
+				RangeVar *newrel = (RangeVar *) lfirst(newlc);
 
-				if (RelationGetRelid(newrel) == oldrelid)
+				if (RangeVarGetRelid(newrel,
+									 ShareUpdateExclusiveLock,
+									 false) == oldrelid)
 				{
 					found = true;
 					break;
@@ -384,10 +392,10 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 
 			if (!found)
 			{
-				Relation	oldrel = table_open(oldrelid,
-												ShareUpdateExclusiveLock);
+				RangeVar *oldrelrv = makeRangeVar(get_namespace_name(get_rel_namespace(oldrelid)),
+												  get_rel_name(oldrelid), -1);
 
-				delrels = lappend(delrels, oldrel);
+				delrels = lappend(delrels, oldrelrv);
 			}
 		}
 
@@ -398,12 +406,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 		 * Don't bother calculating the difference for adding, we'll catch and
 		 * skip existing ones when doing catalog update.
 		 */
-		PublicationAddTables(pubid, rels, true, stmt);
-
-		CloseTableList(delrels);
+		PublicationAddTables(pubid, stmt->tables, true, stmt);
 	}
-
-	CloseTableList(rels);
 }
 
 /*
@@ -501,19 +505,17 @@ RemovePublicationRelById(Oid proid)
 }
 
 /*
- * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * Add listed tables to the publication.
  */
-static List *
-OpenTableList(List *tables)
+static void
+PublicationAddTables(Oid pubid, List *tables, bool if_not_exists,
+					 AlterPublicationStmt *stmt)
 {
 	List	   *relids = NIL;
-	List	   *rels = NIL;
 	ListCell   *lc;
 
-	/*
-	 * Open, share-lock, and check all the explicitly-specified relations
-	 */
+	Assert(!stmt || !stmt->for_all_tables);
+
 	foreach(lc, tables)
 	{
 		RangeVar   *rv = castNode(RangeVar, lfirst(lc));
@@ -540,129 +542,221 @@ OpenTableList(List *tables)
 			continue;
 		}
 
-		rels = lappend(rels, rel);
+		/* Add to processed list. */
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
-		{
-			List	   *children;
-			ListCell   *child;
+		PublicationAddTable(pubid, rel, if_not_exists,
+							stmt, recurse, false, &relids);
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
-										   NULL);
-
-			foreach(child, children)
-			{
-				Oid			childrelid = lfirst_oid(child);
-
-				/* Allow query cancel in case this takes a long time */
-				CHECK_FOR_INTERRUPTS();
-
-				/*
-				 * Skip duplicates if user specified both parent and child
-				 * tables.
-				 */
-				if (list_member_oid(relids, childrelid))
-					continue;
-
-				/* find_all_inheritors already got lock */
-				rel = table_open(childrelid, NoLock);
-				rels = lappend(rels, rel);
-				relids = lappend_oid(relids, childrelid);
-			}
-		}
-	}
-
-	list_free(relids);
-
-	return rels;
-}
-
-/*
- * Close all relations in the list.
- */
-static void
-CloseTableList(List *rels)
-{
-	ListCell   *lc;
-
-	foreach(lc, rels)
-	{
-		Relation	rel = (Relation) lfirst(lc);
-
-		table_close(rel, NoLock);
+		table_close(rel, ShareUpdateExclusiveLock);
 	}
 }
 
 /*
- * Add listed tables to the publication.
+ * Add given table and children (if any) to the publication.
  */
 static void
-PublicationAddTables(Oid pubid, List *rels, bool if_not_exists,
-					 AlterPublicationStmt *stmt)
+PublicationAddTable(Oid pubid, Relation rel, bool if_not_exists,
+					AlterPublicationStmt *stmt,
+					bool recurse, bool recursing,
+					List **processed_relids)
 {
-	ListCell   *lc;
+	ObjectAddress obj;
+	Oid			relid = RelationGetRelid(rel);
 
-	Assert(!stmt || !stmt->for_all_tables);
+	/* Must be owner of the table or superuser. */
+	if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
+					   RelationGetRelationName(rel));
 
-	foreach(lc, rels)
+	obj = publication_add_relation(pubid, rel, recursing, if_not_exists);
+	if (stmt)
 	{
-		Relation	rel = (Relation) lfirst(lc);
-		ObjectAddress obj;
+		EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
+										 (Node *) stmt);
 
-		/* Must be owner of the table or superuser. */
-		if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
-			aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
-						   RelationGetRelationName(rel));
+		InvokeObjectPostCreateHook(PublicationRelRelationId,
+								   obj.objectId, 0);
+	}
 
-		obj = publication_add_relation(pubid, rel, if_not_exists);
-		if (stmt)
+	/* Process children of this rel, if requested */
+	if (recurse)
+	{
+		List	   *children;
+		ListCell   *child;
+
+		children = find_all_inheritors(relid, ShareUpdateExclusiveLock,
+									   NULL);
+
+		foreach(child, children)
 		{
-			EventTriggerCollectSimpleCommand(obj, InvalidObjectAddress,
-											 (Node *) stmt);
+			Oid			childrelid = lfirst_oid(child);
 
-			InvokeObjectPostCreateHook(PublicationRelRelationId,
-									   obj.objectId, 0);
+			/* Allow query cancel in case this takes a long time */
+			CHECK_FOR_INTERRUPTS();
+
+			/*
+			 * Skip duplicates if user specified both parent and child
+			 * tables.
+			 */
+			if (list_member_oid(*processed_relids, childrelid))
+				continue;
+
+			/* find_all_inheritors already got lock */
+			rel = table_open(childrelid, NoLock);
+
+			/* Add to processed list. */
+			*processed_relids = lappend_oid(*processed_relids, childrelid);
+
+			/* Recursively add this child to the publication. */
+			PublicationAddTable(pubid, rel, if_not_exists, stmt,
+								recurse, true, processed_relids);
+			table_close(rel, NoLock);
 		}
 	}
+	else if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			 rel->rd_partdesc->nparts > 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+				 errmsg("cannot add only partitioned table to publication when partitions exist"),
+				 errhint("Do not specify the ONLY keyword.")));
 }
 
 /*
  * Remove listed tables from the publication.
  */
 static void
-PublicationDropTables(Oid pubid, List *rels, bool missing_ok)
+PublicationDropTables(Oid pubid, List *tables, bool missing_ok)
 {
-	ObjectAddress obj;
-	ListCell   *lc;
-	Oid			prid;
+	ListCell *lc;
+	List	 *relids = NIL;
 
-	foreach(lc, rels)
+	foreach(lc, tables)
 	{
-		Relation	rel = (Relation) lfirst(lc);
-		Oid			relid = RelationGetRelid(rel);
+		RangeVar   *rv = castNode(RangeVar, lfirst(lc));
+		bool		recurse = rv->inh;
+		Relation	rel;
+		Oid			myrelid;
 
-		prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid,
-							   ObjectIdGetDatum(relid),
-							   ObjectIdGetDatum(pubid));
-		if (!OidIsValid(prid))
+		/* Allow query cancel in case this takes a long time */
+		CHECK_FOR_INTERRUPTS();
+
+		rel = table_openrv(rv, ShareUpdateExclusiveLock);
+		myrelid = RelationGetRelid(rel);
+
+		/*
+		 * Filter out duplicates if user specifies "foo, foo".
+		 *
+		 * Note that this algorithm is known to not be very efficient (O(N^2))
+		 * but given that it only works on list of tables given to us by user
+		 * it's deemed acceptable.
+		 */
+		if (list_member_oid(relids, myrelid))
 		{
-			if (missing_ok)
-				continue;
-
-			ereport(ERROR,
-					(errcode(ERRCODE_UNDEFINED_OBJECT),
-					 errmsg("relation \"%s\" is not part of the publication",
-							RelationGetRelationName(rel))));
+			table_close(rel, ShareUpdateExclusiveLock);
+			continue;
 		}
 
-		ObjectAddressSet(obj, PublicationRelRelationId, prid);
-		performDeletion(&obj, DROP_CASCADE, 0);
+		/* Add to processed list. */
+		relids = lappend_oid(relids, myrelid);
+
+		PublicationDropTable(pubid, rel, missing_ok, recurse, false, &relids);
+
+		table_close(rel, ShareUpdateExclusiveLock);
 	}
 }
 
 /*
+ * Remove given table and children (if any) from the publication.
+ */
+static void
+PublicationDropTable(Oid pubid, Relation rel, bool missing_ok,
+					 bool recurse, bool recursing,
+					 List **processed_relids)
+{
+	ObjectAddress obj;
+	Oid			relid = RelationGetRelid(rel);
+	Relation	pubrelCatalog;
+	HeapTuple	tuple;
+	Form_pg_publication_rel prform;
+	List	   *children;
+	ListCell   *child;
+
+	pubrelCatalog = table_open(PublicationRelRelationId, RowExclusiveLock);
+	tuple = SearchSysCache2(PUBLICATIONRELMAP,
+							ObjectIdGetDatum(relid),
+							ObjectIdGetDatum(pubid));
+	if (!HeapTupleIsValid(tuple))
+	{
+		if (missing_ok)
+			return;
+
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("relation \"%s\" is not part of the publication",
+						RelationGetRelationName(rel))));
+	}
+
+	prform = (Form_pg_publication_rel) GETSTRUCT(tuple);
+
+	/*
+	 * For a partition, check if we can really drop it from the
+	 * publication.
+	 */
+	if (rel->rd_rel->relispartition && prform->prinh && !recursing)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+				 errmsg("cannot drop partition \"%s\" from an inherited publication",
+						RelationGetRelationName(rel)),
+				 errhint("Drop the parent from publication instead.")));
+
+	ObjectAddressSet(obj, PublicationRelRelationId, prform->oid);
+	performDeletion(&obj, DROP_CASCADE, 0);
+	ReleaseSysCache(tuple);
+
+	/* Process children of this rel, if requested */
+	children = find_all_inheritors(relid, ShareUpdateExclusiveLock,
+								   NULL);
+
+	foreach(child, children)
+	{
+		Oid			childrelid = lfirst_oid(child);
+		Relation	childrel;
+
+		/* Allow query cancel in case this takes a long time */
+		CHECK_FOR_INTERRUPTS();
+
+		/*
+		 * Skip duplicates if user specified both parent and child
+		 * tables.
+		 */
+		if (list_member_oid(*processed_relids, childrelid))
+			continue;
+
+		/* find_all_inheritors already got lock */
+		childrel = table_open(childrelid, NoLock);
+
+		/* Add to processed list. */
+		*processed_relids = lappend_oid(*processed_relids, childrelid);
+
+		/*
+		 * If requested, recursively drop this child from the publication.
+		 * Otherwise, simply reset the inheritance flag of child's publication
+		 * membership because the parent is no longer part of the publication.
+		 */
+		if (recurse)
+			PublicationDropTable(pubid, childrel, missing_ok, recurse, true,
+								 processed_relids);
+		else
+			publication_rel_update_inheritance(pubrelCatalog, pubid, childrel,
+											   false);
+		table_close(childrel, NoLock);
+	}
+
+	table_close(pubrelCatalog, NoLock);
+}
+
+/*
  * Internal workhorse for changing a publication owner
  */
 static void
@@ -772,3 +866,59 @@ AlterPublicationOwner_oid(Oid subid, Oid newOwnerId)
 
 	table_close(rel, RowExclusiveLock);
 }
+
+/*
+ * This adds the partition and its sub-partitions (if any) to all of
+ * parent's publications.  Partition's membership of those publications
+ * is attached to parent's membership, so for example, partition cannot
+ * be removed from the publication unless parent is also removed.
+ */
+void
+ClonePublicationsForPartition(Relation parent, Relation partition)
+{
+	ListCell   *lc;
+	List	   *parentPubs = GetRelationPublications(RelationGetRelid(parent));
+
+	/* Add partition and its sub-partitions (if any) to each publication. */
+	foreach(lc, parentPubs)
+	{
+		Oid		pubid = lfirst_oid(lc);
+		List   *relids = list_make1_oid(RelationGetRelid(partition));
+
+		PublicationAddTable(pubid, partition, true, NULL, true, true,
+							&relids);
+	}
+}
+
+/*
+ * This marks partition's membership in parent's publications as standalone,
+ * that is, not attached to the parent's membership in those publications.
+ * This function is called when detaching a partition from its parent.
+ */
+void
+DetachPartitionPublications(Relation parent, Relation partition)
+{
+	ListCell   *lc;
+	List	   *parentPubs = GetRelationPublications(RelationGetRelid(parent));
+	Relation	pubrelCatalog = NULL;
+
+	if (parentPubs != NIL)
+		pubrelCatalog = table_open(PublicationRelRelationId,
+								   RowExclusiveLock);
+
+	/*
+	 * For each publication, mark the partition's membership as no longer
+	 * being inherited from parent.  Note that we don't recurse to
+	 * partition's own partitions.
+	 */
+	foreach(lc, parentPubs)
+	{
+		Oid		pubid = lfirst_oid(lc);
+
+		publication_rel_update_inheritance(pubrelCatalog, pubid, partition,
+										   false);
+	}
+
+	if (pubrelCatalog)
+		table_close(pubrelCatalog, NoLock);
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index ba8f4459f3..39cf0d40d5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -53,6 +53,7 @@
 #include "commands/defrem.h"
 #include "commands/event_trigger.h"
 #include "commands/policy.h"
+#include "commands/publicationcmds.h"
 #include "commands/sequence.h"
 #include "commands/tablecmds.h"
 #include "commands/tablespace.h"
@@ -1044,7 +1045,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 
 	/*
 	 * If we're creating a partition, create now all the indexes, triggers,
-	 * FKs defined in the parent.
+	 * FKs defined in the parent.  Also, add the partition to all
+	 * publications that the parent is part of.
 	 *
 	 * We can't do it earlier, because DefineIndex wants to know the partition
 	 * key which we just stored.
@@ -1117,6 +1119,9 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		 */
 		CloneForeignKeyConstraints(NULL, parent, rel);
 
+		/* Add the partition to any publications that parent is part of. */
+		ClonePublicationsForPartition(parent, rel);
+
 		table_close(parent, NoLock);
 	}
 
@@ -15673,6 +15678,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd)
 	 */
 	CloneForeignKeyConstraints(wqueue, rel, attachrel);
 
+	/* Add the partition to any publications that parent is part of. */
+	ClonePublicationsForPartition(rel, attachrel);
+
 	/*
 	 * Generate partition constraint from the partition bound specification.
 	 * If the parent itself is a partition, make sure to include its
@@ -16253,6 +16261,9 @@ ATExecDetachPartition(Relation rel, RangeVar *name)
 	}
 	CommandCounterIncrement();
 
+	/* Mark partition's membership in parent's publications as local. */
+	DetachPartitionPublications(rel, partRel);
+
 	/*
 	 * Invalidate the parent's relcache so that the partition is no longer
 	 * included in its partition descriptor.
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..f05f44c99f 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -591,17 +591,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +602,11 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	/*
+	 * Subscription for partitioned tables are really placeholder objects, as
+	 * replication itself occurs on the individual partition level.
+	 */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..ed081743a9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -646,7 +646,7 @@ fetch_remote_table_info(char *nspname, char *relname,
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -656,16 +656,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -682,6 +682,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	lrel->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -769,6 +771,17 @@ copy_table(Relation rel)
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
 
+	/*
+	 * Can't copy if either of the local and the remote relation is a
+	 * partitioned table.
+	 */
+	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+		lrel.relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		logicalrep_rel_close(relmapentry, NoLock);
+		return;
+	}
+
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
 	appendStringInfo(&cmd, "COPY %s TO STDOUT",
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f01fea5b91..5ef01faeed 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3972,8 +3972,9 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/* Only plain and partitioned tables can be aded to publications. */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
@@ -3989,12 +3990,16 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 
 		resetPQExpBuffer(query);
 
-		/* Get the publication membership for the table. */
+		/*
+		 * Get the publication membership for the table.  Skip publications
+		 * for which it has been added as a child via inheritance.
+		 */
 		appendPQExpBuffer(query,
 						  "SELECT pr.tableoid, pr.oid, p.pubname "
 						  "FROM pg_publication_rel pr, pg_publication p "
 						  "WHERE pr.prrelid = '%u'"
-						  "  AND p.oid = pr.prpubid",
+						  "  AND p.oid = pr.prpubid"
+						  "  AND NOT prinh",
 						  tbinfo->dobj.catId.oid);
 		res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
 
@@ -4053,7 +4058,12 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
 
 	query = createPQExpBuffer();
 
-	appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
+	/* For partitioned tables, recurse by default to add partitions. */
+	if (pubrinfo->pubtable->relkind == RELKIND_PARTITIONED_TABLE)
+		appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE",
+					  fmtId(pubrinfo->pubname));
+	else
+		appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
 					  fmtId(pubrinfo->pubname));
 	appendPQExpBuffer(query, " %s;\n",
 					  fmtQualifiedDumpable(tbinfo));
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..4a0b17806f 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -81,13 +81,15 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelations(Oid pubid, bool get_children);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
-											  bool if_not_exists);
+											  bool inh, bool if_not_exists);
+void publication_rel_update_inheritance(Relation pubrelCatalog, Oid pubid,
+								   Relation rel, bool inh);
 
 extern Oid	get_publication_oid(const char *pubname, bool missing_ok);
 extern char *get_publication_name(Oid pubid, bool missing_ok);
diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h
index 5f5bc92ab3..46e5039a12 100644
--- a/src/include/catalog/pg_publication_rel.h
+++ b/src/include/catalog/pg_publication_rel.h
@@ -31,6 +31,7 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
 	Oid			oid;			/* oid */
 	Oid			prpubid;		/* Oid of the publication */
 	Oid			prrelid;		/* Oid of the relation */
+	bool		prinh;			/* Is relation added due to inheritance? */
 } FormData_pg_publication_rel;
 
 /* ----------------
diff --git a/src/include/commands/publicationcmds.h b/src/include/commands/publicationcmds.h
index c536b648f8..915b1e0056 100644
--- a/src/include/commands/publicationcmds.h
+++ b/src/include/commands/publicationcmds.h
@@ -25,5 +25,7 @@ extern void RemovePublicationRelById(Oid proid);
 
 extern ObjectAddress AlterPublicationOwner(const char *name, Oid newOwnerId);
 extern void AlterPublicationOwner_oid(Oid pubid, Oid newOwnerId);
+extern void ClonePublicationsForPartition(Relation parent, Relation partition);
+extern void DetachPartitionPublications(Relation parent, Relation partition);
 
 #endif							/* PUBLICATIONCMDS_H */
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 3fc430af01..0fea368d99 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -45,6 +45,7 @@ typedef struct LogicalRepRelation
 	LogicalRepRelId remoteid;	/* unique id of the relation */
 	char	   *nspname;		/* schema name */
 	char	   *relname;		/* relation name */
+	char		relkind;		/* relation kind */
 	int			natts;			/* number of columns */
 	char	  **attnames;		/* column names */
 	Oid		   *atttyps;		/* column types */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..bf378782f8 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -47,7 +47,6 @@ CREATE SCHEMA pub_test;
 CREATE TABLE testpub_tbl1 (id serial primary key, data text);
 CREATE TABLE pub_test.testpub_nopk (foo int, bar int);
 CREATE VIEW testpub_view AS SELECT 1;
-CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
 SET client_min_messages = 'ERROR';
 CREATE PUBLICATION testpub_foralltables FOR ALL TABLES WITH (publish = 'insert');
 RESET client_min_messages;
@@ -142,11 +141,151 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
+--
+-- Tests for partitioned tables
+--
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
+-- Can add "only" partitioned table when there are no partitions
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+CREATE TABLE testpub_parted1 partition of testpub_parted FOR VALUES IN (1) PARTITION BY LIST (a);
+CREATE TABLE testpub_parted11 partition of testpub_parted1 FOR VALUES IN (1);
+-- fail - cannot add "only" partitioned table
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+ERROR:  cannot add only partitioned table to publication when partitions exist
+HINT:  Do not specify the ONLY keyword.
+-- ok, should add partition testpub_parted1, sub-partition testpub_parted11
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+
+-- New partitions should automatically get added to the publications that
+-- the parent is in.
+CREATE TABLE testpub_parted2 PARTITION OF testpub_parted FOR VALUES IN (2);
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+
+-- Create a table that will later be attached to the parent and add it to
+-- the same publication as the one that the parent is in
+CREATE TABLE testpub_parted3 (LIKE testpub_parted);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted3;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- Attaching to parent should not result in the table being duplicatively
+-- added to the publication, nor an error
+ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted3 FOR VALUES IN (3);
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- cannot drop a partition from publication which parent is still part of
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+ERROR:  cannot drop partition "testpub_parted1" from an inherited publication
+HINT:  Drop the parent from publication instead.
+-- When the partition is detached, it and sub-partitions continue to be
+-- members of the publication
+ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- sub-partition's membership is still inherited, so can't drop
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted11;
+ERROR:  cannot drop partition "testpub_parted11" from an inherited publication
+HINT:  Drop the parent from publication instead.
+-- dropping the now-detached partition should work though
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- Some tests for SET TABLE
+-- no changes to the membership, because testpub_parted and testpub_parted3
+-- are already in the publication
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted, testpub_parted3;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+    "public.testpub_parted2"
+    "public.testpub_parted3"
+
+-- This should drop testpub_parted (hence, testpub_parted2, testpub_parted3)
+-- and add testpub_parted2, testpub_parted1 (hence testpub_parted11)
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted2, testpub_parted1;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted1"
+    "public.testpub_parted11"
+    "public.testpub_parted2"
+
+DROP PUBLICATION testpub_forparted;
+DROP TABLE testpub_parted, testpub_parted1;
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
@@ -219,7 +358,6 @@ ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- ok
 DROP PUBLICATION testpub2;
 SET ROLE regress_publication_user;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
-DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..06ece9e338 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -35,7 +35,6 @@ CREATE SCHEMA pub_test;
 CREATE TABLE testpub_tbl1 (id serial primary key, data text);
 CREATE TABLE pub_test.testpub_nopk (foo int, bar int);
 CREATE VIEW testpub_view AS SELECT 1;
-CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
 
 SET client_min_messages = 'ERROR';
 CREATE PUBLICATION testpub_foralltables FOR ALL TABLES WITH (publish = 'insert');
@@ -83,8 +82,82 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
+
+--
+-- Tests for partitioned tables
+--
+
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+
+CREATE TABLE testpub_parted (a int) PARTITION BY LIST (a);
+
+-- Can add "only" partitioned table when there are no partitions
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+
+\dRp+ testpub_forparted
+
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+
+CREATE TABLE testpub_parted1 partition of testpub_parted FOR VALUES IN (1) PARTITION BY LIST (a);
+CREATE TABLE testpub_parted11 partition of testpub_parted1 FOR VALUES IN (1);
+
+-- fail - cannot add "only" partitioned table
+ALTER PUBLICATION testpub_forparted ADD TABLE ONLY testpub_parted;
+
+-- ok, should add partition testpub_parted1, sub-partition testpub_parted11
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+
+-- New partitions should automatically get added to the publications that
+-- the parent is in.
+CREATE TABLE testpub_parted2 PARTITION OF testpub_parted FOR VALUES IN (2);
+\dRp+ testpub_forparted
+
+-- Create a table that will later be attached to the parent and add it to
+-- the same publication as the one that the parent is in
+CREATE TABLE testpub_parted3 (LIKE testpub_parted);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted3;
+\dRp+ testpub_forparted
+
+-- Attaching to parent should not result in the table being duplicatively
+-- added to the publication, nor an error
+ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted3 FOR VALUES IN (3);
+\dRp+ testpub_forparted
+
+-- cannot drop a partition from publication which parent is still part of
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+
+-- When the partition is detached, it and sub-partitions continue to be
+-- members of the publication
+ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
+
+\dRp+ testpub_forparted
+
+-- sub-partition's membership is still inherited, so can't drop
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted11;
+
+-- dropping the now-detached partition should work though
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted1;
+
+\dRp+ testpub_forparted
+
+-- Some tests for SET TABLE
+
+-- no changes to the membership, because testpub_parted and testpub_parted3
+-- are already in the publication
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted, testpub_parted3;
+
+\dRp+ testpub_forparted
+
+-- This should drop testpub_parted (hence, testpub_parted2, testpub_parted3)
+-- and add testpub_parted2, testpub_parted1 (hence testpub_parted11)
+ALTER PUBLICATION testpub_forparted SET TABLE testpub_parted2, testpub_parted1;
+\dRp+ testpub_forparted
+
+DROP PUBLICATION testpub_forparted;
+DROP TABLE testpub_parted, testpub_parted1;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
@@ -125,7 +198,6 @@ DROP PUBLICATION testpub2;
 SET ROLE regress_publication_user;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
 
-DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 
-- 
2.11.0

#5Petr Jelinek
petr@2ndquadrant.com
In reply to: Amit Langote (#1)
Re: adding partitioned tables to publications

Hi,

On 07/10/2019 02:55, Amit Langote wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

Or just create publication publish_p for table p1, p2, p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

+1

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Yeah for that to work subscription would need to also need to be able to
write to partitioned tables, so it needs both sides to add support for
this. I think if we do both what you did and the transparent handling of
root only, we'll need new keyword to differentiate the two. It might
make sense to think about if we want your way to need an extra keyword
or the transparent one will need it.

One issue that I see reading the patch is following set of commands:

CREATE TABLE foo ...;
CREATE PUBLICATION mypub FOR TABLE foo;

CREATE TABLE bar ...;
ALTER PUBLICATION mypub ADD TABLE bar;

ALTER TABLE foo ATTACH PARTITION bar ...;
ALTER TABLE foo DETACH PARTITION bar ...;

This will end up with bar not being in any publication even though it
was explicitly added. That might be acceptable caveat but it at least
should be clearly documented (IMHO with warning).

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

#6David Fetter
david@fetter.org
In reply to: Amit Langote (#1)
Re: adding partitioned tables to publications

On Mon, Oct 07, 2019 at 09:55:23AM +0900, Amit Langote wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

With this patch, is it possible to remove a partition manually from a
subscription, or will it just get automatically re-added at some
point?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#7Amit Langote
amitlangote09@gmail.com
In reply to: David Fetter (#6)
Re: adding partitioned tables to publications

Hi David,

On Sun, Oct 13, 2019 at 4:55 PM David Fetter <david@fetter.org> wrote:

On Mon, Oct 07, 2019 at 09:55:23AM +0900, Amit Langote wrote:

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

With this patch, is it possible to remove a partition manually from a
subscription, or will it just get automatically re-added at some
point?

Hmm, I don't think there is any way (commands) to manually remove
tables from a subscription. Testing shows that if you drop a table on
the subscription server that is currently being fed data via a
subscription, then a subscription worker will complain and quit if it
receives a row targeting the dropped table and workers that are
subsequently started will do the same thing. Interestingly, this
behavior prevents replication for any other tables in the subscription
from proceeding, which seems unfortunate.

If you were asking if the patch extends the subscription side
functionality to re-add needed partitions that were manually removed
likely by accident, then no.

Thanks,
Amit

#8Amit Langote
amitlangote09@gmail.com
In reply to: Petr Jelinek (#5)
Re: adding partitioned tables to publications

Hi Petr,

Thanks for your comments.

On Sun, Oct 13, 2019 at 5:01 AM Petr Jelinek <petr@2ndquadrant.com> wrote:

On 07/10/2019 02:55, Amit Langote wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

Or just create publication publish_p for table p1, p2, p3;

Yep, facepalm! :)

So, one doesn't really need as many publication objects as there are
partitions as my version suggests, which is good. Although, as you
can tell, a user would still manually need to keep the set of
published partitions up to date, for example when new partitions are
added.

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

+1

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Yeah for that to work subscription would need to also need to be able to
write to partitioned tables, so it needs both sides to add support for
this.

Ah, I didn't know that the subscription code doesn't out-of-the-box
support tuple routing. Indeed, we will need to fix that.

I think if we do both what you did and the transparent handling of
root only, we'll need new keyword to differentiate the two. It might
make sense to think about if we want your way to need an extra keyword
or the transparent one will need it.

I didn't think about that but maybe you are right.

One issue that I see reading the patch is following set of commands:

CREATE TABLE foo ...;
CREATE PUBLICATION mypub FOR TABLE foo;

CREATE TABLE bar ...;
ALTER PUBLICATION mypub ADD TABLE bar;

ALTER TABLE foo ATTACH PARTITION bar ...;
ALTER TABLE foo DETACH PARTITION bar ...;

This will end up with bar not being in any publication even though it
was explicitly added.

I tested and bar continues to be in the publication with above steps:

create table foo (a int) partition by list (a);
create publication mypub for table foo;
create table bar (a int);
alter publication mypub add table bar;
\d bar
Table "public.bar"
Column │ Type │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
a │ integer │ │ │
Publications:
"mypub"

alter table foo attach partition bar for values in (1);
\d bar
Table "public.bar"
Column │ Type │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
a │ integer │ │ │
Partition of: foo FOR VALUES IN (1)
Publications:
"mypub"

-- can't now drop bar from mypub (its membership is no longer standalone)
alter publication mypub drop table bar;
ERROR: cannot drop partition "bar" from an inherited publication
HINT: Drop the parent from publication instead.

alter table foo detach partition bar;

-- bar is still in mypub (now a standalone member)
\d bar
Table "public.bar"
Column │ Type │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
a │ integer │ │ │
Publications:
"mypub"

-- ok to drop now from mypub
alter publication mypub drop table bar;

Thanks,
Amit

#9Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#4)
Re: adding partitioned tables to publications

This patch seems excessively complicated to me. Why don't you just add
the actual partitioned table to pg_publication_rel and then expand the
partition hierarchy in pgoutput (get_rel_sync_entry() or
GetRelationPublications() or somewhere around there). Then you don't
need to do any work in table DDL to keep the list of published tables up
to date.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Amit Langote (#4)
1 attachment(s)
Re: adding partitioned tables to publications

Hi Amit,

On Fri, 11 Oct 2019 at 08:06, Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for sharing this case. I hadn't considered it, but you're
right that it should be handled sensibly. I have fixed table sync
code to handle this case properly. Could you please check your case
with the attached updated patch?

I was checking this today and found that the behavior doesn't change much

with the updated patch. The tables are still replicated, just that a select
count from parent table shows 0, rest of the partitions including default
one has the data from the publisher. I was expecting more like an error at
subscriber saying the table type is not same.

Please find the attached file for the test case, in case something is
unclear.

--
Regards,
Rafia Sabih

Attachments:

lr_part_test.txttext/plain; charset=US-ASCII; name=lr_part_test.txtDownload
#11Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#9)
2 attachment(s)
Re: adding partitioned tables to publications

Sorry about the delay.

On Mon, Nov 4, 2019 at 8:00 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

This patch seems excessively complicated to me. Why don't you just add
the actual partitioned table to pg_publication_rel and then expand the
partition hierarchy in pgoutput (get_rel_sync_entry() or
GetRelationPublications() or somewhere around there). Then you don't
need to do any work in table DDL to keep the list of published tables up
to date.

I tend to agree that having to manage this at the DDL level would be
bug-prone, not to mention pretty complicated code to implement it.

I have tried to implement it the way you suggested. So every decoded
change to a leaf partition will now be published not only via its own
publication but also via publications of its ancestors if any. That
irrespective of whether a row is directly inserted into the leaf
partition or got routed to it via insert done on an ancestor. In this
implementation, the only pg_publication_rel entry is the one
corresponding to the partitioned table.

On the subscription side, when creating pg_subscription_rel entries,
for a publication containing a partitioned table, all of its
partitions too must be fetched as being included in the publication.
That is necessary, because the initial syncing copy and subsequently
received changes must be applied to individual partitions. That could
be changed in the future by publishing leaf partition changes as
changes to the actually published partitioned table. That future
implementation will also hopefully take care of the concern that Rafia
mentioned on this thread that even with this patch, one must make sure
that tables match one-to-one when they're in publish-subscribe
relationship, which actually needs us to bake in low-level details
like table's relkind in the protocol exchanges.

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

Thanks,
Amit

Attachments:

0001-Some-refactoring-of-publication-and-subscription-cod.patchapplication/octet-stream; name=0001-Some-refactoring-of-publication-and-subscription-cod.patchDownload
From 813a27ed2a96b0096d64950ff749513302f67148 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH 1/2] Some refactoring of publication and subscription code

---
 src/backend/catalog/pg_publication.c        |  5 +-
 src/backend/commands/subscriptioncmds.c     | 79 ++++++++++++++++++++---------
 src/backend/commands/tablecmds.c            |  2 +-
 src/backend/replication/pgoutput/pgoutput.c | 11 ++--
 src/backend/utils/cache/relcache.c          |  2 +-
 src/include/catalog/pg_publication.h        |  2 +-
 6 files changed, 67 insertions(+), 34 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..80b98e2c3c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -224,11 +224,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Gets list of publication oids for a relation oid.
+ * Finds all publications associated with the relation.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	List	   *result = NIL;
 	CatCList   *pubrellist;
 	int			i;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 1419195766..11c0f305ff 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -52,7 +52,19 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
+/*
+ * Structure for fetch_table_list() to store the information about
+ * a given published table.
+ */
+typedef struct PublicationTable
+{
+	char	   *nspname;
+	char	   *relname;
+	char		relkind;
+} PublicationTable;
+
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
+static Oid ValidateSubscriptionRel(PublicationTable *pt);
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -464,15 +476,10 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			tables = fetch_table_list(wrconn, publications);
 			foreach(lc, tables)
 			{
-				RangeVar   *rv = (RangeVar *) lfirst(lc);
+				PublicationTable *pt = lfirst(lc);
 				Oid			relid;
 
-				relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-				/* Check for supported relkind. */
-				CheckSubscriptionRelkind(get_rel_relkind(relid),
-										 rv->schemaname, rv->relname);
-
+				relid = ValidateSubscriptionRel(pt);
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -573,14 +580,11 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
 	foreach(lc, pubrel_names)
 	{
-		RangeVar   *rv = (RangeVar *) lfirst(lc);
+		PublicationTable *pt = lfirst(lc);
 		Oid			relid;
 
-		relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-		/* Check for supported relkind. */
-		CheckSubscriptionRelkind(get_rel_relkind(relid),
-								 rv->schemaname, rv->relname);
+		/* Check that there's an appropriate relation present locally. */
+		relid = ValidateSubscriptionRel(pt);
 
 		pubrel_local_oids[off++] = relid;
 
@@ -592,7 +596,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 									InvalidXLogRecPtr);
 			ereport(DEBUG1,
 					(errmsg("table \"%s.%s\" added to subscription \"%s\"",
-							rv->schemaname, rv->relname, sub->name)));
+							pt->nspname, pt->relname, sub->name)));
 		}
 	}
 
@@ -1137,7 +1141,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {TEXTOID, TEXTOID};
+	Oid			tableRow[3] = {TEXTOID, TEXTOID, CHAROID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1145,9 +1149,12 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename, c.relkind\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_class c ON t.schemaname = c.relnamespace::regnamespace::name\n"
+						   "  AND t.tablename = c.relname\n"
 						   " WHERE t.pubname IN (");
+
 	first = true;
 	foreach(lc, publications)
 	{
@@ -1162,7 +1169,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1174,18 +1181,17 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
 	while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
 	{
-		char	   *nspname;
-		char	   *relname;
+		PublicationTable *pt = palloc0(sizeof(PublicationTable));
 		bool		isnull;
-		RangeVar   *rv;
 
-		nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
+		pt->nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
 		Assert(!isnull);
-		relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
+		pt->relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
+		Assert(!isnull);
+		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
 		Assert(!isnull);
 
-		rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
-		tablelist = lappend(tablelist, rv);
+		tablelist = lappend(tablelist, pt);
 
 		ExecClearTuple(slot);
 	}
@@ -1195,3 +1201,28 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 
 	return tablelist;
 }
+
+/*
+ * Looks up a local relation matching the given publication table and
+ * checks that it's appropriate to use as replication target, erroring
+ * out if not.
+ *
+ * Oid of the successfully validated local relation is returned.
+ */
+static Oid
+ValidateSubscriptionRel(PublicationTable *pt)
+{
+	Oid		relid;
+	RangeVar *rv;
+	char	local_relkind;
+
+	rv = makeRangeVar(pstrdup(pt->nspname), pstrdup(pt->relname), -1);
+	relid = RangeVarGetRelid(rv, AccessShareLock, false);
+	Assert(OidIsValid(relid));
+
+	/* Check for supported relkind. */
+	local_relkind = get_rel_relkind(relid);
+	CheckSubscriptionRelkind(local_relkind, rv->schemaname, rv->relname);
+
+	return relid;
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 5597be6e3d..270e76ad73 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14157,7 +14157,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(rel)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..20856fa33c 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -66,7 +66,7 @@ typedef struct RelationSyncEntry
 static HTAB *RelationSyncCache = NULL;
 
 static void init_rel_sync_cache(MemoryContext decoding_context);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -314,7 +314,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	if (!is_publishable_relation(relation))
 		return;
 
-	relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+	relentry = get_rel_sync_entry(data, relation);
 
 	/* First check the table filter */
 	switch (change->action)
@@ -404,7 +404,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!is_publishable_relation(relation))
 			continue;
 
-		relentry = get_rel_sync_entry(data, relid);
+		relentry = get_rel_sync_entry(data, relation);
 
 		if (!relentry->pubactions.pubtruncate)
 			continue;
@@ -529,8 +529,9 @@ init_rel_sync_cache(MemoryContext cachectx)
  * Find or create entry in the relation schema cache.
  */
 static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
@@ -548,7 +549,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *pubids = GetRelationPublications(rel);
 		ListCell   *lc;
 
 		/* Reload publications if needed before use. */
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 585dcee5db..161fe95fe6 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5105,7 +5105,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(relation);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..2981f61db1 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,7 +80,7 @@ typedef struct Publication
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Relation rel);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
-- 
2.11.0

0002-Support-adding-partitioned-tables-to-publication.patchapplication/octet-stream; name=0002-Support-adding-partitioned-tables-to-publication.patchDownload
From f8301073b332c896c8d356bdea5d5d1534c21a6f Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:10:44 +0900
Subject: [PATCH 2/2] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       | 23 +++++++++----
 doc/src/sgml/ref/create_publication.sgml    | 13 ++++----
 src/backend/catalog/pg_publication.c        | 51 ++++++++++++++++-------------
 src/backend/commands/publicationcmds.c      | 12 +++++--
 src/backend/commands/subscriptioncmds.c     | 27 +++++++++++++--
 src/backend/executor/execReplication.c      | 19 +++++------
 src/backend/replication/logical/relation.c  |  1 +
 src/backend/replication/logical/tablesync.c | 24 +++++++++++---
 src/include/replication/logicalproto.h      |  1 +
 src/test/regress/expected/publication.out   |  3 --
 10 files changed, 115 insertions(+), 59 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..d67015e160 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,22 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables.
+     Attempts to replicate relations other than regular and partitioned tables,
+     such as views, materialized views, or foreign tables, will result in an
+     error.  However, note that replicating from a regular table to partitioned
+     table or vice versa is not supported.
+    </para>
+
+    <para>
+     When a partitioned table is added to a publication, all of its existing
+     and future partitions are implicitly considered to be part of the
+     publication.  Any changes made to the leaf partitions are sent to the
+     subscription server which must contain a partitioned table with partition
+     hierarchy matching one-to-one with the publication side partitioned
+     table.  For partitioned tables on the two sides to match one-to-one, each
+     partition with a given partition constraint must have the same name on
+     both sides.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..5e11868989 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 80b98e2c3c..d4cc805499 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -30,6 +30,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -50,17 +51,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -106,7 +99,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -224,7 +218,8 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Finds all publications associated with the relation.
+ * Finds all publications associated with the relation and if it's a
+ * partition, also with any of its ancestors.
  */
 List *
 GetRelationPublications(Relation rel)
@@ -233,20 +228,32 @@ GetRelationPublications(Relation rel)
 	List	   *result = NIL;
 	CatCList   *pubrellist;
 	int			i;
+	ListCell   *lc;
+	List	   *target_rels = NIL;
 
-	/* Find all publications associated with the relation. */
-	pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
-									 ObjectIdGetDatum(relid));
-	for (i = 0; i < pubrellist->n_members; i++)
+	/* For a partition, include its ancestors' publications, if any. */
+	if (rel->rd_rel->relispartition)
+		target_rels = get_partition_ancestors(RelationGetRelid(rel));
+
+	target_rels = lappend_oid(target_rels, relid);
+
+	foreach(lc, target_rels)
 	{
-		HeapTuple	tup = &pubrellist->members[i]->tuple;
-		Oid			pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
+		Oid		relid = lfirst_oid(lc);
 
-		result = lappend_oid(result, pubid);
+		pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
+										 ObjectIdGetDatum(relid));
+		for (i = 0; i < pubrellist->n_members; i++)
+		{
+			HeapTuple	tup = &pubrellist->members[i]->tuple;
+			Oid			pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
+
+			result = lappend_oid(result, pubid);
+		}
+
+		ReleaseSysCacheList(pubrellist);
 	}
 
-	ReleaseSysCacheList(pubrellist);
-
 	return result;
 }
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..db17c47495 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -502,7 +502,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -543,8 +544,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 11c0f305ff..eb555f1d64 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -1149,11 +1149,20 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename, c.relkind\n"
+	appendStringInfoString(&cmd, "SELECT s.schemaname, s.tablename, s.relkind FROM (\n"
+						   "  SELECT DISTINCT t.pubname, t.schemaname, t.tablename, c.relkind\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
 						   "  JOIN pg_class c ON t.schemaname = c.relnamespace::regnamespace::name\n"
 						   "  AND t.tablename = c.relname\n"
-						   " WHERE t.pubname IN (");
+						   "  UNION\n"
+						   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename, s.relkind\n"
+						   "  FROM pg_catalog.pg_publication_tables t,\n"
+						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname, c.relkind\n"
+						   "		   FROM pg_class c\n"
+						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+						   "		   ON p.relid = c.oid\n"
+						   "		   WHERE p.level > 0) AS s(schemaname, tablename, relkind)) s\n"
+						   " WHERE s.pubname IN (");
 
 	first = true;
 	foreach(lc, publications)
@@ -1224,5 +1233,19 @@ ValidateSubscriptionRel(PublicationTable *pt)
 	local_relkind = get_rel_relkind(relid);
 	CheckSubscriptionRelkind(local_relkind, rv->schemaname, rv->relname);
 
+	/*
+	 * Cannot replicate from a regular to a partitioned table or vice
+	 * versa.
+	 */
+	if (local_relkind != pt->relkind)
+		ereport(ERROR,
+				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
+						rv->schemaname, rv->relname),
+				 errdetail("\"%s.%s\" is a %s on subscription side whereas a %s on publication side",
+						   rv->schemaname, rv->relname,
+						   local_relkind == RELKIND_RELATION ? "regular table" : "partitioned table",
+						   pt->relkind == RELKIND_RELATION ? "regular table" : "partitioned table")));
+
 	return relid;
 }
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..f05f44c99f 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -591,17 +591,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +602,11 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	/*
+	 * Subscription for partitioned tables are really placeholder objects, as
+	 * replication itself occurs on the individual partition level.
+	 */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index f938d1fa48..6a10593e79 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -177,6 +177,7 @@ logicalrep_relmap_update(LogicalRepRelation *remoterel)
 	entry->remoterel.remoteid = remoterel->remoteid;
 	entry->remoterel.nspname = pstrdup(remoterel->nspname);
 	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.relkind = remoterel->relkind;
 	entry->remoterel.natts = remoterel->natts;
 	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
 	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..fa469c8c67 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -637,7 +637,8 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
@@ -646,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -656,16 +657,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -682,6 +683,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	lrel->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -769,6 +772,17 @@ copy_table(Relation rel)
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
 
+	/*
+	 * If either table is partitioned, skip copying.  Individual partitions
+	 * will be copied instead.
+	 */
+	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+		lrel.relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		logicalrep_rel_close(relmapentry, NoLock);
+		return;
+	}
+
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
 	appendStringInfo(&cmd, "COPY %s TO STDOUT",
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 3fc430af01..0fea368d99 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -45,6 +45,7 @@ typedef struct LogicalRepRelation
 	LogicalRepRelId remoteid;	/* unique id of the relation */
 	char	   *nspname;		/* schema name */
 	char	   *relname;		/* relation name */
+	char		relkind;		/* relation kind */
 	int			natts;			/* number of columns */
 	char	  **attnames;		/* column names */
 	Oid		   *atttyps;		/* column types */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..ee0db9b07b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -144,9 +144,6 @@ ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
 -- fail - partitioned table
 ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
-- 
2.11.0

#12Amit Langote
amitlangote09@gmail.com
In reply to: Rafia Sabih (#10)
Re: adding partitioned tables to publications

Hello Rafia,

On Tue, Nov 5, 2019 at 12:41 AM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

On Fri, 11 Oct 2019 at 08:06, Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for sharing this case. I hadn't considered it, but you're
right that it should be handled sensibly. I have fixed table sync
code to handle this case properly. Could you please check your case
with the attached updated patch?

I was checking this today and found that the behavior doesn't change much with the updated patch. The tables are still replicated, just that a select count from parent table shows 0, rest of the partitions including default one has the data from the publisher. I was expecting more like an error at subscriber saying the table type is not same.

Please find the attached file for the test case, in case something is unclear.

Thanks for the test case.

With the latest patch I posted, you'll get the following error on subscriber:

create subscription mysub connection 'host=localhost port=5432
dbname=postgres' publication mypub;
ERROR: cannot use relation "public.t" as logical replication target
DETAIL: "public.t" is a regular table on subscription side whereas a
partitioned table on publication side

Although to be honest, I'd rather not see the error. As I mentioned
in my email earlier, it'd be nice to be able sync a partitioned table
and a regular table (or vice versa) via replication.

Thanks,
Amit

#13Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#11)
2 attachment(s)
Re: adding partitioned tables to publications

On Fri, Nov 8, 2019 at 1:27 PM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

0002 didn't contain necessary pg_dump changes, which fixed in the
attached new version.

Thanks,
Amit

Attachments:

v4-0001-Some-refactoring-of-publication-and-subscription-.patchapplication/octet-stream; name=v4-0001-Some-refactoring-of-publication-and-subscription-.patchDownload
From b645e3ccef7106dc57e576bf8a62a41767469bb9 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v4 1/2] Some refactoring of publication and subscription code

---
 src/backend/catalog/pg_publication.c        |  5 +-
 src/backend/commands/subscriptioncmds.c     | 79 ++++++++++++++++++++---------
 src/backend/commands/tablecmds.c            |  2 +-
 src/backend/replication/pgoutput/pgoutput.c | 11 ++--
 src/backend/utils/cache/relcache.c          |  2 +-
 src/include/catalog/pg_publication.h        |  2 +-
 6 files changed, 67 insertions(+), 34 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fd5da7d5f7..80b98e2c3c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -224,11 +224,12 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Gets list of publication oids for a relation oid.
+ * Finds all publications associated with the relation.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	List	   *result = NIL;
 	CatCList   *pubrellist;
 	int			i;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 1419195766..11c0f305ff 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -52,7 +52,19 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
+/*
+ * Structure for fetch_table_list() to store the information about
+ * a given published table.
+ */
+typedef struct PublicationTable
+{
+	char	   *nspname;
+	char	   *relname;
+	char		relkind;
+} PublicationTable;
+
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
+static Oid ValidateSubscriptionRel(PublicationTable *pt);
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -464,15 +476,10 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			tables = fetch_table_list(wrconn, publications);
 			foreach(lc, tables)
 			{
-				RangeVar   *rv = (RangeVar *) lfirst(lc);
+				PublicationTable *pt = lfirst(lc);
 				Oid			relid;
 
-				relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-				/* Check for supported relkind. */
-				CheckSubscriptionRelkind(get_rel_relkind(relid),
-										 rv->schemaname, rv->relname);
-
+				relid = ValidateSubscriptionRel(pt);
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -573,14 +580,11 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
 	foreach(lc, pubrel_names)
 	{
-		RangeVar   *rv = (RangeVar *) lfirst(lc);
+		PublicationTable *pt = lfirst(lc);
 		Oid			relid;
 
-		relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-		/* Check for supported relkind. */
-		CheckSubscriptionRelkind(get_rel_relkind(relid),
-								 rv->schemaname, rv->relname);
+		/* Check that there's an appropriate relation present locally. */
+		relid = ValidateSubscriptionRel(pt);
 
 		pubrel_local_oids[off++] = relid;
 
@@ -592,7 +596,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 									InvalidXLogRecPtr);
 			ereport(DEBUG1,
 					(errmsg("table \"%s.%s\" added to subscription \"%s\"",
-							rv->schemaname, rv->relname, sub->name)));
+							pt->nspname, pt->relname, sub->name)));
 		}
 	}
 
@@ -1137,7 +1141,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {TEXTOID, TEXTOID};
+	Oid			tableRow[3] = {TEXTOID, TEXTOID, CHAROID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1145,9 +1149,12 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename, c.relkind\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_class c ON t.schemaname = c.relnamespace::regnamespace::name\n"
+						   "  AND t.tablename = c.relname\n"
 						   " WHERE t.pubname IN (");
+
 	first = true;
 	foreach(lc, publications)
 	{
@@ -1162,7 +1169,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1174,18 +1181,17 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
 	while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
 	{
-		char	   *nspname;
-		char	   *relname;
+		PublicationTable *pt = palloc0(sizeof(PublicationTable));
 		bool		isnull;
-		RangeVar   *rv;
 
-		nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
+		pt->nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
 		Assert(!isnull);
-		relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
+		pt->relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
+		Assert(!isnull);
+		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
 		Assert(!isnull);
 
-		rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
-		tablelist = lappend(tablelist, rv);
+		tablelist = lappend(tablelist, pt);
 
 		ExecClearTuple(slot);
 	}
@@ -1195,3 +1201,28 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 
 	return tablelist;
 }
+
+/*
+ * Looks up a local relation matching the given publication table and
+ * checks that it's appropriate to use as replication target, erroring
+ * out if not.
+ *
+ * Oid of the successfully validated local relation is returned.
+ */
+static Oid
+ValidateSubscriptionRel(PublicationTable *pt)
+{
+	Oid		relid;
+	RangeVar *rv;
+	char	local_relkind;
+
+	rv = makeRangeVar(pstrdup(pt->nspname), pstrdup(pt->relname), -1);
+	relid = RangeVarGetRelid(rv, AccessShareLock, false);
+	Assert(OidIsValid(relid));
+
+	/* Check for supported relkind. */
+	local_relkind = get_rel_relkind(relid);
+	CheckSubscriptionRelkind(local_relkind, rv->schemaname, rv->relname);
+
+	return relid;
+}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 5597be6e3d..270e76ad73 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14157,7 +14157,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(rel)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..20856fa33c 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -66,7 +66,7 @@ typedef struct RelationSyncEntry
 static HTAB *RelationSyncCache = NULL;
 
 static void init_rel_sync_cache(MemoryContext decoding_context);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -314,7 +314,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	if (!is_publishable_relation(relation))
 		return;
 
-	relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+	relentry = get_rel_sync_entry(data, relation);
 
 	/* First check the table filter */
 	switch (change->action)
@@ -404,7 +404,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!is_publishable_relation(relation))
 			continue;
 
-		relentry = get_rel_sync_entry(data, relid);
+		relentry = get_rel_sync_entry(data, relation);
 
 		if (!relentry->pubactions.pubtruncate)
 			continue;
@@ -529,8 +529,9 @@ init_rel_sync_cache(MemoryContext cachectx)
  * Find or create entry in the relation schema cache.
  */
 static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
@@ -548,7 +549,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *pubids = GetRelationPublications(rel);
 		ListCell   *lc;
 
 		/* Reload publications if needed before use. */
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 585dcee5db..161fe95fe6 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5105,7 +5105,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(relation);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..2981f61db1 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,7 +80,7 @@ typedef struct Publication
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Relation rel);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
-- 
2.11.0

v4-0002-Support-adding-partitioned-tables-to-publication.patchapplication/octet-stream; name=v4-0002-Support-adding-partitioned-tables-to-publication.patchDownload
From cd232d91618010f332dfd8e6b265c769204d70da Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:10:44 +0900
Subject: [PATCH v4 2/2] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       | 23 +++++++++----
 doc/src/sgml/ref/create_publication.sgml    | 13 ++++----
 src/backend/catalog/pg_publication.c        | 51 ++++++++++++++++-------------
 src/backend/commands/publicationcmds.c      | 12 +++++--
 src/backend/commands/subscriptioncmds.c     | 27 +++++++++++++--
 src/backend/executor/execReplication.c      | 19 +++++------
 src/backend/replication/logical/relation.c  |  1 +
 src/backend/replication/logical/tablesync.c | 24 +++++++++++---
 src/bin/pg_dump/pg_dump.c                   |  5 +--
 src/include/replication/logicalproto.h      |  1 +
 src/test/regress/expected/publication.out   |  3 --
 11 files changed, 118 insertions(+), 61 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..d67015e160 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,22 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables.
+     Attempts to replicate relations other than regular and partitioned tables,
+     such as views, materialized views, or foreign tables, will result in an
+     error.  However, note that replicating from a regular table to partitioned
+     table or vice versa is not supported.
+    </para>
+
+    <para>
+     When a partitioned table is added to a publication, all of its existing
+     and future partitions are implicitly considered to be part of the
+     publication.  Any changes made to the leaf partitions are sent to the
+     subscription server which must contain a partitioned table with partition
+     hierarchy matching one-to-one with the publication side partitioned
+     table.  For partitioned tables on the two sides to match one-to-one, each
+     partition with a given partition constraint must have the same name on
+     both sides.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..5e11868989 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,16 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 80b98e2c3c..d4cc805499 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -30,6 +30,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -50,17 +51,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -106,7 +99,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -224,7 +218,8 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Finds all publications associated with the relation.
+ * Finds all publications associated with the relation and if it's a
+ * partition, also with any of its ancestors.
  */
 List *
 GetRelationPublications(Relation rel)
@@ -233,20 +228,32 @@ GetRelationPublications(Relation rel)
 	List	   *result = NIL;
 	CatCList   *pubrellist;
 	int			i;
+	ListCell   *lc;
+	List	   *target_rels = NIL;
 
-	/* Find all publications associated with the relation. */
-	pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
-									 ObjectIdGetDatum(relid));
-	for (i = 0; i < pubrellist->n_members; i++)
+	/* For a partition, include its ancestors' publications, if any. */
+	if (rel->rd_rel->relispartition)
+		target_rels = get_partition_ancestors(RelationGetRelid(rel));
+
+	target_rels = lappend_oid(target_rels, relid);
+
+	foreach(lc, target_rels)
 	{
-		HeapTuple	tup = &pubrellist->members[i]->tuple;
-		Oid			pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
+		Oid		relid = lfirst_oid(lc);
 
-		result = lappend_oid(result, pubid);
+		pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
+										 ObjectIdGetDatum(relid));
+		for (i = 0; i < pubrellist->n_members; i++)
+		{
+			HeapTuple	tup = &pubrellist->members[i]->tuple;
+			Oid			pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
+
+			result = lappend_oid(result, pubid);
+		}
+
+		ReleaseSysCacheList(pubrellist);
 	}
 
-	ReleaseSysCacheList(pubrellist);
-
 	return result;
 }
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f115d4bf80..db17c47495 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -502,7 +502,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -543,8 +544,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 11c0f305ff..eb555f1d64 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -1149,11 +1149,20 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename, c.relkind\n"
+	appendStringInfoString(&cmd, "SELECT s.schemaname, s.tablename, s.relkind FROM (\n"
+						   "  SELECT DISTINCT t.pubname, t.schemaname, t.tablename, c.relkind\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
 						   "  JOIN pg_class c ON t.schemaname = c.relnamespace::regnamespace::name\n"
 						   "  AND t.tablename = c.relname\n"
-						   " WHERE t.pubname IN (");
+						   "  UNION\n"
+						   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename, s.relkind\n"
+						   "  FROM pg_catalog.pg_publication_tables t,\n"
+						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname, c.relkind\n"
+						   "		   FROM pg_class c\n"
+						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+						   "		   ON p.relid = c.oid\n"
+						   "		   WHERE p.level > 0) AS s(schemaname, tablename, relkind)) s\n"
+						   " WHERE s.pubname IN (");
 
 	first = true;
 	foreach(lc, publications)
@@ -1224,5 +1233,19 @@ ValidateSubscriptionRel(PublicationTable *pt)
 	local_relkind = get_rel_relkind(relid);
 	CheckSubscriptionRelkind(local_relkind, rv->schemaname, rv->relname);
 
+	/*
+	 * Cannot replicate from a regular to a partitioned table or vice
+	 * versa.
+	 */
+	if (local_relkind != pt->relkind)
+		ereport(ERROR,
+				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
+						rv->schemaname, rv->relname),
+				 errdetail("\"%s.%s\" is a %s on subscription side whereas a %s on publication side",
+						   rv->schemaname, rv->relname,
+						   local_relkind == RELKIND_RELATION ? "regular table" : "partitioned table",
+						   pt->relkind == RELKIND_RELATION ? "regular table" : "partitioned table")));
+
 	return relid;
 }
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..f05f44c99f 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -591,17 +591,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +602,11 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	/*
+	 * Subscription for partitioned tables are really placeholder objects, as
+	 * replication itself occurs on the individual partition level.
+	 */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index b386f8460d..aac2af71b7 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -177,6 +177,7 @@ logicalrep_relmap_update(LogicalRepRelation *remoterel)
 	entry->remoterel.remoteid = remoterel->remoteid;
 	entry->remoterel.nspname = pstrdup(remoterel->nspname);
 	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.relkind = remoterel->relkind;
 	entry->remoterel.natts = remoterel->natts;
 	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
 	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 7881079e96..fa469c8c67 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -637,7 +637,8 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
@@ -646,7 +647,7 @@ fetch_remote_table_info(char *nspname, char *relname,
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -656,16 +657,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -682,6 +683,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	lrel->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -769,6 +772,17 @@ copy_table(Relation rel)
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
 
+	/*
+	 * If either table is partitioned, skip copying.  Individual partitions
+	 * will be copied instead.
+	 */
+	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+		lrel.relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		logicalrep_rel_close(relmapentry, NoLock);
+		return;
+	}
+
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
 	appendStringInfo(&cmd, "COPY %s TO STDOUT",
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bf69adc2f4..b0e87b9075 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,9 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/* Only plain and partitioned tables can be aded to publications. */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 3fc430af01..0fea368d99 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -45,6 +45,7 @@ typedef struct LogicalRepRelation
 	LogicalRepRelId remoteid;	/* unique id of the relation */
 	char	   *nspname;		/* schema name */
 	char	   *relname;		/* relation name */
+	char		relkind;		/* relation kind */
 	int			natts;			/* number of columns */
 	char	  **attnames;		/* column names */
 	Oid		   *atttyps;		/* column types */
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..ee0db9b07b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -144,9 +144,6 @@ ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
 -- fail - partitioned table
 ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
-- 
2.11.0

#14Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#13)
Re: adding partitioned tables to publications

On 2019-11-11 08:59, Amit Langote wrote:

On Fri, Nov 8, 2019 at 1:27 PM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

0002 didn't contain necessary pg_dump changes, which fixed in the
attached new version.

That looks more pleasant.

I don't understand why you go through great lengths to ensure that the
relkinds match between publisher and subscriber. We already ensure that
only regular tables are published and only regular tables are allowed as
subscription target. In the future, we may want to allow further
combinations. What situation are you trying to address here?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#14)
Re: adding partitioned tables to publications

On Mon, Nov 11, 2019 at 9:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-11-11 08:59, Amit Langote wrote:

On Fri, Nov 8, 2019 at 1:27 PM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

0002 didn't contain necessary pg_dump changes, which fixed in the
attached new version.

That looks more pleasant.

Thanks for looking.

I don't understand why you go through great lengths to ensure that the
relkinds match between publisher and subscriber. We already ensure that
only regular tables are published and only regular tables are allowed as
subscription target. In the future, we may want to allow further
combinations. What situation are you trying to address here?

I'd really want to see the requirement for relkinds to have to match
go away, but as you can see, this patch doesn't modify enough of
pgoutput.c and worker.c to make that possible. Both the code for the
initital syncing and that for the subsequent real-time replication
assume that both source and target are regular tables. So even if
partitioned tables can now be in a publication, they're never sent in
the protocol messages, only their leaf partitions are. Initial
syncing code can be easily modified to support any combination of
source and target relations, but changes needed for real-time
replication seem non-trivial. Do you think we should do that before
we can say partitioned tables support logical replication?

Thanks,
Amit

#16Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#15)
1 attachment(s)
Re: adding partitioned tables to publications

On Tue, Nov 12, 2019 at 10:11 AM Amit Langote <amitlangote09@gmail.com> wrote:

Initial
syncing code can be easily modified to support any combination of
source and target relations, but changes needed for real-time
replication seem non-trivial.

I have spent some time hacking on this. With the attached updated
patch, adding a partitioned table to publication results in publishing
the inserts, updates, deletes of the table's leaf partitions as
inserts, updates, deletes of the table itself (it all happens inside
pgoutput). So, the replication target table doesn't necessarily have
to be a partitioned table and even if it is partitioned its partitions
don't have to match one-to-one.

One restriction remains though: partitioned tables on a subscriber
can't accept updates and deletes, because we'd need to map those to
updates and deletes of their partitions, including handling a tuple
possibly moving from one partition to another during an update.

Also, I haven't added subscription tests yet.

Attached updated patch. The previous division into a refactoring
patch and feature patch no longer made to sense to me, so there is
only one this time.

Thanks,
Amit

Attachments:

v5-0001-Support-adding-partitioned-tables-to-publications.patchapplication/octet-stream; name=v5-0001-Support-adding-partitioned-tables-to-publications.patchDownload
From d04d5a49de338042241a2ac86a608bf6dbe8a02c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v5] Support adding partitioned tables to publications

When a partitioned table is added to a publication, any direct and
indirect changes of its leaf partitions are published as if its own.
---
 doc/src/sgml/logical-replication.sgml       |  12 +-
 doc/src/sgml/ref/create_publication.sgml    |  19 ++-
 src/backend/catalog/pg_publication.c        |  50 +++++--
 src/backend/commands/publicationcmds.c      |  12 +-
 src/backend/commands/subscriptioncmds.c     |  89 +++++++++----
 src/backend/executor/execReplication.c      |  17 +--
 src/backend/replication/logical/tablesync.c |  27 ++--
 src/backend/replication/logical/worker.c    |  89 ++++++++++++-
 src/backend/replication/pgoutput/pgoutput.c | 198 ++++++++++++++++++++++------
 src/bin/pg_dump/pg_dump.c                   |   5 +-
 src/include/catalog/pg_publication.h        |   2 +
 src/test/regress/expected/publication.out   |  19 ++-
 src/test/regress/sql/publication.sql        |  12 +-
 13 files changed, 428 insertions(+), 123 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..87c950b9c8 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,11 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables, although
+     when using partitioned tables as replication <quote>target</quote>, only
+     inserts can be replicated.  Attempts to replicate relations other than
+     regular and partitioned tables, such as views, materialized views, or
+     foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..9a4efc06f8 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,14 +68,18 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
       publication.
      </para>
     </listitem>
@@ -133,6 +137,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
   </para>
 
   <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
+  <para>
    The creation of a publication does not start replication.  It only defines
    a grouping and filtering logic for future subscribers.
   </para>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d442c8e0bb..de8f1afc65 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -26,6 +26,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -47,17 +48,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +96,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -247,6 +241,38 @@ GetRelationPublications(Oid relid)
 }
 
 /*
+ * Finds all publications that publish changes to the input relation'
+ * ancestors.
+ *
+ * *published_ancestors will contain the OIDs of ancestors, one for each
+ * publication returned. The ancestor OIDs can be repeated, because a given
+ * ancestor may be published via multiple publications.
+ */
+List *
+GetRelationAncestorPublications(Relation rel,
+								List **published_ancestors)
+{
+	List	   *ancestors = get_partition_ancestors(RelationGetRelid(rel));
+	List	   *ancestor_pubids = NIL;
+	ListCell   *lc;
+
+	*published_ancestors = NIL;
+	foreach(lc, ancestors)
+	{
+		Oid		relid = lfirst_oid(lc);
+		List   *rel_publishers = GetRelationPublications(relid);
+		int		n = list_length(rel_publishers),
+				i;
+
+		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+		for (i = 0; i < n; i++)
+			*published_ancestors = lappend_oid(*published_ancestors, relid);
+	}
+
+	return ancestor_pubids;
+}
+
+/*
  * Gets list of relation oids for a publication.
  *
  * This should only be used for normal publications, the FOR ALL TABLES
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index fbf11c86aa..ee56acf3f3 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -498,7 +498,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +540,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5408edcfc2..9056409fee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -44,7 +44,20 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
+/*
+ * Structure for fetch_table_list() to store the information about
+ * a given published table.
+ */
+typedef struct PublicationTable
+{
+	char	   *schemaname;
+	char	   *relname;
+	bool		pubupdate;
+	bool		pubdelete;
+} PublicationTable;
+
 static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
+static Oid ValidateSubscriptionRel(PublicationTable *pt);
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -456,15 +469,10 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			tables = fetch_table_list(wrconn, publications);
 			foreach(lc, tables)
 			{
-				RangeVar   *rv = (RangeVar *) lfirst(lc);
+				PublicationTable *pt = (PublicationTable *) lfirst(lc);
 				Oid			relid;
 
-				relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-				/* Check for supported relkind. */
-				CheckSubscriptionRelkind(get_rel_relkind(relid),
-										 rv->schemaname, rv->relname);
-
+				relid = ValidateSubscriptionRel(pt);
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -565,14 +573,11 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
 	foreach(lc, pubrel_names)
 	{
-		RangeVar   *rv = (RangeVar *) lfirst(lc);
+		PublicationTable *pt = (PublicationTable *) lfirst(lc);
 		Oid			relid;
 
-		relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-		/* Check for supported relkind. */
-		CheckSubscriptionRelkind(get_rel_relkind(relid),
-								 rv->schemaname, rv->relname);
+		/* Check that there's an appropriate relation present locally. */
+		relid = ValidateSubscriptionRel(pt);
 
 		pubrel_local_oids[off++] = relid;
 
@@ -584,7 +589,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 									InvalidXLogRecPtr);
 			ereport(DEBUG1,
 					(errmsg("table \"%s.%s\" added to subscription \"%s\"",
-							rv->schemaname, rv->relname, sub->name)));
+							pt->schemaname, pt->relname, sub->name)));
 		}
 	}
 
@@ -1129,7 +1134,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {TEXTOID, TEXTOID};
+	Oid			tableRow[4] = {TEXTOID, TEXTOID, BOOLOID, BOOLOID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1137,9 +1142,11 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename, p.pubupdate, p.pubdelete\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname\n"
 						   " WHERE t.pubname IN (");
+
 	first = true;
 	foreach(lc, publications)
 	{
@@ -1154,7 +1161,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 4, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1166,18 +1173,19 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	slot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
 	while (tuplestore_gettupleslot(res->tuplestore, true, false, slot))
 	{
-		char	   *nspname;
-		char	   *relname;
+		PublicationTable *pt = palloc0(sizeof(PublicationTable));
 		bool		isnull;
-		RangeVar   *rv;
 
-		nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
+		pt->schemaname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
 		Assert(!isnull);
-		relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
+		pt->relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
+		Assert(!isnull);
+		pt->pubupdate = DatumGetBool(slot_getattr(slot, 3, &isnull));
+		Assert(!isnull);
+		pt->pubdelete = DatumGetBool(slot_getattr(slot, 4, &isnull));
 		Assert(!isnull);
 
-		rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
-		tablelist = lappend(tablelist, rv);
+		tablelist = lappend(tablelist, pt);
 
 		ExecClearTuple(slot);
 	}
@@ -1187,3 +1195,36 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 
 	return tablelist;
 }
+
+/*
+ * Looks up a local relation matching the given publication table and
+ * checks that it's appropriate to use as replication target, erroring
+ * out if not.
+ *
+ * Oid of the successfully validated local relation is returned.
+ */
+static Oid
+ValidateSubscriptionRel(PublicationTable *pt)
+{
+	Oid		relid;
+	char	local_relkind;
+	RangeVar *rv;
+
+	rv = makeRangeVar(pstrdup(pt->schemaname), pstrdup(pt->relname), -1);
+	relid = RangeVarGetRelid(rv, AccessShareLock, false);
+	Assert(OidIsValid(relid));
+
+	/* Check for supported relkind. */
+	local_relkind = get_rel_relkind(relid);
+	CheckSubscriptionRelkind(local_relkind, rv->schemaname, rv->relname);
+
+	if (local_relkind == RELKIND_PARTITIONED_TABLE &&
+		(pt->pubupdate || pt->pubdelete))
+		ereport(ERROR,
+				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+				 errmsg("cannot use partitioned table \"%s.%s\" as logical replication target",
+						pt->schemaname, pt->relname),
+				 errdetail("Partitioned tables can accept only insert and truncate operations via logical replication.")));
+
+	return relid;
+}
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..11a2293b56 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -396,7 +396,7 @@ ExecSimpleRelationInsert(EState *estate, TupleTableSlot *slot)
 	ResultRelInfo *resultRelInfo = estate->es_result_relation_info;
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 
-	/* For now we support only tables. */
+	/* For now we support only regular tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
 	CheckCmdReplicaIdentity(rel, CMD_INSERT);
@@ -591,17 +591,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +602,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e01d18c3a1..cccbe0e9c1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,16 +630,17 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - While we fetch relkind too
+ * here, the RELATION message doesn't provide it.
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
-						LogicalRepRelation *lrel)
+						LogicalRepRelation *lrel, char *relkind)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -649,16 +650,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	*relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -750,10 +753,12 @@ copy_table(Relation rel)
 	CopyState	cstate;
 	List	   *attnamelist;
 	ParseState *pstate;
+	char		remote_relkind;
 
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
-							RelationGetRelationName(rel), &lrel);
+							RelationGetRelationName(rel), &lrel,
+							&remote_relkind);
 
 	/* Put the relation into relmap. */
 	logicalrep_relmap_update(&lrel);
@@ -764,8 +769,12 @@ copy_table(Relation rel)
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (remote_relkind == RELKIND_PARTITIONED_TABLE)
+		appendStringInfo(&cmd, "COPY (SELECT * FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ff62303638..78fcb4ffe9 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,13 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -139,6 +141,21 @@ should_apply_changes_for_rel(LogicalRepRelMapEntry *rel)
 }
 
 /*
+ * Different interface to use when a LogicalRepRelMapEntry is not present
+ * for a given local target relation.
+ */
+static bool
+should_apply_changes_for_relid(Oid localreloid, char state,
+							   XLogRecPtr statelsn)
+{
+	if (am_tablesync_worker())
+		return MyLogicalRepWorker->relid == localreloid;
+	else
+		return (state == SUBREL_STATE_READY ||
+				(state == SUBREL_STATE_SYNCDONE &&
+				 statelsn <= remote_final_lsn));
+}
+/*
  * Make sure that we started local transaction.
  *
  * Also switches to ApplyMessageContext as necessary.
@@ -573,6 +590,8 @@ apply_handle_insert(StringInfo s)
 	EState	   *estate;
 	TupleTableSlot *remoteslot;
 	MemoryContext oldctx;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
 
 	ensure_transaction();
 
@@ -601,6 +620,36 @@ apply_handle_insert(StringInfo s)
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, newtup.values);
 	slot_fill_defaults(rel, estate, remoteslot);
+
+	/* Tuple routing for a partitioned table. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		ResultRelInfo	 *partrelinfo;
+		PartitionRoutingInfo *partinfo;
+		TupleConversionMap *map;
+
+		mtstate = makeNode(ModifyTableState);
+		mtstate->ps.plan = NULL;
+		mtstate->ps.state = estate;
+		mtstate->operation = CMD_INSERT;
+		mtstate->resultRelInfo = estate->es_result_relations;
+		proute = ExecSetupPartitionTupleRouting(estate, mtstate,
+												rel->localrel);
+		partrelinfo = ExecFindPartition(mtstate,
+										estate->es_result_relation_info,
+										proute, remoteslot, estate);
+		estate->es_result_relation_info = partrelinfo;
+		partinfo = partrelinfo->ri_PartitionInfo;
+		map = partinfo->pi_RootToPartitionMap;
+		if (map != NULL)
+		{
+			TupleTableSlot *new_slot = partinfo->pi_PartitionTupleSlot;
+
+			remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+											   new_slot);
+		}
+	}
+
 	MemoryContextSwitchTo(oldctx);
 
 	ExecOpenIndices(estate->es_result_relation_info, false);
@@ -610,6 +659,8 @@ apply_handle_insert(StringInfo s)
 
 	/* Cleanup. */
 	ExecCloseIndices(estate->es_result_relation_info);
+	if (proute)
+		ExecCleanupTupleRouting(mtstate, proute);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -906,14 +957,48 @@ apply_handle_truncate(StringInfo s)
 		LogicalRepRelMapEntry *rel;
 
 		rel = logicalrep_rel_open(relid, RowExclusiveLock);
+
 		if (!should_apply_changes_for_rel(rel))
 		{
+			bool	really_skip = true;
+
+			/*
+			 * If we seem to have gotten sent a leaf partition because an
+			 * ancestor was truncated, confirm before proceeding with
+			 * truncating the partition that an ancestor indeed has a valid
+			 * subscription state.
+			 */
+			if (rel->state == SUBREL_STATE_UNKNOWN &&
+				rel->localrel->rd_rel->relispartition)
+			{
+				List   *ancestors = get_partition_ancestors(rel->localreloid);
+				ListCell *lc1;
+
+				foreach(lc1, ancestors)
+				{
+					Oid			ancestor = lfirst_oid(lc1);
+					XLogRecPtr	statelsn;
+					char		state;
+
+					/* Check using the ancestor's subscription state. */
+					state = GetSubscriptionRelState(MySubscription->oid,
+													ancestor, &statelsn,
+													false);
+					really_skip &= !should_apply_changes_for_relid(ancestor,
+																   state,
+																   statelsn);
+				}
+			}
+
 			/*
 			 * The relation can't become interesting in the middle of the
 			 * transaction so it's safe to unlock it.
 			 */
-			logicalrep_rel_close(rel, RowExclusiveLock);
-			continue;
+			if (really_skip)
+			{
+				logicalrep_rel_close(rel, RowExclusiveLock);
+				continue;
+			}
 		}
 
 		remote_rels = lappend(remote_rels, rel);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3483c1b877..a0fc37542b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -49,21 +50,34 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /* Entry in the map used to remember which relation schemas we sent. */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema -- either own or ancestor's?
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * Valid if publishing relation's changes as changes to some ancestor,
+	 * that is, if relation is a partition.  The map, if any, will be used
+	 * to convert the tuples from partition's type to the ancestor's.
+	 */
+	Oid			replicate_as_relid;
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
 static HTAB *RelationSyncCache = NULL;
 
 static void init_rel_sync_cache(MemoryContext decoding_context);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -254,47 +268,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation ancestor =
+			RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc 	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
 
-		desc = RelationGetDescr(relation);
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created
+	 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
+	 * consider objects with hand-assigned OIDs to be "built in", not for
+	 * instance any function or type defined in the information_schema.
+	 * This is important because only hand-assigned OIDs can be expected
+	 * to remain stable across major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->attisdropped || att->attgenerated)
+			continue;
+
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -311,7 +350,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	if (!is_publishable_relation(relation))
 		return;
 
-	relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+	relentry = get_rel_sync_entry(data, relation);
 
 	/* First check the table filter */
 	switch (change->action)
@@ -341,28 +380,56 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -401,7 +468,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!is_publishable_relation(relation))
 			continue;
 
-		relentry = get_rel_sync_entry(data, relid);
+		relentry = get_rel_sync_entry(data, relation);
 
 		if (!relentry->pubactions.pubtruncate)
 			continue;
@@ -524,10 +591,16 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * For a partition, the schema of the top-most ancestor that is published
+ * will be used instead of that of the partition itself, so the information
+ * about ancestor's publications is looked up here and saved in the schema
+ * cache entry.
  */
 static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
@@ -585,6 +658,51 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 				break;
 		}
 
+		/*
+		 * For partitions, we prefer to publish their changes using an
+		 * ancestor's schema (usually the top-most ancestor) if it is
+		 * published, but only if a publication explicitly lists the ancestor
+		 * as its member (that is, not a FOR ALL TABLES publication).
+		 */
+		if (rel->rd_rel->relispartition)
+		{
+			List	   *ancestor_pubids;
+			List	   *published_ancestors = NIL;
+			Oid			topmost_published_ancestor = InvalidOid;
+
+			ancestor_pubids =
+				GetRelationAncestorPublications(rel, &published_ancestors);
+
+			foreach(lc, data->publications)
+			{
+				Publication *pub = lfirst(lc);
+					ListCell *lc1,
+							 *lc2;
+
+				forboth(lc1, ancestor_pubids, lc2, published_ancestors)
+				{
+					if (lfirst_oid(lc1) == pub->oid)
+					{
+						entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
+						entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
+						entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
+						entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+						topmost_published_ancestor = lfirst_oid(lc2);
+					}
+				}
+
+				if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
+					entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
+					break;
+			}
+
+			if (OidIsValid(topmost_published_ancestor))
+				entry->replicate_as_relid = topmost_published_ancestor;
+
+			list_free(ancestor_pubids);
+			list_free(published_ancestors);
+		}
+
 		list_free(pubids);
 
 		entry->replicate_valid = true;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bf69adc2f4..57b4d1a8c1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,9 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/* Only plain and partitioned tables can be added to publications. */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20a2f0ac1b..a67a626a71 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -81,6 +81,8 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Relation rel,
+								List **published_ancestors);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..b41e90e8ad 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,20 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+SELECT tablename FROM pg_publication_tables WHERE pubname = 'testpub_forparted';
+   tablename    
+----------------
+ testpub_parted
+(1 row)
+
+DROP PUBLICATION testpub_forparted;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +156,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..bed6e7d54c 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,16 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+SELECT tablename FROM pg_publication_tables WHERE pubname = 'testpub_forparted';
+DROP PUBLICATION testpub_forparted;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +93,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
-- 
2.11.0

#17Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#16)
Re: adding partitioned tables to publications

On 2019-11-18 09:53, Amit Langote wrote:

I have spent some time hacking on this. With the attached updated
patch, adding a partitioned table to publication results in publishing
the inserts, updates, deletes of the table's leaf partitions as
inserts, updates, deletes of the table itself (it all happens inside
pgoutput). So, the replication target table doesn't necessarily have
to be a partitioned table and even if it is partitioned its partitions
don't have to match one-to-one.

One restriction remains though: partitioned tables on a subscriber
can't accept updates and deletes, because we'd need to map those to
updates and deletes of their partitions, including handling a tuple
possibly moving from one partition to another during an update.

Right. Without that second part, the first part isn't really that
useful yet, is it?

I'm not sure what your intent with this patch is now. I thought the
previous behavior -- add a partitioned table to a publication and its
leaf tables appear in the replication output -- was pretty welcome. Do
we not want that anymore?

There should probably be an option to pick the behavior, like we do in
pg_dump.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#15)
Re: adding partitioned tables to publications

On 2019-11-12 02:11, Amit Langote wrote:

I don't understand why you go through great lengths to ensure that the
relkinds match between publisher and subscriber. We already ensure that
only regular tables are published and only regular tables are allowed as
subscription target. In the future, we may want to allow further
combinations. What situation are you trying to address here?

I'd really want to see the requirement for relkinds to have to match
go away, but as you can see, this patch doesn't modify enough of
pgoutput.c and worker.c to make that possible. Both the code for the
initital syncing and that for the subsequent real-time replication
assume that both source and target are regular tables. So even if
partitioned tables can now be in a publication, they're never sent in
the protocol messages, only their leaf partitions are. Initial
syncing code can be easily modified to support any combination of
source and target relations, but changes needed for real-time
replication seem non-trivial. Do you think we should do that before
we can say partitioned tables support logical replication?

My question was more simply why you have this check:

+   /*
+    * Cannot replicate from a regular to a partitioned table or vice
+    * versa.
+    */
+   if (local_relkind != pt->relkind)
+       ereport(ERROR,
+               (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                errmsg("cannot use relation \"%s.%s\" as logical 
replication target",
+                       rv->schemaname, rv->relname),

It doesn't seem necessary. What happens if you remove it?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#17)
Re: adding partitioned tables to publications

Hi Peter,

On Wed, Nov 20, 2019 at 4:55 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-11-18 09:53, Amit Langote wrote:

I have spent some time hacking on this. With the attached updated
patch, adding a partitioned table to publication results in publishing
the inserts, updates, deletes of the table's leaf partitions as
inserts, updates, deletes of the table itself (it all happens inside
pgoutput). So, the replication target table doesn't necessarily have
to be a partitioned table and even if it is partitioned its partitions
don't have to match one-to-one.

One restriction remains though: partitioned tables on a subscriber
can't accept updates and deletes, because we'd need to map those to
updates and deletes of their partitions, including handling a tuple
possibly moving from one partition to another during an update.

Right. Without that second part, the first part isn't really that
useful yet, is it?

I would say yes.

I'm not sure what your intent with this patch is now. I thought the
previous behavior -- add a partitioned table to a publication and its
leaf tables appear in the replication output -- was pretty welcome. Do
we not want that anymore?

Hmm, I thought it would be more desirable to not expose a published
partitioned table's leaf partitions to a subscriber, because it allows
the target table to be defined more flexibly.

There should probably be an option to pick the behavior, like we do in
pg_dump.

I don't understand which existing behavior. Can you clarify?

Regarding allowing users to choose between publishing partitioned
table changes using leaf tables' schema vs as using own schema, I tend
to agree that there would be value in that. Users who choose the
former will have to ensure that target leaf partitions match exactly.
Users who want flexibility in how the target table is defined can use
the latter.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

If both a leaf partition and an ancestor belong to the same
publication, then leaf partition changes are replicated using the
ancestor's schema. For a leaf partition to be replicated using its
own schema it must be published via a separate publication that
doesn't contain the ancestor. At least that's what the current patch
does.

Thanks,
Amit

#20Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#19)
Re: adding partitioned tables to publications

On 2019-11-22 07:28, Amit Langote wrote:

Hmm, I thought it would be more desirable to not expose a published
partitioned table's leaf partitions to a subscriber, because it allows
the target table to be defined more flexibly.

There are multiple different variants that we probably eventually want
to support. But I think there is value in exposing the partition
structure to the subscriber. Most notably, it allows the subscriber to
run the initial table sync per partition rather than in one big chunk --
which ultimately reflects one of the reasons partitioning exists.

The other way, exposing only the partitioned table, is also useful,
especially if you want to partition differently on the subscriber. But
without the ability to target a partitioned table on the subscriber,
this would right now only allow you to replicate a partitioned table
into a non-partitioned table. Which is valid but probably not often useful.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

If both a leaf partition and an ancestor belong to the same
publication, then leaf partition changes are replicated using the
ancestor's schema. For a leaf partition to be replicated using its
own schema it must be published via a separate publication that
doesn't contain the ancestor. At least that's what the current patch
does.

Hmm, that seems confusing. This would mean that if you add a
partitioned table to a publication that already contains leaf tables,
the publication behavior of the leaf tables would change. So again, I
think this alternative behavior of publishing partitions under the name
of their root table should be an explicit option on a publication, and
then it should be ensured somehow that individual partitions are not
added to the publication in confusing ways.

So, it's up to you which aspect of this you want to tackle, but I
thought your original goal of being able to add partitioned tables to
publications and have that implicitly expand to all member partitions on
the publication side seemed quite useful, self-contained, and
uncontroversial.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Eisentraut (#20)
Re: adding partitioned tables to publications

On Fri, Nov 22, 2019 at 4:16 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-11-22 07:28, Amit Langote wrote:

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

If both a leaf partition and an ancestor belong to the same
publication, then leaf partition changes are replicated using the
ancestor's schema. For a leaf partition to be replicated using its
own schema it must be published via a separate publication that
doesn't contain the ancestor. At least that's what the current patch
does.

Hmm, that seems confusing. This would mean that if you add a
partitioned table to a publication that already contains leaf tables,
the publication behavior of the leaf tables would change. So again, I
think this alternative behavior of publishing partitions under the name
of their root table should be an explicit option on a publication, and
then it should be ensured somehow that individual partitions are not
added to the publication in confusing ways.

Yeah, it can probably detect and throw an error for such cases.

So, it's up to you which aspect of this you want to tackle, but I
thought your original goal of being able to add partitioned tables to
publications and have that implicitly expand to all member partitions on
the publication side seemed quite useful, self-contained, and
uncontroversial.

+1.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#22Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#20)
Re: adding partitioned tables to publications

On Fri, Nov 22, 2019 at 7:46 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-11-22 07:28, Amit Langote wrote:

Hmm, I thought it would be more desirable to not expose a published
partitioned table's leaf partitions to a subscriber, because it allows
the target table to be defined more flexibly.

There are multiple different variants that we probably eventually want
to support. But I think there is value in exposing the partition
structure to the subscriber. Most notably, it allows the subscriber to
run the initial table sync per partition rather than in one big chunk --
which ultimately reflects one of the reasons partitioning exists.

I agree that replicating leaf-to-leaf has the least overhead.

The other way, exposing only the partitioned table, is also useful,
especially if you want to partition differently on the subscriber. But
without the ability to target a partitioned table on the subscriber,
this would right now only allow you to replicate a partitioned table
into a non-partitioned table. Which is valid but probably not often useful.

Handling non-partitioned target tables was the main reason for me to
make publishing using the root parent's schema the default behavior.
But given that replicating from partitioned tables into
non-partitioned ones would be rare, I agree to replicating using the
leaf schema by default.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

If both a leaf partition and an ancestor belong to the same
publication, then leaf partition changes are replicated using the
ancestor's schema. For a leaf partition to be replicated using its
own schema it must be published via a separate publication that
doesn't contain the ancestor. At least that's what the current patch
does.

Hmm, that seems confusing. This would mean that if you add a
partitioned table to a publication that already contains leaf tables,
the publication behavior of the leaf tables would change. So again, I
think this alternative behavior of publishing partitions under the name
of their root table should be an explicit option on a publication, and
then it should be ensured somehow that individual partitions are not
added to the publication in confusing ways.

So, it's up to you which aspect of this you want to tackle, but I
thought your original goal of being able to add partitioned tables to
publications and have that implicitly expand to all member partitions on
the publication side seemed quite useful, self-contained, and
uncontroversial.

OK, let's make whether to publish with root or leaf schema an option,
with the latter being the default. I will see about updating the
patch that way.

Thanks,
Amit

#23Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#22)
4 attachment(s)
Re: adding partitioned tables to publications

On Mon, Nov 25, 2019 at 6:37 PM Amit Langote <amitlangote09@gmail.com> wrote:

OK, let's make whether to publish with root or leaf schema an option,
with the latter being the default. I will see about updating the
patch that way.

Here are the updated patches.

0001: Adding a partitioned table to a publication implicitly adds all
its partitions. The receiving side must have tables matching the
published partitions, which is typically the case, because the same
partition tree is defined on both nodes.

0002: Add a new Boolean publication parameter
'publish_using_root_schema'. If true, a partitioned table's
partitions are not exposed to the subscriber, that is, changes of its
partitions are published as its own. This allows to replicate
partitioned table changes to a non-partitioned table (seldom useful)
or to a partitioned table that has different set of partitions than on
the publisher (a reasonable use case). This patch only adds the
parameter and doesn't implement any of that behavior.

0003: A refactoring patch for worker.c to allow handling partitioned
tables as targets of logical of replication commands a bit easier.

0004: This implements the 'publish_using_root_schema = true' behavior
described above. (An unintended benefit of making partitioned tables
an accepted relation type in worker.c is that it allows partitions on
subscriber to be sub-partitioned even if they are not on the
publisher, that is, when replicating partition-to-partition!)

Thanks,
Amit

Attachments:

v6-0002-Add-publish_using_root_schema-parameter-for-publi.patchapplication/octet-stream; name=v6-0002-Add-publish_using_root_schema-parameter-for-publi.patchDownload
From be83ae09f426939aca69ea4abfcd6164e1141044 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v6 2/4] Add publish_using_root_schema parameter for
 publications

It dictates whether to publish (leaf) partition changes using
the the schema of root parent table.
---
 doc/src/sgml/ref/create_publication.sgml  |  15 +++++
 src/backend/catalog/pg_publication.c      |   1 +
 src/backend/commands/publicationcmds.c    |  94 ++++++++++++++++-----------
 src/bin/pg_dump/pg_dump.c                 |  22 ++++++-
 src/bin/pg_dump/pg_dump.h                 |   1 +
 src/bin/psql/describe.c                   |  17 ++++-
 src/include/catalog/pg_publication.h      |   3 +
 src/test/regress/expected/publication.out | 103 +++++++++++++++++-------------
 src/test/regress/sql/publication.sql      |   3 +
 9 files changed, 171 insertions(+), 88 deletions(-)

diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 848779a00f..a8cf2c4629 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -124,6 +124,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table contained in the publication will be published using its own
+          schema rather than of the individual partitions which are actually
+          changed; the latter is the default.  Setting it to
+          <literal>true</literal> allows the changes to be replicated into a
+          non-partitioned table or a partitioned table consisting of a
+          a different set of partitions.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9e14a8216e..5ef77f1014 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -404,6 +404,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->publish_using_root_schema = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index ee56acf3f3..06e833fe57 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -55,20 +55,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -90,10 +93,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -109,19 +112,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -142,10 +154,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -182,9 +193,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -192,13 +203,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -250,17 +263,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -269,19 +281,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b5e91771e4..cfe89d4e09 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3780,6 +3780,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3791,11 +3792,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3819,6 +3827,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3841,6 +3850,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3917,7 +3928,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 7b2c1524a5..99b0c1611d 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -600,6 +600,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index b3b9313b36..ce1321e17f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5706,7 +5706,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5737,6 +5737,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5778,6 +5782,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5790,6 +5795,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5800,6 +5806,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5849,6 +5858,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5861,6 +5872,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5869,6 +5882,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 5ee7091472..61d338b110 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,6 +76,7 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		publish_using_root_schema;
 	PublicationActions pubactions;
 } Publication;
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e3fabe70f9..da22ca3c6a 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -124,10 +126,19 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
+Tables:
+    "public.testpub_parted"
+
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
 Tables:
     "public.testpub_parted"
 
@@ -146,10 +157,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -187,10 +198,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -234,10 +245,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -247,20 +258,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b79a3f8f8f..7ddca1b974 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -77,6 +78,8 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP PUBLICATION testpub_forparted;
 
 -- fail - view
-- 
2.11.0

v6-0001-Support-adding-partitioned-tables-to-publication.patchapplication/octet-stream; name=v6-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From e015211927787d3ef104fc8e6217a11ce50a6115 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v6 1/4] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       |  10 +-
 doc/src/sgml/ref/create_publication.sgml    |  27 +++--
 src/backend/catalog/pg_publication.c        |  42 +++++---
 src/backend/commands/copy.c                 |   2 +-
 src/backend/commands/publicationcmds.c      |  12 ++-
 src/backend/commands/subscriptioncmds.c     |  63 +++++++----
 src/backend/executor/execMain.c             |   7 +-
 src/backend/executor/execPartition.c        |   5 +-
 src/backend/executor/execReplication.c      |  43 ++++----
 src/backend/executor/nodeModifyTable.c      |   6 +-
 src/backend/replication/logical/tablesync.c |  30 ++++--
 src/backend/replication/pgoutput/pgoutput.c |  41 +++++--
 src/bin/pg_dump/pg_dump.c                   |   5 +-
 src/include/catalog/pg_publication.h        |   1 +
 src/include/executor/executor.h             |   8 +-
 src/test/regress/expected/publication.out   |  21 +++-
 src/test/regress/sql/publication.sql        |  12 ++-
 src/test/subscription/t/013_partition.pl    | 159 ++++++++++++++++++++++++++++
 18 files changed, 399 insertions(+), 95 deletions(-)
 create mode 100644 src/test/subscription/t/013_partition.pl

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..cbf33d73c6 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,9 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables.
+     Attempts to replicate other types of relations such as views, materialized
+     views, or foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..848779a00f 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,25 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+     </para>
+
+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, any <command>INSERT</command>, <command>UPDATE</update>,
+      and <command>DELETE</command>, and <command>TRUNCATE</command> operations
+      that are directly applied to a partition are also published via its
+      ancestors' publications.
      </para>
     </listitem>
    </varlistentry>
@@ -133,6 +143,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
   </para>
 
   <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
+  <para>
    The creation of a publication does not start replication.  It only defines
    a grouping and filtering logic for future subscribers.
   </para>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d442c8e0bb..9e14a8216e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -26,6 +26,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -47,17 +48,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +96,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -230,7 +224,7 @@ GetRelationPublications(Oid relid)
 	CatCList   *pubrellist;
 	int			i;
 
-	/* Find all publications associated with the relation. */
+	/* Finds all publications associated with the relation. */
 	pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
 									 ObjectIdGetDatum(relid));
 	for (i = 0; i < pubrellist->n_members; i++)
@@ -247,6 +241,28 @@ GetRelationPublications(Oid relid)
 }
 
 /*
+ * Finds all publications that publish changes to the input relation's
+ * ancestors.
+ */
+List *
+GetRelationAncestorPublications(Oid relid)
+{
+	List	   *ancestors = get_partition_ancestors(relid);
+	List	   *ancestor_pubids = NIL;
+	ListCell   *lc;
+
+	foreach(lc, ancestors)
+	{
+		Oid			ancestor = lfirst_oid(lc);
+		List	   *rel_publishers = GetRelationPublications(ancestor);
+
+		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+	}
+
+	return ancestor_pubids;
+}
+
+/*
  * Gets list of relation oids for a publication.
  *
  * This should only be used for normal publications, the FOR ALL TABLES
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 42a147b67d..bb8c926659 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2835,7 +2835,7 @@ CopyFrom(CopyState cstate)
 	target_resultRelInfo = resultRelInfo;
 
 	/* Verify the named relation is a valid target for INSERT */
-	CheckValidResultRel(resultRelInfo, CMD_INSERT);
+	CheckValidResultRel(resultRelInfo, NULL, CMD_INSERT);
 
 	ExecOpenIndices(resultRelInfo, false);
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index fbf11c86aa..ee56acf3f3 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -498,7 +498,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +540,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5408edcfc2..f65cad4ac0 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -44,7 +44,8 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
-static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
+static List *fetch_publication_tables(WalReceiverConn *wrconn, List *publications);
+static Oid	ValidateSubscriptionRel(RangeVar *rv);
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -453,18 +454,13 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			 * Get the table list from publisher and build local table status
 			 * info.
 			 */
-			tables = fetch_table_list(wrconn, publications);
+			tables = fetch_publication_tables(wrconn, publications);
 			foreach(lc, tables)
 			{
 				RangeVar   *rv = (RangeVar *) lfirst(lc);
 				Oid			relid;
 
-				relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-				/* Check for supported relkind. */
-				CheckSubscriptionRelkind(get_rel_relkind(relid),
-										 rv->schemaname, rv->relname);
-
+				relid = ValidateSubscriptionRel(rv);
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -530,7 +526,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 				(errmsg("could not connect to the publisher: %s", err)));
 
 	/* Get the table list from publisher. */
-	pubrel_names = fetch_table_list(wrconn, sub->publications);
+	pubrel_names = fetch_publication_tables(wrconn, sub->publications);
 
 	/* We are done with the remote side, close connection. */
 	walrcv_disconnect(wrconn);
@@ -568,11 +564,8 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 		RangeVar   *rv = (RangeVar *) lfirst(lc);
 		Oid			relid;
 
-		relid = RangeVarGetRelid(rv, AccessShareLock, false);
-
-		/* Check for supported relkind. */
-		CheckSubscriptionRelkind(get_rel_relkind(relid),
-								 rv->schemaname, rv->relname);
+		/* Check that there's an appropriate relation present locally. */
+		relid = ValidateSubscriptionRel(rv);
 
 		pubrel_local_oids[off++] = relid;
 
@@ -1121,10 +1114,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
 
 /*
  * Get the list of tables which belong to specified publications on the
- * publisher connection.
+ * publisher connection to create a subscription state (pg_subscription_rel
+ * entry) for each.  For partitioned tables, subscription state is maintained
+ * per partition, so partitions are fetched too.
  */
 static List *
-fetch_table_list(WalReceiverConn *wrconn, List *publications)
+fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
@@ -1137,9 +1132,19 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename FROM (\n"
+						   "  SELECT DISTINCT t.pubname, t.schemaname, t.tablename \n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
-						   " WHERE t.pubname IN (");
+						   "  UNION\n"
+						   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename\n"
+						   "  FROM pg_catalog.pg_publication_tables t,\n"
+						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname\n"
+						   "		   FROM pg_class c\n"
+						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+						   "		   ON p.relid = c.oid\n"
+						   "		   WHERE p.level > 0) AS s(schemaname, tablename)) s\n"
+						   " WHERE s.pubname IN (");
+
 	first = true;
 	foreach(lc, publications)
 	{
@@ -1187,3 +1192,25 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 
 	return tablelist;
 }
+
+/*
+ * Looks up a local relation matching the given publication table and
+ * checks that it's appropriate to use as replication target, erroring
+ * out if not.
+ *
+ * Oid of the successfully validated local relation is returned.
+ */
+static Oid
+ValidateSubscriptionRel(RangeVar *rv)
+{
+	Oid			relid;
+
+	relid = RangeVarGetRelid(rv, AccessShareLock, false);
+	Assert(OidIsValid(relid));
+
+	/* Check for supported relkind. */
+	CheckSubscriptionRelkind(get_rel_relkind(relid),
+							 rv->schemaname, rv->relname);
+
+	return relid;
+}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index c46eb8d646..416970393c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1073,7 +1073,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
  * CheckValidRowMarkRel.
  */
 void
-CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
+CheckValidResultRel(ResultRelInfo *resultRelInfo,
+					ResultRelInfo *rootResultRelInfo,
+					CmdType operation)
 {
 	Relation	resultRel = resultRelInfo->ri_RelationDesc;
 	TriggerDesc *trigDesc = resultRel->trigdesc;
@@ -1083,7 +1085,8 @@ CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
 	{
 		case RELKIND_RELATION:
 		case RELKIND_PARTITIONED_TABLE:
-			CheckCmdReplicaIdentity(resultRel, operation);
+			CheckCmdReplicaIdentity(resultRelInfo, rootResultRelInfo,
+									operation);
 			break;
 		case RELKIND_SEQUENCE:
 			ereport(ERROR,
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d23f292cb0..06f6923966 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -385,7 +385,8 @@ ExecFindPartition(ModifyTableState *mtstate,
 						rri = elem->rri;
 
 						/* Verify this ResultRelInfo allows INSERTs */
-						CheckValidResultRel(rri, CMD_INSERT);
+						CheckValidResultRel(rri, rootResultRelInfo,
+											CMD_INSERT);
 
 						/* Set up the PartitionRoutingInfo for it */
 						ExecInitRoutingInfo(mtstate, estate, proute, dispatch,
@@ -530,7 +531,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 	 * partition-key becomes a DELETE+INSERT operation, so this check is still
 	 * required when the operation is CMD_UPDATE.
 	 */
-	CheckValidResultRel(leaf_part_rri, CMD_INSERT);
+	CheckValidResultRel(leaf_part_rri, rootResultRelInfo, CMD_INSERT);
 
 	/*
 	 * Open partition indices.  The user may have asked to check for conflicts
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..22f613beed 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -396,10 +396,10 @@ ExecSimpleRelationInsert(EState *estate, TupleTableSlot *slot)
 	ResultRelInfo *resultRelInfo = estate->es_result_relation_info;
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 
-	/* For now we support only tables. */
+	/* For now we support only regular tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
-	CheckCmdReplicaIdentity(rel, CMD_INSERT);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_INSERT);
 
 	/* BEFORE ROW INSERT Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -463,7 +463,7 @@ ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
 	/* For now we support only tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
-	CheckCmdReplicaIdentity(rel, CMD_UPDATE);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_UPDATE);
 
 	/* BEFORE ROW UPDATE Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -521,7 +521,7 @@ ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 	ItemPointer tid = &searchslot->tts_tid;
 
-	CheckCmdReplicaIdentity(rel, CMD_DELETE);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_DELETE);
 
 	/* BEFORE ROW DELETE Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -544,12 +544,17 @@ ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 }
 
 /*
- * Check if command can be executed with current replica identity.
+ * Check if command can be executed on 'target_rel' with its (or the
+ * ancestor's) current replica identity.
  */
 void
-CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
+CheckCmdReplicaIdentity(ResultRelInfo *target_rel,
+						ResultRelInfo *root_target_rel,
+						CmdType cmd)
 {
 	PublicationActions *pubactions;
+	Relation	rel = target_rel->ri_RelationDesc;
+	Relation	rootrel = root_target_rel ? root_target_rel->ri_RelationDesc : NULL;
 
 	/* We only need to do checks for UPDATE and DELETE. */
 	if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
@@ -563,9 +568,18 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
 	/*
 	 * This is either UPDATE OR DELETE and there is no replica identity.
 	 *
-	 * Check if the table publishes UPDATES or DELETES.
+	 * Check if the table or its root ancestor publishes UPDATES or DELETES.
 	 */
 	pubactions = GetRelationPublicationActions(rel);
+	if (rootrel)
+	{
+		PublicationActions *root_pubactions;
+
+		root_pubactions = GetRelationPublicationActions(rootrel);
+		pubactions->pubupdate |= root_pubactions->pubupdate;
+		pubactions->pubdelete |= root_pubactions->pubdelete;
+	}
+
 	if (cmd == CMD_UPDATE && pubactions->pubupdate)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -591,17 +605,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +616,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index e3eb9d7b90..fb97d24f3a 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2268,6 +2268,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	int			nplans = list_length(node->plans);
 	ResultRelInfo *saved_resultRelInfo;
 	ResultRelInfo *resultRelInfo;
+	ResultRelInfo *rootResultRelInfo = NULL;
 	Plan	   *subplan;
 	ListCell   *l;
 	int			i;
@@ -2295,8 +2296,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		rootResultRelInfo = mtstate->rootResultRelInfo;
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
@@ -2330,7 +2334,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 		/*
 		 * Verify result relation is a valid target for the current operation
 		 */
-		CheckValidResultRel(resultRelInfo, operation);
+		CheckValidResultRel(resultRelInfo, rootResultRelInfo, operation);
 
 		/*
 		 * If there are indices on the result relation, open them and save
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e01d18c3a1..554bdb10d3 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,16 +630,17 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
-						LogicalRepRelation *lrel)
+						LogicalRepRelation *lrel, char *relkind)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -649,16 +650,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	*relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -750,10 +753,12 @@ copy_table(Relation rel)
 	CopyState	cstate;
 	List	   *attnamelist;
 	ParseState *pstate;
+	char		remote_relkind;
 
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
-							RelationGetRelationName(rel), &lrel);
+							RelationGetRelationName(rel), &lrel,
+							&remote_relkind);
 
 	/* Put the relation into relmap. */
 	logicalrep_relmap_update(&lrel);
@@ -762,6 +767,17 @@ copy_table(Relation rel)
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
 
+	/*
+	 * If either table is partitioned, skip copying.  Individual partitions
+	 * will be copied instead.
+	 */
+	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+		remote_relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		logicalrep_rel_close(relmapentry, NoLock);
+		return;
+	}
+
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
 	appendStringInfo(&cmd, "COPY %s TO STDOUT",
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3483c1b877..8dc78f1779 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -50,7 +50,12 @@ static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
 
-/* Entry in the map used to remember which relation schemas we sent. */
+/*
+ * Entry in the map used to remember which relation schemas we sent.
+ *
+ * For partitions, 'pubactions' considers not only the table's own
+ * publications, but also those of all of its ancestors.
+ */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
@@ -63,7 +68,7 @@ typedef struct RelationSyncEntry
 static HTAB *RelationSyncCache = NULL;
 
 static void init_rel_sync_cache(MemoryContext decoding_context);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -311,7 +316,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	if (!is_publishable_relation(relation))
 		return;
 
-	relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+	relentry = get_rel_sync_entry(data, relation);
 
 	/* First check the table filter */
 	switch (change->action)
@@ -401,7 +406,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!is_publishable_relation(relation))
 			continue;
 
-		relentry = get_rel_sync_entry(data, relid);
+		relentry = get_rel_sync_entry(data, relation);
 
 		if (!relentry->pubactions.pubtruncate)
 			continue;
@@ -526,8 +531,9 @@ init_rel_sync_cache(MemoryContext cachectx)
  * Find or create entry in the relation schema cache.
  */
 static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
@@ -546,7 +552,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	if (!found || !entry->replicate_valid)
 	{
 		List	   *pubids = GetRelationPublications(relid);
-		ListCell   *lc;
+		ListCell   *lc,
+				   *lc1;
+		List	   *ancestor_pubids = NIL;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -568,6 +576,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		entry->pubactions.pubinsert = entry->pubactions.pubupdate =
 			entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
 
+		/* For partitions, also consider publications of ancestors. */
+		if (rel->rd_rel->relispartition)
+			ancestor_pubids =
+				GetRelationAncestorPublications(RelationGetRelid(rel));
+
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
@@ -583,9 +596,25 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
 				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
 				break;
+
+			foreach(lc1, ancestor_pubids)
+			{
+				if (lfirst_oid(lc1) == pub->oid)
+				{
+					entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
+					entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
+					entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				}
+			}
+
+			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
+				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
+				break;
 		}
 
 		list_free(pubids);
+		list_free(ancestor_pubids);
 
 		entry->replicate_valid = true;
 	}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 08658c8e86..b5e91771e4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,9 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/* Only plain and partitioned tables can be added to publications. */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index ea22aa6563..5ee7091472 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,6 +80,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Oid relid);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 6298c7c8ca..698a57d0cd 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -179,7 +179,9 @@ extern void ExecutorEnd(QueryDesc *queryDesc);
 extern void standard_ExecutorEnd(QueryDesc *queryDesc);
 extern void ExecutorRewind(QueryDesc *queryDesc);
 extern bool ExecCheckRTPerms(List *rangeTable, bool ereport_on_violation);
-extern void CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation);
+extern void CheckValidResultRel(ResultRelInfo *resultRelInfo,
+								ResultRelInfo *rootResultRelInfo,
+								CmdType operation);
 extern void InitResultRelInfo(ResultRelInfo *resultRelInfo,
 							  Relation resultRelationDesc,
 							  Index resultRelationIndex,
@@ -592,7 +594,9 @@ extern void ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
 									 TupleTableSlot *searchslot, TupleTableSlot *slot);
 extern void ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 									 TupleTableSlot *searchslot);
-extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
+extern void CheckCmdReplicaIdentity(ResultRelInfo *target_rel,
+									ResultRelInfo *root_target_rel,
+									CmdType cmd);
 
 extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
 									 const char *relname);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..e3fabe70f9 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,22 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+DROP PUBLICATION testpub_forparted;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +158,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..b79a3f8f8f 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,16 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+DROP PUBLICATION testpub_forparted;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +93,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
new file mode 100644
index 0000000000..2b8a5025dc
--- /dev/null
+++ b/src/test/subscription/t/013_partition.pl
@@ -0,0 +1,159 @@
+# Test PARTITION
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# setup
+
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+my $node_subscriber1 = get_new_node('subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+my $node_subscriber2 = get_new_node('subscriber2');
+$node_subscriber2->init(allows_streaming => 'logical');
+$node_subscriber2->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY) PARTITION BY LIST (a)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_1 PARTITION OF tab1 FOR VALUES IN (1, 2, 3)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text DEFAULT 'sub1_tab1', a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (b DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_2 (a int PRIMARY KEY, b text DEFAULT 'sub2_tab1_2')");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub1 FOR TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE tab1_2");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2");
+
+# Wait for initial sync of all subscriptions
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert data (some into the root parent and some directly into partitions)
+
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+my $result = $node_subscriber1->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
+
+# update a row (no partition change)
+
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 1");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+
+# update a row (partition changes)
+
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|3|6), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+
+# delete rows (some from the root parent, some directly from the partition)
+
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'delete from tab1_2 replicated');
+
+# truncate (root parent and partition directly)
+
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(2|1|2), 'truncate of tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'truncate of tab1_2 replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.11.0

v6-0004-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v6-0004-Publish-partitioned-table-inserts-as-its-own.patchDownload
From d07cccd57cbc915f2b1b5a7e9329bf1a6147070e Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Wed, 13 Nov 2019 17:18:51 +0900
Subject: [PATCH v6 4/4] Publish partitioned table inserts as its own

---
 src/backend/catalog/pg_publication.c        |  11 +-
 src/backend/commands/subscriptioncmds.c     |  85 ++++++--
 src/backend/executor/nodeModifyTable.c      |   2 +
 src/backend/replication/logical/tablesync.c |  19 +-
 src/backend/replication/logical/worker.c    | 298 ++++++++++++++++++++++++++--
 src/backend/replication/pgoutput/pgoutput.c | 182 +++++++++++++----
 src/include/catalog/pg_publication.h        |   2 +-
 src/test/subscription/t/013_partition.pl    |  48 ++++-
 8 files changed, 560 insertions(+), 87 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 5ef77f1014..84fc302592 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -243,20 +243,29 @@ GetRelationPublications(Oid relid)
 /*
  * Finds all publications that publish changes to the input relation's
  * ancestors.
+ *
+ * *publisher_ancestors will contain one OID for each publication returned,
+ * of the ancestor which belongs to it.  Values in this list can be repeated,
+ * because a given ancestor may belong to multiple publications.
  */
 List *
-GetRelationAncestorPublications(Oid relid)
+GetRelationAncestorPublications(Oid relid, List **published_ancestors)
 {
 	List	   *ancestors = get_partition_ancestors(relid);
 	List	   *ancestor_pubids = NIL;
 	ListCell   *lc;
 
+	*published_ancestors = NIL;
 	foreach(lc, ancestors)
 	{
 		Oid			ancestor = lfirst_oid(lc);
 		List	   *rel_publishers = GetRelationPublications(ancestor);
+		int			n = list_length(rel_publishers),
+					i;
 
 		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+		for (i = 0; i < n; i++)
+			*published_ancestors = lappend_oid(*published_ancestors, ancestor);
 	}
 
 	return ancestor_pubids;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f65cad4ac0..143d572702 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -44,6 +44,27 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
+/*
+ * Structure used by fetch_publication_tables to describe a published table.
+ * The information is used by the callers of fetch_publication_tables to
+ * generate a pg_subscription_rel catalog entry for the table.
+ */
+typedef struct PublishedTable
+{
+	RangeVar   *rv;
+
+	char		relkind;
+
+	/*
+	 * If the published table is partitioned, the following being true means
+	 * its changes are published using own schema rather than the schema of
+	 * its individual partitions.  In the latter case, a separate
+	 * PublicationTable instance (and hence pg_subscription_rel entry) for
+	 * each partition will be needed.
+	 */
+	bool		published_using_root_schema;
+}			PublishedTable;
+
 static List *fetch_publication_tables(WalReceiverConn *wrconn, List *publications);
 static Oid	ValidateSubscriptionRel(RangeVar *rv);
 
@@ -457,10 +478,21 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			tables = fetch_publication_tables(wrconn, publications);
 			foreach(lc, tables)
 			{
-				RangeVar   *rv = (RangeVar *) lfirst(lc);
+				PublishedTable *pt = (PublishedTable *) lfirst(lc);
+				RangeVar   *rv = pt->rv;
 				Oid			relid;
 
 				relid = ValidateSubscriptionRel(rv);
+
+				/*
+				 * If a partitioned table is published using the schema of its
+				 * partitions, the initial sync will be performed by copying
+				 * from the partitions, so mark the partitioned table itself
+				 * as ready.
+				 */
+				if (pt->relkind == RELKIND_PARTITIONED_TABLE &&
+					!pt->published_using_root_schema)
+					table_state = SUBREL_STATE_READY;
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -561,19 +593,31 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
 	foreach(lc, pubrel_names)
 	{
-		RangeVar   *rv = (RangeVar *) lfirst(lc);
+		PublishedTable *pt = (PublishedTable *) lfirst(lc);
+		RangeVar   *rv = pt->rv;
 		Oid			relid;
+		char		table_state = copy_data ? SUBREL_STATE_INIT :
+		SUBREL_STATE_READY;
 
 		/* Check that there's an appropriate relation present locally. */
 		relid = ValidateSubscriptionRel(rv);
 
 		pubrel_local_oids[off++] = relid;
 
+		/*
+		 * If a partitioned table is published using the schema of its
+		 * partitions, the initial sync will be performed by copying from the
+		 * partitions, so mark the partitioned table itself as ready.
+		 */
+		if (pt->relkind == RELKIND_PARTITIONED_TABLE &&
+			!pt->published_using_root_schema)
+			table_state = SUBREL_STATE_READY;
+
 		if (!bsearch(&relid, subrel_local_oids,
 					 list_length(subrel_states), sizeof(Oid), oid_cmp))
 		{
 			AddSubscriptionRelState(sub->oid, relid,
-									copy_data ? SUBREL_STATE_INIT : SUBREL_STATE_READY,
+									table_state,
 									InvalidXLogRecPtr);
 			ereport(DEBUG1,
 					(errmsg("table \"%s.%s\" added to subscription \"%s\"",
@@ -1124,7 +1168,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {TEXTOID, TEXTOID};
+	Oid			tableRow[4] = {TEXTOID, TEXTOID, CHAROID, BOOLOID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1132,17 +1176,23 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename FROM (\n"
-						   "  SELECT DISTINCT t.pubname, t.schemaname, t.tablename \n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind, s.pubasroot FROM (\n"
+						   "  SELECT DISTINCT t.pubname, t.schemaname, t.tablename, c.relkind, p.pubasroot \n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname\n"
+						   "  JOIN pg_catalog.pg_class c ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::pg_catalog.name\n"
+						   "  AND t.tablename = c.relname\n"
 						   "  UNION\n"
-						   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename\n"
-						   "  FROM pg_catalog.pg_publication_tables t,\n"
-						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname\n"
-						   "		   FROM pg_class c\n"
-						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+						   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename, c.relkind, p.pubasroot\n"
+						   "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname AND NOT p.pubasroot,\n"
+						   "  LATERAL (SELECT c.relnamespace::pg_catalog.regnamespace::pg_catalog.name, c.relname\n"
+						   "		   FROM pg_catalog.pg_class c\n"
+						   "		   JOIN pg_catalog.pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
 						   "		   ON p.relid = c.oid\n"
-						   "		   WHERE p.level > 0) AS s(schemaname, tablename)) s\n"
+						   "		   WHERE p.level > 0) AS s(schemaname, tablename)\n"
+						   "  JOIN pg_catalog.pg_class c ON s.schemaname = c.relnamespace::pg_catalog.regnamespace::pg_catalog.name\n"
+						   "  AND s.tablename = c.relname) s\n"
 						   " WHERE s.pubname IN (");
 
 	first = true;
@@ -1159,7 +1209,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 4, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1174,15 +1224,18 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 		char	   *nspname;
 		char	   *relname;
 		bool		isnull;
-		RangeVar   *rv;
+		PublishedTable *pt = palloc(sizeof(PublishedTable));
 
 		nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
 		Assert(!isnull);
 		relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
 		Assert(!isnull);
+		pt->rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
+		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+		pt->published_using_root_schema = DatumGetBool(slot_getattr(slot, 4, &isnull));
+		Assert(!isnull);
 
-		rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
-		tablelist = lappend(tablelist, rv);
+		tablelist = lappend(tablelist, pt);
 
 		ExecClearTuple(slot);
 	}
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index fb97d24f3a..4e22b7b382 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2299,6 +2299,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		CheckValidResultRel(mtstate->rootResultRelInfo,
+							mtstate->rootResultRelInfo, operation);
 		rootResultRelInfo = mtstate->rootResultRelInfo;
 	}
 
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 554bdb10d3..56c1e28e1b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -767,21 +767,14 @@ copy_table(Relation rel)
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
 
-	/*
-	 * If either table is partitioned, skip copying.  Individual partitions
-	 * will be copied instead.
-	 */
-	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
-		remote_relkind == RELKIND_PARTITIONED_TABLE)
-	{
-		logicalrep_rel_close(relmapentry, NoLock);
-		return;
-	}
-
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (remote_relkind == RELKIND_PARTITIONED_TABLE)
+		appendStringInfo(&cmd, "COPY (SELECT * FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 2686fccdc2..3d6bb37f89 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -140,6 +143,22 @@ should_apply_changes_for_rel(LogicalRepRelMapEntry *rel)
 }
 
 /*
+ * Different interface to use when a LogicalRepRelMapEntry is not present
+ * for a given local target relation.
+ */
+static bool
+should_apply_changes_for_relid(Oid localreloid, char state,
+							   XLogRecPtr statelsn)
+{
+	if (am_tablesync_worker())
+		return MyLogicalRepWorker->relid == localreloid;
+	else
+		return (state == SUBREL_STATE_READY ||
+				(state == SUBREL_STATE_SYNCDONE &&
+				 statelsn <= remote_final_lsn));
+}
+
+/*
  * Make sure that we started local transaction.
  *
  * Also switches to ApplyMessageContext as necessary.
@@ -722,6 +741,168 @@ apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
 }
 
 /*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   LogicalRepRelMapEntry *relmapentry,
+						   EState *estate, CmdType operation,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo,
+			   *partrelinfo1;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, repectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_insert(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_delete(partrelinfo, estate, remoteslot,
+								   &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+
+			/*
+			 * partrelinfo computed above is the partition which might contain
+			 * the search tuple.  Now find the partition for the replacement
+			 * tuple, which might not be the same as partrelinfo.
+			 */
+			localslot = table_slot_create(rel, &estate->es_tupleTable);
+			oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+			slot_modify_cstrings(localslot, remoteslot,
+								 newtup->values, newtup->changed,
+								 relmapentry->attrmap, &relmapentry->remoterel,
+								 RelationGetRelid(rel));
+			partrelinfo1 = ExecFindPartition(mtstate, relinfo, proute,
+											 localslot, estate);
+			MemoryContextSwitchTo(oldctx);
+
+			/*
+			 * If both search and replacement tuple would be in the same
+			 * partition, we can apply this as an UPDATE on the parttion.
+			 */
+			if (partrelinfo == partrelinfo1)
+			{
+				AttrNumber *attrmap = relmapentry->attrmap;
+
+				/*
+				 * If the partition's attributes don't match the root
+				 * relation's, we'll need to make a new attrmap mapping
+				 * partition attribute numbers to remoterel's.
+				 */
+				if (map)
+				{
+					TupleDesc	partdesc = RelationGetDescr(partrelinfo1->ri_RelationDesc);
+					TupleDesc	rootdesc = RelationGetDescr(rel);
+					AttrNumber *partToRootMap,
+								attno;
+
+					/* Need the reverse map here */
+					partToRootMap = convert_tuples_by_name_map(partdesc, rootdesc);
+					attrmap = palloc(partdesc->natts * sizeof(AttrNumber));
+					memset(attrmap, -1, partdesc->natts * sizeof(AttrNumber));
+					for (attno = 0; attno < partdesc->natts; attno++)
+					{
+						AttrNumber	root_attno = partToRootMap[attno];
+
+						attrmap[attno] = relmapentry->attrmap[root_attno - 1];
+					}
+				}
+
+				/* UPDATE partition. */
+				estate->es_result_relation_info = partrelinfo;
+				apply_handle_do_update(partrelinfo, estate, remoteslot,
+									   newtup, attrmap,
+									   &relmapentry->remoterel);
+				if (attrmap != relmapentry->attrmap)
+					pfree(attrmap);
+			}
+			else
+			{
+				/* Different, so handle this as DELETE followed by INSERT. */
+
+				/* DELETE from partition partrelinfo. */
+				estate->es_result_relation_info = partrelinfo;
+				apply_handle_do_delete(partrelinfo, estate, remoteslot,
+									   &relmapentry->remoterel);
+
+				/*
+				 * Convert the replacement tuple to match the destination
+				 * partition rowtype.
+				 */
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				partinfo = partrelinfo1->ri_PartitionInfo;
+				map = partinfo->pi_RootToPartitionMap;
+				if (map != NULL)
+				{
+					TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+					localslot = execute_attr_map_slot(map->attrMap, localslot,
+													  part_slot);
+				}
+				MemoryContextSwitchTo(oldctx);
+				/* INSERT into partition partrelinfo1. */
+				estate->es_result_relation_info = partrelinfo1;
+				apply_handle_do_insert(partrelinfo1, estate, localslot);
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
+/*
  * Handle INSERT message.
  */
 static void
@@ -763,9 +944,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_insert(estate->es_result_relation_info, estate,
-						   remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_INSERT, remoteslot, NULL);
+	else
+		apply_handle_do_insert(estate->es_result_relation_info, estate,
+							   remoteslot);
 
 	PopActiveSnapshot();
 
@@ -863,10 +1048,14 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_update(estate->es_result_relation_info, estate,
-						   remoteslot, &newtup, rel->attrmap,
-						   &rel->remoterel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_UPDATE, remoteslot, &newtup);
+	else
+		apply_handle_do_update(estate->es_result_relation_info, estate,
+							   remoteslot, &newtup, rel->attrmap,
+							   &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -928,9 +1117,13 @@ apply_handle_delete(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_delete(estate->es_result_relation_info, estate,
-						   remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_DELETE, remoteslot, NULL);
+	else
+		apply_handle_do_delete(estate->es_result_relation_info, estate,
+							   remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -972,14 +1165,48 @@ apply_handle_truncate(StringInfo s)
 		LogicalRepRelMapEntry *rel;
 
 		rel = logicalrep_rel_open(relid, RowExclusiveLock);
+
 		if (!should_apply_changes_for_rel(rel))
 		{
+			bool		really_skip = true;
+
+			/*
+			 * If we seem to have gotten sent a leaf partition because an
+			 * ancestor was truncated, confirm before proceeding with
+			 * truncating the partition that an ancestor indeed has a valid
+			 * subscription state.
+			 */
+			if (rel->state == SUBREL_STATE_UNKNOWN &&
+				rel->localrel->rd_rel->relispartition)
+			{
+				List	   *ancestors = get_partition_ancestors(rel->localreloid);
+				ListCell   *lc1;
+
+				foreach(lc1, ancestors)
+				{
+					Oid			ancestor = lfirst_oid(lc1);
+					XLogRecPtr	statelsn;
+					char		state;
+
+					/* Check using the ancestor's subscription state. */
+					state = GetSubscriptionRelState(MySubscription->oid,
+													ancestor, &statelsn,
+													false);
+					really_skip &= !should_apply_changes_for_relid(ancestor,
+																   state,
+																   statelsn);
+				}
+			}
+
 			/*
 			 * The relation can't become interesting in the middle of the
 			 * transaction so it's safe to unlock it.
 			 */
-			logicalrep_rel_close(rel, RowExclusiveLock);
-			continue;
+			if (really_skip)
+			{
+				logicalrep_rel_close(rel, RowExclusiveLock);
+				continue;
+			}
 		}
 
 		remote_rels = lappend(remote_rels, rel);
@@ -987,6 +1214,47 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -996,11 +1264,11 @@ apply_handle_truncate(StringInfo s)
 	 */
 	ExecuteTruncateGuts(rels, relids, relids_logged, DROP_RESTRICT, restart_seqs);
 
-	foreach(lc, remote_rels)
+	foreach(lc, rels)
 	{
-		LogicalRepRelMapEntry *rel = lfirst(lc);
+		Relation	rel = lfirst(lc);
 
-		logicalrep_rel_close(rel, NoLock);
+		table_close(rel, NoLock);
 	}
 
 	CommandCounterIncrement();
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 8dc78f1779..4784a3c587 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -49,6 +50,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +61,22 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * Valid if publishing relation's changes as changes to some ancestor,
+	 * that is, if relation is a partition.  The map, if any, will be used to
+	 * convert the tuples from partition's type to the ancestor's.
+	 */
+	Oid			replicate_as_relid;
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +274,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
 
-		desc = RelationGetDescr(relation);
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->attisdropped || att->attgenerated)
+			continue;
+
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +386,56 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -411,6 +479,28 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!relentry->pubactions.pubtruncate)
 			continue;
 
+		/*
+		 * If this partition was not *directly* truncated, don't bother
+		 * sending it to the subscriber.
+		 */
+		if (OidIsValid(relentry->replicate_as_relid))
+		{
+			int			j;
+			bool		can_skip_part_trunc = false;
+
+			for (j = 0; j < nrelids; j++)
+			{
+				if (relentry->replicate_as_relid == relids[j])
+				{
+					can_skip_part_trunc = true;
+					break;
+				}
+			}
+
+			if (can_skip_part_trunc)
+				continue;
+		}
+
 		relids[nrelids++] = relid;
 		maybe_send_schema(ctx, relation, relentry);
 	}
@@ -529,6 +619,11 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * For a partition, the schema of the top-most ancestor that is published
+ * will be used in some cases, instead of that of the partition itself, so
+ * the information about ancestor's publications is looked up here and saved in
+ * the schema cache entry.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Relation rel)
@@ -553,8 +648,11 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 	{
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc,
-				   *lc1;
+				   *lc1,
+				   *lc2;
 		List	   *ancestor_pubids = NIL;
+		List	   *published_ancestors = NIL;
+		Oid			topmost_published_ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -579,7 +677,9 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 		/* For partitions, also consider publications of ancestors. */
 		if (rel->rd_rel->relispartition)
 			ancestor_pubids =
-				GetRelationAncestorPublications(RelationGetRelid(rel));
+				GetRelationAncestorPublications(RelationGetRelid(rel),
+												&published_ancestors);
+		Assert(list_length(ancestor_pubids) == list_length(published_ancestors));
 
 		foreach(lc, data->publications)
 		{
@@ -597,7 +697,7 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
 				break;
 
-			foreach(lc1, ancestor_pubids)
+			forboth(lc1, ancestor_pubids, lc2, published_ancestors)
 			{
 				if (lfirst_oid(lc1) == pub->oid)
 				{
@@ -605,6 +705,8 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 					entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 					entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
 					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+					if (pub->publish_using_root_schema)
+						topmost_published_ancestor = lfirst_oid(lc2);
 				}
 			}
 
@@ -615,7 +717,9 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 
 		list_free(pubids);
 		list_free(ancestor_pubids);
+		list_free(published_ancestors);
 
+		entry->replicate_as_relid = topmost_published_ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 61d338b110..15bf4a7d4c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,7 +83,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetRelationAncestorPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Oid relid, List **published_ancestors);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 2b8a5025dc..2e3c7991f8 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 10;
+use Test::More tests => 16;
 
 # setup
 
@@ -39,21 +39,38 @@ $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
 
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (b DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (b DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, b text DEFAULT 'sub2_tab1_2')");
 
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text DEFAULT 'sub2_tab1') PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1 FOR TABLE tab1, tab1_1");
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub2 FOR TABLE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE tab1 WITH (publish_using_root_schema = true)");
 
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -83,17 +100,26 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT b, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1_2 replicated');
+
 # update a row (no partition change)
 
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1_1 replicated');
+
 # update a row (partition changes)
 
 $node_publisher->safe_psql('postgres',
@@ -110,6 +136,10 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT b, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT b, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'delete from tab1_1 replicated');
+
 # delete rows (some from the root parent, some directly from the partition)
 
 $node_publisher->safe_psql('postgres',
@@ -128,12 +158,18 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
 # truncate (root parent and partition directly)
 
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
 
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
@@ -149,6 +185,10 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(3|1|5), 'no change, because only truncate of tab1 will be replicated');
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
 
@@ -157,3 +197,7 @@ $node_publisher->wait_for_catchup('sub1');
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.11.0

v6-0003-Some-refactoring-of-logical-worker.c.patchapplication/octet-stream; name=v6-0003-Some-refactoring-of-logical-worker.c.patchDownload
From dd9dc5c8877535d35502222a5b9031fa393a674c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v6 3/4] Some refactoring of logical/worker.c

---
 src/backend/replication/logical/worker.c | 290 ++++++++++++++++++-------------
 1 file changed, 170 insertions(+), 120 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ced0d191c2..2686fccdc2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -89,7 +89,8 @@ static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
 
 typedef struct SlotErrCallbackArg
 {
-	LogicalRepRelMapEntry *rel;
+	LogicalRepRelation *remoterel;
+	Oid			local_reloid;
 	int			local_attnum;
 	int			remote_attnum;
 } SlotErrCallbackArg;
@@ -269,7 +270,6 @@ static void
 slot_store_error_callback(void *arg)
 {
 	SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
-	LogicalRepRelMapEntry *rel;
 	char	   *remotetypname;
 	Oid			remotetypoid,
 				localtypoid;
@@ -278,19 +278,18 @@ slot_store_error_callback(void *arg)
 	if (errarg->remote_attnum < 0)
 		return;
 
-	rel = errarg->rel;
-	remotetypoid = rel->remoterel.atttyps[errarg->remote_attnum];
+	remotetypoid = errarg->remoterel->atttyps[errarg->remote_attnum];
 
 	/* Fetch remote type name from the LogicalRepTypMap cache */
 	remotetypname = logicalrep_typmap_gettypname(remotetypoid);
 
 	/* Fetch local type OID from the local sys cache */
-	localtypoid = get_atttype(rel->localreloid, errarg->local_attnum + 1);
+	localtypoid = get_atttype(errarg->local_reloid, errarg->local_attnum + 1);
 
 	errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\", "
 			   "remote type %s, local type %s",
-			   rel->remoterel.nspname, rel->remoterel.relname,
-			   rel->remoterel.attnames[errarg->remote_attnum],
+			   errarg->remoterel->nspname, errarg->remoterel->relname,
+			   errarg->remoterel->attnames[errarg->remote_attnum],
 			   remotetypname,
 			   format_type_be(localtypoid));
 }
@@ -312,7 +311,8 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 	ExecClearTuple(slot);
 
 	/* Push callback + info on the error context stack */
-	errarg.rel = rel;
+	errarg.remoterel = &rel->remoterel;
+	errarg.local_reloid = rel->localreloid;
 	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
 	errcallback.callback = slot_store_error_callback;
@@ -375,8 +375,9 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
  */
 static void
 slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
-					 LogicalRepRelMapEntry *rel,
-					 char **values, bool *replaces)
+					 char **values, bool *replaces,
+					 AttrNumber *attrmap, LogicalRepRelation *remoterel,
+					 Oid local_reloid)
 {
 	int			natts = slot->tts_tupleDescriptor->natts;
 	int			i;
@@ -396,7 +397,8 @@ slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
 	memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
 
 	/* For error reporting, push callback + info on the error context stack */
-	errarg.rel = rel;
+	errarg.remoterel = remoterel;
+	errarg.local_reloid = local_reloid;
 	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
 	errcallback.callback = slot_store_error_callback;
@@ -408,7 +410,7 @@ slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
 	for (i = 0; i < natts; i++)
 	{
 		Form_pg_attribute att = TupleDescAttr(slot->tts_tupleDescriptor, i);
-		int			remoteattnum = rel->attrmap[i];
+		int			remoteattnum = attrmap[i];
 
 		if (remoteattnum < 0)
 			continue;
@@ -577,6 +579,148 @@ GetRelationIdentityOrPK(Relation rel)
 	return idxoid;
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_do_insert(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *localslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, localslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_do_update(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *remoteslot,
+					   LogicalRepTupleData *newtup,
+					   AttrNumber *attrmap, LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+
+	ExecClearTuple(remoteslot);
+
+	/*
+	 * Tuple found.
+	 *
+	 * Note this will fail if there are other conflicting unique indexes.
+	 */
+	if (found)
+	{
+		/* Process and store remote tuple in the slot */
+		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+		slot_modify_cstrings(remoteslot, localslot,
+							 newtup->values, newtup->changed,
+							 attrmap, remoterel, RelationGetRelid(rel));
+		MemoryContextSwitchTo(oldctx);
+
+		EvalPlanQualSetSlot(&epqstate, remoteslot);
+
+		/* Do the actual update. */
+		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
+	}
+	else
+	{
+		/*
+		 * The tuple to be updated could not be found.
+		 *
+		 * TODO what to do here, change the log level to LOG perhaps?
+		 */
+		elog(DEBUG1,
+			 "logical replication did not find row for update "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
+					   TupleTableSlot *remoteslot,
+					   LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+	ExecOpenIndices(relinfo, false);
+
+	/* If found delete it. */
+	if (found)
+	{
+		EvalPlanQualSetSlot(&epqstate, localslot);
+
+		/* Do the actual delete. */
+		ExecSimpleRelationDelete(estate, &epqstate, localslot);
+	}
+	else
+	{
+		/* The tuple to be deleted could not be found. */
+		elog(DEBUG1,
+			 "logical replication could not find row for delete "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -619,13 +763,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_insert(estate->es_result_relation_info, estate,
+						   remoteslot);
 
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -682,15 +823,11 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -716,12 +853,9 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -729,63 +863,16 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_update(estate->es_result_relation_info, estate,
+						   remoteslot, &newtup, rel->attrmap,
+						   &rel->remoterel);
 
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
-	ExecClearTuple(remoteslot);
-
-	/*
-	 * Tuple found.
-	 *
-	 * Note this will fail if there are other conflicting unique indexes.
-	 */
-	if (found)
-	{
-		/* Process and store remote tuple in the slot */
-		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
-		MemoryContextSwitchTo(oldctx);
-
-		EvalPlanQualSetSlot(&epqstate, remoteslot);
-
-		/* Do the actual update. */
-		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
-	}
-	else
-	{
-		/*
-		 * The tuple to be updated could not be found.
-		 *
-		 * TODO what to do here, change the log level to LOG perhaps?
-		 */
-		elog(DEBUG1,
-			 "logical replication did not find row for update "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
@@ -805,12 +892,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -835,58 +918,25 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
+	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_delete(estate->es_result_relation_info, estate,
+						   remoteslot, &rel->remoterel);
 
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-	/* If found delete it. */
-	if (found)
-	{
-		EvalPlanQualSetSlot(&epqstate, localslot);
-
-		/* Do the actual delete. */
-		ExecSimpleRelationDelete(estate, &epqstate, localslot);
-	}
-	else
-	{
-		/* The tuple to be deleted could not be found. */
-		elog(DEBUG1,
-			 "logical replication could not find row for delete "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
-- 
2.11.0

#24Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#23)
Re: adding partitioned tables to publications

On 2019-12-06 08:48, Amit Langote wrote:

0001: Adding a partitioned table to a publication implicitly adds all
its partitions. The receiving side must have tables matching the
published partitions, which is typically the case, because the same
partition tree is defined on both nodes.

This looks pretty good to me now. But you need to make all the changed
queries version-aware so that you can still replicate from and to older
versions. (For example, pg_partition_tree is not very old.)

This part looks a bit fishy:

+       /*
+        * If either table is partitioned, skip copying.  Individual 
partitions
+        * will be copied instead.
+        */
+       if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+               remote_relkind == RELKIND_PARTITIONED_TABLE)
+       {
+               logicalrep_rel_close(relmapentry, NoLock);
+               return;
+       }

I don't think you want to filter out a partitioned table on the local
side, since (a) COPY can handle that, and (b) it's (as of this patch) an
error to have a partitioned table in the subscription table set.

I'm not a fan of the new ValidateSubscriptionRel() function. It's too
obscure, especially the return value. Doesn't seem worth it.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#25Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#24)
4 attachment(s)
Re: adding partitioned tables to publications

Thanks for checking.

On Thu, Dec 12, 2019 at 12:48 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-12-06 08:48, Amit Langote wrote:

0001: Adding a partitioned table to a publication implicitly adds all
its partitions. The receiving side must have tables matching the
published partitions, which is typically the case, because the same
partition tree is defined on both nodes.

This looks pretty good to me now. But you need to make all the changed
queries version-aware so that you can still replicate from and to older
versions. (For example, pg_partition_tree is not very old.)

True, fixed that.

This part looks a bit fishy:

+       /*
+        * If either table is partitioned, skip copying.  Individual
partitions
+        * will be copied instead.
+        */
+       if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+               remote_relkind == RELKIND_PARTITIONED_TABLE)
+       {
+               logicalrep_rel_close(relmapentry, NoLock);
+               return;
+       }

I don't think you want to filter out a partitioned table on the local
side, since (a) COPY can handle that, and (b) it's (as of this patch) an
error to have a partitioned table in the subscription table set.

Yeah, (b) is true, so copy_table() should only ever see regular tables
with only patch 0001 applied.

I'm not a fan of the new ValidateSubscriptionRel() function. It's too
obscure, especially the return value. Doesn't seem worth it.

It went through many variants since I first introduced it, but yeah I
agree we don't need it if only because of the weird interface.

It occurred to me that, *as of 0001*, we should indeed disallow
replicating from a regular table on publisher node into a partitioned
table of the same name on subscriber node (as the earlier patches
did), because 0001 doesn't implement tuple routing support that would
be needed to apply such changes.

Attached updated patches.

Thanks,
Amit

Attachments:

v7-0001-Support-adding-partitioned-tables-to-publication.patchapplication/octet-stream; name=v7-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From 436cd39191f55bb8e0095d613187f071084bd111 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v7 1/4] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       |  14 +--
 doc/src/sgml/ref/create_publication.sgml    |  27 +++--
 src/backend/catalog/pg_publication.c        |  42 +++++---
 src/backend/commands/copy.c                 |   2 +-
 src/backend/commands/publicationcmds.c      |  12 ++-
 src/backend/commands/subscriptioncmds.c     | 117 +++++++++++++++++---
 src/backend/executor/execMain.c             |   7 +-
 src/backend/executor/execPartition.c        |   5 +-
 src/backend/executor/execReplication.c      |  47 ++++----
 src/backend/executor/nodeModifyTable.c      |   6 +-
 src/backend/replication/logical/tablesync.c |   1 +
 src/backend/replication/pgoutput/pgoutput.c |  41 +++++--
 src/bin/pg_dump/pg_dump.c                   |   5 +-
 src/include/catalog/pg_publication.h        |   1 +
 src/include/executor/executor.h             |   8 +-
 src/test/regress/expected/publication.out   |  21 +++-
 src/test/regress/sql/publication.sql        |  12 ++-
 src/test/subscription/t/013_partition.pl    | 161 ++++++++++++++++++++++++++++
 18 files changed, 443 insertions(+), 86 deletions(-)
 create mode 100644 src/test/subscription/t/013_partition.pl

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..3d8cb0895d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,13 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables, although
+     the table kind must match between the two servers, that is, one cannot
+     replicate from a regular table into a partitioned able or vice versa.
+     Also, when replicating between partitioned tables, the actual replication
+     occurs between leaf partitions, so the partitions on the two servers must
+     match one-to-one.  Attempts to replicate other types of relations such as
+     views, materialized views, or foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..848779a00f 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,25 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+     </para>
+
+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, any <command>INSERT</command>, <command>UPDATE</update>,
+      and <command>DELETE</command>, and <command>TRUNCATE</command> operations
+      that are directly applied to a partition are also published via its
+      ancestors' publications.
      </para>
     </listitem>
    </varlistentry>
@@ -132,6 +142,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
    empty set of tables.  That is useful if tables are to be added later.
   </para>
 
+  <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
   <para>
    The creation of a publication does not start replication.  It only defines
    a grouping and filtering logic for future subscribers.
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d442c8e0bb..9e14a8216e 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -26,6 +26,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -47,17 +48,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +96,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -230,7 +224,7 @@ GetRelationPublications(Oid relid)
 	CatCList   *pubrellist;
 	int			i;
 
-	/* Find all publications associated with the relation. */
+	/* Finds all publications associated with the relation. */
 	pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
 									 ObjectIdGetDatum(relid));
 	for (i = 0; i < pubrellist->n_members; i++)
@@ -246,6 +240,28 @@ GetRelationPublications(Oid relid)
 	return result;
 }
 
+/*
+ * Finds all publications that publish changes to the input relation's
+ * ancestors.
+ */
+List *
+GetRelationAncestorPublications(Oid relid)
+{
+	List	   *ancestors = get_partition_ancestors(relid);
+	List	   *ancestor_pubids = NIL;
+	ListCell   *lc;
+
+	foreach(lc, ancestors)
+	{
+		Oid			ancestor = lfirst_oid(lc);
+		List	   *rel_publishers = GetRelationPublications(ancestor);
+
+		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+	}
+
+	return ancestor_pubids;
+}
+
 /*
  * Gets list of relation oids for a publication.
  *
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 42a147b67d..bb8c926659 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2835,7 +2835,7 @@ CopyFrom(CopyState cstate)
 	target_resultRelInfo = resultRelInfo;
 
 	/* Verify the named relation is a valid target for INSERT */
-	CheckValidResultRel(resultRelInfo, CMD_INSERT);
+	CheckValidResultRel(resultRelInfo, NULL, CMD_INSERT);
 
 	ExecOpenIndices(resultRelInfo, false);
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index fbf11c86aa..ee56acf3f3 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -498,7 +498,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +540,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5408edcfc2..5c5c8ebe3b 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -44,7 +44,19 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
-static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
+/*
+ * Structure used by fetch_publication_tables to describe a published table.
+ * The information is used by the callers of fetch_publication_tables to
+ * generate a pg_subscription_rel catalog entry for the table.
+ */
+typedef struct PublishedTable
+{
+	RangeVar   *rv;
+
+	char		relkind;
+}			PublishedTable;
+
+static List *fetch_publication_tables(WalReceiverConn *wrconn, List *publications);
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -453,18 +465,42 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			 * Get the table list from publisher and build local table status
 			 * info.
 			 */
-			tables = fetch_table_list(wrconn, publications);
+			tables = fetch_publication_tables(wrconn, publications);
 			foreach(lc, tables)
 			{
-				RangeVar   *rv = (RangeVar *) lfirst(lc);
+				PublishedTable *pt = (PublishedTable *) lfirst(lc);
+				RangeVar   *rv = pt->rv;
 				Oid			relid;
+				char		local_relkind;
 
 				relid = RangeVarGetRelid(rv, AccessShareLock, false);
+				local_relkind = get_rel_relkind(relid);
 
 				/* Check for supported relkind. */
-				CheckSubscriptionRelkind(get_rel_relkind(relid),
+				CheckSubscriptionRelkind(local_relkind,
 										 rv->schemaname, rv->relname);
 
+				/*
+				 * Currently, partitioned table replication occurs between leaf
+				 * partitions, so both the source and the target tables must be
+				 * partitioned.
+				 */
+				if (pt->relkind == RELKIND_RELATION &&
+					local_relkind == RELKIND_PARTITIONED_TABLE)
+					ereport(ERROR,
+							(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+							 errmsg("cannot use relation \"%s.%s\" as logical replication target",
+									rv->schemaname, rv->relname),
+							 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
+									   rv->schemaname, rv->relname)));
+
+				/*
+				 * A partitioned table doesn't need local state, because the
+				 * state is managed for individual partitions instead.
+				 */
+				if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+					continue;
+
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -530,7 +566,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 				(errmsg("could not connect to the publisher: %s", err)));
 
 	/* Get the table list from publisher. */
-	pubrel_names = fetch_table_list(wrconn, sub->publications);
+	pubrel_names = fetch_publication_tables(wrconn, sub->publications);
 
 	/* We are done with the remote side, close connection. */
 	walrcv_disconnect(wrconn);
@@ -565,15 +601,39 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
 	foreach(lc, pubrel_names)
 	{
-		RangeVar   *rv = (RangeVar *) lfirst(lc);
+		PublishedTable *pt = (PublishedTable *) lfirst(lc);
+		RangeVar   *rv = pt->rv;
 		Oid			relid;
+		char		local_relkind;
 
 		relid = RangeVarGetRelid(rv, AccessShareLock, false);
+		local_relkind = get_rel_relkind(relid);
 
 		/* Check for supported relkind. */
-		CheckSubscriptionRelkind(get_rel_relkind(relid),
+		CheckSubscriptionRelkind(local_relkind,
 								 rv->schemaname, rv->relname);
 
+		/*
+		 * Currently, partitioned table replication occurs between leaf
+		 * partitions, so both the source and the target tables must be
+		 * partitioned.
+		 */
+		if (pt->relkind == RELKIND_RELATION &&
+			local_relkind == RELKIND_PARTITIONED_TABLE)
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot use relation \"%s.%s\" as logical replication target",
+							rv->schemaname, rv->relname),
+					 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
+							   rv->schemaname, rv->relname)));
+
+		/*
+		 * A partitioned table doesn't need local state, because the
+		 * state is managed for individual partitions instead.
+		 */
+		if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+			continue;
+
 		pubrel_local_oids[off++] = relid;
 
 		if (!bsearch(&relid, subrel_local_oids,
@@ -1121,15 +1181,17 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
 
 /*
  * Get the list of tables which belong to specified publications on the
- * publisher connection.
+ * publisher connection to create a subscription state (pg_subscription_rel
+ * entry) for each.  For partitioned tables, subscription state is maintained
+ * per partition, so partitions are fetched too.
  */
 static List *
-fetch_table_list(WalReceiverConn *wrconn, List *publications)
+fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {TEXTOID, TEXTOID};
+	Oid			tableRow[3] = {TEXTOID, TEXTOID, CHAROID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1137,9 +1199,30 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind FROM (\n"
+						   "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
-						   " WHERE t.pubname IN (");
+						   "  JOIN pg_catalog.pg_class c \n"
+						   "  ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::name\n"
+						   "  AND t.tablename = c.relname \n");
+
+	/*
+	 * As of v13, partitioned tables can be published, although their changes
+	 * are published as their partitions', so we will need the partitions in
+	 * the result.
+	 */
+	if (walrcv_server_version(wrconn) >= 130000)
+		appendStringInfoString(&cmd, "  UNION\n"
+						   "  SELECT t.pubname, s.schemaname, s.tablename, s.relkind\n"
+						   "  FROM pg_catalog.pg_publication_tables t,\n"
+						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname, c.relkind\n"
+						   "		   FROM pg_class c\n"
+						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+						   "		   ON p.relid = c.oid\n"
+						   "		   WHERE p.level > 0) AS s(schemaname, tablename, relkind)\n");
+
+	appendStringInfoString(&cmd, ") s WHERE s.pubname IN (");
+
 	first = true;
 	foreach(lc, publications)
 	{
@@ -1154,7 +1237,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1169,15 +1252,17 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 		char	   *nspname;
 		char	   *relname;
 		bool		isnull;
-		RangeVar   *rv;
+		PublishedTable *pt = palloc(sizeof(PublishedTable));
 
 		nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
 		Assert(!isnull);
 		relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
 		Assert(!isnull);
+		pt->rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
+		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+		Assert(!isnull);
 
-		rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
-		tablelist = lappend(tablelist, rv);
+		tablelist = lappend(tablelist, pt);
 
 		ExecClearTuple(slot);
 	}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index c46eb8d646..416970393c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1073,7 +1073,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
  * CheckValidRowMarkRel.
  */
 void
-CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
+CheckValidResultRel(ResultRelInfo *resultRelInfo,
+					ResultRelInfo *rootResultRelInfo,
+					CmdType operation)
 {
 	Relation	resultRel = resultRelInfo->ri_RelationDesc;
 	TriggerDesc *trigDesc = resultRel->trigdesc;
@@ -1083,7 +1085,8 @@ CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
 	{
 		case RELKIND_RELATION:
 		case RELKIND_PARTITIONED_TABLE:
-			CheckCmdReplicaIdentity(resultRel, operation);
+			CheckCmdReplicaIdentity(resultRelInfo, rootResultRelInfo,
+									operation);
 			break;
 		case RELKIND_SEQUENCE:
 			ereport(ERROR,
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d23f292cb0..06f6923966 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -385,7 +385,8 @@ ExecFindPartition(ModifyTableState *mtstate,
 						rri = elem->rri;
 
 						/* Verify this ResultRelInfo allows INSERTs */
-						CheckValidResultRel(rri, CMD_INSERT);
+						CheckValidResultRel(rri, rootResultRelInfo,
+											CMD_INSERT);
 
 						/* Set up the PartitionRoutingInfo for it */
 						ExecInitRoutingInfo(mtstate, estate, proute, dispatch,
@@ -530,7 +531,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 	 * partition-key becomes a DELETE+INSERT operation, so this check is still
 	 * required when the operation is CMD_UPDATE.
 	 */
-	CheckValidResultRel(leaf_part_rri, CMD_INSERT);
+	CheckValidResultRel(leaf_part_rri, rootResultRelInfo, CMD_INSERT);
 
 	/*
 	 * Open partition indices.  The user may have asked to check for conflicts
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 95e027c970..bd27912bc7 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -396,10 +396,10 @@ ExecSimpleRelationInsert(EState *estate, TupleTableSlot *slot)
 	ResultRelInfo *resultRelInfo = estate->es_result_relation_info;
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 
-	/* For now we support only tables. */
+	/* For now we support only regular tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
-	CheckCmdReplicaIdentity(rel, CMD_INSERT);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_INSERT);
 
 	/* BEFORE ROW INSERT Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -463,7 +463,7 @@ ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
 	/* For now we support only tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
-	CheckCmdReplicaIdentity(rel, CMD_UPDATE);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_UPDATE);
 
 	/* BEFORE ROW UPDATE Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -521,7 +521,7 @@ ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 	ItemPointer tid = &searchslot->tts_tid;
 
-	CheckCmdReplicaIdentity(rel, CMD_DELETE);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_DELETE);
 
 	/* BEFORE ROW DELETE Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -544,12 +544,17 @@ ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 }
 
 /*
- * Check if command can be executed with current replica identity.
+ * Check if command can be executed on 'target_rel' with its (or the
+ * ancestor's) current replica identity.
  */
 void
-CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
+CheckCmdReplicaIdentity(ResultRelInfo *target_rel,
+						ResultRelInfo *root_target_rel,
+						CmdType cmd)
 {
 	PublicationActions *pubactions;
+	Relation	rel = target_rel->ri_RelationDesc;
+	Relation	rootrel = root_target_rel ? root_target_rel->ri_RelationDesc : NULL;
 
 	/* We only need to do checks for UPDATE and DELETE. */
 	if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
@@ -563,9 +568,18 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
 	/*
 	 * This is either UPDATE OR DELETE and there is no replica identity.
 	 *
-	 * Check if the table publishes UPDATES or DELETES.
+	 * Check if the table or its root ancestor publishes UPDATES or DELETES.
 	 */
 	pubactions = GetRelationPublicationActions(rel);
+	if (rootrel)
+	{
+		PublicationActions *root_pubactions;
+
+		root_pubactions = GetRelationPublicationActions(rootrel);
+		pubactions->pubupdate |= root_pubactions->pubupdate;
+		pubactions->pubdelete |= root_pubactions->pubdelete;
+	}
+
 	if (cmd == CMD_UPDATE && pubactions->pubupdate)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -591,17 +605,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +616,11 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	/*
+	 * There are some unsupported cases with partitioned tables, but we leave
+	 * it for the caller to report them.
+	 */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 9ba1d78344..2676ae281e 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2268,6 +2268,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	int			nplans = list_length(node->plans);
 	ResultRelInfo *saved_resultRelInfo;
 	ResultRelInfo *resultRelInfo;
+	ResultRelInfo *rootResultRelInfo = NULL;
 	Plan	   *subplan;
 	ListCell   *l;
 	int			i;
@@ -2295,8 +2296,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		rootResultRelInfo = mtstate->rootResultRelInfo;
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
@@ -2330,7 +2334,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 		/*
 		 * Verify result relation is a valid target for the current operation
 		 */
-		CheckValidResultRel(resultRelInfo, operation);
+		CheckValidResultRel(resultRelInfo, rootResultRelInfo, operation);
 
 		/*
 		 * If there are indices on the result relation, open them and save
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e01d18c3a1..ec387ba768 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -761,6 +761,7 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
+	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 3483c1b877..8dc78f1779 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -50,7 +50,12 @@ static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
 
-/* Entry in the map used to remember which relation schemas we sent. */
+/*
+ * Entry in the map used to remember which relation schemas we sent.
+ *
+ * For partitions, 'pubactions' considers not only the table's own
+ * publications, but also those of all of its ancestors.
+ */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
@@ -63,7 +68,7 @@ typedef struct RelationSyncEntry
 static HTAB *RelationSyncCache = NULL;
 
 static void init_rel_sync_cache(MemoryContext decoding_context);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -311,7 +316,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	if (!is_publishable_relation(relation))
 		return;
 
-	relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+	relentry = get_rel_sync_entry(data, relation);
 
 	/* First check the table filter */
 	switch (change->action)
@@ -401,7 +406,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!is_publishable_relation(relation))
 			continue;
 
-		relentry = get_rel_sync_entry(data, relid);
+		relentry = get_rel_sync_entry(data, relation);
 
 		if (!relentry->pubactions.pubtruncate)
 			continue;
@@ -526,8 +531,9 @@ init_rel_sync_cache(MemoryContext cachectx)
  * Find or create entry in the relation schema cache.
  */
 static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
@@ -546,7 +552,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	if (!found || !entry->replicate_valid)
 	{
 		List	   *pubids = GetRelationPublications(relid);
-		ListCell   *lc;
+		ListCell   *lc,
+				   *lc1;
+		List	   *ancestor_pubids = NIL;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -568,6 +576,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		entry->pubactions.pubinsert = entry->pubactions.pubupdate =
 			entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
 
+		/* For partitions, also consider publications of ancestors. */
+		if (rel->rd_rel->relispartition)
+			ancestor_pubids =
+				GetRelationAncestorPublications(RelationGetRelid(rel));
+
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
@@ -580,12 +593,28 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
 			}
 
+			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
+				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
+				break;
+
+			foreach(lc1, ancestor_pubids)
+			{
+				if (lfirst_oid(lc1) == pub->oid)
+				{
+					entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
+					entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
+					entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				}
+			}
+
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
 				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
 				break;
 		}
 
 		list_free(pubids);
+		list_free(ancestor_pubids);
 
 		entry->replicate_valid = true;
 	}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 08658c8e86..b5e91771e4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,9 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/* Only plain and partitioned tables can be added to publications. */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index ea22aa6563..5ee7091472 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,6 +80,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Oid relid);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 6298c7c8ca..698a57d0cd 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -179,7 +179,9 @@ extern void ExecutorEnd(QueryDesc *queryDesc);
 extern void standard_ExecutorEnd(QueryDesc *queryDesc);
 extern void ExecutorRewind(QueryDesc *queryDesc);
 extern bool ExecCheckRTPerms(List *rangeTable, bool ereport_on_violation);
-extern void CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation);
+extern void CheckValidResultRel(ResultRelInfo *resultRelInfo,
+								ResultRelInfo *rootResultRelInfo,
+								CmdType operation);
 extern void InitResultRelInfo(ResultRelInfo *resultRelInfo,
 							  Relation resultRelationDesc,
 							  Index resultRelationIndex,
@@ -592,7 +594,9 @@ extern void ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
 									 TupleTableSlot *searchslot, TupleTableSlot *slot);
 extern void ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 									 TupleTableSlot *searchslot);
-extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
+extern void CheckCmdReplicaIdentity(ResultRelInfo *target_rel,
+									ResultRelInfo *root_target_rel,
+									CmdType cmd);
 
 extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
 									 const char *relname);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..e3fabe70f9 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,22 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+DROP PUBLICATION testpub_forparted;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +158,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..b79a3f8f8f 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,16 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+DROP PUBLICATION testpub_forparted;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +93,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
new file mode 100644
index 0000000000..eb0f1cd6a8
--- /dev/null
+++ b/src/test/subscription/t/013_partition.pl
@@ -0,0 +1,161 @@
+# Test PARTITION
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# setup
+
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+my $node_subscriber1 = get_new_node('subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+my $node_subscriber2 = get_new_node('subscriber2');
+$node_subscriber2->init(allows_streaming => 'logical');
+$node_subscriber2->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub1 FOR TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE tab1_2");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2");
+
+# Wait for initial sync of all subscriptions
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert data (some into the root parent and some directly into partitions)
+
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+my $result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
+
+# update a row (no partition change)
+
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 1");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+
+# update a row (partition changes)
+
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|3|6), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+
+# delete rows (some from the root parent, some directly from the partition)
+
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'delete from tab1_2 replicated');
+
+# truncate (root parent and partition directly)
+
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(2|1|2), 'truncate of tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'truncate of tab1_2 replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

v7-0003-Some-refactoring-of-logical-worker.c.patchapplication/octet-stream; name=v7-0003-Some-refactoring-of-logical-worker.c.patchDownload
From fb50b06602f4a17135096851859e148deb0a0578 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v7 3/4] Some refactoring of logical/worker.c

---
 src/backend/replication/logical/worker.c | 290 ++++++++++++++++++-------------
 1 file changed, 170 insertions(+), 120 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ced0d191c2..2686fccdc2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -89,7 +89,8 @@ static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
 
 typedef struct SlotErrCallbackArg
 {
-	LogicalRepRelMapEntry *rel;
+	LogicalRepRelation *remoterel;
+	Oid			local_reloid;
 	int			local_attnum;
 	int			remote_attnum;
 } SlotErrCallbackArg;
@@ -269,7 +270,6 @@ static void
 slot_store_error_callback(void *arg)
 {
 	SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
-	LogicalRepRelMapEntry *rel;
 	char	   *remotetypname;
 	Oid			remotetypoid,
 				localtypoid;
@@ -278,19 +278,18 @@ slot_store_error_callback(void *arg)
 	if (errarg->remote_attnum < 0)
 		return;
 
-	rel = errarg->rel;
-	remotetypoid = rel->remoterel.atttyps[errarg->remote_attnum];
+	remotetypoid = errarg->remoterel->atttyps[errarg->remote_attnum];
 
 	/* Fetch remote type name from the LogicalRepTypMap cache */
 	remotetypname = logicalrep_typmap_gettypname(remotetypoid);
 
 	/* Fetch local type OID from the local sys cache */
-	localtypoid = get_atttype(rel->localreloid, errarg->local_attnum + 1);
+	localtypoid = get_atttype(errarg->local_reloid, errarg->local_attnum + 1);
 
 	errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\", "
 			   "remote type %s, local type %s",
-			   rel->remoterel.nspname, rel->remoterel.relname,
-			   rel->remoterel.attnames[errarg->remote_attnum],
+			   errarg->remoterel->nspname, errarg->remoterel->relname,
+			   errarg->remoterel->attnames[errarg->remote_attnum],
 			   remotetypname,
 			   format_type_be(localtypoid));
 }
@@ -312,7 +311,8 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 	ExecClearTuple(slot);
 
 	/* Push callback + info on the error context stack */
-	errarg.rel = rel;
+	errarg.remoterel = &rel->remoterel;
+	errarg.local_reloid = rel->localreloid;
 	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
 	errcallback.callback = slot_store_error_callback;
@@ -375,8 +375,9 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
  */
 static void
 slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
-					 LogicalRepRelMapEntry *rel,
-					 char **values, bool *replaces)
+					 char **values, bool *replaces,
+					 AttrNumber *attrmap, LogicalRepRelation *remoterel,
+					 Oid local_reloid)
 {
 	int			natts = slot->tts_tupleDescriptor->natts;
 	int			i;
@@ -396,7 +397,8 @@ slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
 	memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
 
 	/* For error reporting, push callback + info on the error context stack */
-	errarg.rel = rel;
+	errarg.remoterel = remoterel;
+	errarg.local_reloid = local_reloid;
 	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
 	errcallback.callback = slot_store_error_callback;
@@ -408,7 +410,7 @@ slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
 	for (i = 0; i < natts; i++)
 	{
 		Form_pg_attribute att = TupleDescAttr(slot->tts_tupleDescriptor, i);
-		int			remoteattnum = rel->attrmap[i];
+		int			remoteattnum = attrmap[i];
 
 		if (remoteattnum < 0)
 			continue;
@@ -577,6 +579,148 @@ GetRelationIdentityOrPK(Relation rel)
 	return idxoid;
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_do_insert(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *localslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, localslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_do_update(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *remoteslot,
+					   LogicalRepTupleData *newtup,
+					   AttrNumber *attrmap, LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+
+	ExecClearTuple(remoteslot);
+
+	/*
+	 * Tuple found.
+	 *
+	 * Note this will fail if there are other conflicting unique indexes.
+	 */
+	if (found)
+	{
+		/* Process and store remote tuple in the slot */
+		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+		slot_modify_cstrings(remoteslot, localslot,
+							 newtup->values, newtup->changed,
+							 attrmap, remoterel, RelationGetRelid(rel));
+		MemoryContextSwitchTo(oldctx);
+
+		EvalPlanQualSetSlot(&epqstate, remoteslot);
+
+		/* Do the actual update. */
+		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
+	}
+	else
+	{
+		/*
+		 * The tuple to be updated could not be found.
+		 *
+		 * TODO what to do here, change the log level to LOG perhaps?
+		 */
+		elog(DEBUG1,
+			 "logical replication did not find row for update "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
+					   TupleTableSlot *remoteslot,
+					   LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+	ExecOpenIndices(relinfo, false);
+
+	/* If found delete it. */
+	if (found)
+	{
+		EvalPlanQualSetSlot(&epqstate, localslot);
+
+		/* Do the actual delete. */
+		ExecSimpleRelationDelete(estate, &epqstate, localslot);
+	}
+	else
+	{
+		/* The tuple to be deleted could not be found. */
+		elog(DEBUG1,
+			 "logical replication could not find row for delete "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -619,13 +763,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_insert(estate->es_result_relation_info, estate,
+						   remoteslot);
 
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -682,15 +823,11 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -716,12 +853,9 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -729,63 +863,16 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
-	ExecClearTuple(remoteslot);
-
-	/*
-	 * Tuple found.
-	 *
-	 * Note this will fail if there are other conflicting unique indexes.
-	 */
-	if (found)
-	{
-		/* Process and store remote tuple in the slot */
-		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
-		MemoryContextSwitchTo(oldctx);
-
-		EvalPlanQualSetSlot(&epqstate, remoteslot);
-
-		/* Do the actual update. */
-		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
-	}
-	else
-	{
-		/*
-		 * The tuple to be updated could not be found.
-		 *
-		 * TODO what to do here, change the log level to LOG perhaps?
-		 */
-		elog(DEBUG1,
-			 "logical replication did not find row for update "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_update(estate->es_result_relation_info, estate,
+						   remoteslot, &newtup, rel->attrmap,
+						   &rel->remoterel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
@@ -805,12 +892,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -835,58 +918,25 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
+	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-	/* If found delete it. */
-	if (found)
-	{
-		EvalPlanQualSetSlot(&epqstate, localslot);
-
-		/* Do the actual delete. */
-		ExecSimpleRelationDelete(estate, &epqstate, localslot);
-	}
-	else
-	{
-		/* The tuple to be deleted could not be found. */
-		elog(DEBUG1,
-			 "logical replication could not find row for delete "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_delete(estate->es_result_relation_info, estate,
+						   remoteslot, &rel->remoterel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
-- 
2.16.5

v7-0002-Add-publish_using_root_schema-parameter-for-publi.patchapplication/octet-stream; name=v7-0002-Add-publish_using_root_schema-parameter-for-publi.patchDownload
From e9775f0ca7f3195a2aed3a4a19ed202b67c21dd8 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v7 2/4] Add publish_using_root_schema parameter for
 publications

It dictates whether to publish (leaf) partition changes using
the the schema of root parent table.
---
 doc/src/sgml/ref/create_publication.sgml  |  15 +++++
 src/backend/catalog/pg_publication.c      |   1 +
 src/backend/commands/publicationcmds.c    |  94 ++++++++++++++++-----------
 src/bin/pg_dump/pg_dump.c                 |  22 ++++++-
 src/bin/pg_dump/pg_dump.h                 |   1 +
 src/bin/psql/describe.c                   |  17 ++++-
 src/include/catalog/pg_publication.h      |   3 +
 src/test/regress/expected/publication.out | 103 +++++++++++++++++-------------
 src/test/regress/sql/publication.sql      |   3 +
 9 files changed, 171 insertions(+), 88 deletions(-)

diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 848779a00f..a8cf2c4629 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -124,6 +124,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table contained in the publication will be published using its own
+          schema rather than of the individual partitions which are actually
+          changed; the latter is the default.  Setting it to
+          <literal>true</literal> allows the changes to be replicated into a
+          non-partitioned table or a partitioned table consisting of a
+          a different set of partitions.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9e14a8216e..5ef77f1014 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -404,6 +404,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->publish_using_root_schema = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index ee56acf3f3..06e833fe57 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -55,20 +55,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -90,10 +93,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -109,19 +112,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -142,10 +154,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -182,9 +193,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -192,13 +203,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -250,17 +263,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -269,19 +281,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b5e91771e4..cfe89d4e09 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3780,6 +3780,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3791,11 +3792,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3819,6 +3827,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3841,6 +3850,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3917,7 +3928,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 7b2c1524a5..99b0c1611d 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -600,6 +600,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index b3b9313b36..ce1321e17f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5706,7 +5706,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5737,6 +5737,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5778,6 +5782,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5790,6 +5795,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5800,6 +5806,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5849,6 +5858,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5861,6 +5872,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5869,6 +5882,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 5ee7091472..61d338b110 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,6 +76,7 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		publish_using_root_schema;
 	PublicationActions pubactions;
 } Publication;
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e3fabe70f9..da22ca3c6a 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -124,10 +126,19 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
+Tables:
+    "public.testpub_parted"
+
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
 Tables:
     "public.testpub_parted"
 
@@ -146,10 +157,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -187,10 +198,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -234,10 +245,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -247,20 +258,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b79a3f8f8f..7ddca1b974 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -77,6 +78,8 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP PUBLICATION testpub_forparted;
 
 -- fail - view
-- 
2.16.5

v7-0004-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v7-0004-Publish-partitioned-table-inserts-as-its-own.patchDownload
From a3068a630548574d29f42f2b297b3131cf5f56b5 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Wed, 13 Nov 2019 17:18:51 +0900
Subject: [PATCH v7 4/4] Publish partitioned table inserts as its own

---
 doc/src/sgml/logical-replication.sgml       |   8 +-
 src/backend/catalog/pg_publication.c        |  11 +-
 src/backend/commands/subscriptioncmds.c     | 103 ++++++-----
 src/backend/executor/nodeModifyTable.c      |   2 +
 src/backend/replication/logical/tablesync.c |  28 ++-
 src/backend/replication/logical/worker.c    | 277 ++++++++++++++++++++++++++--
 src/backend/replication/pgoutput/pgoutput.c | 191 +++++++++++++++----
 src/include/catalog/pg_publication.h        |   2 +-
 src/test/subscription/t/013_partition.pl    |  48 ++++-
 9 files changed, 545 insertions(+), 125 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 3d8cb0895d..1a4d5a9d25 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,12 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by regular and partitioned tables, although
-     the table kind must match between the two servers, that is, one cannot
-     replicate from a regular table into a partitioned able or vice versa.
-     Also, when replicating between partitioned tables, the actual replication
-     occurs between leaf partitions, so the partitions on the two servers must
-     match one-to-one.  Attempts to replicate other types of relations such as
+     Replication is only supported by regular and partitioned tables.
+     Attempts to replicate other types of relations such as
      views, materialized views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 5ef77f1014..19f16b3d43 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -243,20 +243,29 @@ GetRelationPublications(Oid relid)
 /*
  * Finds all publications that publish changes to the input relation's
  * ancestors.
+ *
+ * *published_ancestors will contain one OID for each publication returned,
+ * of the ancestor which belongs to it.  Values in this list can be repeated,
+ * because a given ancestor may belong to multiple publications.
  */
 List *
-GetRelationAncestorPublications(Oid relid)
+GetRelationAncestorPublications(Oid relid, List **published_ancestors)
 {
 	List	   *ancestors = get_partition_ancestors(relid);
 	List	   *ancestor_pubids = NIL;
 	ListCell   *lc;
 
+	*published_ancestors = NIL;
 	foreach(lc, ancestors)
 	{
 		Oid			ancestor = lfirst_oid(lc);
 		List	   *rel_publishers = GetRelationPublications(ancestor);
+		int			n = list_length(rel_publishers),
+					i;
 
 		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+		for (i = 0; i < n; i++)
+			*published_ancestors = lappend_oid(*published_ancestors, ancestor);
 	}
 
 	return ancestor_pubids;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5c5c8ebe3b..6b29d18706 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -54,6 +54,15 @@ typedef struct PublishedTable
 	RangeVar   *rv;
 
 	char		relkind;
+
+	/*
+	 * If the published table is partitioned, the following being true means
+	 * its changes are published using own schema rather than the schema of
+	 * its individual partitions.  In the latter case, a separate
+	 * PublicationTable instance (and hence pg_subscription_rel entry) for
+	 * each partition will be needed.
+	 */
+	bool		published_using_root_schema;
 }			PublishedTable;
 
 static List *fetch_publication_tables(WalReceiverConn *wrconn, List *publications);
@@ -481,24 +490,13 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 										 rv->schemaname, rv->relname);
 
 				/*
-				 * Currently, partitioned table replication occurs between leaf
-				 * partitions, so both the source and the target tables must be
-				 * partitioned.
+				 * A partitioned table doesn't need local state if the state
+				 * is managed for individual partitions, which is the case if
+				 * the partitioned table is published using the schema of its
+				 * partitions.
 				 */
-				if (pt->relkind == RELKIND_RELATION &&
-					local_relkind == RELKIND_PARTITIONED_TABLE)
-					ereport(ERROR,
-							(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-							 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-									rv->schemaname, rv->relname),
-							 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
-									   rv->schemaname, rv->relname)));
-
-				/*
-				 * A partitioned table doesn't need local state, because the
-				 * state is managed for individual partitions instead.
-				 */
-				if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+				if (pt->relkind == RELKIND_PARTITIONED_TABLE &&
+					!pt->published_using_root_schema)
 					continue;
 
 				AddSubscriptionRelState(subid, relid, table_state,
@@ -614,24 +612,12 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 								 rv->schemaname, rv->relname);
 
 		/*
-		 * Currently, partitioned table replication occurs between leaf
-		 * partitions, so both the source and the target tables must be
-		 * partitioned.
+		 * A partitioned table doesn't need local state if the state is
+		 * managed for individual partitions, which is the case if the
+		 * partitioned table is published using the schema of its partitions.
 		 */
-		if (pt->relkind == RELKIND_RELATION &&
-			local_relkind == RELKIND_PARTITIONED_TABLE)
-			ereport(ERROR,
-					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-					 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-							rv->schemaname, rv->relname),
-					 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
-							   rv->schemaname, rv->relname)));
-
-		/*
-		 * A partitioned table doesn't need local state, because the
-		 * state is managed for individual partitions instead.
-		 */
-		if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+		if (pt->relkind == RELKIND_PARTITIONED_TABLE &&
+			!pt->published_using_root_schema)
 			continue;
 
 		pubrel_local_oids[off++] = relid;
@@ -1191,7 +1177,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[3] = {TEXTOID, TEXTOID, CHAROID};
+	Oid			tableRow[4] = {TEXTOID, TEXTOID, CHAROID, BOOLOID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1199,27 +1185,41 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind FROM (\n"
-						   "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind\n"
-						   "  FROM pg_catalog.pg_publication_tables t\n"
-						   "  JOIN pg_catalog.pg_class c \n"
-						   "  ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::name\n"
-						   "  AND t.tablename = c.relname \n");
+	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind, s.pubasroot FROM (\n");
 
 	/*
 	 * As of v13, partitioned tables can be published, although their changes
-	 * are published as their partitions', so we will need the partitions in
-	 * the result.
+	 * may be published either as their own or as their partitions', which is
+	 * checked with pg_publication.pubasroot (whether the publication publishes
+	 * using root partitioned table's schema).
+	 */
+	if (walrcv_server_version(wrconn) >= 130000)
+		appendStringInfoString(&cmd, "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind, p.pubasroot\n");
+	else
+		appendStringInfoString(&cmd, "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind, false AS pubasroot\n");
+
+	appendStringInfoString(&cmd, "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname\n"
+						   "  JOIN pg_catalog.pg_class c\n"
+						   "  ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::pg_catalog.name\n"
+						   "  AND t.tablename = c.relname\n");
+
+	/*
+	 * If publication doesn't publish using the root table's schema, we will
+	 * need partitions in the result.
 	 */
 	if (walrcv_server_version(wrconn) >= 130000)
 		appendStringInfoString(&cmd, "  UNION\n"
-						   "  SELECT t.pubname, s.schemaname, s.tablename, s.relkind\n"
-						   "  FROM pg_catalog.pg_publication_tables t,\n"
-						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname, c.relkind\n"
-						   "		   FROM pg_class c\n"
-						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
-						   "		   ON p.relid = c.oid\n"
-						   "		   WHERE p.level > 0) AS s(schemaname, tablename, relkind)\n");
+							   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename, c.relkind, false AS pubasroot\n"
+							   "  FROM pg_catalog.pg_publication_tables t\n"
+							   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname AND NOT p.pubasroot,\n"
+							   "  LATERAL (SELECT c.relnamespace::pg_catalog.regnamespace::pg_catalog.name, c.relname\n"
+							   "		   FROM pg_catalog.pg_class c\n"
+							   "		   JOIN pg_catalog.pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+							   "		   ON p.relid = c.oid\n"
+							   "		   WHERE p.level > 0) AS s(schemaname, tablename)\n"
+							   "  JOIN pg_catalog.pg_class c ON s.schemaname = c.relnamespace::pg_catalog.regnamespace::pg_catalog.name\n"
+							   "  AND s.tablename = c.relname\n");
 
 	appendStringInfoString(&cmd, ") s WHERE s.pubname IN (");
 
@@ -1237,7 +1237,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 4, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1260,6 +1260,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 		Assert(!isnull);
 		pt->rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
 		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+		pt->published_using_root_schema = DatumGetBool(slot_getattr(slot, 4, &isnull));
 		Assert(!isnull);
 
 		tablelist = lappend(tablelist, pt);
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 2676ae281e..ad27fed388 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2299,6 +2299,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		CheckValidResultRel(mtstate->rootResultRelInfo,
+							mtstate->rootResultRelInfo, operation);
 		rootResultRelInfo = mtstate->rootResultRelInfo;
 	}
 
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index ec387ba768..56c1e28e1b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,16 +630,17 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
-						LogicalRepRelation *lrel)
+						LogicalRepRelation *lrel, char *relkind)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -649,16 +650,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	*relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -750,10 +753,12 @@ copy_table(Relation rel)
 	CopyState	cstate;
 	List	   *attnamelist;
 	ParseState *pstate;
+	char		remote_relkind;
 
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
-							RelationGetRelationName(rel), &lrel);
+							RelationGetRelationName(rel), &lrel,
+							&remote_relkind);
 
 	/* Put the relation into relmap. */
 	logicalrep_relmap_update(&lrel);
@@ -761,12 +766,15 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (remote_relkind == RELKIND_PARTITIONED_TABLE)
+		appendStringInfo(&cmd, "COPY (SELECT * FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 2686fccdc2..728adaa612 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -721,6 +724,168 @@ apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
 	EvalPlanQualEnd(&epqstate);
 }
 
+/*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   LogicalRepRelMapEntry *relmapentry,
+						   EState *estate, CmdType operation,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo,
+			   *partrelinfo1;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, repectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_insert(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_delete(partrelinfo, estate, remoteslot,
+								   &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+
+			/*
+			 * partrelinfo computed above is the partition which might contain
+			 * the search tuple.  Now find the partition for the replacement
+			 * tuple, which might not be the same as partrelinfo.
+			 */
+			localslot = table_slot_create(rel, &estate->es_tupleTable);
+			oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+			slot_modify_cstrings(localslot, remoteslot,
+								 newtup->values, newtup->changed,
+								 relmapentry->attrmap, &relmapentry->remoterel,
+								 RelationGetRelid(rel));
+			partrelinfo1 = ExecFindPartition(mtstate, relinfo, proute,
+											 localslot, estate);
+			MemoryContextSwitchTo(oldctx);
+
+			/*
+			 * If both search and replacement tuple would be in the same
+			 * partition, we can apply this as an UPDATE on the parttion.
+			 */
+			if (partrelinfo == partrelinfo1)
+			{
+				AttrNumber *attrmap = relmapentry->attrmap;
+
+				/*
+				 * If the partition's attributes don't match the root
+				 * relation's, we'll need to make a new attrmap mapping
+				 * partition attribute numbers to remoterel's.
+				 */
+				if (map)
+				{
+					TupleDesc	partdesc = RelationGetDescr(partrelinfo1->ri_RelationDesc);
+					TupleDesc	rootdesc = RelationGetDescr(rel);
+					AttrNumber *partToRootMap,
+								attno;
+
+					/* Need the reverse map here */
+					partToRootMap = convert_tuples_by_name_map(partdesc, rootdesc);
+					attrmap = palloc(partdesc->natts * sizeof(AttrNumber));
+					memset(attrmap, -1, partdesc->natts * sizeof(AttrNumber));
+					for (attno = 0; attno < partdesc->natts; attno++)
+					{
+						AttrNumber	root_attno = partToRootMap[attno];
+
+						attrmap[attno] = relmapentry->attrmap[root_attno - 1];
+					}
+				}
+
+				/* UPDATE partition. */
+				estate->es_result_relation_info = partrelinfo;
+				apply_handle_do_update(partrelinfo, estate, remoteslot,
+									   newtup, attrmap,
+									   &relmapentry->remoterel);
+				if (attrmap != relmapentry->attrmap)
+					pfree(attrmap);
+			}
+			else
+			{
+				/* Different, so handle this as DELETE followed by INSERT. */
+
+				/* DELETE from partition partrelinfo. */
+				estate->es_result_relation_info = partrelinfo;
+				apply_handle_do_delete(partrelinfo, estate, remoteslot,
+									   &relmapentry->remoterel);
+
+				/*
+				 * Convert the replacement tuple to match the destination
+				 * partition rowtype.
+				 */
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				partinfo = partrelinfo1->ri_PartitionInfo;
+				map = partinfo->pi_RootToPartitionMap;
+				if (map != NULL)
+				{
+					TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+					localslot = execute_attr_map_slot(map->attrMap, localslot,
+													  part_slot);
+				}
+				MemoryContextSwitchTo(oldctx);
+				/* INSERT into partition partrelinfo1. */
+				estate->es_result_relation_info = partrelinfo1;
+				apply_handle_do_insert(partrelinfo1, estate, localslot);
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -763,9 +928,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_insert(estate->es_result_relation_info, estate,
-						   remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_INSERT, remoteslot, NULL);
+	else
+		apply_handle_do_insert(estate->es_result_relation_info, estate,
+							   remoteslot);
 
 	PopActiveSnapshot();
 
@@ -863,10 +1032,14 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_update(estate->es_result_relation_info, estate,
-						   remoteslot, &newtup, rel->attrmap,
-						   &rel->remoterel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_UPDATE, remoteslot, &newtup);
+	else
+		apply_handle_do_update(estate->es_result_relation_info, estate,
+							   remoteslot, &newtup, rel->attrmap,
+							   &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -928,9 +1101,13 @@ apply_handle_delete(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_delete(estate->es_result_relation_info, estate,
-						   remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_DELETE, remoteslot, NULL);
+	else
+		apply_handle_do_delete(estate->es_result_relation_info, estate,
+							   remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -972,14 +1149,43 @@ apply_handle_truncate(StringInfo s)
 		LogicalRepRelMapEntry *rel;
 
 		rel = logicalrep_rel_open(relid, RowExclusiveLock);
+
 		if (!should_apply_changes_for_rel(rel))
 		{
+			bool		really_skip = true;
+
+			/*
+			 * If we seem to have gotten sent a leaf partition because an
+			 * ancestor was truncated, confirm before proceeding with
+			 * truncating the partition that an ancestor indeed has a valid
+			 * subscription state.
+			 */
+			if (rel->state == SUBREL_STATE_UNKNOWN &&
+				rel->localrel->rd_rel->relispartition)
+			{
+				List	   *ancestors = get_partition_ancestors(rel->localreloid);
+				ListCell   *lc1;
+
+				foreach(lc1, ancestors)
+				{
+					Oid			anc_oid = lfirst_oid(lc1);
+					LogicalRepRelMapEntry *anc_rel;
+
+					anc_rel = logicalrep_rel_open(anc_oid, RowExclusiveLock);
+					really_skip &= !should_apply_changes_for_rel(anc_rel);
+					logicalrep_rel_close(anc_rel, RowExclusiveLock);
+				}
+			}
+
 			/*
 			 * The relation can't become interesting in the middle of the
 			 * transaction so it's safe to unlock it.
 			 */
-			logicalrep_rel_close(rel, RowExclusiveLock);
-			continue;
+			if (really_skip)
+			{
+				logicalrep_rel_close(rel, RowExclusiveLock);
+				continue;
+			}
 		}
 
 		remote_rels = lappend(remote_rels, rel);
@@ -987,6 +1193,47 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -996,11 +1243,11 @@ apply_handle_truncate(StringInfo s)
 	 */
 	ExecuteTruncateGuts(rels, relids, relids_logged, DROP_RESTRICT, restart_seqs);
 
-	foreach(lc, remote_rels)
+	foreach(lc, rels)
 	{
-		LogicalRepRelMapEntry *rel = lfirst(lc);
+		Relation	rel = lfirst(lc);
 
-		logicalrep_rel_close(rel, NoLock);
+		table_close(rel, NoLock);
 	}
 
 	CommandCounterIncrement();
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 8dc78f1779..5e07b3776e 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -49,6 +50,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +61,22 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * Valid if publishing relation's changes as changes to some ancestor,
+	 * that is, if relation is a partition.  The map, if any, will be used to
+	 * convert the tuples from partition's type to the ancestor's.
+	 */
+	Oid			replicate_as_relid;
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +274,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +386,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -411,6 +488,28 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!relentry->pubactions.pubtruncate)
 			continue;
 
+		/*
+		 * If this partition was not *directly* truncated, don't bother
+		 * sending it to the subscriber.
+		 */
+		if (OidIsValid(relentry->replicate_as_relid))
+		{
+			int			j;
+			bool		can_skip_part_trunc = false;
+
+			for (j = 0; j < nrelids; j++)
+			{
+				if (relentry->replicate_as_relid == relids[j])
+				{
+					can_skip_part_trunc = true;
+					break;
+				}
+			}
+
+			if (can_skip_part_trunc)
+				continue;
+		}
+
 		relids[nrelids++] = relid;
 		maybe_send_schema(ctx, relation, relentry);
 	}
@@ -529,6 +628,11 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * For a partition, the schema of the top-most ancestor that is published
+ * will be used in some cases, instead of that of the partition itself, so
+ * the information about ancestor's publications is looked up here and saved in
+ * the schema cache entry.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Relation rel)
@@ -553,8 +657,11 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 	{
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc,
-				   *lc1;
+				   *lc1,
+				   *lc2;
 		List	   *ancestor_pubids = NIL;
+		List	   *published_ancestors = NIL;
+		Oid			topmost_published_ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -579,7 +686,9 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 		/* For partitions, also consider publications of ancestors. */
 		if (rel->rd_rel->relispartition)
 			ancestor_pubids =
-				GetRelationAncestorPublications(RelationGetRelid(rel));
+				GetRelationAncestorPublications(RelationGetRelid(rel),
+												&published_ancestors);
+		Assert(list_length(ancestor_pubids) == list_length(published_ancestors));
 
 		foreach(lc, data->publications)
 		{
@@ -597,7 +706,7 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
 				break;
 
-			foreach(lc1, ancestor_pubids)
+			forboth(lc1, ancestor_pubids, lc2, published_ancestors)
 			{
 				if (lfirst_oid(lc1) == pub->oid)
 				{
@@ -605,6 +714,8 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 					entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 					entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
 					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+					if (pub->publish_using_root_schema)
+						topmost_published_ancestor = lfirst_oid(lc2);
 				}
 			}
 
@@ -615,7 +726,9 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 
 		list_free(pubids);
 		list_free(ancestor_pubids);
+		list_free(published_ancestors);
 
+		entry->replicate_as_relid = topmost_published_ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 61d338b110..15bf4a7d4c 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,7 +83,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetRelationAncestorPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Oid relid, List **published_ancestors);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index eb0f1cd6a8..957c7b4be1 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 10;
+use Test::More tests => 16;
 
 # setup
 
@@ -41,21 +41,38 @@ $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
 
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1 FOR TABLE tab1, tab1_1");
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub2 FOR TABLE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE tab1 WITH (publish_using_root_schema = true)");
 
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -85,17 +102,26 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1_2 replicated');
+
 # update a row (no partition change)
 
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1_1 replicated');
+
 # update a row (partition changes)
 
 $node_publisher->safe_psql('postgres',
@@ -112,6 +138,10 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'delete from tab1_1 replicated');
+
 # delete rows (some from the root parent, some directly from the partition)
 
 $node_publisher->safe_psql('postgres',
@@ -130,12 +160,18 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
 # truncate (root parent and partition directly)
 
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
 
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
@@ -151,6 +187,10 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(3|1|5), 'no change, because only truncate of tab1 will be replicated');
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
 
@@ -159,3 +199,7 @@ $node_publisher->wait_for_catchup('sub1');
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

#26Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Langote (#25)
Re: adding partitioned tables to publications

On Mon, Dec 16, 2019 at 2:50 PM Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for checking.

On Thu, Dec 12, 2019 at 12:48 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-12-06 08:48, Amit Langote wrote:

0001: Adding a partitioned table to a publication implicitly adds all
its partitions. The receiving side must have tables matching the
published partitions, which is typically the case, because the same
partition tree is defined on both nodes.

This looks pretty good to me now. But you need to make all the changed
queries version-aware so that you can still replicate from and to older
versions. (For example, pg_partition_tree is not very old.)

True, fixed that.

This part looks a bit fishy:

+       /*
+        * If either table is partitioned, skip copying.  Individual
partitions
+        * will be copied instead.
+        */
+       if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+               remote_relkind == RELKIND_PARTITIONED_TABLE)
+       {
+               logicalrep_rel_close(relmapentry, NoLock);
+               return;
+       }

I don't think you want to filter out a partitioned table on the local
side, since (a) COPY can handle that, and (b) it's (as of this patch) an
error to have a partitioned table in the subscription table set.

Yeah, (b) is true, so copy_table() should only ever see regular tables
with only patch 0001 applied.

I'm not a fan of the new ValidateSubscriptionRel() function. It's too
obscure, especially the return value. Doesn't seem worth it.

It went through many variants since I first introduced it, but yeah I
agree we don't need it if only because of the weird interface.

It occurred to me that, *as of 0001*, we should indeed disallow
replicating from a regular table on publisher node into a partitioned
table of the same name on subscriber node (as the earlier patches
did), because 0001 doesn't implement tuple routing support that would
be needed to apply such changes.

Attached updated patches.

I am planning to review this patch. Currently, it is not applying on
the head so can you rebase it?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#27Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Amit Langote (#25)
Re: adding partitioned tables to publications

Hi Amit,

I went through this patch set once again today and here are my two cents.

On Mon, 16 Dec 2019 at 10:19, Amit Langote <amitlangote09@gmail.com> wrote:

Attached updated patches.

-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables, although
+     the table kind must match between the two servers, that is, one cannot

I find the phrase 'table kind' a bit odd, how about something like
type of the table.

/* Only plain tables can be aded to publications. */
- if (tbinfo->relkind != RELKIND_RELATION)
+ /* Only plain and partitioned tables can be added to publications. */
IMHO using regular instead of plain would be more consistent.
+ /*
+ * Find a partition for the tuple contained in remoteslot.
+ *
+ * For insert, remoteslot is tuple to insert.  For update and delete, it
+ * is the tuple to be replaced and deleted, repectively.
+ */
Misspelled 'respectively'.
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+    LogicalRepRelMapEntry *relmapentry,
+    EState *estate, CmdType operation,
+    TupleTableSlot *remoteslot,
+    LogicalRepTupleData *newtup)
+{
+ Relation rel = relinfo->ri_RelationDesc;
+ ModifyTableState *mtstate = NULL;
+ PartitionTupleRouting *proute = NULL;
+ ResultRelInfo *partrelinfo,
+    *partrelinfo1;

IMHO, partrelinfo1 can be better named to improve readability.

Otherwise, as Dilip already mentioned, there is a rebase required
particularly for 0003 and 0004.

--
Regards,
Rafia Sabih

#28Amit Langote
amitlangote09@gmail.com
In reply to: Rafia Sabih (#27)
4 attachment(s)
Re: adding partitioned tables to publications

On Mon, Jan 6, 2020 at 8:25 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

Hi Amit,

I went through this patch set once again today and here are my two cents.

Thanks Rafia.

Rebased and updated to address your comments.

Regards,
Amit

Attachments:

v8-0001-Support-adding-partitioned-tables-to-publication.patchtext/plain; charset=US-ASCII; name=v8-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From 8a1b409f3217e53d557288fac3c0843e53d0710e Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v8 1/4] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       |  15 +--
 doc/src/sgml/ref/create_publication.sgml    |  27 +++--
 src/backend/catalog/pg_publication.c        |  42 +++++---
 src/backend/commands/copy.c                 |   2 +-
 src/backend/commands/publicationcmds.c      |  12 ++-
 src/backend/commands/subscriptioncmds.c     | 117 +++++++++++++++++---
 src/backend/executor/execMain.c             |   7 +-
 src/backend/executor/execPartition.c        |   5 +-
 src/backend/executor/execReplication.c      |  47 ++++----
 src/backend/executor/nodeModifyTable.c      |   6 +-
 src/backend/replication/logical/tablesync.c |   1 +
 src/backend/replication/pgoutput/pgoutput.c |  41 +++++--
 src/bin/pg_dump/pg_dump.c                   |   8 +-
 src/include/catalog/pg_publication.h        |   1 +
 src/include/executor/executor.h             |   8 +-
 src/test/regress/expected/publication.out   |  21 +++-
 src/test/regress/sql/publication.sql        |  12 ++-
 src/test/subscription/t/013_partition.pl    | 161 ++++++++++++++++++++++++++++
 18 files changed, 447 insertions(+), 86 deletions(-)
 create mode 100644 src/test/subscription/t/013_partition.pl

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..4584cb82f6 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,14 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables, although
+     the type of the table must match between the two servers, that is, one
+     cannot replicate from a regular table into a partitioned able or vice
+     versa. Also, when replicating between partitioned tables, the actual
+     replication occurs between leaf partitions, so the partitions on the two
+     servers must match one-to-one.  Attempts to replicate other types of
+     relations such as views, materialized views, or foreign tables, will
+     result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..848779a00f 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,25 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+     </para>
+
+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, any <command>INSERT</command>, <command>UPDATE</update>,
+      and <command>DELETE</command>, and <command>TRUNCATE</command> operations
+      that are directly applied to a partition are also published via its
+      ancestors' publications.
      </para>
     </listitem>
    </varlistentry>
@@ -132,6 +142,11 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
    empty set of tables.  That is useful if tables are to be added later.
   </para>
 
+  <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
   <para>
    The creation of a publication does not start replication.  It only defines
    a grouping and filtering logic for future subscribers.
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index c5eea7af3f..fb369dbe17 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -26,6 +26,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -47,17 +48,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +96,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -230,7 +224,7 @@ GetRelationPublications(Oid relid)
 	CatCList   *pubrellist;
 	int			i;
 
-	/* Find all publications associated with the relation. */
+	/* Finds all publications associated with the relation. */
 	pubrellist = SearchSysCacheList1(PUBLICATIONRELMAP,
 									 ObjectIdGetDatum(relid));
 	for (i = 0; i < pubrellist->n_members; i++)
@@ -246,6 +240,28 @@ GetRelationPublications(Oid relid)
 	return result;
 }
 
+/*
+ * Finds all publications that publish changes to the input relation's
+ * ancestors.
+ */
+List *
+GetRelationAncestorPublications(Oid relid)
+{
+	List	   *ancestors = get_partition_ancestors(relid);
+	List	   *ancestor_pubids = NIL;
+	ListCell   *lc;
+
+	foreach(lc, ancestors)
+	{
+		Oid			ancestor = lfirst_oid(lc);
+		List	   *rel_publishers = GetRelationPublications(ancestor);
+
+		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+	}
+
+	return ancestor_pubids;
+}
+
 /*
  * Gets list of relation oids for a publication.
  *
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index c93a788798..5a75419caf 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2837,7 +2837,7 @@ CopyFrom(CopyState cstate)
 	target_resultRelInfo = resultRelInfo;
 
 	/* Verify the named relation is a valid target for INSERT */
-	CheckValidResultRel(resultRelInfo, CMD_INSERT);
+	CheckValidResultRel(resultRelInfo, NULL, CMD_INSERT);
 
 	ExecOpenIndices(resultRelInfo, false);
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f96cb42adc..8f38c63ad2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -498,7 +498,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +540,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 95962b4a3e..786b15eb27 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -44,7 +44,19 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
-static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
+/*
+ * Structure used by fetch_publication_tables to describe a published table.
+ * The information is used by the callers of fetch_publication_tables to
+ * generate a pg_subscription_rel catalog entry for the table.
+ */
+typedef struct PublishedTable
+{
+	RangeVar   *rv;
+
+	char		relkind;
+}			PublishedTable;
+
+static List *fetch_publication_tables(WalReceiverConn *wrconn, List *publications);
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -453,18 +465,42 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 			 * Get the table list from publisher and build local table status
 			 * info.
 			 */
-			tables = fetch_table_list(wrconn, publications);
+			tables = fetch_publication_tables(wrconn, publications);
 			foreach(lc, tables)
 			{
-				RangeVar   *rv = (RangeVar *) lfirst(lc);
+				PublishedTable *pt = (PublishedTable *) lfirst(lc);
+				RangeVar   *rv = pt->rv;
 				Oid			relid;
+				char		local_relkind;
 
 				relid = RangeVarGetRelid(rv, AccessShareLock, false);
+				local_relkind = get_rel_relkind(relid);
 
 				/* Check for supported relkind. */
-				CheckSubscriptionRelkind(get_rel_relkind(relid),
+				CheckSubscriptionRelkind(local_relkind,
 										 rv->schemaname, rv->relname);
 
+				/*
+				 * Currently, partitioned table replication occurs between leaf
+				 * partitions, so both the source and the target tables must be
+				 * partitioned.
+				 */
+				if (pt->relkind == RELKIND_RELATION &&
+					local_relkind == RELKIND_PARTITIONED_TABLE)
+					ereport(ERROR,
+							(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+							 errmsg("cannot use relation \"%s.%s\" as logical replication target",
+									rv->schemaname, rv->relname),
+							 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
+									   rv->schemaname, rv->relname)));
+
+				/*
+				 * A partitioned table doesn't need local state, because the
+				 * state is managed for individual partitions instead.
+				 */
+				if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+					continue;
+
 				AddSubscriptionRelState(subid, relid, table_state,
 										InvalidXLogRecPtr);
 			}
@@ -530,7 +566,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 				(errmsg("could not connect to the publisher: %s", err)));
 
 	/* Get the table list from publisher. */
-	pubrel_names = fetch_table_list(wrconn, sub->publications);
+	pubrel_names = fetch_publication_tables(wrconn, sub->publications);
 
 	/* We are done with the remote side, close connection. */
 	walrcv_disconnect(wrconn);
@@ -565,15 +601,39 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 
 	foreach(lc, pubrel_names)
 	{
-		RangeVar   *rv = (RangeVar *) lfirst(lc);
+		PublishedTable *pt = (PublishedTable *) lfirst(lc);
+		RangeVar   *rv = pt->rv;
 		Oid			relid;
+		char		local_relkind;
 
 		relid = RangeVarGetRelid(rv, AccessShareLock, false);
+		local_relkind = get_rel_relkind(relid);
 
 		/* Check for supported relkind. */
-		CheckSubscriptionRelkind(get_rel_relkind(relid),
+		CheckSubscriptionRelkind(local_relkind,
 								 rv->schemaname, rv->relname);
 
+		/*
+		 * Currently, partitioned table replication occurs between leaf
+		 * partitions, so both the source and the target tables must be
+		 * partitioned.
+		 */
+		if (pt->relkind == RELKIND_RELATION &&
+			local_relkind == RELKIND_PARTITIONED_TABLE)
+			ereport(ERROR,
+					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+					 errmsg("cannot use relation \"%s.%s\" as logical replication target",
+							rv->schemaname, rv->relname),
+					 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
+							   rv->schemaname, rv->relname)));
+
+		/*
+		 * A partitioned table doesn't need local state, because the
+		 * state is managed for individual partitions instead.
+		 */
+		if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+			continue;
+
 		pubrel_local_oids[off++] = relid;
 
 		if (!bsearch(&relid, subrel_local_oids,
@@ -1121,15 +1181,17 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
 
 /*
  * Get the list of tables which belong to specified publications on the
- * publisher connection.
+ * publisher connection to create a subscription state (pg_subscription_rel
+ * entry) for each.  For partitioned tables, subscription state is maintained
+ * per partition, so partitions are fetched too.
  */
 static List *
-fetch_table_list(WalReceiverConn *wrconn, List *publications)
+fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {TEXTOID, TEXTOID};
+	Oid			tableRow[3] = {TEXTOID, TEXTOID, CHAROID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1137,9 +1199,30 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
+	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind FROM (\n"
+						   "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind\n"
 						   "  FROM pg_catalog.pg_publication_tables t\n"
-						   " WHERE t.pubname IN (");
+						   "  JOIN pg_catalog.pg_class c \n"
+						   "  ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::name\n"
+						   "  AND t.tablename = c.relname \n");
+
+	/*
+	 * As of v13, partitioned tables can be published, although their changes
+	 * are published as their partitions', so we will need the partitions in
+	 * the result.
+	 */
+	if (walrcv_server_version(wrconn) >= 130000)
+		appendStringInfoString(&cmd, "  UNION\n"
+						   "  SELECT t.pubname, s.schemaname, s.tablename, s.relkind\n"
+						   "  FROM pg_catalog.pg_publication_tables t,\n"
+						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname, c.relkind\n"
+						   "		   FROM pg_class c\n"
+						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+						   "		   ON p.relid = c.oid\n"
+						   "		   WHERE p.level > 0) AS s(schemaname, tablename, relkind)\n");
+
+	appendStringInfoString(&cmd, ") s WHERE s.pubname IN (");
+
 	first = true;
 	foreach(lc, publications)
 	{
@@ -1154,7 +1237,7 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1169,15 +1252,17 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
 		char	   *nspname;
 		char	   *relname;
 		bool		isnull;
-		RangeVar   *rv;
+		PublishedTable *pt = palloc(sizeof(PublishedTable));
 
 		nspname = TextDatumGetCString(slot_getattr(slot, 1, &isnull));
 		Assert(!isnull);
 		relname = TextDatumGetCString(slot_getattr(slot, 2, &isnull));
 		Assert(!isnull);
+		pt->rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
+		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+		Assert(!isnull);
 
-		rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
-		tablelist = lappend(tablelist, rv);
+		tablelist = lappend(tablelist, pt);
 
 		ExecClearTuple(slot);
 	}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4181a7e343..96671ca49e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1073,7 +1073,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
  * CheckValidRowMarkRel.
  */
 void
-CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
+CheckValidResultRel(ResultRelInfo *resultRelInfo,
+					ResultRelInfo *rootResultRelInfo,
+					CmdType operation)
 {
 	Relation	resultRel = resultRelInfo->ri_RelationDesc;
 	TriggerDesc *trigDesc = resultRel->trigdesc;
@@ -1083,7 +1085,8 @@ CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
 	{
 		case RELKIND_RELATION:
 		case RELKIND_PARTITIONED_TABLE:
-			CheckCmdReplicaIdentity(resultRel, operation);
+			CheckCmdReplicaIdentity(resultRelInfo, rootResultRelInfo,
+									operation);
 			break;
 		case RELKIND_SEQUENCE:
 			ereport(ERROR,
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index c13b1d3501..2a639011b8 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -384,7 +384,8 @@ ExecFindPartition(ModifyTableState *mtstate,
 						rri = elem->rri;
 
 						/* Verify this ResultRelInfo allows INSERTs */
-						CheckValidResultRel(rri, CMD_INSERT);
+						CheckValidResultRel(rri, rootResultRelInfo,
+											CMD_INSERT);
 
 						/* Set up the PartitionRoutingInfo for it */
 						ExecInitRoutingInfo(mtstate, estate, proute, dispatch,
@@ -529,7 +530,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 	 * partition-key becomes a DELETE+INSERT operation, so this check is still
 	 * required when the operation is CMD_UPDATE.
 	 */
-	CheckValidResultRel(leaf_part_rri, CMD_INSERT);
+	CheckValidResultRel(leaf_part_rri, rootResultRelInfo, CMD_INSERT);
 
 	/*
 	 * Open partition indices.  The user may have asked to check for conflicts
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 582b0cb017..65bfb05df5 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -396,10 +396,10 @@ ExecSimpleRelationInsert(EState *estate, TupleTableSlot *slot)
 	ResultRelInfo *resultRelInfo = estate->es_result_relation_info;
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 
-	/* For now we support only tables. */
+	/* For now we support only regular tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
-	CheckCmdReplicaIdentity(rel, CMD_INSERT);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_INSERT);
 
 	/* BEFORE ROW INSERT Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -463,7 +463,7 @@ ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
 	/* For now we support only tables. */
 	Assert(rel->rd_rel->relkind == RELKIND_RELATION);
 
-	CheckCmdReplicaIdentity(rel, CMD_UPDATE);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_UPDATE);
 
 	/* BEFORE ROW UPDATE Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -521,7 +521,7 @@ ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 	ItemPointer tid = &searchslot->tts_tid;
 
-	CheckCmdReplicaIdentity(rel, CMD_DELETE);
+	CheckCmdReplicaIdentity(resultRelInfo, NULL, CMD_DELETE);
 
 	/* BEFORE ROW DELETE Triggers */
 	if (resultRelInfo->ri_TrigDesc &&
@@ -544,12 +544,17 @@ ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 }
 
 /*
- * Check if command can be executed with current replica identity.
+ * Check if command can be executed on 'target_rel' with its (or the
+ * ancestor's) current replica identity.
  */
 void
-CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
+CheckCmdReplicaIdentity(ResultRelInfo *target_rel,
+						ResultRelInfo *root_target_rel,
+						CmdType cmd)
 {
 	PublicationActions *pubactions;
+	Relation	rel = target_rel->ri_RelationDesc;
+	Relation	rootrel = root_target_rel ? root_target_rel->ri_RelationDesc : NULL;
 
 	/* We only need to do checks for UPDATE and DELETE. */
 	if (cmd != CMD_UPDATE && cmd != CMD_DELETE)
@@ -563,9 +568,18 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
 	/*
 	 * This is either UPDATE OR DELETE and there is no replica identity.
 	 *
-	 * Check if the table publishes UPDATES or DELETES.
+	 * Check if the table or its root ancestor publishes UPDATES or DELETES.
 	 */
 	pubactions = GetRelationPublicationActions(rel);
+	if (rootrel)
+	{
+		PublicationActions *root_pubactions;
+
+		root_pubactions = GetRelationPublicationActions(rootrel);
+		pubactions->pubupdate |= root_pubactions->pubupdate;
+		pubactions->pubdelete |= root_pubactions->pubdelete;
+	}
+
 	if (cmd == CMD_UPDATE && pubactions->pubupdate)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -591,17 +605,10 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * We currently only support writing to regular and partitioned tables.
+	 * However, give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +616,11 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	/*
+	 * There are some unsupported cases with partitioned tables, but we leave
+	 * it for the caller to report them.
+	 */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 59d1a31c97..63e108bb56 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2268,6 +2268,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	int			nplans = list_length(node->plans);
 	ResultRelInfo *saved_resultRelInfo;
 	ResultRelInfo *resultRelInfo;
+	ResultRelInfo *rootResultRelInfo = NULL;
 	Plan	   *subplan;
 	ListCell   *l;
 	int			i;
@@ -2295,8 +2296,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		rootResultRelInfo = mtstate->rootResultRelInfo;
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
@@ -2330,7 +2334,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 		/*
 		 * Verify result relation is a valid target for the current operation
 		 */
-		CheckValidResultRel(resultRelInfo, operation);
+		CheckValidResultRel(resultRelInfo, rootResultRelInfo, operation);
 
 		/*
 		 * If there are indices on the result relation, open them and save
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f8183cd488..98825f01e9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -761,6 +761,7 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
+	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 752508213a..059d2c9194 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -50,7 +50,12 @@ static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
 
-/* Entry in the map used to remember which relation schemas we sent. */
+/*
+ * Entry in the map used to remember which relation schemas we sent.
+ *
+ * For partitions, 'pubactions' considers not only the table's own
+ * publications, but also those of all of its ancestors.
+ */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
@@ -63,7 +68,7 @@ typedef struct RelationSyncEntry
 static HTAB *RelationSyncCache = NULL;
 
 static void init_rel_sync_cache(MemoryContext decoding_context);
-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);
+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -311,7 +316,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	if (!is_publishable_relation(relation))
 		return;
 
-	relentry = get_rel_sync_entry(data, RelationGetRelid(relation));
+	relentry = get_rel_sync_entry(data, relation);
 
 	/* First check the table filter */
 	switch (change->action)
@@ -401,7 +406,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!is_publishable_relation(relation))
 			continue;
 
-		relentry = get_rel_sync_entry(data, relid);
+		relentry = get_rel_sync_entry(data, relation);
 
 		if (!relentry->pubactions.pubtruncate)
 			continue;
@@ -526,8 +531,9 @@ init_rel_sync_cache(MemoryContext cachectx)
  * Find or create entry in the relation schema cache.
  */
 static RelationSyncEntry *
-get_rel_sync_entry(PGOutputData *data, Oid relid)
+get_rel_sync_entry(PGOutputData *data, Relation rel)
 {
+	Oid			relid = RelationGetRelid(rel);
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
@@ -546,7 +552,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	if (!found || !entry->replicate_valid)
 	{
 		List	   *pubids = GetRelationPublications(relid);
-		ListCell   *lc;
+		ListCell   *lc,
+				   *lc1;
+		List	   *ancestor_pubids = NIL;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -568,6 +576,11 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		entry->pubactions.pubinsert = entry->pubactions.pubupdate =
 			entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
 
+		/* For partitions, also consider publications of ancestors. */
+		if (rel->rd_rel->relispartition)
+			ancestor_pubids =
+				GetRelationAncestorPublications(RelationGetRelid(rel));
+
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
@@ -580,12 +593,28 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
 			}
 
+			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
+				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
+				break;
+
+			foreach(lc1, ancestor_pubids)
+			{
+				if (lfirst_oid(lc1) == pub->oid)
+				{
+					entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
+					entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
+					entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				}
+			}
+
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
 				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
 				break;
 		}
 
 		list_free(pubids);
+		list_free(ancestor_pubids);
 
 		entry->replicate_valid = true;
 	}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 799b6988b7..dc33c20048 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,12 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/*
+		 * Only regular and partitioned tables can be added to
+		 * publications.
+		 */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 6cdc2b1197..3cfb31c2e6 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,6 +80,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Oid relid);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 6ef3e1fe06..5b97bb5d57 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -179,7 +179,9 @@ extern void ExecutorEnd(QueryDesc *queryDesc);
 extern void standard_ExecutorEnd(QueryDesc *queryDesc);
 extern void ExecutorRewind(QueryDesc *queryDesc);
 extern bool ExecCheckRTPerms(List *rangeTable, bool ereport_on_violation);
-extern void CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation);
+extern void CheckValidResultRel(ResultRelInfo *resultRelInfo,
+								ResultRelInfo *rootResultRelInfo,
+								CmdType operation);
 extern void InitResultRelInfo(ResultRelInfo *resultRelInfo,
 							  Relation resultRelationDesc,
 							  Index resultRelationIndex,
@@ -592,7 +594,9 @@ extern void ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
 									 TupleTableSlot *searchslot, TupleTableSlot *slot);
 extern void ExecSimpleRelationDelete(EState *estate, EPQState *epqstate,
 									 TupleTableSlot *searchslot);
-extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
+extern void CheckCmdReplicaIdentity(ResultRelInfo *target_rel,
+									ResultRelInfo *root_target_rel,
+									CmdType cmd);
 
 extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
 									 const char *relname);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..e3fabe70f9 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,22 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+DROP PUBLICATION testpub_forparted;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +158,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..b79a3f8f8f 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,16 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+DROP PUBLICATION testpub_forparted;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +93,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
new file mode 100644
index 0000000000..eb0f1cd6a8
--- /dev/null
+++ b/src/test/subscription/t/013_partition.pl
@@ -0,0 +1,161 @@
+# Test PARTITION
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# setup
+
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+my $node_subscriber1 = get_new_node('subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+my $node_subscriber2 = get_new_node('subscriber2');
+$node_subscriber2->init(allows_streaming => 'logical');
+$node_subscriber2->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub1 FOR TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE tab1_2");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2");
+
+# Wait for initial sync of all subscriptions
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert data (some into the root parent and some directly into partitions)
+
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+my $result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
+
+# update a row (no partition change)
+
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 1");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+
+# update a row (partition changes)
+
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|3|6), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+
+# delete rows (some from the root parent, some directly from the partition)
+
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'delete from tab1_2 replicated');
+
+# truncate (root parent and partition directly)
+
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(2|1|2), 'truncate of tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'truncate of tab1_2 replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

v8-0002-Add-publish_using_root_schema-parameter-for-publi.patchtext/plain; charset=US-ASCII; name=v8-0002-Add-publish_using_root_schema-parameter-for-publi.patchDownload
From ccc1d855b4fd62b80fdae745c2dee541608cf740 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v8 2/4] Add publish_using_root_schema parameter for
 publications

It dictates whether to publish (leaf) partition changes using
the the schema of root parent table.
---
 doc/src/sgml/ref/create_publication.sgml  |  15 +++++
 src/backend/catalog/pg_publication.c      |   1 +
 src/backend/commands/publicationcmds.c    |  94 ++++++++++++++++-----------
 src/bin/pg_dump/pg_dump.c                 |  22 ++++++-
 src/bin/pg_dump/pg_dump.h                 |   1 +
 src/bin/psql/describe.c                   |  17 ++++-
 src/include/catalog/pg_publication.h      |   3 +
 src/test/regress/expected/publication.out | 103 +++++++++++++++++-------------
 src/test/regress/sql/publication.sql      |   3 +
 9 files changed, 171 insertions(+), 88 deletions(-)

diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 848779a00f..a8cf2c4629 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -124,6 +124,21 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table contained in the publication will be published using its own
+          schema rather than of the individual partitions which are actually
+          changed; the latter is the default.  Setting it to
+          <literal>true</literal> allows the changes to be replicated into a
+          non-partitioned table or a partitioned table consisting of a
+          a different set of partitions.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index fb369dbe17..6d2911d18f 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -403,6 +403,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->publish_using_root_schema = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 8f38c63ad2..e48815534c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -55,20 +55,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -90,10 +93,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -109,19 +112,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -142,10 +154,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -182,9 +193,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -192,13 +203,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -250,17 +263,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -269,19 +281,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index dc33c20048..bdbd1f823b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3780,6 +3780,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3791,11 +3792,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3819,6 +3827,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3841,6 +3850,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3917,7 +3928,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21004e5078..90e47dd1f3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -600,6 +600,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f3c7eb96fa..3f6ce713af 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5706,7 +5706,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5737,6 +5737,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5778,6 +5782,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5790,6 +5795,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5800,6 +5806,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5849,6 +5858,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5861,6 +5872,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5869,6 +5882,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 3cfb31c2e6..9d13e5c735 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,6 +76,7 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		publish_using_root_schema;
 	PublicationActions pubactions;
 } Publication;
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e3fabe70f9..da22ca3c6a 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -124,10 +126,19 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
+Tables:
+    "public.testpub_parted"
+
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
 Tables:
     "public.testpub_parted"
 
@@ -146,10 +157,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -187,10 +198,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -234,10 +245,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -247,20 +258,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b79a3f8f8f..7ddca1b974 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -77,6 +78,8 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP PUBLICATION testpub_forparted;
 
 -- fail - view
-- 
2.16.5

v8-0003-Some-refactoring-of-logical-worker.c.patchtext/plain; charset=US-ASCII; name=v8-0003-Some-refactoring-of-logical-worker.c.patchDownload
From 8126a6bb784506180ba1d9c4985aabe124ffc63e Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v8 3/4] Some refactoring of logical/worker.c

---
 src/backend/replication/logical/worker.c | 291 ++++++++++++++++++-------------
 1 file changed, 170 insertions(+), 121 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 7a5471f95c..34b0ac78cc 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -90,7 +90,8 @@ static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
 
 typedef struct SlotErrCallbackArg
 {
-	LogicalRepRelMapEntry *rel;
+	LogicalRepRelation *remoterel;
+	Oid			local_reloid;
 	int			local_attnum;
 	int			remote_attnum;
 } SlotErrCallbackArg;
@@ -268,7 +269,6 @@ static void
 slot_store_error_callback(void *arg)
 {
 	SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
-	LogicalRepRelMapEntry *rel;
 	char	   *remotetypname;
 	Oid			remotetypoid,
 				localtypoid;
@@ -277,19 +277,18 @@ slot_store_error_callback(void *arg)
 	if (errarg->remote_attnum < 0)
 		return;
 
-	rel = errarg->rel;
-	remotetypoid = rel->remoterel.atttyps[errarg->remote_attnum];
+	remotetypoid = errarg->remoterel->atttyps[errarg->remote_attnum];
 
 	/* Fetch remote type name from the LogicalRepTypMap cache */
 	remotetypname = logicalrep_typmap_gettypname(remotetypoid);
 
 	/* Fetch local type OID from the local sys cache */
-	localtypoid = get_atttype(rel->localreloid, errarg->local_attnum + 1);
+	localtypoid = get_atttype(errarg->local_reloid, errarg->local_attnum + 1);
 
 	errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\", "
 			   "remote type %s, local type %s",
-			   rel->remoterel.nspname, rel->remoterel.relname,
-			   rel->remoterel.attnames[errarg->remote_attnum],
+			   errarg->remoterel->nspname, errarg->remoterel->relname,
+			   errarg->remoterel->attnames[errarg->remote_attnum],
 			   remotetypname,
 			   format_type_be(localtypoid));
 }
@@ -311,7 +310,8 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 	ExecClearTuple(slot);
 
 	/* Push callback + info on the error context stack */
-	errarg.rel = rel;
+	errarg.remoterel = &rel->remoterel;
+	errarg.local_reloid = rel->localreloid;
 	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
 	errcallback.callback = slot_store_error_callback;
@@ -375,8 +375,9 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
  */
 static void
 slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
-					 LogicalRepRelMapEntry *rel,
-					 char **values, bool *replaces)
+					 char **values, bool *replaces,
+					 AttrMap *attrmap, LogicalRepRelation *remoterel,
+					 Oid local_reloid)
 {
 	int			natts = slot->tts_tupleDescriptor->natts;
 	int			i;
@@ -396,7 +397,8 @@ slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
 	memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
 
 	/* For error reporting, push callback + info on the error context stack */
-	errarg.rel = rel;
+	errarg.remoterel = remoterel;
+	errarg.local_reloid = local_reloid;
 	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
 	errcallback.callback = slot_store_error_callback;
@@ -405,11 +407,11 @@ slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
 	error_context_stack = &errcallback;
 
 	/* Call the "in" function for each replaced attribute */
-	Assert(natts == rel->attrmap->maplen);
+	Assert(natts == attrmap->maplen);
 	for (i = 0; i < natts; i++)
 	{
 		Form_pg_attribute att = TupleDescAttr(slot->tts_tupleDescriptor, i);
-		int			remoteattnum = rel->attrmap->attnums[i];
+		int			remoteattnum = attrmap->attnums[i];
 
 		if (remoteattnum < 0)
 			continue;
@@ -578,6 +580,148 @@ GetRelationIdentityOrPK(Relation rel)
 	return idxoid;
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_do_insert(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *localslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, localslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_do_update(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *remoteslot,
+					   LogicalRepTupleData *newtup,
+					   AttrMap *attrmap, LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+
+	ExecClearTuple(remoteslot);
+
+	/*
+	 * Tuple found.
+	 *
+	 * Note this will fail if there are other conflicting unique indexes.
+	 */
+	if (found)
+	{
+		/* Process and store remote tuple in the slot */
+		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+		slot_modify_cstrings(remoteslot, localslot,
+							 newtup->values, newtup->changed,
+							 attrmap, remoterel, RelationGetRelid(rel));
+		MemoryContextSwitchTo(oldctx);
+
+		EvalPlanQualSetSlot(&epqstate, remoteslot);
+
+		/* Do the actual update. */
+		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
+	}
+	else
+	{
+		/*
+		 * The tuple to be updated could not be found.
+		 *
+		 * TODO what to do here, change the log level to LOG perhaps?
+		 */
+		elog(DEBUG1,
+			 "logical replication did not find row for update "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
+					   TupleTableSlot *remoteslot,
+					   LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+	ExecOpenIndices(relinfo, false);
+
+	/* If found delete it. */
+	if (found)
+	{
+		EvalPlanQualSetSlot(&epqstate, localslot);
+
+		/* Do the actual delete. */
+		ExecSimpleRelationDelete(estate, &epqstate, localslot);
+	}
+	else
+	{
+		/* The tuple to be deleted could not be found. */
+		elog(DEBUG1,
+			 "logical replication could not find row for delete "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -620,13 +764,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
-
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_insert(estate->es_result_relation_info, estate,
+						   remoteslot);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -683,16 +824,12 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
 	RangeTblEntry *target_rte;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -718,9 +855,6 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
 	/*
 	 * Populate updatedCols so that per-column triggers can fire.  This could
@@ -738,7 +872,6 @@ apply_handle_update(StringInfo s)
 	}
 
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -746,63 +879,16 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
-	ExecClearTuple(remoteslot);
-
-	/*
-	 * Tuple found.
-	 *
-	 * Note this will fail if there are other conflicting unique indexes.
-	 */
-	if (found)
-	{
-		/* Process and store remote tuple in the slot */
-		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
-		MemoryContextSwitchTo(oldctx);
-
-		EvalPlanQualSetSlot(&epqstate, remoteslot);
-
-		/* Do the actual update. */
-		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
-	}
-	else
-	{
-		/*
-		 * The tuple to be updated could not be found.
-		 *
-		 * TODO what to do here, change the log level to LOG perhaps?
-		 */
-		elog(DEBUG1,
-			 "logical replication did not find row for update "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_update(estate->es_result_relation_info, estate,
+						   remoteslot, &newtup, rel->attrmap,
+						   &rel->remoterel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
@@ -822,12 +908,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -852,58 +934,25 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
+	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-	/* If found delete it. */
-	if (found)
-	{
-		EvalPlanQualSetSlot(&epqstate, localslot);
-
-		/* Do the actual delete. */
-		ExecSimpleRelationDelete(estate, &epqstate, localslot);
-	}
-	else
-	{
-		/* The tuple to be deleted could not be found. */
-		elog(DEBUG1,
-			 "logical replication could not find row for delete "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_delete(estate->es_result_relation_info, estate,
+						   remoteslot, &rel->remoterel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
-- 
2.16.5

v8-0004-Publish-partitioned-table-inserts-as-its-own.patchtext/plain; charset=US-ASCII; name=v8-0004-Publish-partitioned-table-inserts-as-its-own.patchDownload
From 365b81efed48fb0b1cdb6708fc3f4a9b82a84a22 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Wed, 13 Nov 2019 17:18:51 +0900
Subject: [PATCH v8 4/4] Publish partitioned table inserts as its own

---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 src/backend/catalog/pg_publication.c        |  11 +-
 src/backend/commands/subscriptioncmds.c     | 103 +++++-----
 src/backend/executor/nodeModifyTable.c      |   2 +
 src/backend/replication/logical/tablesync.c |  28 ++-
 src/backend/replication/logical/worker.c    | 289 ++++++++++++++++++++++++++--
 src/backend/replication/pgoutput/pgoutput.c | 191 ++++++++++++++----
 src/include/catalog/pg_publication.h        |   2 +-
 src/test/subscription/t/013_partition.pl    |  48 ++++-
 9 files changed, 558 insertions(+), 127 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 4584cb82f6..1a4d5a9d25 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,14 +402,9 @@
 
    <listitem>
     <para>
-     Replication is only supported by regular and partitioned tables, although
-     the type of the table must match between the two servers, that is, one
-     cannot replicate from a regular table into a partitioned able or vice
-     versa. Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so the partitions on the two
-     servers must match one-to-one.  Attempts to replicate other types of
-     relations such as views, materialized views, or foreign tables, will
-     result in an error.
+     Replication is only supported by regular and partitioned tables.
+     Attempts to replicate other types of relations such as
+     views, materialized views, or foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 6d2911d18f..d47461f763 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -243,20 +243,29 @@ GetRelationPublications(Oid relid)
 /*
  * Finds all publications that publish changes to the input relation's
  * ancestors.
+ *
+ * *published_ancestors will contain one OID for each publication returned,
+ * of the ancestor which belongs to it.  Values in this list can be repeated,
+ * because a given ancestor may belong to multiple publications.
  */
 List *
-GetRelationAncestorPublications(Oid relid)
+GetRelationAncestorPublications(Oid relid, List **published_ancestors)
 {
 	List	   *ancestors = get_partition_ancestors(relid);
 	List	   *ancestor_pubids = NIL;
 	ListCell   *lc;
 
+	*published_ancestors = NIL;
 	foreach(lc, ancestors)
 	{
 		Oid			ancestor = lfirst_oid(lc);
 		List	   *rel_publishers = GetRelationPublications(ancestor);
+		int			n = list_length(rel_publishers),
+					i;
 
 		ancestor_pubids = list_concat_copy(ancestor_pubids, rel_publishers);
+		for (i = 0; i < n; i++)
+			*published_ancestors = lappend_oid(*published_ancestors, ancestor);
 	}
 
 	return ancestor_pubids;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 786b15eb27..2a45aff445 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -54,6 +54,15 @@ typedef struct PublishedTable
 	RangeVar   *rv;
 
 	char		relkind;
+
+	/*
+	 * If the published table is partitioned, the following being true means
+	 * its changes are published using own schema rather than the schema of
+	 * its individual partitions.  In the latter case, a separate
+	 * PublicationTable instance (and hence pg_subscription_rel entry) for
+	 * each partition will be needed.
+	 */
+	bool		published_using_root_schema;
 }			PublishedTable;
 
 static List *fetch_publication_tables(WalReceiverConn *wrconn, List *publications);
@@ -481,24 +490,13 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 										 rv->schemaname, rv->relname);
 
 				/*
-				 * Currently, partitioned table replication occurs between leaf
-				 * partitions, so both the source and the target tables must be
-				 * partitioned.
+				 * A partitioned table doesn't need local state if the state
+				 * is managed for individual partitions, which is the case if
+				 * the partitioned table is published using the schema of its
+				 * partitions.
 				 */
-				if (pt->relkind == RELKIND_RELATION &&
-					local_relkind == RELKIND_PARTITIONED_TABLE)
-					ereport(ERROR,
-							(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-							 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-									rv->schemaname, rv->relname),
-							 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
-									   rv->schemaname, rv->relname)));
-
-				/*
-				 * A partitioned table doesn't need local state, because the
-				 * state is managed for individual partitions instead.
-				 */
-				if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+				if (pt->relkind == RELKIND_PARTITIONED_TABLE &&
+					!pt->published_using_root_schema)
 					continue;
 
 				AddSubscriptionRelState(subid, relid, table_state,
@@ -614,24 +612,12 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data)
 								 rv->schemaname, rv->relname);
 
 		/*
-		 * Currently, partitioned table replication occurs between leaf
-		 * partitions, so both the source and the target tables must be
-		 * partitioned.
+		 * A partitioned table doesn't need local state if the state is
+		 * managed for individual partitions, which is the case if the
+		 * partitioned table is published using the schema of its partitions.
 		 */
-		if (pt->relkind == RELKIND_RELATION &&
-			local_relkind == RELKIND_PARTITIONED_TABLE)
-			ereport(ERROR,
-					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-					 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-							rv->schemaname, rv->relname),
-					 errdetail("\"%s.%s\" is a partitioned table whereas it is a regular table on publication server.",
-							   rv->schemaname, rv->relname)));
-
-		/*
-		 * A partitioned table doesn't need local state, because the
-		 * state is managed for individual partitions instead.
-		 */
-		if (pt->relkind == RELKIND_PARTITIONED_TABLE)
+		if (pt->relkind == RELKIND_PARTITIONED_TABLE &&
+			!pt->published_using_root_schema)
 			continue;
 
 		pubrel_local_oids[off++] = relid;
@@ -1191,7 +1177,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[3] = {TEXTOID, TEXTOID, CHAROID};
+	Oid			tableRow[4] = {TEXTOID, TEXTOID, CHAROID, BOOLOID};
 	ListCell   *lc;
 	bool		first;
 	List	   *tablelist = NIL;
@@ -1199,27 +1185,41 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	Assert(list_length(publications) > 0);
 
 	initStringInfo(&cmd);
-	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind FROM (\n"
-						   "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind\n"
-						   "  FROM pg_catalog.pg_publication_tables t\n"
-						   "  JOIN pg_catalog.pg_class c \n"
-						   "  ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::name\n"
-						   "  AND t.tablename = c.relname \n");
+	appendStringInfoString(&cmd, "SELECT DISTINCT s.schemaname, s.tablename, s.relkind, s.pubasroot FROM (\n");
 
 	/*
 	 * As of v13, partitioned tables can be published, although their changes
-	 * are published as their partitions', so we will need the partitions in
-	 * the result.
+	 * may be published either as their own or as their partitions', which is
+	 * checked with pg_publication.pubasroot (whether the publication publishes
+	 * using root partitioned table's schema).
+	 */
+	if (walrcv_server_version(wrconn) >= 130000)
+		appendStringInfoString(&cmd, "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind, p.pubasroot\n");
+	else
+		appendStringInfoString(&cmd, "  SELECT t.pubname, t.schemaname, t.tablename, c.relkind, false AS pubasroot\n");
+
+	appendStringInfoString(&cmd, "  FROM pg_catalog.pg_publication_tables t\n"
+						   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname\n"
+						   "  JOIN pg_catalog.pg_class c\n"
+						   "  ON t.schemaname = c.relnamespace::pg_catalog.regnamespace::pg_catalog.name\n"
+						   "  AND t.tablename = c.relname\n");
+
+	/*
+	 * If publication doesn't publish using the root table's schema, we will
+	 * need partitions in the result.
 	 */
 	if (walrcv_server_version(wrconn) >= 130000)
 		appendStringInfoString(&cmd, "  UNION\n"
-						   "  SELECT t.pubname, s.schemaname, s.tablename, s.relkind\n"
-						   "  FROM pg_catalog.pg_publication_tables t,\n"
-						   "  LATERAL (SELECT c.relnamespace::regnamespace::name, c.relname, c.relkind\n"
-						   "		   FROM pg_class c\n"
-						   "		   JOIN pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
-						   "		   ON p.relid = c.oid\n"
-						   "		   WHERE p.level > 0) AS s(schemaname, tablename, relkind)\n");
+							   "  SELECT DISTINCT t.pubname, s.schemaname, s.tablename, c.relkind, false AS pubasroot\n"
+							   "  FROM pg_catalog.pg_publication_tables t\n"
+							   "  JOIN pg_catalog.pg_publication p ON t.pubname = p.pubname AND NOT p.pubasroot,\n"
+							   "  LATERAL (SELECT c.relnamespace::pg_catalog.regnamespace::pg_catalog.name, c.relname\n"
+							   "		   FROM pg_catalog.pg_class c\n"
+							   "		   JOIN pg_catalog.pg_partition_tree(t.schemaname || '.' || t.tablename) p\n"
+							   "		   ON p.relid = c.oid\n"
+							   "		   WHERE p.level > 0) AS s(schemaname, tablename)\n"
+							   "  JOIN pg_catalog.pg_class c ON s.schemaname = c.relnamespace::pg_catalog.regnamespace::pg_catalog.name\n"
+							   "  AND s.tablename = c.relname\n");
 
 	appendStringInfoString(&cmd, ") s WHERE s.pubname IN (");
 
@@ -1237,7 +1237,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 	}
 	appendStringInfoChar(&cmd, ')');
 
-	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 4, tableRow);
 	pfree(cmd.data);
 
 	if (res->status != WALRCV_OK_TUPLES)
@@ -1260,6 +1260,7 @@ fetch_publication_tables(WalReceiverConn *wrconn, List *publications)
 		Assert(!isnull);
 		pt->rv = makeRangeVar(pstrdup(nspname), pstrdup(relname), -1);
 		pt->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+		pt->published_using_root_schema = DatumGetBool(slot_getattr(slot, 4, &isnull));
 		Assert(!isnull);
 
 		tablelist = lappend(tablelist, pt);
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 63e108bb56..5b7265939f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2299,6 +2299,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		CheckValidResultRel(mtstate->rootResultRelInfo,
+							mtstate->rootResultRelInfo, operation);
 		rootResultRelInfo = mtstate->rootResultRelInfo;
 	}
 
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 98825f01e9..6a18b78f22 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,16 +630,17 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
-						LogicalRepRelation *lrel)
+						LogicalRepRelation *lrel, char *relkind)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -649,16 +650,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	*relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -750,10 +753,12 @@ copy_table(Relation rel)
 	CopyState	cstate;
 	List	   *attnamelist;
 	ParseState *pstate;
+	char		remote_relkind;
 
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
-							RelationGetRelationName(rel), &lrel);
+							RelationGetRelationName(rel), &lrel,
+							&remote_relkind);
 
 	/* Put the relation into relmap. */
 	logicalrep_relmap_update(&lrel);
@@ -761,12 +766,15 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (remote_relkind == RELKIND_PARTITIONED_TABLE)
+		appendStringInfo(&cmd, "COPY (SELECT * FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 34b0ac78cc..ec34418f75 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -722,6 +725,180 @@ apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
 	EvalPlanQualEnd(&epqstate);
 }
 
+/*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   LogicalRepRelMapEntry *relmapentry,
+						   EState *estate, CmdType operation,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, respectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_insert(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_delete(partrelinfo, estate, remoteslot,
+								   &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			{
+				ResultRelInfo *partrelinfo_new;
+
+				/*
+				 * partrelinfo computed above is the partition which might
+				 * contain the search tuple.  Now find the partition for the
+				 * replacement tuple, which might not be the same as
+				 * partrelinfo.
+				 */
+				localslot = table_slot_create(rel, &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				slot_modify_cstrings(localslot, remoteslot,
+									 newtup->values, newtup->changed,
+									 relmapentry->attrmap,
+									 &relmapentry->remoterel,
+									 RelationGetRelid(rel));
+				partrelinfo_new = ExecFindPartition(mtstate, relinfo, proute,
+													localslot, estate);
+				MemoryContextSwitchTo(oldctx);
+
+				/*
+				 * If both search and replacement tuple would be in the same
+				 * partition, we can apply this as an UPDATE on the parttion.
+				 */
+				if (partrelinfo == partrelinfo_new)
+				{
+					AttrMap *attrmap = relmapentry->attrmap,
+							*new_attrmap = NULL;
+
+					/*
+					 * If the partition's attributes don't match the root
+					 * relation's, we'll need to make a new attrmap which maps
+					 * partition attribute numbers to remoterel's, instead
+					 * the original which maps root relation's attribute
+					 * numbers to remoterel's.
+					 */
+					if (map)
+					{
+						TupleDesc	partdesc = RelationGetDescr(partrelinfo->ri_RelationDesc);
+						TupleDesc	rootdesc = RelationGetDescr(rel);
+						AttrMap	   *partToRootMap;
+						AttrNumber	attno;
+
+						/* Need the reverse map here */
+						partToRootMap =  build_attrmap_by_name(partdesc, rootdesc);
+						new_attrmap = make_attrmap(partdesc->natts);
+						memset(new_attrmap->attnums, -1,
+							   new_attrmap->maplen * sizeof(AttrNumber));
+						for (attno = 0; attno < new_attrmap->maplen; attno++)
+						{
+							AttrNumber	root_attno = partToRootMap->attnums[attno];
+
+							new_attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+						}
+						attrmap = new_attrmap;
+					}
+
+					/* UPDATE partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_update(partrelinfo, estate, remoteslot,
+										   newtup, attrmap,
+										   &relmapentry->remoterel);
+					if (new_attrmap)
+						free_attrmap(new_attrmap);
+				}
+				else
+				{
+					/*
+					 * Different, so handle this as DELETE followed by INSERT.
+					 */
+
+					/* DELETE from partition partrelinfo. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_delete(partrelinfo, estate, remoteslot,
+										   &relmapentry->remoterel);
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+						localslot = execute_attr_map_slot(map->attrMap, localslot,
+														  part_slot);
+					}
+					MemoryContextSwitchTo(oldctx);
+					/* INSERT into partition partrelinfo_new. */
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_do_insert(partrelinfo_new, estate,
+										   localslot);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -764,9 +941,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_insert(estate->es_result_relation_info, estate,
-						   remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_INSERT, remoteslot, NULL);
+	else
+		apply_handle_do_insert(estate->es_result_relation_info, estate,
+							   remoteslot);
 
 	PopActiveSnapshot();
 
@@ -879,10 +1060,14 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_update(estate->es_result_relation_info, estate,
-						   remoteslot, &newtup, rel->attrmap,
-						   &rel->remoterel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_UPDATE, remoteslot, &newtup);
+	else
+		apply_handle_do_update(estate->es_result_relation_info, estate,
+							   remoteslot, &newtup, rel->attrmap,
+							   &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -944,9 +1129,13 @@ apply_handle_delete(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_delete(estate->es_result_relation_info, estate,
-						   remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, rel,
+								   estate, CMD_DELETE, remoteslot, NULL);
+	else
+		apply_handle_do_delete(estate->es_result_relation_info, estate,
+							   remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -988,14 +1177,43 @@ apply_handle_truncate(StringInfo s)
 		LogicalRepRelMapEntry *rel;
 
 		rel = logicalrep_rel_open(relid, RowExclusiveLock);
+
 		if (!should_apply_changes_for_rel(rel))
 		{
+			bool		really_skip = true;
+
+			/*
+			 * If we seem to have gotten sent a leaf partition because an
+			 * ancestor was truncated, confirm before proceeding with
+			 * truncating the partition that an ancestor indeed has a valid
+			 * subscription state.
+			 */
+			if (rel->state == SUBREL_STATE_UNKNOWN &&
+				rel->localrel->rd_rel->relispartition)
+			{
+				List	   *ancestors = get_partition_ancestors(rel->localreloid);
+				ListCell   *lc1;
+
+				foreach(lc1, ancestors)
+				{
+					Oid			anc_oid = lfirst_oid(lc1);
+					LogicalRepRelMapEntry *anc_rel;
+
+					anc_rel = logicalrep_rel_open(anc_oid, RowExclusiveLock);
+					really_skip &= !should_apply_changes_for_rel(anc_rel);
+					logicalrep_rel_close(anc_rel, RowExclusiveLock);
+				}
+			}
+
 			/*
 			 * The relation can't become interesting in the middle of the
 			 * transaction so it's safe to unlock it.
 			 */
-			logicalrep_rel_close(rel, RowExclusiveLock);
-			continue;
+			if (really_skip)
+			{
+				logicalrep_rel_close(rel, RowExclusiveLock);
+				continue;
+			}
 		}
 
 		remote_rels = lappend(remote_rels, rel);
@@ -1003,6 +1221,47 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1012,11 +1271,11 @@ apply_handle_truncate(StringInfo s)
 	 */
 	ExecuteTruncateGuts(rels, relids, relids_logged, DROP_RESTRICT, restart_seqs);
 
-	foreach(lc, remote_rels)
+	foreach(lc, rels)
 	{
-		LogicalRepRelMapEntry *rel = lfirst(lc);
+		Relation	rel = lfirst(lc);
 
-		logicalrep_rel_close(rel, NoLock);
+		table_close(rel, NoLock);
 	}
 
 	CommandCounterIncrement();
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 059d2c9194..99ceae0d5f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -49,6 +50,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +61,22 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * Valid if publishing relation's changes as changes to some ancestor,
+	 * that is, if relation is a partition.  The map, if any, will be used to
+	 * convert the tuples from partition's type to the ancestor's.
+	 */
+	Oid			replicate_as_relid;
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +274,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +386,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -411,6 +488,28 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!relentry->pubactions.pubtruncate)
 			continue;
 
+		/*
+		 * If this partition was not *directly* truncated, don't bother
+		 * sending it to the subscriber.
+		 */
+		if (OidIsValid(relentry->replicate_as_relid))
+		{
+			int			j;
+			bool		can_skip_part_trunc = false;
+
+			for (j = 0; j < nrelids; j++)
+			{
+				if (relentry->replicate_as_relid == relids[j])
+				{
+					can_skip_part_trunc = true;
+					break;
+				}
+			}
+
+			if (can_skip_part_trunc)
+				continue;
+		}
+
 		relids[nrelids++] = relid;
 		maybe_send_schema(ctx, relation, relentry);
 	}
@@ -529,6 +628,11 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * For a partition, the schema of the top-most ancestor that is published
+ * will be used in some cases, instead of that of the partition itself, so
+ * the information about ancestor's publications is looked up here and saved in
+ * the schema cache entry.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Relation rel)
@@ -553,8 +657,11 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 	{
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc,
-				   *lc1;
+				   *lc1,
+				   *lc2;
 		List	   *ancestor_pubids = NIL;
+		List	   *published_ancestors = NIL;
+		Oid			topmost_published_ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -579,7 +686,9 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 		/* For partitions, also consider publications of ancestors. */
 		if (rel->rd_rel->relispartition)
 			ancestor_pubids =
-				GetRelationAncestorPublications(RelationGetRelid(rel));
+				GetRelationAncestorPublications(RelationGetRelid(rel),
+												&published_ancestors);
+		Assert(list_length(ancestor_pubids) == list_length(published_ancestors));
 
 		foreach(lc, data->publications)
 		{
@@ -597,7 +706,7 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 				entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
 				break;
 
-			foreach(lc1, ancestor_pubids)
+			forboth(lc1, ancestor_pubids, lc2, published_ancestors)
 			{
 				if (lfirst_oid(lc1) == pub->oid)
 				{
@@ -605,6 +714,8 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 					entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 					entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
 					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+					if (pub->publish_using_root_schema)
+						topmost_published_ancestor = lfirst_oid(lc2);
 				}
 			}
 
@@ -615,7 +726,9 @@ get_rel_sync_entry(PGOutputData *data, Relation rel)
 
 		list_free(pubids);
 		list_free(ancestor_pubids);
+		list_free(published_ancestors);
 
+		entry->replicate_as_relid = topmost_published_ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 9d13e5c735..0a45c11d7d 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -83,7 +83,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetRelationAncestorPublications(Oid relid);
+extern List *GetRelationAncestorPublications(Oid relid, List **published_ancestors);
 extern List *GetPublicationRelations(Oid pubid);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index eb0f1cd6a8..957c7b4be1 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 10;
+use Test::More tests => 16;
 
 # setup
 
@@ -41,21 +41,38 @@ $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
 
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1 FOR TABLE tab1, tab1_1");
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub2 FOR TABLE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE tab1 WITH (publish_using_root_schema = true)");
 
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -85,17 +102,26 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1_2 replicated');
+
 # update a row (no partition change)
 
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1_1 replicated');
+
 # update a row (partition changes)
 
 $node_publisher->safe_psql('postgres',
@@ -112,6 +138,10 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'delete from tab1_1 replicated');
+
 # delete rows (some from the root parent, some directly from the partition)
 
 $node_publisher->safe_psql('postgres',
@@ -130,12 +160,18 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
 # truncate (root parent and partition directly)
 
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
 
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
@@ -151,6 +187,10 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(3|1|5), 'no change, because only truncate of tab1 will be replicated');
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
 
@@ -159,3 +199,7 @@ $node_publisher->wait_for_catchup('sub1');
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

#29Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Amit Langote (#28)
Re: adding partitioned tables to publications

On Tue, 7 Jan 2020 at 06:02, Amit Langote <amitlangote09@gmail.com> wrote:

Rebased and updated to address your comments.

+  <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
What is the reason for above, I mean not for the comment but not
including partitioned tables in for all tables options.

--
Regards,
Rafia Sabih

#30Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#28)
Re: adding partitioned tables to publications

On 2020-01-07 06:01, Amit Langote wrote:

On Mon, Jan 6, 2020 at 8:25 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

Hi Amit,

I went through this patch set once again today and here are my two cents.

Thanks Rafia.

Rebased and updated to address your comments.

Looking through 0001, I think perhaps there is a better way to structure
some of the API changes.

Instead of passing the root_target_rel to CheckValidResultRel() and
CheckCmdReplicaIdentity(), which we only need to check the publication
actions of the root table, how about changing
GetRelationPublicationActions() to automatically include the publication
information of the root table. Then we have that information in the
relcache once and don't need to check the base table and the partition
root separately at each call site (of which there is only one right
now). (Would that work correctly with relcache invalidation?)

Similarly, couldn't GetRelationPublications() just automatically take
partitioning into account? We don't need the separation between
GetRelationPublications() and GetRelationAncestorPublications(). This
would also avoid errors of omission, for example the
GetRelationPublications() call in ATPrepChangePersistence() doesn't take
GetRelationAncestorPublications() into account.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#31Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Rafia Sabih (#29)
Re: adding partitioned tables to publications

On 2020-01-07 15:18, Rafia Sabih wrote:

On Tue, 7 Jan 2020 at 06:02, Amit Langote <amitlangote09@gmail.com> wrote:

Rebased and updated to address your comments.

+  <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
What is the reason for above, I mean not for the comment but not
including partitioned tables in for all tables options.

This comment is kind of a noop, because the leaf partitions are already
included in FOR ALL TABLES, so whether partitioned tables are considered
included in FOR ALL TABLES is irrelevant. I suggest removing the
comment to avoid any confusion.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#32Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#31)
Re: adding partitioned tables to publications

On Wed, Jan 8, 2020 at 7:57 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-01-07 15:18, Rafia Sabih wrote:

On Tue, 7 Jan 2020 at 06:02, Amit Langote <amitlangote09@gmail.com> wrote:

Rebased and updated to address your comments.

+  <para>
+   Partitioned tables are not considered when <literal>FOR ALL TABLES</literal>
+   is specified.
+  </para>
+
What is the reason for above, I mean not for the comment but not
including partitioned tables in for all tables options.

This comment is kind of a noop, because the leaf partitions are already
included in FOR ALL TABLES, so whether partitioned tables are considered
included in FOR ALL TABLES is irrelevant. I suggest removing the
comment to avoid any confusion.

I agree. I had written that comment considering the other feature
where the changes are published as root table's, but even in that case
it'd be wrong to do what it says -- partitioned tables *should* be
included in that case.

I will fix the patches accordingly.

Thanks,
Amit

#33Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#30)
1 attachment(s)
Re: adding partitioned tables to publications

Hi Peter
,
Thanks for the review and sorry it took me a while to get back.

On Wed, Jan 8, 2020 at 7:54 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Looking through 0001, I think perhaps there is a better way to structure
some of the API changes.

Instead of passing the root_target_rel to CheckValidResultRel() and
CheckCmdReplicaIdentity(), which we only need to check the publication
actions of the root table, how about changing
GetRelationPublicationActions() to automatically include the publication
information of the root table. Then we have that information in the
relcache once and don't need to check the base table and the partition
root separately at each call site (of which there is only one right
now). (Would that work correctly with relcache invalidation?)

Similarly, couldn't GetRelationPublications() just automatically take
partitioning into account? We don't need the separation between
GetRelationPublications() and GetRelationAncestorPublications(). This
would also avoid errors of omission, for example the
GetRelationPublications() call in ATPrepChangePersistence() doesn't take
GetRelationAncestorPublications() into account.

I have addressed these comments in the attached updated patch.

Other than that, the updated patch contains following significant changes:

* Changed pg_publication.c: GetPublicationRelations() so that any
published partitioned tables are expanded as needed

* Since the pg_publication_tables view is backed by
GetPublicationRelations(), that means subscriptioncmds.c:
fetch_table_list() no longer needs to craft a query to include
partitions when needed, because partitions are included at source.
That seems better, because it allows to limit the complexity
surrounding publication of partitioned tables to the publication side.

* Fixed the publication table DDL to spot more cases of tables being
added to a publication in a duplicative manner. For example,
partition being added to a publication which already contains its
ancestor and a partitioned tables being added to a publication
(implying all of its partitions are added) which already contains a
partition

Only attaching 0001. Will send the rest after polishing them a bit more.

Thanks,
Amit

Attachments:

v9-0001-Support-adding-partitioned-tables-to-publication.patchtext/plain; charset=US-ASCII; name=v9-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From 72eb76b32daa384074beaa3b3b1946db8fd154a8 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v9] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       |  18 +--
 doc/src/sgml/ref/create_publication.sgml    |  20 +++-
 src/backend/catalog/pg_publication.c        | 164 ++++++++++++++++++++++---
 src/backend/commands/publicationcmds.c      |  16 ++-
 src/backend/replication/logical/tablesync.c |   1 +
 src/backend/replication/pgoutput/pgoutput.c |  19 ++-
 src/bin/pg_dump/pg_dump.c                   |   8 +-
 src/include/catalog/pg_publication.h        |   2 +-
 src/test/regress/expected/publication.out   |  30 ++++-
 src/test/regress/sql/publication.sql        |  18 ++-
 src/test/subscription/t/013_partition.pl    | 178 ++++++++++++++++++++++++++++
 11 files changed, 428 insertions(+), 46 deletions(-)
 create mode 100644 src/test/subscription/t/013_partition.pl

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..fa30ac27f7 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,17 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables, although
+     the type of the table must match between the two servers, that is, one
+     cannot replicate from a regular table into a partitioned able or vice
+     versa. Also, when replicating between partitioned tables, the actual
+     replication occurs between leaf partitions, so the partitions on the two
+     servers must match one-to-one.
+    </para>
+
+    <para>
+     Attempts to replicate other types of relations such as views, materialized
+     views, or foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..a304f9b8c3 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+     </para>
+
+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, even operations that are performed directly on a
+      partition are also published via its ancestors' publications.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index c5eea7af3f..c05617dec9 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -24,8 +24,10 @@
 #include "catalog/index.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -40,6 +42,8 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+static List *get_rel_publications(Oid relid);
+
 /*
  * Check if relation can be in given publication and throws appropriate
  * error if not.
@@ -47,17 +51,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +99,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -165,6 +162,10 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	 * Check for duplicates. Note that this does not really prevent
 	 * duplicates, it's here just to provide nicer error message in common
 	 * case. The real protection is the unique key on the catalog.
+	 *
+	 * We give special messages for when a partition is found to be implicitly
+	 * published via an ancestor and when a partitioned tables's partitions
+	 * are found to be published on their own.
 	 */
 	if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
 							  ObjectIdGetDatum(pubid)))
@@ -179,6 +180,71 @@ publication_add_relation(Oid pubid, Relation targetrel,
 				 errmsg("relation \"%s\" is already member of publication \"%s\"",
 						RelationGetRelationName(targetrel), pub->name)));
 	}
+	else if (targetrel->rd_rel->relispartition)
+	{
+		List   *ancestors = get_partition_ancestors(relid);
+		ListCell *lc;
+		Oid		ancestor;
+		bool	found = false;
+
+		foreach(lc, ancestors)
+		{
+			ancestor = lfirst_oid(lc);
+			if (SearchSysCacheExists2(PUBLICATIONRELMAP,
+									  ObjectIdGetDatum(ancestor),
+									  ObjectIdGetDatum(pubid)))
+			{
+				found = true;
+				break;
+			}
+		}
+
+		if (found)
+		{
+			Assert(OidIsValid(ancestor));
+			table_close(rel, RowExclusiveLock);
+
+			if (if_not_exists)
+				return InvalidObjectAddress;
+
+			ereport(ERROR,
+					(errcode(ERRCODE_DUPLICATE_OBJECT),
+					 errmsg("relation \"%s\" is already member of publication \"%s\" via ancestor \"%s\"",
+							RelationGetRelationName(targetrel), pub->name,
+							get_rel_name(ancestor))));
+		}
+	}
+	else if (targetrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+	{
+		List   *pub_rels = GetPublicationRelations(pubid, true);
+		List   *parts = find_all_inheritors(relid, NoLock, NULL);
+		ListCell *lc;
+		Oid		partition;
+		bool	found = false;
+
+		foreach(lc, parts)
+		{
+			partition = lfirst_oid(lc);
+			if (list_member_oid(pub_rels, partition))
+			{
+				found = true;
+				break;
+			}
+		}
+
+		if (found)
+		{
+			Assert(OidIsValid(partition));
+			table_close(rel, RowExclusiveLock);
+			ereport(ERROR,
+					(errcode(ERRCODE_DUPLICATE_OBJECT),
+					 errmsg("descendent table \"%s\" of \"%s\"is already member of publication \"%s\"",
+							get_rel_name(partition),
+							RelationGetRelationName(targetrel), pub->name),
+					 errhint("Remove descendent tables of \"%s\" from publication before adding it to the publication.",
+							 RelationGetRelationName(targetrel))));
+		}
+	}
 
 	check_publication_add_relation(targetrel);
 
@@ -221,10 +287,35 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Gets list of publication oids for a relation oid.
+ * Gets list of publication oids for a relation, plus those of ancestors,
+ * if any, if the relation is a partition.
  */
 List *
 GetRelationPublications(Oid relid)
+{
+	List	   *result = NIL;
+
+	result = get_rel_publications(relid);
+	if (get_rel_relispartition(relid))
+	{
+		List	   *ancestors = get_partition_ancestors(relid);
+		ListCell   *lc;
+
+		foreach(lc, ancestors)
+		{
+			Oid			ancestor = lfirst_oid(lc);
+			List	   *ancestor_pubs = get_rel_publications(ancestor);
+
+			result = list_concat(result, ancestor_pubs);
+		}
+	}
+
+	return result;
+}
+
+/* Workhorse of GetRelationPublications() */
+static List *
+get_rel_publications(Oid relid)
 {
 	List	   *result = NIL;
 	CatCList   *pubrellist;
@@ -251,9 +342,14 @@ GetRelationPublications(Oid relid)
  *
  * This should only be used for normal publications, the FOR ALL TABLES
  * should use GetAllTablesPublicationRelations().
+ *
+ * Caller should pass true for 'include_partitions' so that for any
+ * partitioned tables that are in the publication its partitions are
+ * included too if the operation to be performed on the returned relations
+ * expects to see all relations that are affected by the publication.
  */
 List *
-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, bool include_partitions)
 {
 	List	   *result;
 	Relation	pubrelsrel;
@@ -278,8 +374,12 @@ GetPublicationRelations(Oid pubid)
 		Form_pg_publication_rel pubrel;
 
 		pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
-
-		result = lappend_oid(result, pubrel->prrelid);
+		if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
+			include_partitions)
+			result = list_concat(result, find_all_inheritors(pubrel->prrelid,
+															 NoLock, NULL));
+		else
+			result = lappend_oid(result, pubrel->prrelid);
 	}
 
 	systable_endscan(scan);
@@ -480,10 +580,40 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
 		publication = GetPublicationByName(pubname, false);
+
+		/*
+		 * Publications support partitioned tables, although we need to filter
+		 * them out from the result, because all changes are replicated using
+		 * the leaf partition identity and schema.
+		 */
 		if (publication->alltables)
+		{
+			/*
+			 * GetAllTablesPublicationRelations() only ever returns leaf
+			 * partitions.
+			 */
 			tables = GetAllTablesPublicationRelations();
+		}
 		else
-			tables = GetPublicationRelations(publication->oid);
+		{
+			List   *all_tables;
+			ListCell *lc;
+
+			/*
+			 * GetPublicationRelations() includes partitioned tables in its
+			 * result which is required by other internal users of that
+			 * function, which must be filtered out.
+			 */
+			all_tables = GetPublicationRelations(publication->oid, true);
+			tables = NIL;
+			foreach(lc, all_tables)
+			{
+				Oid		relid = lfirst_oid(lc);
+
+				if (get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE)
+					tables = lappend_oid(tables, relid);
+			}
+		}
 		funcctx->user_fctx = (void *) tables;
 
 		MemoryContextSwitchTo(oldcontext);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f96cb42adc..d4b43e7662 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -299,7 +299,7 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	}
 	else
 	{
-		List	   *relids = GetPublicationRelations(pubform->oid);
+		List	   *relids = GetPublicationRelations(pubform->oid, true);
 
 		/*
 		 * We don't want to send too many individual messages, at some point
@@ -356,7 +356,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 		PublicationDropTables(pubid, rels, false);
 	else						/* DEFELEM_SET */
 	{
-		List	   *oldrelids = GetPublicationRelations(pubid);
+		List	   *oldrelids = GetPublicationRelations(pubid, false);
 		List	   *delrels = NIL;
 		ListCell   *oldlc;
 
@@ -498,7 +498,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +540,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f8183cd488..98825f01e9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -761,6 +761,7 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
+	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 752508213a..d6b9cbe1bd 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -50,7 +50,12 @@ static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
 
-/* Entry in the map used to remember which relation schemas we sent. */
+/*
+ * Entry in the map used to remember which relation schemas we sent.
+ *
+ * For partitions, 'pubactions' considers not only the table's own
+ * publications, but also those of all of its ancestors.
+ */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
@@ -406,6 +411,13 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!relentry->pubactions.pubtruncate)
 			continue;
 
+		/*
+		 * Don't send partitioned tables, because partitions would be
+		 * sent instead.
+		 */
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+			continue;
+
 		relids[nrelids++] = relid;
 		maybe_send_schema(ctx, relation, relentry);
 	}
@@ -524,6 +536,11 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * This looks up publications that given relation is directly or indirectly
+ * part of (latter if it's really the relation's ancestor that is part of a
+ * publication) and fills up the found entry with the information about
+ * which operations to publish.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 799b6988b7..dc33c20048 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,12 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/*
+		 * Only regular and partitioned tables can be added to
+		 * publications.
+		 */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 6cdc2b1197..04a8b87e78 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,7 +80,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelations(Oid pubid, bool include_partitions);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..d1d9b90c50 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,31 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+-- fail - can't re-add partition
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted1;
+ERROR:  relation "testpub_parted1" is already member of publication "testpub_forparted" via ancestor "testpub_parted"
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted1;
+-- fail - can't re-add partition
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+ERROR:  descendent table "testpub_parted1" of "testpub_parted"is already member of publication "testpub_forparted"
+HINT:  Remove descendent tables of "testpub_parted" from publication before adding it to the publication.
+DROP PUBLICATION testpub_forparted;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +167,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..7074c08efd 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,22 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+-- fail - can't re-add partition
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted1;
+ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted1;
+-- fail - can't re-add partition
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+DROP PUBLICATION testpub_forparted;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +99,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
new file mode 100644
index 0000000000..1fa392b618
--- /dev/null
+++ b/src/test/subscription/t/013_partition.pl
@@ -0,0 +1,178 @@
+# Test PARTITION
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 15;
+
+# setup
+
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+my $node_subscriber1 = get_new_node('subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+my $node_subscriber2 = get_new_node('subscriber2');
+$node_subscriber2->init(allows_streaming => 'logical');
+$node_subscriber2->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# publisher
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+
+# subscriber1
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+
+# subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+
+# Wait for initial sync of all subscriptions
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+my $result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
+
+# update (no partition change)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 1");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
+
+# update (partition changes)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'delete from tab1_2 replicated');
+
+# truncate
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'truncate of tab1_2 replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

#34Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#33)
4 attachment(s)
Re: adding partitioned tables to publications

On Wed, Jan 22, 2020 at 2:38 PM Amit Langote <amitlangote09@gmail.com> wrote:

Other than that, the updated patch contains following significant changes:

* Changed pg_publication.c: GetPublicationRelations() so that any
published partitioned tables are expanded as needed

* Since the pg_publication_tables view is backed by
GetPublicationRelations(), that means subscriptioncmds.c:
fetch_table_list() no longer needs to craft a query to include
partitions when needed, because partitions are included at source.
That seems better, because it allows to limit the complexity
surrounding publication of partitioned tables to the publication side.

* Fixed the publication table DDL to spot more cases of tables being
added to a publication in a duplicative manner. For example,
partition being added to a publication which already contains its
ancestor and a partitioned tables being added to a publication
(implying all of its partitions are added) which already contains a
partition

On second thought, this seems like an overkill. It might be OK after
all for both a partitioned table and its partitions to be explicitly
added to a publication without complaining of duplication. IOW, it's
the user's call whether it makes sense to do that or not.

Only attaching 0001.

Attached updated 0001 considering the above and the rest of the
patches that add support for replicating partitioned tables using
their own identity and schema. I have reorganized the other patches
as follows:

0002: refactoring of logical/worker.c without any functionality
changes (contains much less churn than in earlier versions)

0003: support logical replication into partitioned tables on the
subscription side (allows replicating from a non-partitioned table on
publisher node into a partitioned table on subscriber node)

0004: support optionally replicating partitioned table changes (and
changes directly made to partitions) using root partitioned table
identity and schema

Thanks,
Amit

Attachments:

v10-0003-Add-subscription-support-to-replicate-into-parti.patchtext/plain; charset=US-ASCII; name=v10-0003-Add-subscription-support-to-replicate-into-parti.patchDownload
From d7545b676e39edf77a6944d2b63c6dea25bb7569 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v10 3/4] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c      |  14 +-
 src/backend/replication/logical/relation.c  | 161 +++++++++++++++++++
 src/backend/replication/logical/tablesync.c |  28 ++--
 src/backend/replication/logical/worker.c    | 232 ++++++++++++++++++++++++++--
 src/include/replication/logicalrelation.h   |   2 +
 src/test/subscription/t/013_partition.pl    |   7 +-
 6 files changed, 413 insertions(+), 31 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 582b0cb017..635b29d050 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -591,17 +591,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +601,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b970..54189d7965 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -34,6 +34,7 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
+static HTAB *LogicalRepPartMap = NULL;
 
 
 /*
@@ -472,3 +473,163 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepRelMapContext)
+		LogicalRepRelMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepRelMapEntry);
+	ctl.hcxt = LogicalRepRelMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	entry = hash_search(LogicalRepPartMap, (void *) &partOid,
+						HASH_ENTER, &found);
+
+	if (found)
+		return entry;
+
+	memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+
+	/* Make cached copy of the data */
+	oldctx = MemoryContextSwitchTo(LogicalRepRelMapContext);
+
+	/* Remote relation is used as-is from the root's entry. */
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		memset(entry->attrmap->attnums, -1,
+			   entry->attrmap->maplen * sizeof(AttrNumber));
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 98825f01e9..6a18b78f22 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,16 +630,17 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
-						LogicalRepRelation *lrel)
+						LogicalRepRelation *lrel, char *relkind)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -649,16 +650,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	*relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -750,10 +753,12 @@ copy_table(Relation rel)
 	CopyState	cstate;
 	List	   *attnamelist;
 	ParseState *pstate;
+	char		remote_relkind;
 
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
-							RelationGetRelationName(rel), &lrel);
+							RelationGetRelationName(rel), &lrel,
+							&remote_relkind);
 
 	/* Put the relation into relmap. */
 	logicalrep_relmap_update(&lrel);
@@ -761,12 +766,15 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (remote_relkind == RELKIND_PARTITIONED_TABLE)
+		appendStringInfo(&cmd, "COPY (SELECT * FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 86601f6e8f..a48537db0c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -720,6 +723,152 @@ apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
 	EvalPlanQualEnd(&epqstate);
 }
 
+/*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, respectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws an error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_insert(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_delete(partrelinfo, estate, remoteslot,
+								   &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			{
+				ResultRelInfo *partrelinfo_new;
+
+				/*
+				 * partrelinfo computed above is the partition which might
+				 * contain the search tuple.  Now find the partition for the
+				 * replacement tuple, which might not be the same as
+				 * partrelinfo.
+				 */
+				localslot = table_slot_create(rel, &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				slot_modify_cstrings(localslot, remoteslot, relmapentry,
+									 newtup->values, newtup->changed);
+				partrelinfo_new = ExecFindPartition(mtstate, relinfo, proute,
+													localslot, estate);
+
+				MemoryContextSwitchTo(oldctx);
+
+				/*
+				 * If both search and replacement tuple would be in the same
+				 * partition, we can apply this as an UPDATE on the parttion.
+				 */
+				if (partrelinfo == partrelinfo_new)
+				{
+					Relation	partrel = partrelinfo->ri_RelationDesc;
+					AttrMap	   *attrmap = map ? map->attrMap : NULL;
+					LogicalRepRelMapEntry *part_entry;
+
+					part_entry = logicalrep_partition_open(relmapentry,
+														   partrel, attrmap);
+
+					/* UPDATE partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_update(partrelinfo, estate, remoteslot,
+										   newtup, part_entry);
+				}
+				else
+				{
+					/*
+					 * Different, so handle this as DELETE followed by INSERT.
+					 */
+
+					/* DELETE from partition partrelinfo. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_delete(partrelinfo, estate, remoteslot,
+										   &relmapentry->remoterel);
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+						localslot = execute_attr_map_slot(map->attrMap, localslot,
+														  part_slot);
+					}
+					MemoryContextSwitchTo(oldctx);
+					/* INSERT into partition partrelinfo_new. */
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_do_insert(partrelinfo_new, estate,
+										   localslot);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -762,9 +911,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_insert(estate->es_result_relation_info, estate,
-						   remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_do_insert(estate->es_result_relation_info, estate,
+							   remoteslot);
 
 	PopActiveSnapshot();
 
@@ -877,9 +1030,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_update(estate->es_result_relation_info, estate,
-						   remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_do_update(estate->es_result_relation_info, estate,
+							   remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -940,9 +1097,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_delete(estate->es_result_relation_info, estate,
-						   remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_do_delete(estate->es_result_relation_info, estate,
+							   remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -970,6 +1131,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -999,6 +1161,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1014,6 +1222,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a8028c..4650b4f9e1 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 1fa392b618..1ec487154b 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -42,10 +42,15 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
-- 
2.16.5

v10-0004-Publish-partitioned-table-inserts-as-its-own.patchtext/plain; charset=US-ASCII; name=v10-0004-Publish-partitioned-table-inserts-as-its-own.patchDownload
From 61e983f85ccd309fd8aea2ae2119db278812b119 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v10 4/4] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that cab be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  12 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 +++
 src/backend/catalog/partition.c             |   9 ++
 src/backend/catalog/pg_publication.c        |  98 ++++++++++---
 src/backend/commands/publicationcmds.c      |  95 ++++++++-----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 ++++++++++++++++++++++------
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 ++-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 ++++++++------
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 170 +++++++++++++++++++++-
 17 files changed, 611 insertions(+), 168 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index fa30ac27f7..98da594eeb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,16 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by regular and partitioned tables, although
-     the type of the table must match between the two servers, that is, one
-     cannot replicate from a regular table into a partitioned able or vice
-     versa. Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so the partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by regular and partitioned tables.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index a304f9b8c3..b51701a623 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -122,6 +122,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac017fa..07853b85d5 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE &&
+			get_rel_relispartition(relid);
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 14d4ad3abd..6fac401ed4 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -224,13 +224,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -242,6 +259,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -362,9 +385,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -386,12 +413,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -422,6 +472,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -518,36 +569,45 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 
 		/*
 		 * Publications support partitioned tables, although we need to filter
-		 * them out from the result, because all changes are replicated using
-		 * the leaf partition identity and schema.
+		 * them out from the result unless the publication replicates changes
+		 * using the root schema.  In other cases, we return only their leaf
+		 * partitions, because all changes are replicated using the leaf
+		 * partition identity and schema.
 		 */
 		if (publication->alltables)
 		{
-			/*
-			 * GetAllTablesPublicationRelations() only ever returns leaf
-			 * partitions.
-			 */
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		}
 		else
 		{
 			List   *all_tables;
 			ListCell *lc;
 
+			/*
+			 * Only need partitions if not replicating partitioned table
+			 * changes using the root schema.
+			 */
+			all_tables = GetPublicationRelations(publication->oid,
+												 !publication->pubasroot);
+
 			/*
 			 * GetPublicationRelations() includes partitioned tables in its
 			 * result which is required by other internal users of that
-			 * function, which must be filtered out.
+			 * function, which must be filtered out if needed.
 			 */
-			all_tables = GetPublicationRelations(publication->oid, true);
-			tables = NIL;
-			foreach(lc, all_tables)
+			if (!publication->pubasroot)
 			{
-				Oid		relid = lfirst_oid(lc);
-
-				if (get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE)
-					tables = lappend_oid(tables, relid);
+				tables = NIL;
+				foreach(lc, all_tables)
+				{
+					Oid		relid = lfirst_oid(lc);
+
+					if (get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE)
+						tables = lappend_oid(tables, relid);
+				}
 			}
+			else
+				tables = all_tables;
 		}
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index d4b43e7662..309ee77650 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -55,20 +56,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -90,10 +94,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -109,19 +113,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -142,10 +155,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -182,9 +194,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -192,13 +204,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -250,17 +264,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -269,19 +282,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7c23968f2d..5ba7dde845 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14625,7 +14625,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 59d1a31c97..f88377a0c2 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2295,8 +2295,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index d6b9cbe1bd..ac88ba4f83 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions would be
-		 * sent instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that given relation is directly or indirectly
  * part of (latter if it's really the relation's ancestor that is part of a
  * publication) and fills up the found entry with the information about
- * which operations to publish.
+ * which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..cf5736b311 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5138,7 +5139,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5157,7 +5158,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index dc33c20048..bdbd1f823b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3780,6 +3780,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3791,11 +3792,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3819,6 +3827,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3841,6 +3850,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3917,7 +3928,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21004e5078..90e47dd1f3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -600,6 +600,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f3c7eb96fa..3f6ce713af 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5706,7 +5706,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5737,6 +5737,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5778,6 +5782,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5790,6 +5795,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5800,6 +5806,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5849,6 +5858,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5861,6 +5872,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5869,6 +5882,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873aff6e..c6c19119ca 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 04a8b87e78..ea4210c1c2 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,15 +76,16 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 extern List *GetPublicationRelations(Oid pubid, bool include_partitions);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index e3fabe70f9..da22ca3c6a 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -124,10 +126,19 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
+Tables:
+    "public.testpub_parted"
+
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
 Tables:
     "public.testpub_parted"
 
@@ -146,10 +157,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -187,10 +198,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -234,10 +245,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -247,20 +258,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index b79a3f8f8f..7ddca1b974 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -77,6 +78,8 @@ RESET client_min_messages;
 CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP PUBLICATION testpub_forparted;
 
 -- fail - view
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 1ec487154b..6cb484aded 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 34;
 
 # setup
 
@@ -25,7 +25,11 @@ my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1");
 $node_publisher->safe_psql('postgres',
-	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+	"CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_using_root_schema = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 WITH (publish_using_root_schema = true)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -34,8 +38,24 @@ $node_publisher->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (1, 2, 3, 5, 6)");
 $node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub2 ADD TABLE tab1_1, tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub3 ADD TABLE tab2, tab3_1");
 
 # subscriber1
 $node_subscriber1->safe_psql('postgres',
@@ -51,18 +71,42 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (1) TO (10)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub4 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # subscriber 2
 $node_subscriber2->safe_psql('postgres',
-	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub2");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -79,14 +123,28 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (3), (5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|1|5), 'insert into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|1|5), 'insert into tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -95,32 +153,68 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1 replicated');
+
 # update (no partition change)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|2|5), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|2|5), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1 replicated');
+
 # update (partition changes)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 2");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|3|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|3|6), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
@@ -129,19 +223,41 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'update of tab1 replicated');
+
 # delete
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1 WHERE a IN (3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3 WHERE a IN (3, 5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(1|6|6), 'delete from tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(1|6|6), 'delete from tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_1");
 is($result, qq(0||), 'delete from tab1_1 replicated');
@@ -150,34 +266,80 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
 # truncate
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab3_1 (a) VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (2)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_1 VALUES (1)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2_1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(4|1|6), 'truncate of tab2_2 NOT replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$node_subscriber2->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub3");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab3");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(1|1|1), 'tab1_1 unchanged');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(1|2|2), 'tab1_2 unchanged');
-- 
2.16.5

v10-0001-Support-adding-partitioned-tables-to-publication.patchtext/plain; charset=US-ASCII; name=v10-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From 21c3b1345d6ad98967caadd3804cca7755ac2cd9 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v10 1/4] Support adding partitioned tables to publication

---
 doc/src/sgml/logical-replication.sgml       |  18 +--
 doc/src/sgml/ref/create_publication.sgml    |  20 +++-
 src/backend/catalog/pg_publication.c        |  99 +++++++++++++---
 src/backend/commands/publicationcmds.c      |  16 ++-
 src/backend/replication/logical/tablesync.c |   1 +
 src/backend/replication/pgoutput/pgoutput.c |  19 ++-
 src/bin/pg_dump/pg_dump.c                   |   8 +-
 src/include/catalog/pg_publication.h        |   2 +-
 src/test/regress/expected/publication.out   |  21 +++-
 src/test/regress/sql/publication.sql        |  12 +-
 src/test/subscription/t/013_partition.pl    | 178 ++++++++++++++++++++++++++++
 11 files changed, 348 insertions(+), 46 deletions(-)
 create mode 100644 src/test/subscription/t/013_partition.pl

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..fa30ac27f7 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,17 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by regular and partitioned tables, although
+     the type of the table must match between the two servers, that is, one
+     cannot replicate from a regular table into a partitioned able or vice
+     versa. Also, when replicating between partitioned tables, the actual
+     replication occurs between leaf partitions, so the partitions on the two
+     servers must match one-to-one.
+    </para>
+
+    <para>
+     Attempts to replicate other types of relations such as views, materialized
+     views, or foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..a304f9b8c3 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+     </para>
+
+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, even operations that are performed directly on a
+      partition are also published via its ancestors' publications.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index c5eea7af3f..14d4ad3abd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -24,8 +24,10 @@
 #include "catalog/index.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -40,6 +42,8 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+static List *get_rel_publications(Oid relid);
+
 /*
  * Check if relation can be in given publication and throws appropriate
  * error if not.
@@ -47,17 +51,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +99,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -165,6 +162,10 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	 * Check for duplicates. Note that this does not really prevent
 	 * duplicates, it's here just to provide nicer error message in common
 	 * case. The real protection is the unique key on the catalog.
+	 *
+	 * We give special messages for when a partition is found to be implicitly
+	 * published via an ancestor and when a partitioned tables's partitions
+	 * are found to be published on their own.
 	 */
 	if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
 							  ObjectIdGetDatum(pubid)))
@@ -221,10 +222,35 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Gets list of publication oids for a relation oid.
+ * Gets list of publication oids for a relation, plus those of ancestors,
+ * if any, if the relation is a partition.
  */
 List *
 GetRelationPublications(Oid relid)
+{
+	List	   *result = NIL;
+
+	result = get_rel_publications(relid);
+	if (get_rel_relispartition(relid))
+	{
+		List	   *ancestors = get_partition_ancestors(relid);
+		ListCell   *lc;
+
+		foreach(lc, ancestors)
+		{
+			Oid			ancestor = lfirst_oid(lc);
+			List	   *ancestor_pubs = get_rel_publications(ancestor);
+
+			result = list_concat(result, ancestor_pubs);
+		}
+	}
+
+	return result;
+}
+
+/* Workhorse of GetRelationPublications() */
+static List *
+get_rel_publications(Oid relid)
 {
 	List	   *result = NIL;
 	CatCList   *pubrellist;
@@ -251,9 +277,14 @@ GetRelationPublications(Oid relid)
  *
  * This should only be used for normal publications, the FOR ALL TABLES
  * should use GetAllTablesPublicationRelations().
+ *
+ * Caller should pass true for 'include_partitions' so that for any
+ * partitioned tables that are in the publication its partitions are
+ * included too if the operation to be performed on the returned relations
+ * expects to see all relations that are affected by the publication.
  */
 List *
-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, bool include_partitions)
 {
 	List	   *result;
 	Relation	pubrelsrel;
@@ -278,8 +309,12 @@ GetPublicationRelations(Oid pubid)
 		Form_pg_publication_rel pubrel;
 
 		pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
-
-		result = lappend_oid(result, pubrel->prrelid);
+		if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
+			include_partitions)
+			result = list_concat(result, find_all_inheritors(pubrel->prrelid,
+															 NoLock, NULL));
+		else
+			result = lappend_oid(result, pubrel->prrelid);
 	}
 
 	systable_endscan(scan);
@@ -480,10 +515,40 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
 		publication = GetPublicationByName(pubname, false);
+
+		/*
+		 * Publications support partitioned tables, although we need to filter
+		 * them out from the result, because all changes are replicated using
+		 * the leaf partition identity and schema.
+		 */
 		if (publication->alltables)
+		{
+			/*
+			 * GetAllTablesPublicationRelations() only ever returns leaf
+			 * partitions.
+			 */
 			tables = GetAllTablesPublicationRelations();
+		}
 		else
-			tables = GetPublicationRelations(publication->oid);
+		{
+			List   *all_tables;
+			ListCell *lc;
+
+			/*
+			 * GetPublicationRelations() includes partitioned tables in its
+			 * result which is required by other internal users of that
+			 * function, which must be filtered out.
+			 */
+			all_tables = GetPublicationRelations(publication->oid, true);
+			tables = NIL;
+			foreach(lc, all_tables)
+			{
+				Oid		relid = lfirst_oid(lc);
+
+				if (get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE)
+					tables = lappend_oid(tables, relid);
+			}
+		}
 		funcctx->user_fctx = (void *) tables;
 
 		MemoryContextSwitchTo(oldcontext);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f96cb42adc..d4b43e7662 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -299,7 +299,7 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	}
 	else
 	{
-		List	   *relids = GetPublicationRelations(pubform->oid);
+		List	   *relids = GetPublicationRelations(pubform->oid, true);
 
 		/*
 		 * We don't want to send too many individual messages, at some point
@@ -356,7 +356,7 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 		PublicationDropTables(pubid, rels, false);
 	else						/* DEFELEM_SET */
 	{
-		List	   *oldrelids = GetPublicationRelations(pubid);
+		List	   *oldrelids = GetPublicationRelations(pubid, false);
 		List	   *delrels = NIL;
 		ListCell   *oldlc;
 
@@ -498,7 +498,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +540,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f8183cd488..98825f01e9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -761,6 +761,7 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
+	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 752508213a..d6b9cbe1bd 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -50,7 +50,12 @@ static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
 
-/* Entry in the map used to remember which relation schemas we sent. */
+/*
+ * Entry in the map used to remember which relation schemas we sent.
+ *
+ * For partitions, 'pubactions' considers not only the table's own
+ * publications, but also those of all of its ancestors.
+ */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
@@ -406,6 +411,13 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!relentry->pubactions.pubtruncate)
 			continue;
 
+		/*
+		 * Don't send partitioned tables, because partitions would be
+		 * sent instead.
+		 */
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+			continue;
+
 		relids[nrelids++] = relid;
 		maybe_send_schema(ctx, relation, relentry);
 	}
@@ -524,6 +536,11 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * This looks up publications that given relation is directly or indirectly
+ * part of (latter if it's really the relation's ancestor that is part of a
+ * publication) and fills up the found entry with the information about
+ * which operations to publish.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 799b6988b7..dc33c20048 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,12 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/*
+		 * Only regular and partitioned tables can be added to
+		 * publications.
+		 */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 6cdc2b1197..04a8b87e78 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,7 +80,7 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetPublicationRelations(Oid pubid);
+extern List *GetPublicationRelations(Oid pubid, bool include_partitions);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..e3fabe70f9 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,22 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+DROP PUBLICATION testpub_forparted;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +158,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..b79a3f8f8f 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,16 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+RESET client_min_messages;
+-- should add only the parent to publication, not the partition
+CREATE TABLE testpub_parted1 PARTITION OF testpub_parted FOR VALUES IN (1);
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+DROP PUBLICATION testpub_forparted;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +93,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
new file mode 100644
index 0000000000..1fa392b618
--- /dev/null
+++ b/src/test/subscription/t/013_partition.pl
@@ -0,0 +1,178 @@
+# Test PARTITION
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 15;
+
+# setup
+
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+my $node_subscriber1 = get_new_node('subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+my $node_subscriber2 = get_new_node('subscriber2');
+$node_subscriber2->init(allows_streaming => 'logical');
+$node_subscriber2->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# publisher
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+
+# subscriber1
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+
+# subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+
+# Wait for initial sync of all subscriptions
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+my $result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
+
+# update (no partition change)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 1");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
+
+# update (partition changes)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'delete from tab1_2 replicated');
+
+# truncate
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'truncate of tab1_2 replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

v10-0002-Some-refactoring-of-logical-worker.c.patchtext/plain; charset=US-ASCII; name=v10-0002-Some-refactoring-of-logical-worker.c.patchDownload
From f4360963fdc2d9ff76cb89b4324f94e04bd79724 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v10 2/4] Some refactoring of logical/worker.c

This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding apply_handle_do_{insert|update|delete}
functions, to perform those operations on relations that are not
direct targets of replication.

An example of that is when replicating changes into a partitioned
table, some of which must be applied to its partitions.
---
 src/backend/replication/logical/worker.c | 261 ++++++++++++++++++-------------
 1 file changed, 153 insertions(+), 108 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 7a5471f95c..86601f6e8f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -578,6 +578,148 @@ GetRelationIdentityOrPK(Relation rel)
 	return idxoid;
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_do_insert(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *localslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, localslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_do_update(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *remoteslot,
+					   LogicalRepTupleData *newtup,
+					   LogicalRepRelMapEntry *relmapentry)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	LogicalRepRelation *remoterel = &relmapentry->remoterel;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+
+	ExecClearTuple(remoteslot);
+
+	/*
+	 * Tuple found.
+	 *
+	 * Note this will fail if there are other conflicting unique indexes.
+	 */
+	if (found)
+	{
+		/* Process and store remote tuple in the slot */
+		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+		slot_modify_cstrings(remoteslot, localslot, relmapentry,
+							 newtup->values, newtup->changed);
+		MemoryContextSwitchTo(oldctx);
+
+		EvalPlanQualSetSlot(&epqstate, remoteslot);
+
+		/* Do the actual update. */
+		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
+	}
+	else
+	{
+		/*
+		 * The tuple to be updated could not be found.
+		 *
+		 * TODO what to do here, change the log level to LOG perhaps?
+		 */
+		elog(DEBUG1,
+			 "logical replication did not find row for update "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
+					   TupleTableSlot *remoteslot,
+					   LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+	ExecOpenIndices(relinfo, false);
+
+	/* If found delete it. */
+	if (found)
+	{
+		EvalPlanQualSetSlot(&epqstate, localslot);
+
+		/* Do the actual delete. */
+		ExecSimpleRelationDelete(estate, &epqstate, localslot);
+	}
+	else
+	{
+		/* The tuple to be deleted could not be found. */
+		elog(DEBUG1,
+			 "logical replication could not find row for delete "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -620,13 +762,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
-
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_insert(estate->es_result_relation_info, estate,
+						   remoteslot);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -683,16 +822,12 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
 	RangeTblEntry *target_rte;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -718,9 +853,6 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
 	/*
 	 * Populate updatedCols so that per-column triggers can fire.  This could
@@ -738,7 +870,6 @@ apply_handle_update(StringInfo s)
 	}
 
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -746,63 +877,15 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
-	ExecClearTuple(remoteslot);
-
-	/*
-	 * Tuple found.
-	 *
-	 * Note this will fail if there are other conflicting unique indexes.
-	 */
-	if (found)
-	{
-		/* Process and store remote tuple in the slot */
-		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
-		MemoryContextSwitchTo(oldctx);
-
-		EvalPlanQualSetSlot(&epqstate, remoteslot);
-
-		/* Do the actual update. */
-		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
-	}
-	else
-	{
-		/*
-		 * The tuple to be updated could not be found.
-		 *
-		 * TODO what to do here, change the log level to LOG perhaps?
-		 */
-		elog(DEBUG1,
-			 "logical replication did not find row for update "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_update(estate->es_result_relation_info, estate,
+						   remoteslot, &newtup, rel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
@@ -822,12 +905,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -852,58 +931,24 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-	/* If found delete it. */
-	if (found)
-	{
-		EvalPlanQualSetSlot(&epqstate, localslot);
-
-		/* Do the actual delete. */
-		ExecSimpleRelationDelete(estate, &epqstate, localslot);
-	}
-	else
-	{
-		/* The tuple to be deleted could not be found. */
-		elog(DEBUG1,
-			 "logical replication could not find row for delete "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_delete(estate->es_result_relation_info, estate,
+						   remoteslot, &rel->remoterel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
-- 
2.16.5

#35Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#34)
Re: adding partitioned tables to publications

On 2020-01-23 11:10, Amit Langote wrote:

On Wed, Jan 22, 2020 at 2:38 PM Amit Langote<amitlangote09@gmail.com> wrote:

Other than that, the updated patch contains following significant changes:

* Changed pg_publication.c: GetPublicationRelations() so that any
published partitioned tables are expanded as needed

* Since the pg_publication_tables view is backed by
GetPublicationRelations(), that means subscriptioncmds.c:
fetch_table_list() no longer needs to craft a query to include
partitions when needed, because partitions are included at source.
That seems better, because it allows to limit the complexity
surrounding publication of partitioned tables to the publication side.

* Fixed the publication table DDL to spot more cases of tables being
added to a publication in a duplicative manner. For example,
partition being added to a publication which already contains its
ancestor and a partitioned tables being added to a publication
(implying all of its partitions are added) which already contains a
partition

On second thought, this seems like an overkill. It might be OK after
all for both a partitioned table and its partitions to be explicitly
added to a publication without complaining of duplication. IOW, it's
the user's call whether it makes sense to do that or not.

This structure looks good now.

However, it does seem unfortunate that in pg_get_publication_tables() we
need to postprocess the result of GetPublicationRelations(). Since
we're already changing the API of GetPublicationRelations(), couldn't we
also make it optionally not include partitioned tables?

For the test, perhaps add test cases where partitions are attached and
detached so that we can see whether their publication relcache
information is properly updated. (I'm not doubting that it works, but
it would be good to have a test for, in case of future restructuring.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#36Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Amit Langote (#34)
Re: adding partitioned tables to publications

Hi Amit,

Once again I went through this patch set and here are my few comments,

On Thu, 23 Jan 2020 at 11:10, Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Jan 22, 2020 at 2:38 PM Amit Langote <amitlangote09@gmail.com> wrote:

Other than that, the updated patch contains following significant changes:

* Changed pg_publication.c: GetPublicationRelations() so that any
published partitioned tables are expanded as needed

* Since the pg_publication_tables view is backed by
GetPublicationRelations(), that means subscriptioncmds.c:
fetch_table_list() no longer needs to craft a query to include
partitions when needed, because partitions are included at source.
That seems better, because it allows to limit the complexity
surrounding publication of partitioned tables to the publication side.

* Fixed the publication table DDL to spot more cases of tables being
added to a publication in a duplicative manner. For example,
partition being added to a publication which already contains its
ancestor and a partitioned tables being added to a publication
(implying all of its partitions are added) which already contains a
partition

On second thought, this seems like an overkill. It might be OK after
all for both a partitioned table and its partitions to be explicitly
added to a publication without complaining of duplication. IOW, it's
the user's call whether it makes sense to do that or not.

Only attaching 0001.

Attached updated 0001 considering the above and the rest of the
patches that add support for replicating partitioned tables using
their own identity and schema. I have reorganized the other patches
as follows:

0002: refactoring of logical/worker.c without any functionality
changes (contains much less churn than in earlier versions)

0003: support logical replication into partitioned tables on the
subscription side (allows replicating from a non-partitioned table on
publisher node into a partitioned table on subscriber node)

0004: support optionally replicating partitioned table changes (and
changes directly made to partitions) using root partitioned table
identity and schema

+ cannot replicate from a regular table into a partitioned able or vice
Here is a missing t from table.

+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, even operations that are performed directly on a
+      partition are also published via its ancestors' publications.

Now this is confusing, does it mean that when partitions are later
added to the table they will be replicated too, I think not, because
you need to first create them manually at the replication side, isn't
it...?

+ /* Must be a regular or partitioned table */
+ if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+ RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
  ereport(ERROR,
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg("\"%s\" is not a table",

IMHO the error message and details should be modified here to
something along the lines of 'is neither a regular or partitioned
table'

+ * published via an ancestor and when a partitioned tables's partitions
tables's --> tables'

+ if (get_rel_relispartition(relid))
+ {
+ List    *ancestors = get_partition_ancestors(relid);

Now, this is just for my understanding, why the ancestors have to be a
list, I always assumed that a partition could only have one ancestor
-- the root table. Is there something more to it that I am totally
missing here or is it to cover the scenario of having partitions of
partitions.

Here I also want to clarify one thing, does it also happen like if a
partitioned table is dropped from a publication then all its
partitions are also implicitly dropped? As far as my understanding
goes that doesn't happen, so shouldn't there be some notice about it.

-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, bool include_partitions)

How about having an enum here with INCLUDE_PARTITIONS,
INCLUDE_PARTITIONED_REL, and SKIP_PARTITIONS to address the three
possibilities and avoiding reiterating through the list in
pg_get_publication_tables().

--
Regards,
Rafia Sabih

#37Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#35)
Re: adding partitioned tables to publications

On Tue, Jan 28, 2020 at 6:11 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

This structure looks good now.

Thanks for taking a look.

However, it does seem unfortunate that in pg_get_publication_tables() we
need to postprocess the result of GetPublicationRelations(). Since
we're already changing the API of GetPublicationRelations(), couldn't we
also make it optionally not include partitioned tables?

Hmm, okay. We really need GetPublicationRelations() to handle
partitioned tables in 3 ways:

1. Don't expand and return them as-is
2. Expand and return only leaf partitions
3. Expand and return all partitions

I will try that in the new patch.

For the test, perhaps add test cases where partitions are attached and
detached so that we can see whether their publication relcache
information is properly updated. (I'm not doubting that it works, but
it would be good to have a test for, in case of future restructuring.)

Okay, I will add some to publication.sql.

Will send updated patches after addressing Rafia's comments.

Thanks,
Amit

#38Amit Langote
amitlangote09@gmail.com
In reply to: Rafia Sabih (#36)
4 attachment(s)
Re: adding partitioned tables to publications

Thank Rafia for the review.

On Wed, Jan 29, 2020 at 3:55 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

On Thu, 23 Jan 2020 at 11:10, Amit Langote <amitlangote09@gmail.com> wrote:

v10 patches

+ cannot replicate from a regular table into a partitioned able or vice
Here is a missing t from table.

Oops, fixed.

+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, even operations that are performed directly on a
+      partition are also published via its ancestors' publications.

Now this is confusing, does it mean that when partitions are later
added to the table they will be replicated too, I think not, because
you need to first create them manually at the replication side, isn't
it...?

Yes, it's upon the user to make sure that they have set up the
partitions correctly on the subscriber. I don't see how that's very
different from what needs to be done when tables are added to a
publication after-the-fact. Did I misunderstand you?

+ /* Must be a regular or partitioned table */
+ if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+ RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("\"%s\" is not a table",

IMHO the error message and details should be modified here to
something along the lines of 'is neither a regular or partitioned
table'

Hmm, this is simply following a convention that's used in most places
around the code, although I'm not really a fan of these "not a
<whatever>"-style messages to begin with. It's less ambiguous with a
"cannot perform <action> on <relkind>"-style message, which some
places already use.

In that view, I have changed the documentation too to say this:

+     Replication is only supported by tables, partitioned or not, although a
+     given table must either be partitioned on both servers or not partitioned
+     at all.  Also, when replicating between partitioned tables, the actual
+     replication occurs between leaf partitions, so partitions on the two
+     servers must match one-to-one.

In retrospect, the confusion surrounding how we communicate the
various operations and properties that cannot be supported on a table
if partitioned, both in the error messages and the documentation,
could have been avoided if it wasn't based on relkind. I guess it's
too late now though. :(

+ * published via an ancestor and when a partitioned tables's partitions
tables's --> tables'

+ if (get_rel_relispartition(relid))
+ {
+ List    *ancestors = get_partition_ancestors(relid);

Now, this is just for my understanding, why the ancestors have to be a
list, I always assumed that a partition could only have one ancestor
-- the root table. Is there something more to it that I am totally
missing here or is it to cover the scenario of having partitions of
partitions.

Yes, with multi-level partitioning.

Here I also want to clarify one thing, does it also happen like if a
partitioned table is dropped from a publication then all its
partitions are also implicitly dropped? As far as my understanding
goes that doesn't happen, so shouldn't there be some notice about it.

Actually, that is what happens, unless partitions were explicitly
added to the publication, in which case they will continue to be
published.

-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, bool include_partitions)

How about having an enum here with INCLUDE_PARTITIONS,
INCLUDE_PARTITIONED_REL, and SKIP_PARTITIONS to address the three
possibilities and avoiding reiterating through the list in
pg_get_publication_tables().

I have done something similar in the updated patch, as I mentioned in
my earlier reply.

Please check the updated patches.

Thanks,
Amit

Attachments:

v11-0001-Support-adding-partitioned-tables-to-publication.patchtext/plain; charset=US-ASCII; name=v11-0001-Support-adding-partitioned-tables-to-publication.patchDownload
From 1ffd5327b0950a70ab1b8d65e8946c8a2882ac0a Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 7 Nov 2019 18:19:33 +0900
Subject: [PATCH v11 1/4] Support adding partitioned tables to publication

When a partitioned tables is added to a publication, changes of all
of its current and future partitions are published via that
publication.
---
 doc/src/sgml/logical-replication.sgml       |  17 +--
 doc/src/sgml/ref/create_publication.sgml    |  20 +++-
 src/backend/catalog/pg_publication.c        |  92 +++++++++++---
 src/backend/commands/publicationcmds.c      |  23 +++-
 src/backend/replication/logical/tablesync.c |   1 +
 src/backend/replication/pgoutput/pgoutput.c |  19 ++-
 src/bin/pg_dump/pg_dump.c                   |   8 +-
 src/include/catalog/pg_publication.h        |  15 ++-
 src/test/regress/expected/publication.out   |  34 +++++-
 src/test/regress/sql/publication.sql        |  23 +++-
 src/test/subscription/t/013_partition.pl    | 178 ++++++++++++++++++++++++++++
 11 files changed, 385 insertions(+), 45 deletions(-)
 create mode 100644 src/test/subscription/t/013_partition.pl

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f657d1d06e..8bd7c9c8ac 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,13 +402,16 @@
 
    <listitem>
     <para>
-     Replication is only possible from base tables to base tables.  That is,
-     the tables on the publication and on the subscription side must be normal
-     tables, not views, materialized views, partition root tables, or foreign
-     tables.  In the case of partitions, you can therefore replicate a
-     partition hierarchy one-to-one, but you cannot currently replicate to a
-     differently partitioned setup.  Attempts to replicate tables other than
-     base tables will result in an error.
+     Replication is only supported by tables, partitioned or not, although a
+     given table must either be partitioned on both servers or not partitioned
+     at all.  Also, when replicating between partitioned tables, the actual
+     replication occurs between leaf partitions, so partitions on the two
+     servers must match one-to-one.
+    </para>
+
+    <para>
+     Attempts to replicate other types of relations such as views, materialized
+     views, or foreign tables, will result in an error.
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 99f87ca393..a304f9b8c3 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -68,15 +68,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
       that table is added to the publication.  If <literal>ONLY</literal> is not
       specified, the table and all its descendant tables (if any) are added.
       Optionally, <literal>*</literal> can be specified after the table name to
-      explicitly indicate that descendant tables are included.
+      explicitly indicate that descendant tables are included.  However, adding
+      a partitioned table to a publication never explicitly adds its partitions,
+      because partitions are implicitly published due to the partitioned table
+      being added to the publication.
      </para>
 
      <para>
-      Only persistent base tables can be part of a publication.  Temporary
-      tables, unlogged tables, foreign tables, materialized views, regular
-      views, and partitioned tables cannot be part of a publication.  To
-      replicate a partitioned table, add the individual partitions to the
-      publication.
+      Only persistent base tables and partitioned tables can be part of a
+      publication. Temporary tables, unlogged tables, foreign tables,
+      materialized views, regular views cannot be part of a publication.
+     </para>
+
+     <para>
+      When a partitioned table is added to a publication, all of its existing
+      and future partitions are also implicitly considered to be part of the
+      publication.  So, even operations that are performed directly on a
+      partition are also published via its ancestors' publications.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index c5eea7af3f..ea13cced79 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -24,8 +24,10 @@
 #include "catalog/index.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
 #include "catalog/pg_type.h"
@@ -40,6 +42,8 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+static List *get_rel_publications(Oid relid);
+
 /*
  * Check if relation can be in given publication and throws appropriate
  * error if not.
@@ -47,17 +51,9 @@
 static void
 check_publication_add_relation(Relation targetrel)
 {
-	/* Give more specific error for partitioned tables */
-	if (RelationGetForm(targetrel)->relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("\"%s\" is a partitioned table",
-						RelationGetRelationName(targetrel)),
-				 errdetail("Adding partitioned tables to publications is not supported."),
-				 errhint("You can add the table partitions individually.")));
-
-	/* Must be table */
-	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION)
+	/* Must be a regular or partitioned table */
+	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
+		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("\"%s\" is not a table",
@@ -103,7 +99,8 @@ check_publication_add_relation(Relation targetrel)
 static bool
 is_publishable_class(Oid relid, Form_pg_class reltuple)
 {
-	return reltuple->relkind == RELKIND_RELATION &&
+	return (reltuple->relkind == RELKIND_RELATION ||
+			reltuple->relkind == RELKIND_PARTITIONED_TABLE) &&
 		!IsCatalogRelationOid(relid) &&
 		reltuple->relpersistence == RELPERSISTENCE_PERMANENT &&
 		relid >= FirstNormalObjectId;
@@ -165,6 +162,10 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	 * Check for duplicates. Note that this does not really prevent
 	 * duplicates, it's here just to provide nicer error message in common
 	 * case. The real protection is the unique key on the catalog.
+	 *
+	 * We give special messages for when a partition is found to be implicitly
+	 * published via an ancestor and when a partitioned tables's partitions
+	 * are found to be published on their own.
 	 */
 	if (SearchSysCacheExists2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
 							  ObjectIdGetDatum(pubid)))
@@ -221,10 +222,35 @@ publication_add_relation(Oid pubid, Relation targetrel,
 
 
 /*
- * Gets list of publication oids for a relation oid.
+ * Gets list of publication oids for a relation, plus those of ancestors,
+ * if any, if the relation is a partition.
  */
 List *
 GetRelationPublications(Oid relid)
+{
+	List	   *result = NIL;
+
+	result = get_rel_publications(relid);
+	if (get_rel_relispartition(relid))
+	{
+		List	   *ancestors = get_partition_ancestors(relid);
+		ListCell   *lc;
+
+		foreach(lc, ancestors)
+		{
+			Oid			ancestor = lfirst_oid(lc);
+			List	   *ancestor_pubs = get_rel_publications(ancestor);
+
+			result = list_concat(result, ancestor_pubs);
+		}
+	}
+
+	return result;
+}
+
+/* Workhorse of GetRelationPublications() */
+static List *
+get_rel_publications(Oid relid)
 {
 	List	   *result = NIL;
 	CatCList   *pubrellist;
@@ -251,9 +277,12 @@ GetRelationPublications(Oid relid)
  *
  * This should only be used for normal publications, the FOR ALL TABLES
  * should use GetAllTablesPublicationRelations().
+ *
+ * See catalog/pg_publication.h for the values that are appropriate for
+ * 'pub_partopt'.
  */
 List *
-GetPublicationRelations(Oid pubid)
+GetPublicationRelations(Oid pubid, int pub_partopt)
 {
 	List	   *result;
 	Relation	pubrelsrel;
@@ -279,7 +308,31 @@ GetPublicationRelations(Oid pubid)
 
 		pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
 
-		result = lappend_oid(result, pubrel->prrelid);
+		if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
+			pub_partopt != PUBLICATION_PART_ROOT)
+		{
+			List   *all_parts = find_all_inheritors(pubrel->prrelid, NoLock,
+													NULL);
+
+			if (pub_partopt == PUBLICATION_PART_ALL)
+				result = list_concat(result, all_parts);
+			else if (pub_partopt == PUBLICATION_PART_LEAF)
+			{
+				ListCell   *lc;
+
+				foreach(lc, all_parts)
+				{
+					Oid		partOid = lfirst_oid(lc);
+
+					if (get_rel_relkind(partOid) != RELKIND_PARTITIONED_TABLE)
+						result = lappend_oid(result, partOid);
+				}
+			}
+			else
+				Assert(false);
+		}
+		else
+			result = lappend_oid(result, pubrel->prrelid);
 	}
 
 	systable_endscan(scan);
@@ -480,10 +533,17 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
 		publication = GetPublicationByName(pubname, false);
+
+		/*
+		 * Publications support partitioned tables, although all changes are
+		 * replicated using leaf partition identity and schema, so we only
+		 * need those.
+		 */
 		if (publication->alltables)
 			tables = GetAllTablesPublicationRelations();
 		else
-			tables = GetPublicationRelations(publication->oid);
+			tables = GetPublicationRelations(publication->oid,
+											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
 		MemoryContextSwitchTo(oldcontext);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f96cb42adc..23b9e1a5ae 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -299,7 +299,13 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	}
 	else
 	{
-		List	   *relids = GetPublicationRelations(pubform->oid);
+		/*
+		 * For any partitioned tables contained in the publication, we must
+		 * invalidate all partitions contained in the respective partition
+		 * trees, not just those explicitly mentioned in the publication.
+		 */
+		List   *relids = GetPublicationRelations(pubform->oid,
+												 PUBLICATION_PART_ALL);
 
 		/*
 		 * We don't want to send too many individual messages, at some point
@@ -356,7 +362,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 		PublicationDropTables(pubid, rels, false);
 	else						/* DEFELEM_SET */
 	{
-		List	   *oldrelids = GetPublicationRelations(pubid);
+		List   *oldrelids = GetPublicationRelations(pubid,
+													PUBLICATION_PART_ROOT);
 		List	   *delrels = NIL;
 		ListCell   *oldlc;
 
@@ -498,7 +505,8 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
+ * add them to a publication.
  */
 static List *
 OpenTableList(List *tables)
@@ -539,8 +547,13 @@ OpenTableList(List *tables)
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 
-		/* Add children of this rel, if requested */
-		if (recurse)
+		/*
+		 * Add children of this rel, if requested, so that they too are added
+		 * to the publication.  A partitioned table can't have any inheritance
+		 * children other than its partitions, which need not be explicitly
+		 * added to the publication.
+		 */
+		if (recurse && rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		{
 			List	   *children;
 			ListCell   *child;
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f8183cd488..98825f01e9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -761,6 +761,7 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
+	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 752508213a..d6b9cbe1bd 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -50,7 +50,12 @@ static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
 
-/* Entry in the map used to remember which relation schemas we sent. */
+/*
+ * Entry in the map used to remember which relation schemas we sent.
+ *
+ * For partitions, 'pubactions' considers not only the table's own
+ * publications, but also those of all of its ancestors.
+ */
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
@@ -406,6 +411,13 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 		if (!relentry->pubactions.pubtruncate)
 			continue;
 
+		/*
+		 * Don't send partitioned tables, because partitions would be
+		 * sent instead.
+		 */
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+			continue;
+
 		relids[nrelids++] = relid;
 		maybe_send_schema(ctx, relation, relentry);
 	}
@@ -524,6 +536,11 @@ init_rel_sync_cache(MemoryContext cachectx)
 
 /*
  * Find or create entry in the relation schema cache.
+ *
+ * This looks up publications that given relation is directly or indirectly
+ * part of (latter if it's really the relation's ancestor that is part of a
+ * publication) and fills up the found entry with the information about
+ * which operations to publish.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 799b6988b7..dc33c20048 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3969,8 +3969,12 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	{
 		TableInfo  *tbinfo = &tblinfo[i];
 
-		/* Only plain tables can be aded to publications. */
-		if (tbinfo->relkind != RELKIND_RELATION)
+		/*
+		 * Only regular and partitioned tables can be added to
+		 * publications.
+		 */
+		if (tbinfo->relkind != RELKIND_RELATION &&
+			tbinfo->relkind != RELKIND_PARTITIONED_TABLE)
 			continue;
 
 		/*
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 6cdc2b1197..20d95e5914 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -80,7 +80,20 @@ typedef struct Publication
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
 extern List *GetRelationPublications(Oid relid);
-extern List *GetPublicationRelations(Oid pubid);
+
+/*---------
+ * Expected values for pub_partopt parameter of GetRelationPublications(),
+ * which allows callers to specify which partitions of partitioned tables
+ * mentioned in the publication they expect to see.
+ *
+ *	ROOT:	only the table explicitly mentioned in the publication
+ *	LEAF:	only leaf partitions in given tree
+ *	ALL:	all partitions in given tree
+ */
+#define	PUBLICATION_PART_ROOT	0
+#define	PUBLICATION_PART_LEAF	1
+#define	PUBLICATION_PART_ALL	2
+extern List *GetPublicationRelations(Oid pubid, int pub_partopt);
 extern List *GetAllTablesPublications(void);
 extern List *GetAllTablesPublicationRelations(void);
 
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index feb51e4add..2634d2c1e1 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -116,6 +116,35 @@ Tables:
 
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+CREATE PUBLICATION testpub_forparted1;
+RESET client_min_messages;
+CREATE TABLE testpub_parted1 (LIKE testpub_parted);
+ALTER PUBLICATION testpub_forparted1 SET (publish='insert');
+-- works despite missing REPLICA IDENTITY, because updates are not replicated
+UPDATE testpub_parted1 SET a = 1;
+ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
+-- only parent is listed as being in publication, not the partition
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+                          Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
+--------------------------+------------+---------+---------+---------+-----------
+ regress_publication_user | f          | t       | t       | t       | t
+Tables:
+    "public.testpub_parted"
+
+-- should now fail, because parent's publication replicates updates
+UPDATE testpub_parted1 SET a = 1;
+ERROR:  cannot update table "testpub_parted1" because it does not have a replica identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
+ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
+-- works again, because parent's publication is no longer considered
+UPDATE testpub_parted1 SET a = 1;
+DROP TABLE testpub_parted1;
+DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
@@ -142,11 +171,6 @@ Tables:
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
 ERROR:  "testpub_view" is not a table
 DETAIL:  Only tables can be added to publications.
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
-ERROR:  "testpub_parted" is a partitioned table
-DETAIL:  Adding partitioned tables to publications is not supported.
-HINT:  You can add the table partitions individually.
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default ADD TABLE pub_test.testpub_nopk;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 5773a755cf..219e04129d 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -69,6 +69,27 @@ RESET client_min_messages;
 DROP TABLE testpub_tbl3, testpub_tbl3a;
 DROP PUBLICATION testpub3, testpub4;
 
+-- Tests for partitioned tables
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forparted;
+CREATE PUBLICATION testpub_forparted1;
+RESET client_min_messages;
+CREATE TABLE testpub_parted1 (LIKE testpub_parted);
+ALTER PUBLICATION testpub_forparted1 SET (publish='insert');
+-- works despite missing REPLICA IDENTITY, because updates are not replicated
+UPDATE testpub_parted1 SET a = 1;
+ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
+-- only parent is listed as being in publication, not the partition
+ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
+\dRp+ testpub_forparted
+-- should now fail, because parent's publication replicates updates
+UPDATE testpub_parted1 SET a = 1;
+ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
+-- works again, because parent's publication is no longer considered
+UPDATE testpub_parted1 SET a = 1;
+DROP TABLE testpub_parted1;
+DROP PUBLICATION testpub_forparted, testpub_forparted1;
+
 -- fail - view
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_view;
 SET client_min_messages = 'ERROR';
@@ -83,8 +104,6 @@ CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 
 -- fail - view
 ALTER PUBLICATION testpub_default ADD TABLE testpub_view;
--- fail - partitioned table
-ALTER PUBLICATION testpub_fortbl ADD TABLE testpub_parted;
 
 ALTER PUBLICATION testpub_default ADD TABLE testpub_tbl1;
 ALTER PUBLICATION testpub_default SET TABLE testpub_tbl1;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
new file mode 100644
index 0000000000..1fa392b618
--- /dev/null
+++ b/src/test/subscription/t/013_partition.pl
@@ -0,0 +1,178 @@
+# Test PARTITION
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 15;
+
+# setup
+
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+my $node_subscriber1 = get_new_node('subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+my $node_subscriber2 = get_new_node('subscriber2');
+$node_subscriber2->init(allows_streaming => 'logical');
+$node_subscriber2->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# publisher
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+
+# subscriber1
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+
+# subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+
+# Wait for initial sync of all subscriptions
+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+my $result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
+
+# update (no partition change)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 1");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
+
+# update (partition changes)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
+is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'delete from tab1_2 replicated');
+
+# truncate
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(0||), 'truncate of tab1_2 replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1");
+
+$node_publisher->wait_for_catchup('sub1');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1_1 replicated');
-- 
2.16.5

v11-0003-Add-subscription-support-to-replicate-into-parti.patchtext/plain; charset=US-ASCII; name=v11-0003-Add-subscription-support-to-replicate-into-parti.patchDownload
From 70d968d8edf7529d24bad0978dcf6fd66abf083e Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v11 3/4] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c      |  14 +-
 src/backend/replication/logical/relation.c  | 161 +++++++++++++++++++
 src/backend/replication/logical/tablesync.c |  28 ++--
 src/backend/replication/logical/worker.c    | 232 ++++++++++++++++++++++++++--
 src/include/replication/logicalrelation.h   |   2 +
 src/test/subscription/t/013_partition.pl    |   7 +-
 6 files changed, 413 insertions(+), 31 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 582b0cb017..635b29d050 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -591,17 +591,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -609,7 +601,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b970..54189d7965 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -34,6 +34,7 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
+static HTAB *LogicalRepPartMap = NULL;
 
 
 /*
@@ -472,3 +473,163 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepRelMapContext)
+		LogicalRepRelMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepRelMapEntry);
+	ctl.hcxt = LogicalRepRelMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	entry = hash_search(LogicalRepPartMap, (void *) &partOid,
+						HASH_ENTER, &found);
+
+	if (found)
+		return entry;
+
+	memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+
+	/* Make cached copy of the data */
+	oldctx = MemoryContextSwitchTo(LogicalRepRelMapContext);
+
+	/* Remote relation is used as-is from the root's entry. */
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		memset(entry->attrmap->attnums, -1,
+			   entry->attrmap->maplen * sizeof(AttrNumber));
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 98825f01e9..6a18b78f22 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -630,16 +630,17 @@ copy_read_data(void *outbuf, int minread, int maxread)
 
 /*
  * Get information about remote relation in similar fashion the RELATION
- * message provides during replication.
+ * message provides during replication.  XXX - while we fetch relkind too
+ * here, the RELATION message doesn't provide it
  */
 static void
 fetch_remote_table_info(char *nspname, char *relname,
-						LogicalRepRelation *lrel)
+						LogicalRepRelation *lrel, char *relkind)
 {
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
@@ -649,16 +650,16 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
 					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND pg_relation_is_publishable(c.oid)",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, 3, tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	*relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -750,10 +753,12 @@ copy_table(Relation rel)
 	CopyState	cstate;
 	List	   *attnamelist;
 	ParseState *pstate;
+	char		remote_relkind;
 
 	/* Get the publisher relation info. */
 	fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
-							RelationGetRelationName(rel), &lrel);
+							RelationGetRelationName(rel), &lrel,
+							&remote_relkind);
 
 	/* Put the relation into relmap. */
 	logicalrep_relmap_update(&lrel);
@@ -761,12 +766,15 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (remote_relkind == RELKIND_PARTITIONED_TABLE)
+		appendStringInfo(&cmd, "COPY (SELECT * FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 86601f6e8f..a48537db0c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -720,6 +723,152 @@ apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
 	EvalPlanQualEnd(&epqstate);
 }
 
+/*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, respectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws an error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_insert(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_delete(partrelinfo, estate, remoteslot,
+								   &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			{
+				ResultRelInfo *partrelinfo_new;
+
+				/*
+				 * partrelinfo computed above is the partition which might
+				 * contain the search tuple.  Now find the partition for the
+				 * replacement tuple, which might not be the same as
+				 * partrelinfo.
+				 */
+				localslot = table_slot_create(rel, &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				slot_modify_cstrings(localslot, remoteslot, relmapentry,
+									 newtup->values, newtup->changed);
+				partrelinfo_new = ExecFindPartition(mtstate, relinfo, proute,
+													localslot, estate);
+
+				MemoryContextSwitchTo(oldctx);
+
+				/*
+				 * If both search and replacement tuple would be in the same
+				 * partition, we can apply this as an UPDATE on the parttion.
+				 */
+				if (partrelinfo == partrelinfo_new)
+				{
+					Relation	partrel = partrelinfo->ri_RelationDesc;
+					AttrMap	   *attrmap = map ? map->attrMap : NULL;
+					LogicalRepRelMapEntry *part_entry;
+
+					part_entry = logicalrep_partition_open(relmapentry,
+														   partrel, attrmap);
+
+					/* UPDATE partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_update(partrelinfo, estate, remoteslot,
+										   newtup, part_entry);
+				}
+				else
+				{
+					/*
+					 * Different, so handle this as DELETE followed by INSERT.
+					 */
+
+					/* DELETE from partition partrelinfo. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_delete(partrelinfo, estate, remoteslot,
+										   &relmapentry->remoterel);
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+						localslot = execute_attr_map_slot(map->attrMap, localslot,
+														  part_slot);
+					}
+					MemoryContextSwitchTo(oldctx);
+					/* INSERT into partition partrelinfo_new. */
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_do_insert(partrelinfo_new, estate,
+										   localslot);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -762,9 +911,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_insert(estate->es_result_relation_info, estate,
-						   remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_do_insert(estate->es_result_relation_info, estate,
+							   remoteslot);
 
 	PopActiveSnapshot();
 
@@ -877,9 +1030,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_update(estate->es_result_relation_info, estate,
-						   remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_do_update(estate->es_result_relation_info, estate,
+							   remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -940,9 +1097,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_delete(estate->es_result_relation_info, estate,
-						   remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_do_delete(estate->es_result_relation_info, estate,
+							   remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -970,6 +1131,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -999,6 +1161,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1014,6 +1222,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a8028c..4650b4f9e1 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 1fa392b618..1ec487154b 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -42,10 +42,15 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
-- 
2.16.5

v11-0002-Some-refactoring-of-logical-worker.c.patchtext/plain; charset=US-ASCII; name=v11-0002-Some-refactoring-of-logical-worker.c.patchDownload
From 41afa4b8d8207ed9a2c8dd1fa674bc734741f34f Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v11 2/4] Some refactoring of logical/worker.c

This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding apply_handle_do_{insert|update|delete}
functions, to perform those operations on relations that are not
direct targets of replication.

An example of that is when replicating changes into a partitioned
table, some of which must be applied to its partitions.
---
 src/backend/replication/logical/worker.c | 261 ++++++++++++++++++-------------
 1 file changed, 153 insertions(+), 108 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 7a5471f95c..86601f6e8f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -578,6 +578,148 @@ GetRelationIdentityOrPK(Relation rel)
 	return idxoid;
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_do_insert(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *localslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, localslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_do_update(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *remoteslot,
+					   LogicalRepTupleData *newtup,
+					   LogicalRepRelMapEntry *relmapentry)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	LogicalRepRelation *remoterel = &relmapentry->remoterel;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+
+	ExecClearTuple(remoteslot);
+
+	/*
+	 * Tuple found.
+	 *
+	 * Note this will fail if there are other conflicting unique indexes.
+	 */
+	if (found)
+	{
+		/* Process and store remote tuple in the slot */
+		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+		slot_modify_cstrings(remoteslot, localslot, relmapentry,
+							 newtup->values, newtup->changed);
+		MemoryContextSwitchTo(oldctx);
+
+		EvalPlanQualSetSlot(&epqstate, remoteslot);
+
+		/* Do the actual update. */
+		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
+	}
+	else
+	{
+		/*
+		 * The tuple to be updated could not be found.
+		 *
+		 * TODO what to do here, change the log level to LOG perhaps?
+		 */
+		elog(DEBUG1,
+			 "logical replication did not find row for update "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
+					   TupleTableSlot *remoteslot,
+					   LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+	ExecOpenIndices(relinfo, false);
+
+	/* If found delete it. */
+	if (found)
+	{
+		EvalPlanQualSetSlot(&epqstate, localslot);
+
+		/* Do the actual delete. */
+		ExecSimpleRelationDelete(estate, &epqstate, localslot);
+	}
+	else
+	{
+		/* The tuple to be deleted could not be found. */
+		elog(DEBUG1,
+			 "logical replication could not find row for delete "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -620,13 +762,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
-
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_insert(estate->es_result_relation_info, estate,
+						   remoteslot);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -683,16 +822,12 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
 	RangeTblEntry *target_rte;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -718,9 +853,6 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
 	/*
 	 * Populate updatedCols so that per-column triggers can fire.  This could
@@ -738,7 +870,6 @@ apply_handle_update(StringInfo s)
 	}
 
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -746,63 +877,15 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
-	ExecClearTuple(remoteslot);
-
-	/*
-	 * Tuple found.
-	 *
-	 * Note this will fail if there are other conflicting unique indexes.
-	 */
-	if (found)
-	{
-		/* Process and store remote tuple in the slot */
-		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
-		MemoryContextSwitchTo(oldctx);
-
-		EvalPlanQualSetSlot(&epqstate, remoteslot);
-
-		/* Do the actual update. */
-		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
-	}
-	else
-	{
-		/*
-		 * The tuple to be updated could not be found.
-		 *
-		 * TODO what to do here, change the log level to LOG perhaps?
-		 */
-		elog(DEBUG1,
-			 "logical replication did not find row for update "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_update(estate->es_result_relation_info, estate,
+						   remoteslot, &newtup, rel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
@@ -822,12 +905,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -852,58 +931,24 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-	/* If found delete it. */
-	if (found)
-	{
-		EvalPlanQualSetSlot(&epqstate, localslot);
-
-		/* Do the actual delete. */
-		ExecSimpleRelationDelete(estate, &epqstate, localslot);
-	}
-	else
-	{
-		/* The tuple to be deleted could not be found. */
-		elog(DEBUG1,
-			 "logical replication could not find row for delete "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_delete(estate->es_result_relation_info, estate,
+						   remoteslot, &rel->remoterel);
 
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
-- 
2.16.5

v11-0004-Publish-partitioned-table-inserts-as-its-own.patchtext/plain; charset=US-ASCII; name=v11-0004-Publish-partitioned-table-inserts-as-its-own.patchDownload
From 57681a73e14126bffd89e5acc9f0ac23b858b779 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v11 4/4] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that cab be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 +++
 src/backend/catalog/partition.c             |   9 ++
 src/backend/catalog/pg_publication.c        |  63 ++++++++-
 src/backend/commands/publicationcmds.c      |  95 ++++++++-----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 ++++++++++++++++++++++------
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 ++-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 ++++++++------
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 170 +++++++++++++++++++++-
 17 files changed, 590 insertions(+), 153 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8bd7c9c8ac..a99e90b331 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,15 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by tables, partitioned or not, although a
-     given table must either be partitioned on both servers or not partitioned
-     at all.  Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by tables, partitioned or not.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index a304f9b8c3..b51701a623 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -122,6 +122,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac017fa..07853b85d5 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE &&
+			get_rel_relispartition(relid);
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index ea13cced79..7dc23ecf70 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -224,13 +224,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -242,6 +259,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -380,9 +403,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -404,12 +431,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -440,6 +490,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -540,9 +591,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 23b9e1a5ae..d54e439f4b 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -55,20 +56,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -90,10 +94,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -109,19 +113,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -142,10 +155,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -182,9 +194,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -192,13 +204,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -250,17 +264,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -269,19 +282,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 70589dd1dc..46366c095b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14630,7 +14630,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 59d1a31c97..f88377a0c2 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2295,8 +2295,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index d6b9cbe1bd..ac88ba4f83 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions would be
-		 * sent instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that given relation is directly or indirectly
  * part of (latter if it's really the relation's ancestor that is part of a
  * publication) and fills up the found entry with the information about
- * which operations to publish.
+ * which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..cf5736b311 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5138,7 +5139,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5157,7 +5158,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index dc33c20048..bdbd1f823b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3780,6 +3780,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3791,11 +3792,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3819,6 +3827,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3841,6 +3850,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3917,7 +3928,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21004e5078..90e47dd1f3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -600,6 +600,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f3c7eb96fa..3f6ce713af 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5706,7 +5706,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5737,6 +5737,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5778,6 +5782,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5790,6 +5795,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5800,6 +5806,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5849,6 +5858,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5861,6 +5872,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5869,6 +5882,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873aff6e..c6c19119ca 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 20d95e5914..956ac383d8 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -95,7 +98,7 @@ extern List *GetRelationPublications(Oid relid);
 #define	PUBLICATION_PART_ALL	2
 extern List *GetPublicationRelations(Oid pubid, int pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c1e1..d2d269b11b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e04129d..9742aef802 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 1ec487154b..6cb484aded 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 34;
 
 # setup
 
@@ -25,7 +25,11 @@ my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1");
 $node_publisher->safe_psql('postgres',
-	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+	"CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_using_root_schema = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 WITH (publish_using_root_schema = true)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -34,8 +38,24 @@ $node_publisher->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (1, 2, 3, 5, 6)");
 $node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub2 ADD TABLE tab1_1, tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub3 ADD TABLE tab2, tab3_1");
 
 # subscriber1
 $node_subscriber1->safe_psql('postgres',
@@ -51,18 +71,42 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (1) TO (10)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub4 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # subscriber 2
 $node_subscriber2->safe_psql('postgres',
-	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub2");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -79,14 +123,28 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (3), (5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|1|5), 'insert into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|1|5), 'insert into tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -95,32 +153,68 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1 replicated');
+
 # update (no partition change)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|2|5), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|2|5), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1 replicated');
+
 # update (partition changes)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 2");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|3|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|3|6), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
@@ -129,19 +223,41 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'update of tab1 replicated');
+
 # delete
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1 WHERE a IN (3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3 WHERE a IN (3, 5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(1|6|6), 'delete from tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(1|6|6), 'delete from tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_1");
 is($result, qq(0||), 'delete from tab1_1 replicated');
@@ -150,34 +266,80 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
 # truncate
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab3_1 (a) VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (2)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_1 VALUES (1)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2_1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(4|1|6), 'truncate of tab2_2 NOT replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$node_subscriber2->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub3");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab3");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(1|1|1), 'tab1_1 unchanged');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(1|2|2), 'tab1_2 unchanged');
-- 
2.16.5

#39Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#38)
Re: adding partitioned tables to publications

I have committed the 0001 patch of this series (partitioned table member
of publication). I changed the new argument of
GetPublicationRelations() to an enum and reformatted some comments.
I'll continue looking through the subsequent patches.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#40Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#39)
Re: adding partitioned tables to publications

On Tue, Mar 10, 2020 at 5:52 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I have committed the 0001 patch of this series (partitioned table member
of publication). I changed the new argument of
GetPublicationRelations() to an enum and reformatted some comments.
I'll continue looking through the subsequent patches.

Thank you.

- Amit

#41Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#40)
1 attachment(s)
Re: adding partitioned tables to publications

I was trying to extract some preparatory work from the remaining patches
and came up with the attached. This is part of your patch 0003, but
also relevant for part 0004. The problem was that COPY (SELECT *) is
not sufficient when the table has generated columns, so we need to build
the column list explicitly.

Thoughts?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Prepare-to-support-non-tables-in-publications.patchtext/plain; charset=UTF-8; name=0001-Prepare-to-support-non-tables-in-publications.patch; x-mac-creator=0; x-mac-type=0Download
From c52672ff1de4d7c55c4b50eb7986f45801ea4c60 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 16 Mar 2020 13:27:51 +0100
Subject: [PATCH] Prepare to support non-tables in publications

This by itself doesn't change any functionality but prepares the way
for having relations other than base tables in publications.

Make arrangements for COPY handling the initial table sync.  For
non-tables we have to use COPY (SELECT ...) instead of directly
copying from the table, but then we have to take care to omit
generated columns from the column list.

Also, remove a hardcoded reference to relkind = 'r' and rely on
pg_relation_is_publishable(), which matches what the publisher can
actually publish and is correct even in cross-version scenarios.

Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
---
 src/backend/replication/logical/tablesync.c | 43 ++++++++++++++++-----
 src/include/replication/logicalproto.h      |  1 +
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 98825f01e9..358b0a3726 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -639,26 +639,27 @@ fetch_remote_table_info(char *nspname, char *relname,
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
-	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
+	Oid			tableRow[] = {OIDOID, CHAROID, CHAROID, BOOLOID};
+	Oid			attrRow[] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
 	int			natt;
+	bool		remote_is_publishable;
 
 	lrel->nspname = nspname;
 	lrel->relname = relname;
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind,"
+					 "    pg_relation_is_publishable(c.oid)"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
-					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND c.relname = %s",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
+	res = walrcv_exec(wrconn, cmd.data, lengthof(tableRow), tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -675,6 +676,13 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	lrel->relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
+	remote_is_publishable = DatumGetBool(slot_getattr(slot, 4, &isnull));
+	if (isnull || !remote_is_publishable)
+		ereport(ERROR,
+				(errmsg("table \"%s.%s\" on the publisher is not publishable",
+						nspname, relname)));
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
@@ -696,7 +704,7 @@ fetch_remote_table_info(char *nspname, char *relname,
 					 lrel->remoteid,
 					 (walrcv_server_version(wrconn) >= 120000 ? "AND a.attgenerated = ''" : ""),
 					 lrel->remoteid);
-	res = walrcv_exec(wrconn, cmd.data, 4, attrRow);
+	res = walrcv_exec(wrconn, cmd.data, lengthof(attrRow), attrRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -765,8 +773,25 @@ copy_table(Relation rel)
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "COPY %s TO STDOUT",
-					 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	if (lrel.relkind == RELKIND_RELATION)
+		appendStringInfo(&cmd, "COPY %s TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	else
+	{
+		/*
+		 * For non-tables, we need to do COPY (SELECT ...), but we can't just
+		 * do SELECT * because we need to not copy generated columns.
+		 */
+		appendStringInfo(&cmd, "COPY (SELECT ");
+		for (int i = 0; i < lrel.natts; i++)
+		{
+			appendStringInfoString(&cmd, quote_identifier(lrel.attnames[i]));
+			if (i < lrel.natts - 1)
+				appendStringInfoString(&cmd, ", ");
+		}
+		appendStringInfo(&cmd, " FROM %s) TO STDOUT",
+						 quote_qualified_identifier(lrel.nspname, lrel.relname));
+	}
 	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 2cc2dc4db3..4860561be9 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -49,6 +49,7 @@ typedef struct LogicalRepRelation
 	char	  **attnames;		/* column names */
 	Oid		   *atttyps;		/* column types */
 	char		replident;		/* replica identity */
+	char		relkind;		/* remote relation kind */
 	Bitmapset  *attkeys;		/* Bitmap of key columns */
 } LogicalRepRelation;
 
-- 
2.25.0

#42Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#41)
Re: adding partitioned tables to publications

Hi Peter,

On Mon, Mar 16, 2020 at 9:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I was trying to extract some preparatory work from the remaining patches
and came up with the attached. This is part of your patch 0003, but
also relevant for part 0004. The problem was that COPY (SELECT *) is
not sufficient when the table has generated columns, so we need to build
the column list explicitly.

Thoughts?

Thank you for that.

+   if (isnull || !remote_is_publishable)
+       ereport(ERROR,
+               (errmsg("table \"%s.%s\" on the publisher is not publishable",
+                       nspname, relname)));

Maybe add a one-line comment above this to say it's an "not supposed
to happen" error or am I missing something? Wouldn't elog() suffice
for this?

Other than that, looks good.

--
Thank you,
Amit

#43Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#42)
3 attachment(s)
Re: adding partitioned tables to publications

On Wed, Mar 18, 2020 at 12:06 PM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Peter,

On Mon, Mar 16, 2020 at 9:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I was trying to extract some preparatory work from the remaining patches
and came up with the attached. This is part of your patch 0003, but
also relevant for part 0004. The problem was that COPY (SELECT *) is
not sufficient when the table has generated columns, so we need to build
the column list explicitly.

Thoughts?

Thank you for that.

+   if (isnull || !remote_is_publishable)
+       ereport(ERROR,
+               (errmsg("table \"%s.%s\" on the publisher is not publishable",
+                       nspname, relname)));

Maybe add a one-line comment above this to say it's an "not supposed
to happen" error or am I missing something? Wouldn't elog() suffice
for this?

Other than that, looks good.

Wait, the following Assert in copy_table() should now be gone:

Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);

because just below it:

    /* Start copy on the publisher. */
    initStringInfo(&cmd);
-   appendStringInfo(&cmd, "COPY %s TO STDOUT",
-                    quote_qualified_identifier(lrel.nspname, lrel.relname));
+   if (lrel.relkind == RELKIND_RELATION)
+       appendStringInfo(&cmd, "COPY %s TO STDOUT",
+                        quote_qualified_identifier(lrel.nspname,
lrel.relname));
+   else
+   {
+       /*
+        * For non-tables, we need to do COPY (SELECT ...), but we can't just
+        * do SELECT * because we need to not copy generated columns.
+        */

By the way, I have rebased the patches, although maybe you've got your
own copies; attached.

--
Thank you,
Amit

Attachments:

v12-0003-Add-subscription-support-to-replicate-into-parti.patchapplication/octet-stream; name=v12-0003-Add-subscription-support-to-replicate-into-parti.patchDownload
From e726e37a73f1a070b9af5775c101f33acc1b5401 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v12 3/4] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c     |  14 +-
 src/backend/replication/logical/relation.c | 161 ++++++++++++++
 src/backend/replication/logical/worker.c   | 232 ++++++++++++++++++++-
 src/include/replication/logicalrelation.h  |   2 +
 src/test/subscription/t/013_partition.pl   |   7 +-
 5 files changed, 395 insertions(+), 21 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 7194becfd9..dc8a01a5cd 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -594,17 +594,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -612,7 +604,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b970..54189d7965 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -34,6 +34,7 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
+static HTAB *LogicalRepPartMap = NULL;
 
 
 /*
@@ -472,3 +473,163 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepRelMapContext)
+		LogicalRepRelMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepRelMapEntry);
+	ctl.hcxt = LogicalRepRelMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	entry = hash_search(LogicalRepPartMap, (void *) &partOid,
+						HASH_ENTER, &found);
+
+	if (found)
+		return entry;
+
+	memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+
+	/* Make cached copy of the data */
+	oldctx = MemoryContextSwitchTo(LogicalRepRelMapContext);
+
+	/* Remote relation is used as-is from the root's entry. */
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		memset(entry->attrmap->attnums, -1,
+			   entry->attrmap->maplen * sizeof(AttrNumber));
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 63c2c01511..26472125fe 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -721,6 +724,152 @@ apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
 	EvalPlanQualEnd(&epqstate);
 }
 
+/*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, respectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws an error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_insert(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_do_delete(partrelinfo, estate, remoteslot,
+								   &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			{
+				ResultRelInfo *partrelinfo_new;
+
+				/*
+				 * partrelinfo computed above is the partition which might
+				 * contain the search tuple.  Now find the partition for the
+				 * replacement tuple, which might not be the same as
+				 * partrelinfo.
+				 */
+				localslot = table_slot_create(rel, &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				slot_modify_cstrings(localslot, remoteslot, relmapentry,
+									 newtup->values, newtup->changed);
+				partrelinfo_new = ExecFindPartition(mtstate, relinfo, proute,
+													localslot, estate);
+
+				MemoryContextSwitchTo(oldctx);
+
+				/*
+				 * If both search and replacement tuple would be in the same
+				 * partition, we can apply this as an UPDATE on the parttion.
+				 */
+				if (partrelinfo == partrelinfo_new)
+				{
+					Relation	partrel = partrelinfo->ri_RelationDesc;
+					AttrMap	   *attrmap = map ? map->attrMap : NULL;
+					LogicalRepRelMapEntry *part_entry;
+
+					part_entry = logicalrep_partition_open(relmapentry,
+														   partrel, attrmap);
+
+					/* UPDATE partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_update(partrelinfo, estate, remoteslot,
+										   newtup, part_entry);
+				}
+				else
+				{
+					/*
+					 * Different, so handle this as DELETE followed by INSERT.
+					 */
+
+					/* DELETE from partition partrelinfo. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_do_delete(partrelinfo, estate, remoteslot,
+										   &relmapentry->remoterel);
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+						localslot = execute_attr_map_slot(map->attrMap, localslot,
+														  part_slot);
+					}
+					MemoryContextSwitchTo(oldctx);
+					/* INSERT into partition partrelinfo_new. */
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_do_insert(partrelinfo_new, estate,
+										   localslot);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -763,9 +912,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_insert(estate->es_result_relation_info, estate,
-						   remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_do_insert(estate->es_result_relation_info, estate,
+							   remoteslot);
 
 	PopActiveSnapshot();
 
@@ -880,9 +1033,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_update(estate->es_result_relation_info, estate,
-						   remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_do_update(estate->es_result_relation_info, estate,
+							   remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -943,9 +1100,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_do_delete(estate->es_result_relation_info, estate,
-						   remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_do_delete(estate->es_result_relation_info, estate,
+							   remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -973,6 +1134,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -1002,6 +1164,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1017,6 +1225,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a8028c..4650b4f9e1 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index ea5812ce18..7c08a2e6ca 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -42,10 +42,15 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
-- 
2.20.1 (Apple Git-117)

v12-0002-Some-refactoring-of-logical-worker.c.patchapplication/octet-stream; name=v12-0002-Some-refactoring-of-logical-worker.c.patchDownload
From c1c3588654c86a483f1acbda3797c187430d3f0e Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v12 2/4] Some refactoring of logical/worker.c

This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding apply_handle_do_{insert|update|delete}
functions, to perform those operations on relations that are not
direct targets of replication.

An example of that is when replicating changes into a partitioned
table, some of which must be applied to its partitions.
---
 src/backend/replication/logical/worker.c | 261 +++++++++++++----------
 1 file changed, 153 insertions(+), 108 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ad4a732fd2..63c2c01511 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -579,6 +579,148 @@ GetRelationIdentityOrPK(Relation rel)
 	return idxoid;
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_do_insert(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *localslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, localslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_do_update(ResultRelInfo *relinfo,
+					   EState *estate, TupleTableSlot *remoteslot,
+					   LogicalRepTupleData *newtup,
+					   LogicalRepRelMapEntry *relmapentry)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	LogicalRepRelation *remoterel = &relmapentry->remoterel;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+
+	ExecClearTuple(remoteslot);
+
+	/*
+	 * Tuple found.
+	 *
+	 * Note this will fail if there are other conflicting unique indexes.
+	 */
+	if (found)
+	{
+		/* Process and store remote tuple in the slot */
+		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+		slot_modify_cstrings(remoteslot, localslot, relmapentry,
+							 newtup->values, newtup->changed);
+		MemoryContextSwitchTo(oldctx);
+
+		EvalPlanQualSetSlot(&epqstate, remoteslot);
+
+		/* Do the actual update. */
+		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
+	}
+	else
+	{
+		/*
+		 * The tuple to be updated could not be found.
+		 *
+		 * TODO what to do here, change the log level to LOG perhaps?
+		 */
+		elog(DEBUG1,
+			 "logical replication did not find row for update "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_do_delete(ResultRelInfo *relinfo, EState *estate,
+					   TupleTableSlot *remoteslot,
+					   LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	/*
+	 * Try to find tuple using either replica identity index, primary key or
+	 * if needed, sequential scan.
+	 */
+	idxoid = GetRelationIdentityOrPK(rel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(rel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, localslot);
+	else
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
+										 remoteslot, localslot);
+	ExecOpenIndices(relinfo, false);
+
+	/* If found delete it. */
+	if (found)
+	{
+		EvalPlanQualSetSlot(&epqstate, localslot);
+
+		/* Do the actual delete. */
+		ExecSimpleRelationDelete(estate, &epqstate, localslot);
+	}
+	else
+	{
+		/* The tuple to be deleted could not be found. */
+		elog(DEBUG1,
+			 "logical replication could not find row for delete "
+			 "in replication target relation \"%s\"",
+			 RelationGetRelationName(rel));
+	}
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+	EvalPlanQualEnd(&epqstate);
+}
+
 /*
  * Handle INSERT message.
  */
@@ -621,13 +763,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_insert(estate->es_result_relation_info, estate,
+						   remoteslot);
 
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -684,16 +823,12 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
 	RangeTblEntry *target_rte;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -719,9 +854,6 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
 	/*
 	 * Populate updatedCols so that per-column triggers can fire.  This could
@@ -741,7 +873,6 @@ apply_handle_update(StringInfo s)
 	fill_extraUpdatedCols(target_rte, RelationGetDescr(rel->localrel));
 
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -749,63 +880,15 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_update(estate->es_result_relation_info, estate,
+						   remoteslot, &newtup, rel);
 
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
-	ExecClearTuple(remoteslot);
-
-	/*
-	 * Tuple found.
-	 *
-	 * Note this will fail if there are other conflicting unique indexes.
-	 */
-	if (found)
-	{
-		/* Process and store remote tuple in the slot */
-		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
-		MemoryContextSwitchTo(oldctx);
-
-		EvalPlanQualSetSlot(&epqstate, remoteslot);
-
-		/* Do the actual update. */
-		ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
-	}
-	else
-	{
-		/*
-		 * The tuple to be updated could not be found.
-		 *
-		 * TODO what to do here, change the log level to LOG perhaps?
-		 */
-		elog(DEBUG1,
-			 "logical replication did not find row for update "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
@@ -825,12 +908,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -855,58 +934,24 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_do_delete(estate->es_result_relation_info, estate,
+						   remoteslot, &rel->remoterel);
 
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-	/* If found delete it. */
-	if (found)
-	{
-		EvalPlanQualSetSlot(&epqstate, localslot);
-
-		/* Do the actual delete. */
-		ExecSimpleRelationDelete(estate, &epqstate, localslot);
-	}
-	else
-	{
-		/* The tuple to be deleted could not be found. */
-		elog(DEBUG1,
-			 "logical replication could not find row for delete "
-			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
-	}
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
 	AfterTriggerEndQuery(estate);
 
-	EvalPlanQualEnd(&epqstate);
 	ExecResetTupleTable(estate->es_tupleTable, false);
 	FreeExecutorState(estate);
 
-- 
2.20.1 (Apple Git-117)

v12-0004-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v12-0004-Publish-partitioned-table-inserts-as-its-own.patchDownload
From fdd0f1ef6e805c52b3b6012c9b5ba054fc0b0ade Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v12 4/4] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that cab be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   9 +
 src/backend/catalog/pg_publication.c        |  63 +++++-
 src/backend/commands/publicationcmds.c      |  95 +++++----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 ++++++++++++++++----
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 +-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 +++++-----
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 170 +++++++++++++++-
 17 files changed, 590 insertions(+), 153 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8bd7c9c8ac..a99e90b331 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,15 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by tables, partitioned or not, although a
-     given table must either be partitioned on both servers or not partitioned
-     at all.  Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by tables, partitioned or not.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28f33..0ca6cffaba 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac017fa..07853b85d5 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE &&
+			get_rel_relispartition(relid);
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae1ee..0c534a29c0 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bdc28..9e102a4b78 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8c33b67c1b..8d40d2ec4c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14692,7 +14692,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4322..f71fd98be2 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70cffa..f48a8fbb58 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
 
-		desc = RelationGetDescr(relation);
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->attisdropped || att->attgenerated)
+			continue;
+
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
+
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index ff70326474..a5b595ad32 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5141,7 +5142,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5160,7 +5161,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ced0681ec3..3bf3702709 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3792,6 +3792,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3803,11 +3804,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3831,6 +3839,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3853,6 +3862,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3929,7 +3940,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e0c6444ef6..44e964fa24 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -601,6 +601,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245fea7..cbd69942f4 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873aff6e..c6c19119ca 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c5e0..a85a6c8991 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c1e1..d2d269b11b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e04129d..9742aef802 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 7c08a2e6ca..b0a038b54b 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 34;
 
 # setup
 
@@ -25,7 +25,11 @@ my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1");
 $node_publisher->safe_psql('postgres',
-	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+	"CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_using_root_schema = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 WITH (publish_using_root_schema = true)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -34,8 +38,24 @@ $node_publisher->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (1, 2, 3, 5, 6)");
 $node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub2 ADD TABLE tab1_1, tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub3 ADD TABLE tab2, tab3_1");
 
 # subscriber1
 $node_subscriber1->safe_psql('postgres',
@@ -51,18 +71,42 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (1) TO (10)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub4 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # subscriber 2
 $node_subscriber2->safe_psql('postgres',
-	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub2");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -79,14 +123,28 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (3), (5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|1|5), 'insert into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|1|5), 'insert into tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -95,32 +153,68 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1 replicated');
+
 # update (no partition change)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|2|5), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|2|5), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1 replicated');
+
 # update (partition changes)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 2");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|3|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|3|6), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
@@ -129,19 +223,41 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'update of tab1 replicated');
+
 # delete
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1 WHERE a IN (3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3 WHERE a IN (3, 5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(1|6|6), 'delete from tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(1|6|6), 'delete from tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_1");
 is($result, qq(0||), 'delete from tab1_1 replicated');
@@ -150,34 +266,80 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
 # truncate
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab3_1 (a) VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (2)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_1 VALUES (1)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2_1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(4|1|6), 'truncate of tab2_2 NOT replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$node_subscriber2->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub3");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab3");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(1|1|1), 'tab1_1 unchanged');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(1|2|2), 'tab1_2 unchanged');
-- 
2.20.1 (Apple Git-117)

#44Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#42)
Re: adding partitioned tables to publications

On 2020-03-18 04:06, Amit Langote wrote:

+   if (isnull || !remote_is_publishable)
+       ereport(ERROR,
+               (errmsg("table \"%s.%s\" on the publisher is not publishable",
+                       nspname, relname)));

Maybe add a one-line comment above this to say it's an "not supposed
to happen" error or am I missing something? Wouldn't elog() suffice
for this?

On second thought, maybe we should just drop this check. The list of
tables that is part of the publication was already filtered by the
publisher, so this query doesn't need to check it again. We just need
the relkind to be able to construct the COPY command, but we don't need
to second-guess it beyond that.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#45Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#44)
Re: adding partitioned tables to publications

On Wed, Mar 18, 2020 at 8:16 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-18 04:06, Amit Langote wrote:

+   if (isnull || !remote_is_publishable)
+       ereport(ERROR,
+               (errmsg("table \"%s.%s\" on the publisher is not publishable",
+                       nspname, relname)));

Maybe add a one-line comment above this to say it's an "not supposed
to happen" error or am I missing something? Wouldn't elog() suffice
for this?

On second thought, maybe we should just drop this check. The list of
tables that is part of the publication was already filtered by the
publisher, so this query doesn't need to check it again. We just need
the relkind to be able to construct the COPY command, but we don't need
to second-guess it beyond that.

Agreed.

--
Thank you,
Amit

#46Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#45)
Re: adding partitioned tables to publications

On 2020-03-18 15:19, Amit Langote wrote:

On Wed, Mar 18, 2020 at 8:16 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-18 04:06, Amit Langote wrote:

+   if (isnull || !remote_is_publishable)
+       ereport(ERROR,
+               (errmsg("table \"%s.%s\" on the publisher is not publishable",
+                       nspname, relname)));

Maybe add a one-line comment above this to say it's an "not supposed
to happen" error or am I missing something? Wouldn't elog() suffice
for this?

On second thought, maybe we should just drop this check. The list of
tables that is part of the publication was already filtered by the
publisher, so this query doesn't need to check it again. We just need
the relkind to be able to construct the COPY command, but we don't need
to second-guess it beyond that.

Agreed.

Committed with that change then.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#47Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#43)
Re: adding partitioned tables to publications

On 2020-03-18 08:33, Amit Langote wrote:

By the way, I have rebased the patches, although maybe you've got your
own copies; attached.

Looking through 0002 and 0003 now.

The structure looks generally good.

In 0002, the naming of apply_handle_insert() vs.
apply_handle_do_insert() etc. seems a bit prone to confusion. How about
something like apply_handle_insert_internal()? Also, should we put each
of those internal functions next to their internal function instead of
in a separate group like you have it?

In apply_handle_do_insert(), the argument localslot should probably be
remoteslot.

In apply_handle_do_delete(), the ExecOpenIndices() call was moved to a
different location relative to the rest of the code. That was probably
not intended.

In 0003, you have /* TODO, use inverse lookup hashtable? */. Is this
something you plan to address in this cycle, or is that more for future
generations?

0003 could use some more tests. The one test that you adjusted just
ensures the data goes somewhere instead of being rejected, but there are
no tests that check whether it ends up in the right partition, whether
cross-partition updates work etc.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#48Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#47)
3 attachment(s)
Re: adding partitioned tables to publications

On Thu, Mar 19, 2020 at 11:18 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-18 08:33, Amit Langote wrote:

By the way, I have rebased the patches, although maybe you've got your
own copies; attached.

Looking through 0002 and 0003 now.

The structure looks generally good.

Thanks for the review.

In 0002, the naming of apply_handle_insert() vs.
apply_handle_do_insert() etc. seems a bit prone to confusion. How about
something like apply_handle_insert_internal()? Also, should we put each
of those internal functions next to their internal function instead of
in a separate group like you have it?

Sure.

In apply_handle_do_insert(), the argument localslot should probably be
remoteslot.

You're right, fixed.

In apply_handle_do_delete(), the ExecOpenIndices() call was moved to a
different location relative to the rest of the code. That was probably
not intended.

Fixed.

In 0003, you have /* TODO, use inverse lookup hashtable? */. Is this
something you plan to address in this cycle, or is that more for future
generations?

Sorry, this is simply a copy-paste from logicalrep_relmap_invalidate_cb().

0003 could use some more tests. The one test that you adjusted just
ensures the data goes somewhere instead of being rejected, but there are
no tests that check whether it ends up in the right partition, whether
cross-partition updates work etc.

Okay, added some tests.

Attached updated patches.

--
Thank you,
Amit

Attachments:

v13-0003-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v13-0003-Publish-partitioned-table-inserts-as-its-own.patchDownload
From 9f8cc8104a1440c9abdd2450acf63eef452c8004 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v13 3/3] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that cab be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   9 +
 src/backend/catalog/pg_publication.c        |  63 +++++-
 src/backend/commands/publicationcmds.c      |  95 +++++----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 ++++++++++++++++----
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 +-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 +++++-----
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 170 +++++++++++++++-
 17 files changed, 590 insertions(+), 153 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8bd7c9c8ac..a99e90b331 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,15 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by tables, partitioned or not, although a
-     given table must either be partitioned on both servers or not partitioned
-     at all.  Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by tables, partitioned or not.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28f33..0ca6cffaba 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac017fa..07853b85d5 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE &&
+			get_rel_relispartition(relid);
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae1ee..0c534a29c0 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bdc28..9e102a4b78 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 729025470d..c9e4214c73 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14692,7 +14692,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4322..f71fd98be2 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70cffa..f48a8fbb58 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
 
-		desc = RelationGetDescr(relation);
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->attisdropped || att->attgenerated)
+			continue;
+
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
+
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 76f41dbe36..8792217a26 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5138,7 +5139,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5157,7 +5158,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 959b36a95c..d703f17dc6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3792,6 +3792,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3803,11 +3804,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3831,6 +3839,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3853,6 +3862,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -3929,7 +3940,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e0c6444ef6..44e964fa24 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -601,6 +601,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245fea7..cbd69942f4 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873aff6e..c6c19119ca 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c5e0..a85a6c8991 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c1e1..d2d269b11b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e04129d..9742aef802 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 8ac55102ec..2d033047a0 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 17;
+use Test::More tests => 36;
 
 # setup
 
@@ -25,7 +25,11 @@ my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1");
 $node_publisher->safe_psql('postgres',
-	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+	"CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_using_root_schema = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 WITH (publish_using_root_schema = true)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -34,8 +38,24 @@ $node_publisher->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (1, 2, 3, 5, 6)");
 $node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub2 ADD TABLE tab1_1, tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub3 ADD TABLE tab2, tab3_1");
 
 # subscriber1
 $node_subscriber1->safe_psql('postgres',
@@ -51,18 +71,42 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (1) TO (10)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub4 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # subscriber 2
 $node_subscriber2->safe_psql('postgres',
-	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub2");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -79,9 +123,15 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (3), (5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
@@ -91,6 +141,14 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT tableoid::regclass FROM tab1 WHERE a = 5");
 is($result, qq(tab1_2_1), 'inserts into tab1_2 replicated into correct partition');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|1|5), 'insert into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|1|5), 'insert into tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -99,27 +157,55 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|1|5), 'inserts into tab1 replicated');
+
 # update (no partition change)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|2|5), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|2|5), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|2|5), 'update of tab1 replicated');
+
 # update (partition changes)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 2");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
@@ -129,6 +215,14 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT tableoid::regclass FROM tab1 WHERE a = 6");
 is($result, qq(tab1_2_2), 'update of tab1_2 correctly replicated as cross-partition update');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|3|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|3|6), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
@@ -137,19 +231,41 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'update of tab1 replicated');
+
 # delete
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1 WHERE a IN (3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3 WHERE a IN (3, 5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(1|6|6), 'delete from tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(1|6|6), 'delete from tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_1");
 is($result, qq(0||), 'delete from tab1_1 replicated');
@@ -158,34 +274,80 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
 # truncate
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab3_1 (a) VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (2)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_1 VALUES (1)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2_1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(4|1|6), 'truncate of tab2_2 NOT replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$node_subscriber2->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub3");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab3");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(1|1|1), 'tab1_1 unchanged');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(1|2|2), 'tab1_2 unchanged');
-- 
2.20.1 (Apple Git-117)

v13-0001-Some-refactoring-of-logical-worker.c.patchapplication/octet-stream; name=v13-0001-Some-refactoring-of-logical-worker.c.patchDownload
From 6f383407db2afcf6271e9552768708e040d946ca Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 5 Dec 2019 09:17:06 +0900
Subject: [PATCH v13 1/3] Some refactoring of logical/worker.c

This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding apply_handle_do_{insert|update|delete}
functions, to perform those operations on relations that are not
direct targets of replication.

An example of that is when replicating changes into a partitioned
table, some of which must be applied to its partitions.
---
 src/backend/replication/logical/worker.c | 175 +++++++++++++++--------
 1 file changed, 116 insertions(+), 59 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ad4a732fd2..1ab0afc65e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -113,6 +113,16 @@ static void store_flush_position(XLogRecPtr remote_lsn);
 
 static void maybe_reread_subscription(void);
 
+static void apply_handle_insert_internal(ResultRelInfo *relinfo,
+							 EState *estate, TupleTableSlot *remoteslot);
+static void apply_handle_update_internal(ResultRelInfo *relinfo,
+							 EState *estate, TupleTableSlot *remoteslot,
+							 LogicalRepTupleData *newtup,
+							 LogicalRepRelMapEntry *relmapentry);
+static void apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
+							 TupleTableSlot *remoteslot,
+							 LogicalRepRelation *remoterel);
+
 /*
  * Should this worker apply changes for given relation.
  *
@@ -582,6 +592,7 @@ GetRelationIdentityOrPK(Relation rel)
 /*
  * Handle INSERT message.
  */
+
 static void
 apply_handle_insert(StringInfo s)
 {
@@ -621,13 +632,10 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	ExecOpenIndices(estate->es_result_relation_info, false);
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_insert_internal(estate->es_result_relation_info, estate,
+								 remoteslot);
 
-	/* Do the insert. */
-	ExecSimpleRelationInsert(estate, remoteslot);
-
-	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
 	PopActiveSnapshot();
 
 	/* Handle queued AFTER triggers. */
@@ -641,6 +649,20 @@ apply_handle_insert(StringInfo s)
 	CommandCounterIncrement();
 }
 
+/* Workhorse for apply_handle_insert() */
+static void
+apply_handle_insert_internal(ResultRelInfo *relinfo,
+							 EState *estate, TupleTableSlot *remoteslot)
+{
+	ExecOpenIndices(relinfo, false);
+
+	/* Do the insert. */
+	ExecSimpleRelationInsert(estate, remoteslot);
+
+	/* Cleanup. */
+	ExecCloseIndices(relinfo);
+}
+
 /*
  * Check if the logical replication relation is updatable and throw
  * appropriate error if it isn't.
@@ -684,16 +706,12 @@ apply_handle_update(StringInfo s)
 {
 	LogicalRepRelMapEntry *rel;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	LogicalRepTupleData oldtup;
 	LogicalRepTupleData newtup;
 	bool		has_oldtup;
-	TupleTableSlot *localslot;
 	TupleTableSlot *remoteslot;
 	RangeTblEntry *target_rte;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -719,9 +737,6 @@ apply_handle_update(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
 	/*
 	 * Populate updatedCols so that per-column triggers can fire.  This could
@@ -741,7 +756,6 @@ apply_handle_update(StringInfo s)
 	fill_extraUpdatedCols(target_rte, RelationGetDescr(rel->localrel));
 
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
 	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
@@ -749,20 +763,57 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_update_internal(estate->es_result_relation_info, estate,
+								 remoteslot, &newtup, rel);
+
+	PopActiveSnapshot();
+
+	/* Handle queued AFTER triggers. */
+	AfterTriggerEndQuery(estate);
+
+	ExecResetTupleTable(estate->es_tupleTable, false);
+	FreeExecutorState(estate);
+
+	logicalrep_rel_close(rel, NoLock);
+
+	CommandCounterIncrement();
+}
+
+/* Workhorse for apply_handle_update() */
+static void
+apply_handle_update_internal(ResultRelInfo *relinfo,
+							 EState *estate, TupleTableSlot *remoteslot,
+							 LogicalRepTupleData *newtup,
+							 LogicalRepRelMapEntry *relmapentry)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	LogicalRepRelation *remoterel = &relmapentry->remoterel;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+	MemoryContext oldctx;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
 	/*
 	 * Try to find tuple using either replica identity index, primary key or
 	 * if needed, sequential scan.
 	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
+	idxoid = GetRelationIdentityOrPK(rel);
 	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL && has_oldtup));
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
 
 	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
+		found = RelationFindReplTupleByIndex(rel, idxoid,
 											 LockTupleExclusive,
 											 remoteslot, localslot);
 	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
 										 remoteslot, localslot);
 
 	ExecClearTuple(remoteslot);
@@ -776,8 +827,8 @@ apply_handle_update(StringInfo s)
 	{
 		/* Process and store remote tuple in the slot */
 		oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		slot_modify_cstrings(remoteslot, localslot, rel,
-							 newtup.values, newtup.changed);
+		slot_modify_cstrings(remoteslot, localslot, relmapentry,
+							 newtup->values, newtup->changed);
 		MemoryContextSwitchTo(oldctx);
 
 		EvalPlanQualSetSlot(&epqstate, remoteslot);
@@ -795,23 +846,12 @@ apply_handle_update(StringInfo s)
 		elog(DEBUG1,
 			 "logical replication did not find row for update "
 			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
+			 RelationGetRelationName(rel));
 	}
 
 	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
-	PopActiveSnapshot();
-
-	/* Handle queued AFTER triggers. */
-	AfterTriggerEndQuery(estate);
-
+	ExecCloseIndices(relinfo);
 	EvalPlanQualEnd(&epqstate);
-	ExecResetTupleTable(estate->es_tupleTable, false);
-	FreeExecutorState(estate);
-
-	logicalrep_rel_close(rel, NoLock);
-
-	CommandCounterIncrement();
 }
 
 /*
@@ -825,12 +865,8 @@ apply_handle_delete(StringInfo s)
 	LogicalRepRelMapEntry *rel;
 	LogicalRepTupleData oldtup;
 	LogicalRepRelId relid;
-	Oid			idxoid;
 	EState	   *estate;
-	EPQState	epqstate;
 	TupleTableSlot *remoteslot;
-	TupleTableSlot *localslot;
-	bool		found;
 	MemoryContext oldctx;
 
 	ensure_transaction();
@@ -855,33 +891,65 @@ apply_handle_delete(StringInfo s)
 	remoteslot = ExecInitExtraTupleSlot(estate,
 										RelationGetDescr(rel->localrel),
 										&TTSOpsVirtual);
-	localslot = table_slot_create(rel->localrel,
-								  &estate->es_tupleTable);
-	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
 
+	/* Input functions may need an active snapshot, so get one */
 	PushActiveSnapshot(GetTransactionSnapshot());
-	ExecOpenIndices(estate->es_result_relation_info, false);
 
-	/* Find the tuple using the replica identity index. */
+	/* Build the search tuple. */
 	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
+	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
+	apply_handle_delete_internal(estate->es_result_relation_info, estate,
+								 remoteslot, &rel->remoterel);
+
+	PopActiveSnapshot();
+
+	/* Handle queued AFTER triggers. */
+	AfterTriggerEndQuery(estate);
+
+	ExecResetTupleTable(estate->es_tupleTable, false);
+	FreeExecutorState(estate);
+
+	logicalrep_rel_close(rel, NoLock);
+
+	CommandCounterIncrement();
+}
+
+/* Workhorse for apply_handle_delete() */
+static void
+apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
+							 TupleTableSlot *remoteslot,
+							 LogicalRepRelation *remoterel)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	Oid			idxoid;
+	EPQState	epqstate;
+	TupleTableSlot *localslot;
+	bool		found;
+
+	localslot = table_slot_create(rel, &estate->es_tupleTable);
+	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+
+	ExecOpenIndices(relinfo, false);
+
 	/*
 	 * Try to find tuple using either replica identity index, primary key or
 	 * if needed, sequential scan.
 	 */
-	idxoid = GetRelationIdentityOrPK(rel->localrel);
+	idxoid = GetRelationIdentityOrPK(rel);
 	Assert(OidIsValid(idxoid) ||
-		   (rel->remoterel.replident == REPLICA_IDENTITY_FULL));
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
 
 	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(rel->localrel, idxoid,
+		found = RelationFindReplTupleByIndex(rel, idxoid,
 											 LockTupleExclusive,
 											 remoteslot, localslot);
 	else
-		found = RelationFindReplTupleSeq(rel->localrel, LockTupleExclusive,
+		found = RelationFindReplTupleSeq(rel, LockTupleExclusive,
 										 remoteslot, localslot);
+
 	/* If found delete it. */
 	if (found)
 	{
@@ -896,23 +964,12 @@ apply_handle_delete(StringInfo s)
 		elog(DEBUG1,
 			 "logical replication could not find row for delete "
 			 "in replication target relation \"%s\"",
-			 RelationGetRelationName(rel->localrel));
+			 RelationGetRelationName(rel));
 	}
 
 	/* Cleanup. */
-	ExecCloseIndices(estate->es_result_relation_info);
-	PopActiveSnapshot();
-
-	/* Handle queued AFTER triggers. */
-	AfterTriggerEndQuery(estate);
-
+	ExecCloseIndices(relinfo);
 	EvalPlanQualEnd(&epqstate);
-	ExecResetTupleTable(estate->es_tupleTable, false);
-	FreeExecutorState(estate);
-
-	logicalrep_rel_close(rel, NoLock);
-
-	CommandCounterIncrement();
 }
 
 /*
-- 
2.20.1 (Apple Git-117)

v13-0002-Add-subscription-support-to-replicate-into-parti.patchapplication/octet-stream; name=v13-0002-Add-subscription-support-to-replicate-into-parti.patchDownload
From ba903f7c2b3ba1faf4ed884c6cc90c972aa0c9db Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v13 2/3] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c      |  14 +-
 src/backend/replication/logical/relation.c  | 161 +++++++++++++
 src/backend/replication/logical/tablesync.c |   1 -
 src/backend/replication/logical/worker.c    | 240 +++++++++++++++++++-
 src/include/replication/logicalrelation.h   |   2 +
 src/test/subscription/t/013_partition.pl    |  17 +-
 6 files changed, 412 insertions(+), 23 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 7194becfd9..dc8a01a5cd 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -594,17 +594,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -612,7 +604,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b970..54189d7965 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -34,6 +34,7 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
+static HTAB *LogicalRepPartMap = NULL;
 
 
 /*
@@ -472,3 +473,163 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepRelMapContext)
+		LogicalRepRelMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepRelMapEntry);
+	ctl.hcxt = LogicalRepRelMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	entry = hash_search(LogicalRepPartMap, (void *) &partOid,
+						HASH_ENTER, &found);
+
+	if (found)
+		return entry;
+
+	memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+
+	/* Make cached copy of the data */
+	oldctx = MemoryContextSwitchTo(LogicalRepRelMapContext);
+
+	/* Remote relation is used as-is from the root's entry. */
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		memset(entry->attrmap->attnums, -1,
+			   entry->attrmap->maplen * sizeof(AttrNumber));
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a60c666153..c27d970589 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -762,7 +762,6 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 1ab0afc65e..acf6a3ae01 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -122,6 +125,12 @@ static void apply_handle_update_internal(ResultRelInfo *relinfo,
 static void apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 							 TupleTableSlot *remoteslot,
 							 LogicalRepRelation *remoterel);
+static void apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation);
 
 /*
  * Should this worker apply changes for given relation.
@@ -632,9 +641,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_insert_internal(estate->es_result_relation_info, estate,
-								 remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_insert_internal(estate->es_result_relation_info, estate,
+									 remoteslot);
 
 	PopActiveSnapshot();
 
@@ -763,9 +776,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_update_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_update_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -900,9 +917,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_delete_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_delete_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -972,6 +993,154 @@ apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 	EvalPlanQualEnd(&epqstate);
 }
 
+/*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	rel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	TupleTableSlot *localslot;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, rel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 *
+	 * For insert, remoteslot is tuple to insert.  For update and delete, it
+	 * is the tuple to be replaced and deleted, respectively.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	/* The following throws an error if a suitable partition is not found. */
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+	{
+		TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+		remoteslot = execute_attr_map_slot(map->attrMap, remoteslot,
+										   part_slot);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	switch (operation)
+	{
+		case CMD_INSERT:
+			/* Just insert into the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_insert_internal(partrelinfo, estate, remoteslot);
+			break;
+
+		case CMD_DELETE:
+			/* Just delete from the partition. */
+			estate->es_result_relation_info = partrelinfo;
+			apply_handle_delete_internal(partrelinfo, estate, remoteslot,
+										 &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			{
+				ResultRelInfo *partrelinfo_new;
+
+				/*
+				 * partrelinfo computed above is the partition which might
+				 * contain the search tuple.  Now find the partition for the
+				 * replacement tuple, which might not be the same as
+				 * partrelinfo.
+				 */
+				localslot = table_slot_create(rel, &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				slot_modify_cstrings(localslot, remoteslot, relmapentry,
+									 newtup->values, newtup->changed);
+				partrelinfo_new = ExecFindPartition(mtstate, relinfo, proute,
+													localslot, estate);
+
+				MemoryContextSwitchTo(oldctx);
+
+				/*
+				 * If both search and replacement tuple would be in the same
+				 * partition, we can apply this as an UPDATE on the parttion.
+				 */
+				if (partrelinfo == partrelinfo_new)
+				{
+					Relation	partrel = partrelinfo->ri_RelationDesc;
+					AttrMap	   *attrmap = map ? map->attrMap : NULL;
+					LogicalRepRelMapEntry *part_entry;
+
+					part_entry = logicalrep_partition_open(relmapentry,
+														   partrel, attrmap);
+
+					/* UPDATE partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_update_internal(partrelinfo, estate,
+												 remoteslot, newtup,
+												 part_entry);
+				}
+				else
+				{
+					/*
+					 * Different, so handle this as DELETE followed by INSERT.
+					 */
+
+					/* DELETE from partition partrelinfo. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_delete_internal(partrelinfo, estate,
+												 remoteslot,
+												 &relmapentry->remoterel);
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						TupleTableSlot *part_slot = partinfo->pi_PartitionTupleSlot;
+
+						localslot = execute_attr_map_slot(map->attrMap, localslot,
+														  part_slot);
+					}
+					MemoryContextSwitchTo(oldctx);
+					/* INSERT into partition partrelinfo_new. */
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_insert_internal(partrelinfo_new, estate,
+												 localslot);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
 /*
  * Handle TRUNCATE message.
  *
@@ -985,6 +1154,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -1014,6 +1184,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1029,6 +1245,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a8028c..4650b4f9e1 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index ea5812ce18..8ac55102ec 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 17;
 
 # setup
 
@@ -42,10 +42,15 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
@@ -82,6 +87,10 @@ my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT tableoid::regclass FROM tab1 WHERE a = 5");
+is($result, qq(tab1_2_1), 'inserts into tab1_2 replicated into correct partition');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -116,6 +125,10 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT tableoid::regclass FROM tab1 WHERE a = 6");
+is($result, qq(tab1_2_2), 'update of tab1_2 correctly replicated as cross-partition update');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
-- 
2.20.1 (Apple Git-117)

#49Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#48)
Re: adding partitioned tables to publications

On 2020-03-23 06:02, Amit Langote wrote:

Okay, added some tests.

Attached updated patches.

I have committed the worker.c refactoring patch.

"Add subscription support to replicate into partitioned tables" still
has lacking test coverage. Your changes in relation.c are not exercised
at all because the partitioned table branch in apply_handle_update() is
never taken. This is critical and tricky code, so I would look for
significant testing.

The code looks okay to me. I would remove this code

+       memset(entry->attrmap->attnums, -1,
+              entry->attrmap->maplen * sizeof(AttrNumber));

because the entries are explicitly filled right after anyway, and
filling the bytes with -1 has an unclear effect. There is also
seemingly some fishiness in this code around whether attribute numbers
are zero- or one-based. Perhaps this could be documented briefly.
Maybe I'm misunderstanding something.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#50Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#49)
Re: adding partitioned tables to publications

On Wed, Mar 25, 2020 at 9:29 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-23 06:02, Amit Langote wrote:

Okay, added some tests.

Attached updated patches.

I have committed the worker.c refactoring patch.

"Add subscription support to replicate into partitioned tables" still
has lacking test coverage. Your changes in relation.c are not exercised
at all because the partitioned table branch in apply_handle_update() is
never taken. This is critical and tricky code, so I would look for
significant testing.

While trying some tests around the code you mentioned, I found what
looks like a bug, which looking into now.

The code looks okay to me. I would remove this code

+       memset(entry->attrmap->attnums, -1,
+              entry->attrmap->maplen * sizeof(AttrNumber));

because the entries are explicitly filled right after anyway, and
filling the bytes with -1 has an unclear effect. There is also
seemingly some fishiness in this code around whether attribute numbers
are zero- or one-based. Perhaps this could be documented briefly.
Maybe I'm misunderstanding something.

Will check and fix as necessary.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#51Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#50)
3 attachment(s)
Re: adding partitioned tables to publications

On Thu, Mar 26, 2020 at 11:23 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Mar 25, 2020 at 9:29 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-23 06:02, Amit Langote wrote:

Okay, added some tests.

Attached updated patches.

I have committed the worker.c refactoring patch.

"Add subscription support to replicate into partitioned tables" still
has lacking test coverage. Your changes in relation.c are not exercised
at all because the partitioned table branch in apply_handle_update() is
never taken. This is critical and tricky code, so I would look for
significant testing.

While trying some tests around the code you mentioned, I found what
looks like a bug, which looking into now.

Turns out the code in apply_handle_tuple_routing() for the UPDATE
message was somewhat bogus, which fixed in the updated version. I
ended up with anothing refactoring patch, which attached as 0001.

It appears to me that the tests now seem enough to cover
apply_handle_tuple_routing(), although more could still be added.

The code looks okay to me. I would remove this code

+       memset(entry->attrmap->attnums, -1,
+              entry->attrmap->maplen * sizeof(AttrNumber));

because the entries are explicitly filled right after anyway, and
filling the bytes with -1 has an unclear effect. There is also
seemingly some fishiness in this code around whether attribute numbers
are zero- or one-based. Perhaps this could be documented briefly.
Maybe I'm misunderstanding something.

Will check and fix as necessary.

Removed that memset. I have added a comment about one- vs. zero-based
indexes contained in the maps coming from two different modules, viz.
tuple routing and logical replication, resp.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v14-0003-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v14-0003-Publish-partitioned-table-inserts-as-its-own.patchDownload
From ae1a7f5736540078d60e51a7c9485bde81f323a2 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v14 3/3] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that cab be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 +++
 src/backend/catalog/partition.c             |   9 ++
 src/backend/catalog/pg_publication.c        |  63 ++++++++-
 src/backend/commands/publicationcmds.c      |  95 ++++++++-----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 ++++++++++++++++++++++------
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 ++-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 ++++++++------
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 162 ++++++++++++++++++++-
 17 files changed, 582 insertions(+), 153 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8bd7c9c..a99e90b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,15 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by tables, partitioned or not, although a
-     given table must either be partitioned on both servers or not partitioned
-     at all.  Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by tables, partitioned or not.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..0ca6cff 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac01..07853b8 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE &&
+			get_rel_relispartition(relid);
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..0c534a2 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..9e102a4 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8e35c5b..f7c1e17 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14694,7 +14694,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..f71fd98 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..f48a8fb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 76f41db..8792217 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5138,7 +5139,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5157,7 +5158,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 1849dfe..efe3ee4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3879,11 +3880,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..d12c28b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..cbd6994 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873af..c6c1911 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..a85a6c8 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..d2d269b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..9742aef 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 14ff9f4..6c75782 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 18;
+use Test::More tests => 35;
 
 # setup
 
@@ -25,7 +25,11 @@ my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
 $node_publisher->safe_psql('postgres',
 	"CREATE PUBLICATION pub1");
 $node_publisher->safe_psql('postgres',
-	"CREATE PUBLICATION pub_all FOR ALL TABLES");
+	"CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_using_root_schema = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 WITH (publish_using_root_schema = true)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -35,7 +39,23 @@ $node_publisher->safe_psql('postgres',
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
 $node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub2 ADD TABLE tab1_1, tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub3 ADD TABLE tab2, tab3_1");
 
 # subscriber1
 $node_subscriber1->safe_psql('postgres',
@@ -52,17 +72,41 @@ $node_subscriber1->safe_psql('postgres',
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 $node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (1) TO (10)");
+$node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub4 CONNECTION '$publisher_connstr' PUBLICATION pub3");
 
 # subscriber 2
 $node_subscriber2->safe_psql('postgres',
-	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text)");
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
+$node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub2");
 
 # Wait for initial sync of all subscriptions
 my $synced_query =
@@ -79,9 +123,15 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (3), (5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
@@ -91,6 +141,14 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT tableoid::regclass FROM tab1 WHERE a = 5");
 is($result, qq(tab1_2_1), 'inserts into tab1_2 replicated into correct partition');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|1|5), 'insert into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|1|5), 'insert into tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -104,9 +162,15 @@ $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
@@ -116,6 +180,14 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT tableoid::regclass FROM tab1 WHERE a = 6");
 is($result, qq(tab1_2_2), 'update of tab1_2 correctly replicated as intra-partition update');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|2|5), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|2|5), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
@@ -123,9 +195,15 @@ is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 # update (replicated as delete+insert)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 5 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 2");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 2");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
@@ -135,6 +213,14 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT tableoid::regclass FROM tab1 WHERE a = 5");
 is($result, qq(tab1_2_1), 'update of tab1_2 correctly replicated as cross-partition update');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|3|3|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|3|3|6), 'update of tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
@@ -143,19 +229,41 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
+is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|3|3|6), 'update of tab1 replicated');
+
 # delete
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1 WHERE a IN (3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2 WHERE a IN (3, 5)");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3 WHERE a IN (3, 5)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'delete from tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(1|6|6), 'delete from tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(1|6|6), 'delete from tab3_1 replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_1");
 is($result, qq(0||), 'delete from tab1_1 replicated');
@@ -164,34 +272,80 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'delete from tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(0||), 'delete from tab1_1, tab_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
 # truncate
 $node_subscriber1->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+$node_subscriber1->safe_psql('postgres',
+	"INSERT INTO tab3_1 (a) VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (2)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_1 VALUES (1)");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2_1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub3');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(2|1|2), 'truncate of tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(4|1|6), 'truncate of tab2_2 NOT replicated');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1_2");
 is($result, qq(0||), 'truncate of tab1_2 replicated');
 
+$node_subscriber2->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub3");
+$node_subscriber2->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (2)");
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab2");
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab3");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
+$node_publisher->wait_for_catchup('sub4');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_1");
+is($result, qq(1|1|1), 'tab1_1 unchanged');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_2");
+is($result, qq(1|2|2), 'tab1_2 unchanged');
-- 
1.8.3.1

v1-0001-worker.c-refactor-code-to-look-up-local-tuple.patchapplication/octet-stream; name=v1-0001-worker.c-refactor-code-to-look-up-local-tuple.patchDownload
From 6fafe4b91233deba66dd81e7048fbefd52f52077 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Fri, 27 Mar 2020 20:54:30 +0900
Subject: [PATCH v14 1/3] worker.c: refactor code to look up local tuple

---
 src/backend/replication/logical/worker.c | 78 +++++++++++++++++---------------
 1 file changed, 41 insertions(+), 37 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index fa38117..51c0278 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -122,6 +122,10 @@ static void apply_handle_update_internal(ResultRelInfo *relinfo,
 static void apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 										 TupleTableSlot *remoteslot,
 										 LogicalRepRelation *remoterel);
+static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+									LogicalRepRelation *remoterel,
+									TupleTableSlot *remoteslot,
+									TupleTableSlot **localslot);
 
 /*
  * Should this worker apply changes for given relation.
@@ -788,33 +792,17 @@ apply_handle_update_internal(ResultRelInfo *relinfo,
 							 LogicalRepRelMapEntry *relmapentry)
 {
 	Relation	localrel = relinfo->ri_RelationDesc;
-	Oid			idxoid;
 	EPQState	epqstate;
 	TupleTableSlot *localslot;
 	bool		found;
 	MemoryContext oldctx;
 
-	localslot = table_slot_create(localrel, &estate->es_tupleTable);
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
-
 	ExecOpenIndices(relinfo, false);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (relmapentry->remoterel.replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
+	found = FindReplTupleInLocalRel(estate, localrel,
+									&relmapentry->remoterel,
+									remoteslot, &localslot);
 	ExecClearTuple(remoteslot);
 
 	/*
@@ -922,31 +910,15 @@ apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 							 LogicalRepRelation *remoterel)
 {
 	Relation	localrel = relinfo->ri_RelationDesc;
-	Oid			idxoid;
 	EPQState	epqstate;
 	TupleTableSlot *localslot;
 	bool		found;
 
-	localslot = table_slot_create(localrel, &estate->es_tupleTable);
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
-
 	ExecOpenIndices(relinfo, false);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (remoterel->replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
-										 remoteslot, localslot);
+	found = FindReplTupleInLocalRel(estate, localrel, remoterel,
+									remoteslot, &localslot);
 
 	/* If found delete it. */
 	if (found)
@@ -971,6 +943,38 @@ apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 }
 
 /*
+ * Try to find a tuple received from the publication side (one in 'remoteslot')
+ * in the corresponding local relation using either replica identity index,
+ * primary key or if needed, sequential scan.
+ *
+ * Local tuple, if found, is returned in '*localslot'.
+ */
+static bool
+FindReplTupleInLocalRel(EState *estate, Relation localrel,
+						LogicalRepRelation *remoterel,
+						TupleTableSlot *remoteslot,
+						TupleTableSlot **localslot)
+{
+	Oid			idxoid;
+	bool		found;
+
+	idxoid = GetRelationIdentityOrPK(localrel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	*localslot = table_slot_create(localrel, &estate->es_tupleTable);
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(localrel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, *localslot);
+	else
+		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
+										 remoteslot, *localslot);
+
+	return found;
+}
+
+/*
  * Handle TRUNCATE message.
  *
  * TODO: FDW support
-- 
1.8.3.1

v14-0002-Add-subscription-support-to-replicate-into-parti.patchapplication/octet-stream; name=v14-0002-Add-subscription-support-to-replicate-into-parti.patchDownload
From d5d2dbd9f28db3a66a38cba76710d2b5679b09c2 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v14 2/3] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c      |  14 +-
 src/backend/replication/logical/relation.c  | 167 ++++++++++++++++
 src/backend/replication/logical/tablesync.c |   1 -
 src/backend/replication/logical/worker.c    | 298 +++++++++++++++++++++++++++-
 src/include/replication/logicalrelation.h   |   2 +
 src/test/subscription/t/013_partition.pl    |  31 ++-
 6 files changed, 486 insertions(+), 27 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 7194bec..dc8a01a 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -594,17 +594,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -612,7 +604,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b..6b88bde 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -34,6 +34,7 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
+static HTAB *LogicalRepPartMap = NULL;
 
 
 /*
@@ -472,3 +473,169 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepRelMapContext)
+		LogicalRepRelMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepRelMapEntry);
+	ctl.hcxt = LogicalRepRelMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ *
+ * Note there's no logialrep_partition_close, because the caller closes the
+ * the component relation.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	entry = hash_search(LogicalRepPartMap, (void *) &partOid,
+						HASH_ENTER, &found);
+
+	if (found)
+		return entry;
+
+	memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+
+	/* Make cached copy of the data */
+	oldctx = MemoryContextSwitchTo(LogicalRepRelMapContext);
+
+	/* Remote relation is used as-is from the root's entry. */
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 *
+	 * Note that 'map' which comes from the tuple routing data structure
+	 * contains 1-based attribute numbers (of the parent relation).  However,
+	 * the map in 'entry', a logical replication data structure, contains
+	 * 0-based attribute numbers (of the remote relation).
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a60c666..c27d970 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -762,7 +762,6 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 51c0278..9871d1f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -126,6 +129,12 @@ static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
 									LogicalRepRelation *remoterel,
 									TupleTableSlot *remoteslot,
 									TupleTableSlot **localslot);
+static void apply_handle_tuple_routing(ResultRelInfo *relinfo,
+									   EState *estate,
+									   TupleTableSlot *remoteslot,
+									   LogicalRepTupleData *newtup,
+									   LogicalRepRelMapEntry *relmapentry,
+									   CmdType operation);
 
 /*
  * Should this worker apply changes for given relation.
@@ -636,9 +645,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_insert_internal(estate->es_result_relation_info, estate,
-								 remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_insert_internal(estate->es_result_relation_info, estate,
+									 remoteslot);
 
 	PopActiveSnapshot();
 
@@ -767,9 +780,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_update_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_update_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -886,9 +903,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_delete_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_delete_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -975,6 +996,212 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 }
 
 /*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	parentrel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	Relation	partrel;
+	TupleTableSlot *remoteslot_part;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, parentrel);
+
+	/*
+	 * Find a partition for the tuple contained in remoteslot.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	partrel = partrelinfo->ri_RelationDesc;
+
+	/* Convert the tuple to match the partition's rowtype. */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	remoteslot_part = partinfo->pi_PartitionTupleSlot;
+	if (remoteslot_part == NULL)
+		remoteslot_part = table_slot_create(partrel, &estate->es_tupleTable);
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+		remoteslot_part = execute_attr_map_slot(map->attrMap, remoteslot,
+												remoteslot_part);
+	else
+	{
+		remoteslot_part = ExecCopySlot(remoteslot_part, remoteslot);
+		slot_getallattrs(remoteslot_part);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	estate->es_result_relation_info = partrelinfo;
+	switch (operation)
+	{
+		case CMD_INSERT:
+			apply_handle_insert_internal(partrelinfo, estate,
+										 remoteslot_part);
+			break;
+
+		case CMD_DELETE:
+			apply_handle_delete_internal(partrelinfo, estate,
+										 remoteslot_part,
+										 &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			/*
+			 * For UPDATE, depending on whether or not the updated tuple
+			 * satisfies the partition's constraint, perform a simple UPDATE
+			 * UPDATE of the partition or move the updated tuple into a
+			 * different suitable partition.
+			 */
+			{
+				AttrMap	   *attrmap = map ? map->attrMap : NULL;
+				LogicalRepRelMapEntry *part_entry;
+				TupleTableSlot *localslot;
+				ResultRelInfo *partrelinfo_new;
+				TupleTableSlot *remoteslot_new;
+				bool		found;
+
+				part_entry = logicalrep_partition_open(relmapentry, partrel,
+													   attrmap);
+
+				/* Get the matching local tuple from the partition. */
+				found = FindReplTupleInLocalRel(estate, partrel,
+												&part_entry->remoterel,
+												remoteslot_part, &localslot);
+
+				remoteslot_new = table_slot_create(partrel,
+												   &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				if (found)
+				{
+					/* Get the updated tuple.  */
+					slot_modify_cstrings(remoteslot_new, localslot,
+										 part_entry,
+										 newtup->values, newtup->changed);
+					MemoryContextSwitchTo(oldctx);
+				}
+				else
+				{
+					/*
+					 * The tuple to be updated could not be found.
+					 *
+					 * TODO what to do here, change the log level to LOG
+					 * perhaps?
+					 */
+					elog(DEBUG1,
+						 "logical replication did not find row for update "
+						 "in replication target relation \"%s\"",
+						 RelationGetRelationName(partrel));
+				}
+
+				/* Does the updated tuple satisfy the partition constraint? */
+				if (partrelinfo->ri_PartitionCheck == NULL ||
+					ExecPartitionCheck(partrelinfo, remoteslot_new, estate,
+									   false))
+				{
+					/*
+					 * Yes, so simply UPDATE the partition.  We don't call
+					 * apply_handle_update_interal() here, which could do this
+					 * work, to avoid repeating some work already done above,
+					 * such as finding the local tuple in the partition.
+					 */
+					EPQState epqstate;
+
+					EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+					ExecOpenIndices(partrelinfo, false);
+
+					EvalPlanQualSetSlot(&epqstate, remoteslot_new);
+					ExecSimpleRelationUpdate(estate, &epqstate, localslot,
+											 remoteslot_new);
+					ExecCloseIndices(partrelinfo);
+					EvalPlanQualEnd(&epqstate);
+				}
+				else
+				{
+					/* Move the tuple into the new partition. */
+
+					/* Convert the updated tuple back to the parent's rowtype. */
+					if (map)
+					{
+						TupleConversionMap *PartitionToRootMap =
+							convert_tuples_by_name(RelationGetDescr(partrel),
+												   RelationGetDescr(parentrel));
+						remoteslot =
+							execute_attr_map_slot(PartitionToRootMap->attrMap,
+												  remoteslot_new, remoteslot);
+					}
+					else
+					{
+						remoteslot = ExecCopySlot(remoteslot, remoteslot_new);
+						slot_getallattrs(remoteslot);
+					}
+
+
+					/* Find the new partition. */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partrelinfo_new = ExecFindPartition(mtstate, relinfo,
+														proute, remoteslot,
+														estate);
+					MemoryContextSwitchTo(oldctx);
+					Assert(partrelinfo_new != partrelinfo);
+
+					/* DELETE old tuple from the old partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_delete_internal(partrelinfo, estate,
+												 remoteslot_part,
+												 &relmapentry->remoterel);
+
+					/* INSERT new tuple into the new partition. */
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						remoteslot_new = partinfo->pi_PartitionTupleSlot;
+						remoteslot_new = execute_attr_map_slot(map->attrMap,
+															   remoteslot,
+															   remoteslot_new);
+					}
+					MemoryContextSwitchTo(oldctx);
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_insert_internal(partrelinfo_new, estate,
+												 remoteslot_new);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
+/*
  * Handle TRUNCATE message.
  *
  * TODO: FDW support
@@ -987,6 +1214,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -1016,6 +1244,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1031,6 +1305,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a80..4650b4f 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index ea5812c..14ff9f4 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 18;
 
 # setup
 
@@ -42,10 +42,15 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 PARTITION OF tab1_2 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (6)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
@@ -82,6 +87,10 @@ my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT tableoid::regclass FROM tab1 WHERE a = 5");
+is($result, qq(tab1_2_1), 'inserts into tab1_2 replicated into correct partition');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|1|3), 'inserts into tab1_1 replicated');
@@ -90,24 +99,30 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
-# update (no partition change)
+# update (replicated as update)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+is($result, qq(sub1_tab1|3|2|6), 'update of tab1_1 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT tableoid::regclass FROM tab1 WHERE a = 6");
+is($result, qq(tab1_2_2), 'update of tab1_2 correctly replicated as intra-partition update');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
-# update (partition changes)
+# update (replicated as delete+insert)
 $node_publisher->safe_psql('postgres',
-	"UPDATE tab1 SET a = 6 WHERE a = 2");
+	"UPDATE tab1 SET a = 5 WHERE a = 2");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
@@ -116,6 +131,10 @@ $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
 is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
 
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT tableoid::regclass FROM tab1 WHERE a = 5");
+is($result, qq(tab1_2_1), 'update of tab1_2 correctly replicated as cross-partition update');
+
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
-- 
1.8.3.1

#52Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#51)
3 attachment(s)
Re: adding partitioned tables to publications

I have updated the comments in apply_handle_tuple_routing() (see 0002)
to better explain what's going on with UPDATE handling. I also
rearranged the tests a bit for clarity.

Attached updated patches.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v1-0001-worker.c-refactor-code-to-look-up-local-tuple.patchapplication/octet-stream; name=v1-0001-worker.c-refactor-code-to-look-up-local-tuple.patchDownload
From 4715dd6c444ed5c5021f8330bc151277474397b1 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Fri, 27 Mar 2020 20:54:30 +0900
Subject: [PATCH v15 1/3] worker.c: refactor code to look up local tuple

---
 src/backend/replication/logical/worker.c | 78 +++++++++++++++++---------------
 1 file changed, 41 insertions(+), 37 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index fa38117..51c0278 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -122,6 +122,10 @@ static void apply_handle_update_internal(ResultRelInfo *relinfo,
 static void apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 										 TupleTableSlot *remoteslot,
 										 LogicalRepRelation *remoterel);
+static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+									LogicalRepRelation *remoterel,
+									TupleTableSlot *remoteslot,
+									TupleTableSlot **localslot);
 
 /*
  * Should this worker apply changes for given relation.
@@ -788,33 +792,17 @@ apply_handle_update_internal(ResultRelInfo *relinfo,
 							 LogicalRepRelMapEntry *relmapentry)
 {
 	Relation	localrel = relinfo->ri_RelationDesc;
-	Oid			idxoid;
 	EPQState	epqstate;
 	TupleTableSlot *localslot;
 	bool		found;
 	MemoryContext oldctx;
 
-	localslot = table_slot_create(localrel, &estate->es_tupleTable);
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
-
 	ExecOpenIndices(relinfo, false);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (relmapentry->remoterel.replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
-										 remoteslot, localslot);
-
+	found = FindReplTupleInLocalRel(estate, localrel,
+									&relmapentry->remoterel,
+									remoteslot, &localslot);
 	ExecClearTuple(remoteslot);
 
 	/*
@@ -922,31 +910,15 @@ apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 							 LogicalRepRelation *remoterel)
 {
 	Relation	localrel = relinfo->ri_RelationDesc;
-	Oid			idxoid;
 	EPQState	epqstate;
 	TupleTableSlot *localslot;
 	bool		found;
 
-	localslot = table_slot_create(localrel, &estate->es_tupleTable);
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
-
 	ExecOpenIndices(relinfo, false);
 
-	/*
-	 * Try to find tuple using either replica identity index, primary key or
-	 * if needed, sequential scan.
-	 */
-	idxoid = GetRelationIdentityOrPK(localrel);
-	Assert(OidIsValid(idxoid) ||
-		   (remoterel->replident == REPLICA_IDENTITY_FULL));
-
-	if (OidIsValid(idxoid))
-		found = RelationFindReplTupleByIndex(localrel, idxoid,
-											 LockTupleExclusive,
-											 remoteslot, localslot);
-	else
-		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
-										 remoteslot, localslot);
+	found = FindReplTupleInLocalRel(estate, localrel, remoterel,
+									remoteslot, &localslot);
 
 	/* If found delete it. */
 	if (found)
@@ -971,6 +943,38 @@ apply_handle_delete_internal(ResultRelInfo *relinfo, EState *estate,
 }
 
 /*
+ * Try to find a tuple received from the publication side (one in 'remoteslot')
+ * in the corresponding local relation using either replica identity index,
+ * primary key or if needed, sequential scan.
+ *
+ * Local tuple, if found, is returned in '*localslot'.
+ */
+static bool
+FindReplTupleInLocalRel(EState *estate, Relation localrel,
+						LogicalRepRelation *remoterel,
+						TupleTableSlot *remoteslot,
+						TupleTableSlot **localslot)
+{
+	Oid			idxoid;
+	bool		found;
+
+	idxoid = GetRelationIdentityOrPK(localrel);
+	Assert(OidIsValid(idxoid) ||
+		   (remoterel->replident == REPLICA_IDENTITY_FULL));
+
+	*localslot = table_slot_create(localrel, &estate->es_tupleTable);
+	if (OidIsValid(idxoid))
+		found = RelationFindReplTupleByIndex(localrel, idxoid,
+											 LockTupleExclusive,
+											 remoteslot, *localslot);
+	else
+		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
+										 remoteslot, *localslot);
+
+	return found;
+}
+
+/*
  * Handle TRUNCATE message.
  *
  * TODO: FDW support
-- 
1.8.3.1

v15-0003-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v15-0003-Publish-partitioned-table-inserts-as-its-own.patchDownload
From a1ac1f87278a8b7b7be2b30eeb5797d55d901aad Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v15 3/3] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that can be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   9 +
 src/backend/catalog/pg_publication.c        |  63 ++++++-
 src/backend/commands/publicationcmds.c      |  95 ++++++-----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 +++++++++++++++++++-----
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 ++++++------
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 244 +++++++++++++++++++++++++++-
 17 files changed, 666 insertions(+), 151 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8bd7c9c..a99e90b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,15 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by tables, partitioned or not, although a
-     given table must either be partitioned on both servers or not partitioned
-     at all.  Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by tables, partitioned or not.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..0ca6cff 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_using_root_schema</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac01..15b8063 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relispartition(relid) &&
+			get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE;
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..0c534a2 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..9e102a4 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_using_root_schema_given,
+						  bool *publish_using_root_schema)
 {
 	ListCell   *lc;
 
+	*publish_using_root_schema_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_using_root_schema = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_using_root_schema") == 0)
+		{
+			if (*publish_using_root_schema_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_using_root_schema_given = true;
+			*publish_using_root_schema = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_using_root_schema);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_using_root_schema_given;
+	bool		publish_using_root_schema;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_using_root_schema_given,
+							  &publish_using_root_schema);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_using_root_schema_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_using_root_schema);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8e35c5b..f7c1e17 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14694,7 +14694,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..f71fd98 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..f48a8fb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 782af9a..e99cab2 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5138,7 +5139,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5157,7 +5158,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 1849dfe..efe3ee4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3879,11 +3880,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_using_root_schema = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..d12c28b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..cbd6994 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873af..c6c1911 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..a85a6c8 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..d2d269b 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..9742aef 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_using_root_schema = 'true', publish_using_root_schema = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_using_root_schema = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index fe3275a..82beb28 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 17;
+use Test::More tests => 44;
 
 # setup
 
@@ -44,7 +44,6 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
-
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
@@ -82,6 +81,8 @@ $node_subscriber1->poll_query_until('postgres', $synced_query)
 $node_subscriber2->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
+# Tests for replication using leaf partition identity and schema
+
 # insert
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1)");
@@ -206,3 +207,242 @@ is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1 replicated');
+
+# Tests for replication using root table identity and schema
+
+# Publisher
+$node_publisher->safe_psql('postgres',
+	"DROP PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (0, 1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (0, 1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub_all SET (publish_using_root_schema = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_viaroot FOR TABLE tab2, tab3_1 WITH (publish_using_root_schema = true)");
+
+# Subscriber 1
+$node_subscriber1->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (0) TO (10)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_viaroot CONNECTION '$publisher_connstr' PUBLICATION pub_viaroot");
+
+# Subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"DROP TABLE tab1");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub2 REFRESH PUBLICATION");
+
+# Wait for initial sync of all subscriptions
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (0)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (0), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (0), (3), (5)");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|5), 'inserts into tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|5), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|5), 'inserts into tab3 replicated');
+
+# update (replicated as update)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 5");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|6), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|6), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|6), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|6), 'inserts into tab3 replicated');
+
+# update (replicated as delete+insert)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 6");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|3), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|3), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|3), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|3), 'update of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|3), 'update of tab3 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'delete tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'delete from tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3");
+is($result, qq(0||), 'delete from tab3 replicated');
+
+# truncate
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+# these will NOT be replicated
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2, tab2_1, tab3_1");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(3|1|5), 'truncate of tab2_1 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(3|1|5), 'truncate of tab1_2 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(3|1|5), 'truncate of tab2_1 NOT replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1, tab2, tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3");
+is($result, qq(0||), 'truncate of tab3 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
-- 
1.8.3.1

v15-0002-Add-subscription-support-to-replicate-into-parti.patchapplication/octet-stream; name=v15-0002-Add-subscription-support-to-replicate-into-parti.patchDownload
From 8b729f36b77584a3cf2bad48d3a88e13f493e8b5 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v15 2/3] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c      |  14 +-
 src/backend/replication/logical/relation.c  | 167 +++++++++++++++
 src/backend/replication/logical/tablesync.c |   1 -
 src/backend/replication/logical/worker.c    | 312 +++++++++++++++++++++++++++-
 src/include/replication/logicalrelation.h   |   2 +
 src/test/subscription/t/013_partition.pl    |  52 ++++-
 6 files changed, 516 insertions(+), 32 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 7194bec..dc8a01a 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -594,17 +594,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -612,7 +604,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b..6b88bde 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -34,6 +34,7 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
+static HTAB *LogicalRepPartMap = NULL;
 
 
 /*
@@ -472,3 +473,169 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepRelMapContext)
+		LogicalRepRelMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepRelMapEntry);
+	ctl.hcxt = LogicalRepRelMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ *
+ * Note there's no logialrep_partition_close, because the caller closes the
+ * the component relation.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	entry = hash_search(LogicalRepPartMap, (void *) &partOid,
+						HASH_ENTER, &found);
+
+	if (found)
+		return entry;
+
+	memset(entry, 0, sizeof(LogicalRepRelMapEntry));
+
+	/* Make cached copy of the data */
+	oldctx = MemoryContextSwitchTo(LogicalRepRelMapContext);
+
+	/* Remote relation is used as-is from the root's entry. */
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 *
+	 * Note that 'map' which comes from the tuple routing data structure
+	 * contains 1-based attribute numbers (of the parent relation).  However,
+	 * the map in 'entry', a logical replication data structure, contains
+	 * 0-based attribute numbers (of the remote relation).
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a60c666..c27d970 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -762,7 +762,6 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 51c0278..29a84b3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -126,6 +129,12 @@ static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
 									LogicalRepRelation *remoterel,
 									TupleTableSlot *remoteslot,
 									TupleTableSlot **localslot);
+static void apply_handle_tuple_routing(ResultRelInfo *relinfo,
+									   EState *estate,
+									   TupleTableSlot *remoteslot,
+									   LogicalRepTupleData *newtup,
+									   LogicalRepRelMapEntry *relmapentry,
+									   CmdType operation);
 
 /*
  * Should this worker apply changes for given relation.
@@ -636,9 +645,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_insert_internal(estate->es_result_relation_info, estate,
-								 remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_insert_internal(estate->es_result_relation_info, estate,
+									 remoteslot);
 
 	PopActiveSnapshot();
 
@@ -767,9 +780,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_update_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_update_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -886,9 +903,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_delete_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_delete_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -975,6 +996,226 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 }
 
 /*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	parentrel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	Relation	partrel;
+	TupleTableSlot *remoteslot_part;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, parentrel);
+
+	/*
+	 * Find the partition to which the "search tuple" belongs.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	partrel = partrelinfo->ri_RelationDesc;
+
+	/*
+	 * To perform any of the operations below, the tuple must match the
+	 * partition's rowtype. Convert if needed or just copy, using a dedicated
+	 * slot to store the tuple in any case.
+	 */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	remoteslot_part = partinfo->pi_PartitionTupleSlot;
+	if (remoteslot_part == NULL)
+		remoteslot_part = table_slot_create(partrel, &estate->es_tupleTable);
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+		remoteslot_part = execute_attr_map_slot(map->attrMap, remoteslot,
+												remoteslot_part);
+	else
+	{
+		remoteslot_part = ExecCopySlot(remoteslot_part, remoteslot);
+		slot_getallattrs(remoteslot_part);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	estate->es_result_relation_info = partrelinfo;
+	switch (operation)
+	{
+		case CMD_INSERT:
+			apply_handle_insert_internal(partrelinfo, estate,
+										 remoteslot_part);
+			break;
+
+		case CMD_DELETE:
+			apply_handle_delete_internal(partrelinfo, estate,
+										 remoteslot_part,
+										 &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			/*
+			 * For UPDATE, depending on whether or not the updated tuple
+			 * satisfies the partition's constraint, perform a simple UPDATE
+			 * of the partition or move the updated tuple into a different
+			 * suitable partition.
+			 */
+			{
+				AttrMap	   *attrmap = map ? map->attrMap : NULL;
+				LogicalRepRelMapEntry *part_entry;
+				TupleTableSlot *localslot;
+				ResultRelInfo *partrelinfo_new;
+				TupleTableSlot *remoteslot_new;
+				bool		found;
+
+				part_entry = logicalrep_partition_open(relmapentry, partrel,
+													   attrmap);
+
+				/* Get the matching local tuple from the partition. */
+				found = FindReplTupleInLocalRel(estate, partrel,
+												&part_entry->remoterel,
+												remoteslot_part, &localslot);
+
+				remoteslot_new = table_slot_create(partrel,
+												   &estate->es_tupleTable);
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				if (found)
+				{
+					/* Apply the update.  */
+					slot_modify_cstrings(remoteslot_new, localslot,
+										 part_entry,
+										 newtup->values, newtup->changed);
+					MemoryContextSwitchTo(oldctx);
+				}
+				else
+				{
+					/*
+					 * The tuple to be updated could not be found.
+					 *
+					 * TODO what to do here, change the log level to LOG
+					 * perhaps?
+					 */
+					elog(DEBUG1,
+						 "logical replication did not find row for update "
+						 "in replication target relation \"%s\"",
+						 RelationGetRelationName(partrel));
+				}
+
+				/*
+				 * Does the updated tuple still satisfy the current
+				 * partition's constraint?
+				 */
+				if (partrelinfo->ri_PartitionCheck == NULL ||
+					ExecPartitionCheck(partrelinfo, remoteslot_new, estate,
+									   false))
+				{
+					/*
+					 * Yes, so simply UPDATE the partition.  We don't call
+					 * apply_handle_update_interal() here, which would
+					 * normally do the following work, to avoid repeating some
+					 * work already done above to find the local tuple in the
+					 * partition.
+					 */
+					EPQState epqstate;
+
+					EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+					ExecOpenIndices(partrelinfo, false);
+
+					EvalPlanQualSetSlot(&epqstate, remoteslot_new);
+					ExecSimpleRelationUpdate(estate, &epqstate, localslot,
+											 remoteslot_new);
+					ExecCloseIndices(partrelinfo);
+					EvalPlanQualEnd(&epqstate);
+				}
+				else
+				{
+					/* Move the tuple into the new partition. */
+
+					/*
+					 * New partition will be found using tuple routing, which
+					 * can only occur via the parent table.  We might need
+					 * to convert the tuple the parent's rowtype.  Note that
+					 * this is the tuple found in the partition, not the
+					 * original search tuple received by this function.
+					 */
+					if (map)
+					{
+						TupleConversionMap *PartitionToRootMap =
+							convert_tuples_by_name(RelationGetDescr(partrel),
+												   RelationGetDescr(parentrel));
+						remoteslot =
+							execute_attr_map_slot(PartitionToRootMap->attrMap,
+												  remoteslot_new, remoteslot);
+					}
+					else
+					{
+						remoteslot = ExecCopySlot(remoteslot, remoteslot_new);
+						slot_getallattrs(remoteslot);
+					}
+
+
+					/* Find the new partition. */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partrelinfo_new = ExecFindPartition(mtstate, relinfo,
+														proute, remoteslot,
+														estate);
+					MemoryContextSwitchTo(oldctx);
+					Assert(partrelinfo_new != partrelinfo);
+
+					/* DELETE old tuple from the old partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_delete_internal(partrelinfo, estate,
+												 remoteslot_part,
+												 &relmapentry->remoterel);
+
+					/* INSERT new tuple into the new partition. */
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						remoteslot_new = partinfo->pi_PartitionTupleSlot;
+						remoteslot_new = execute_attr_map_slot(map->attrMap,
+															   remoteslot,
+															   remoteslot_new);
+					}
+					MemoryContextSwitchTo(oldctx);
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_insert_internal(partrelinfo_new, estate,
+												 remoteslot_new);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
+/*
  * Handle TRUNCATE message.
  *
  * TODO: FDW support
@@ -987,6 +1228,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -1016,6 +1258,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1031,6 +1319,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a80..4650b4f 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index ea5812c..fe3275a 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 17;
 
 # setup
 
@@ -35,6 +35,8 @@ $node_publisher->safe_psql('postgres',
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
 $node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_def PARTITION OF tab1 DEFAULT");
+$node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
 
 # subscriber1
@@ -42,10 +44,21 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6) PARTITION BY LIST (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_1 (c text, b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1_2 ATTACH PARTITION tab1_2_1 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 (a int NOT NULL, c text, b text)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1_2 ATTACH PARTITION tab1_2_2 FOR VALUES IN (6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_def PARTITION OF tab1 (c DEFAULT 'sub1_tab1') DEFAULT");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
@@ -57,6 +70,8 @@ $node_subscriber2->safe_psql('postgres',
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_def (a int PRIMARY KEY, b text, c text DEFAULT 'sub2_tab1_def')");
+$node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
 
 # Wait for initial sync of all subscriptions
@@ -74,13 +89,15 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (0)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+is($result, qq(sub1_tab1|4|0|5), 'inserts into tab1 and its partitions replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
@@ -90,43 +107,56 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
-# update (no partition change)
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_def GROUP BY 1");
+is($result, qq(sub2_tab1_def|1|0|0), 'inserts into tab1_def replicated');
+
+# update (replicated as update)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
+# update of tab1_2 is in turn applied as delete from tab1_2_1 and insert into tab1_2_2
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+is($result, qq(sub1_tab1|4|0|6), 'update of tab1_1, tab1_2 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
-# update (partition changes)
+# update (replicated as delete+insert)
 $node_publisher->safe_psql('postgres',
-	"UPDATE tab1 SET a = 6 WHERE a = 2");
+	"UPDATE tab1 SET a = 1 WHERE a = 0");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 5 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
+is($result, qq(sub1_tab1|4|2|6), 'update of tab1 (delete from tab1_def + insert into tab1_1) replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
-is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+is($result, qq(sub2_tab1_1|2|2|3), 'delete from tab1_1 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
 
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_def");
+is($result, qq(0||), 'delete from tab1_def replicated');
+
 # delete
 $node_publisher->safe_psql('postgres',
-	"DELETE FROM tab1 WHERE a IN (3, 5)");
+	"DELETE FROM tab1 WHERE a IN (2, 3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
 
@@ -175,4 +205,4 @@ $result = $node_subscriber1->safe_psql('postgres',
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
-- 
1.8.3.1

#53Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#52)
Re: adding partitioned tables to publications

On 2020-03-30 17:42, Amit Langote wrote:

I have updated the comments in apply_handle_tuple_routing() (see 0002)
to better explain what's going on with UPDATE handling. I also
rearranged the tests a bit for clarity.

Attached updated patches.

Test coverage for 0002 is still a bit lacking. Please do a coverage
build yourself and get at least one test case to exercise every branch
in apply_handle_tuple_routing(). Right now, I don't see any coverage
for updates without attribute remapping and updates that don't move to a
new partition.

Also, the coverage report reveals that in logicalrep_partmap_init(), the
patch is mistakenly initializing LogicalRepRelMapContext instead of
LogicalRepPartMapContext. (Hmm, how does it even work like that?)

I think apart from some of these details, this patch is okay, but I
don't have deep experience in the partitioning code, I can just see that
it looks like other code elsewhere. Perhaps someone with more knowledge
can give this a look as well.

About patch 0003, I was talking to some people offline about the name of
the option. There was some confusion about using the term "schema".
How about naming it "publish_via_partition_root", which also matches the
name of the analogous option in pg_dump.

Code coverage here could also be improved. A lot of the new code in
pgoutput.c is not tested.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#54Petr Jelinek
petr@2ndquadrant.com
In reply to: Peter Eisentraut (#53)
Re: adding partitioned tables to publications

Hi,

On 02/04/2020 14:23, Peter Eisentraut wrote:

On 2020-03-30 17:42, Amit Langote wrote:

I have updated the comments in apply_handle_tuple_routing() (see 0002)
to better explain what's going on with UPDATE handling.  I also
rearranged the tests a bit for clarity.

Attached updated patches.
Also, the coverage report reveals that in logicalrep_partmap_init(), the

patch is mistakenly initializing LogicalRepRelMapContext instead of
LogicalRepPartMapContext.  (Hmm, how does it even work like that?)

It works because it's just a MemoryContext and it's long lived. I wonder
if the fix here is to simply remove the LogicalRepPartMapContext...

I think apart from some of these details, this patch is okay, but I
don't have deep experience in the partitioning code, I can just see that
it looks like other code elsewhere.  Perhaps someone with more knowledge
can give this a look as well.

FWIW it looks okay to me as well from perspective of somebody who
implemented something similar outside of core.

About patch 0003, I was talking to some people offline about the name of
the option.  There was some confusion about using the term "schema". How
about naming it "publish_via_partition_root", which also matches the
name of the analogous option in pg_dump.

+1 (disclaimer: I was one of the people who discussed this offline)

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

#55Amit Langote
amitlangote09@gmail.com
In reply to: Petr Jelinek (#54)
Re: adding partitioned tables to publications

On Fri, Apr 3, 2020 at 4:52 PM Petr Jelinek <petr@2ndquadrant.com> wrote:

On 02/04/2020 14:23, Peter Eisentraut wrote:

On 2020-03-30 17:42, Amit Langote wrote:

I have updated the comments in apply_handle_tuple_routing() (see 0002)
to better explain what's going on with UPDATE handling. I also
rearranged the tests a bit for clarity.

Attached updated patches.
Also, the coverage report reveals that in logicalrep_partmap_init(), the

patch is mistakenly initializing LogicalRepRelMapContext instead of
LogicalRepPartMapContext. (Hmm, how does it even work like that?)

It works because it's just a MemoryContext and it's long lived. I wonder
if the fix here is to simply remove the LogicalRepPartMapContext...

Actually, there is no LogicalRepPartMapContext in the patches posted
so far, but I have decided to add it in the updated patch. One
advantage beside avoiding confusion is that it might help to tell
memory consumed by the partitions apart from that consumed by the
actual replication targets.

I think apart from some of these details, this patch is okay, but I
don't have deep experience in the partitioning code, I can just see that
it looks like other code elsewhere. Perhaps someone with more knowledge
can give this a look as well.

FWIW it looks okay to me as well from perspective of somebody who
implemented something similar outside of core.

Thanks for giving it a look.

About patch 0003, I was talking to some people offline about the name of
the option. There was some confusion about using the term "schema". How
about naming it "publish_via_partition_root", which also matches the
name of the analogous option in pg_dump.

+1 (disclaimer: I was one of the people who discussed this offline)

Okay, I like that too.

I am checking test coverage at the moment and should have the patches
ready by sometime later today.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#56Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#55)
2 attachment(s)
Re: adding partitioned tables to publications

On Fri, Apr 3, 2020 at 6:34 PM Amit Langote <amitlangote09@gmail.com> wrote:

I am checking test coverage at the moment and should have the patches
ready by sometime later today.

Attached updated patches.

I confirmed using a coverage build that all the new code in
logical/worker.c due to 0002 is now covered. For some reason, coverage
report for pgoutput.c doesn't say the same thing for 0003's changes,
although I doubt that result. It seems strange to believe that *none*
of the new code is tested. I even checked by adding debugging elog()s
next to the lines that the coverage report says aren't exercised,
which tell me that that's not true. Perhaps my coverage build is
somehow getting messed up, so it would be nice if someone with
reliable coverage builds can confirm one way or the other. I will
continue to check what's wrong.

I fixed a couple of bugs in 0002. One of the bugs was that the
"partition map" hash table in logical/relation.c didn't really work,
so logicalrep_partition_would() always create a new entry.

In 0003, changed the publication parameter name to publish_via_partition_root.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v16-0001-Add-subscription-support-to-replicate-into-parti.patchapplication/octet-stream; name=v16-0001-Add-subscription-support-to-replicate-into-parti.patchDownload
From d6392bd1e5e6f8a7d462604a6c012d3f164aa075 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlangote09@gmail.com>
Date: Thu, 23 Jan 2020 11:49:01 +0900
Subject: [PATCH v16 1/2] Add subscription support to replicate into
 partitioned tables

Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.
---
 src/backend/executor/execReplication.c      |  14 +-
 src/backend/replication/logical/relation.c  | 189 ++++++++++++++++
 src/backend/replication/logical/tablesync.c |   1 -
 src/backend/replication/logical/worker.c    | 319 +++++++++++++++++++++++++++-
 src/include/replication/logicalrelation.h   |   2 +
 src/test/subscription/t/013_partition.pl    |  70 ++++--
 6 files changed, 557 insertions(+), 38 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 7194bec..dc8a01a 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -594,17 +594,9 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 						 const char *relname)
 {
 	/*
-	 * We currently only support writing to regular tables.  However, give a
-	 * more specific error for partitioned and foreign tables.
+	 * Give a more specific error for foreign tables.
 	 */
-	if (relkind == RELKIND_PARTITIONED_TABLE)
-		ereport(ERROR,
-				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
-						nspname, relname),
-				 errdetail("\"%s.%s\" is a partitioned table.",
-						   nspname, relname)));
-	else if (relkind == RELKIND_FOREIGN_TABLE)
+	if (relkind == RELKIND_FOREIGN_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
@@ -612,7 +604,7 @@ CheckSubscriptionRelkind(char relkind, const char *nspname,
 				 errdetail("\"%s.%s\" is a foreign table.",
 						   nspname, relname)));
 
-	if (relkind != RELKIND_RELATION)
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 				 errmsg("cannot use relation \"%s.%s\" as logical replication target",
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 3d7291b..53cba08 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -35,6 +35,24 @@ static MemoryContext LogicalRepRelMapContext = NULL;
 static HTAB *LogicalRepRelMap = NULL;
 static HTAB *LogicalRepTypMap = NULL;
 
+/*
+ * Partition map (LogicalRepPartMap)
+ *
+ * When a partitioned table is used as replication target, replicated
+ * operations are actually performed on its leaf partitions, which requires
+ * the partitions to also be mapped to the remote relation.  Parent's entry
+ * (LogicalRepRelMapEntry) cannot be used as-is for all partitions, because
+ * individual partitions may have different attribute numbers, which means
+ * attribute mappings to remote relation's attributes must be maintained
+ * separately for each partition.
+ */
+static MemoryContext LogicalRepPartMapContext = NULL;
+static HTAB *LogicalRepPartMap = NULL;
+typedef struct LogicalRepPartMapEntry
+{
+	Oid		partoid;	/* LogicalRepPartMap's key */
+	LogicalRepRelMapEntry relmapentry;
+} LogicalRepPartMapEntry;
 
 /*
  * Relcache invalidation callback for our relation map cache.
@@ -472,3 +490,174 @@ logicalrep_typmap_gettypname(Oid remoteid)
 	Assert(OidIsValid(entry->remoteid));
 	return psprintf("%s.%s", entry->nspname, entry->typname);
 }
+
+/*
+ * Partition cache: look up partition LogicalRepRelMapEntry's
+ *
+ * Unlike relation map cache, this is keyed by partition OID, not remote
+ * relation OID, because we only have to use this cache in the case where
+ * partitions are not directly mapped to any remote relation, such as when
+ * replication is occurring with one of their ancestors as target.
+ */
+
+/*
+ * Relcache invalidation callback
+ */
+static void
+logicalrep_partmap_invalidate_cb(Datum arg, Oid reloid)
+{
+	LogicalRepRelMapEntry *entry;
+
+	/* Just to be sure. */
+	if (LogicalRepPartMap == NULL)
+		return;
+
+	if (reloid != InvalidOid)
+	{
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		/* TODO, use inverse lookup hashtable? */
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+		{
+			if (entry->localreloid == reloid)
+			{
+				entry->localreloid = InvalidOid;
+				hash_seq_term(&status);
+				break;
+			}
+		}
+	}
+	else
+	{
+		/* invalidate all cache entries */
+		HASH_SEQ_STATUS status;
+
+		hash_seq_init(&status, LogicalRepPartMap);
+
+		while ((entry = (LogicalRepRelMapEntry *) hash_seq_search(&status)) != NULL)
+			entry->localreloid = InvalidOid;
+	}
+}
+
+/*
+ * Initialize the partition map cache.
+ */
+static void
+logicalrep_partmap_init(void)
+{
+	HASHCTL		ctl;
+
+	if (!LogicalRepPartMapContext)
+		LogicalRepPartMapContext =
+			AllocSetContextCreate(CacheMemoryContext,
+								  "LogicalRepPartMapContext",
+								  ALLOCSET_DEFAULT_SIZES);
+
+	/* Initialize the relation hash table. */
+	MemSet(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);	/* partition OID */
+	ctl.entrysize = sizeof(LogicalRepPartMapEntry);
+	ctl.hcxt = LogicalRepPartMapContext;
+
+	LogicalRepPartMap = hash_create("logicalrep partition map cache", 64, &ctl,
+								   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	/* Watch for invalidation events. */
+	CacheRegisterRelcacheCallback(logicalrep_partmap_invalidate_cb,
+								  (Datum) 0);
+}
+
+/*
+ * logicalrep_partition_open
+ *
+ * Returned entry reuses most of the values of the root table's entry, save
+ * the attribute map, which can be different for the partition.
+ *
+ * Note there's no logialrep_partition_close, because the caller closes the
+ * the component relation.
+ */
+LogicalRepRelMapEntry *
+logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map)
+{
+	LogicalRepRelMapEntry *entry;
+	LogicalRepPartMapEntry *part_entry;
+	LogicalRepRelation *remoterel = &root->remoterel;
+	Oid			partOid = RelationGetRelid(partrel);
+	AttrMap	   *attrmap = root->attrmap;
+	bool		found;
+	int			i;
+	MemoryContext oldctx;
+
+	if (LogicalRepPartMap == NULL)
+		logicalrep_partmap_init();
+
+	/* Search for existing entry. */
+	part_entry = (LogicalRepPartMapEntry *) hash_search(LogicalRepPartMap,
+														(void *) &partOid,
+														HASH_ENTER, &found);
+
+	if (found)
+		return &part_entry->relmapentry;
+
+	memset(part_entry, 0, sizeof(LogicalRepPartMapEntry));
+
+	/* Switch to longer-lived context. */
+	oldctx = MemoryContextSwitchTo(LogicalRepPartMapContext);
+
+	part_entry->partoid = partOid;
+
+	/* Remote relation is used as-is from the root entry. */
+	entry = &part_entry->relmapentry;
+	entry->remoterel.remoteid = remoterel->remoteid;
+	entry->remoterel.nspname = pstrdup(remoterel->nspname);
+	entry->remoterel.relname = pstrdup(remoterel->relname);
+	entry->remoterel.natts = remoterel->natts;
+	entry->remoterel.attnames = palloc(remoterel->natts * sizeof(char *));
+	entry->remoterel.atttyps = palloc(remoterel->natts * sizeof(Oid));
+	for (i = 0; i < remoterel->natts; i++)
+	{
+		entry->remoterel.attnames[i] = pstrdup(remoterel->attnames[i]);
+		entry->remoterel.atttyps[i] = remoterel->atttyps[i];
+	}
+	entry->remoterel.replident = remoterel->replident;
+	entry->remoterel.attkeys = bms_copy(remoterel->attkeys);
+
+	entry->localrel = partrel;
+	entry->localreloid = partOid;
+
+	/*
+	 * If the partition's attributes don't match the root relation's, we'll
+	 * need to make a new attrmap which maps partition attribute numbers to
+	 * remoterel's, instead the original which maps root relation's attribute
+	 * numbers to remoterel's.
+	 *
+	 * Note that 'map' which comes from the tuple routing data structure
+	 * contains 1-based attribute numbers (of the parent relation).  However,
+	 * the map in 'entry', a logical replication data structure, contains
+	 * 0-based attribute numbers (of the remote relation).
+	 */
+	if (map)
+	{
+		AttrNumber	attno;
+
+		entry->attrmap = make_attrmap(map->maplen);
+		for (attno = 0; attno < entry->attrmap->maplen; attno++)
+		{
+			AttrNumber	root_attno = map->attnums[attno];
+
+			entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1];
+		}
+	}
+	else
+		entry->attrmap = attrmap;
+
+	entry->updatable = root->updatable;
+
+	/* state and statelsn are left set to 0. */
+	MemoryContextSwitchTo(oldctx);
+
+	return entry;
+}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a60c666..c27d970 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -762,7 +762,6 @@ copy_table(Relation rel)
 	/* Map the publisher relation to local one. */
 	relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
 	Assert(rel == relmapentry->localrel);
-	Assert(relmapentry->localrel->rd_rel->relkind == RELKIND_RELATION);
 
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 673ebd2..93bbca2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -29,11 +29,14 @@
 #include "access/xlog_internal.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
+#include "catalog/pg_inherits.h"
 #include "catalog/pg_subscription.h"
 #include "catalog/pg_subscription_rel.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
+#include "executor/execPartition.h"
 #include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -126,6 +129,12 @@ static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
 									LogicalRepRelation *remoterel,
 									TupleTableSlot *remoteslot,
 									TupleTableSlot **localslot);
+static void apply_handle_tuple_routing(ResultRelInfo *relinfo,
+									   EState *estate,
+									   TupleTableSlot *remoteslot,
+									   LogicalRepTupleData *newtup,
+									   LogicalRepRelMapEntry *relmapentry,
+									   CmdType operation);
 
 /*
  * Should this worker apply changes for given relation.
@@ -636,9 +645,13 @@ apply_handle_insert(StringInfo s)
 	slot_fill_defaults(rel, estate, remoteslot);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_insert_internal(estate->es_result_relation_info, estate,
-								 remoteslot);
+	/* For a partitioned table, insert the tuple into a partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_INSERT);
+	else
+		apply_handle_insert_internal(estate->es_result_relation_info, estate,
+									 remoteslot);
 
 	PopActiveSnapshot();
 
@@ -767,9 +780,13 @@ apply_handle_update(StringInfo s)
 						has_oldtup ? oldtup.values : newtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_update_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &newtup, rel);
+	/* For a partitioned table, apply update to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, &newtup, rel, CMD_UPDATE);
+	else
+		apply_handle_update_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &newtup, rel);
 
 	PopActiveSnapshot();
 
@@ -886,9 +903,13 @@ apply_handle_delete(StringInfo s)
 	slot_store_cstrings(remoteslot, rel, oldtup.values);
 	MemoryContextSwitchTo(oldctx);
 
-	Assert(rel->localrel->rd_rel->relkind == RELKIND_RELATION);
-	apply_handle_delete_internal(estate->es_result_relation_info, estate,
-								 remoteslot, &rel->remoterel);
+	/* For a partitioned table, apply delete to correct partition. */
+	if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		apply_handle_tuple_routing(estate->es_result_relation_info, estate,
+								   remoteslot, NULL, rel, CMD_DELETE);
+	else
+		apply_handle_delete_internal(estate->es_result_relation_info, estate,
+									 remoteslot, &rel->remoterel);
 
 	PopActiveSnapshot();
 
@@ -976,6 +997,233 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 }
 
 /*
+ * This handles insert, update, delete on a partitioned table.
+ */
+static void
+apply_handle_tuple_routing(ResultRelInfo *relinfo,
+						   EState *estate,
+						   TupleTableSlot *remoteslot,
+						   LogicalRepTupleData *newtup,
+						   LogicalRepRelMapEntry *relmapentry,
+						   CmdType operation)
+{
+	Relation	parentrel = relinfo->ri_RelationDesc;
+	ModifyTableState *mtstate = NULL;
+	PartitionTupleRouting *proute = NULL;
+	ResultRelInfo *partrelinfo;
+	Relation	partrel;
+	TupleTableSlot *remoteslot_part;
+	PartitionRoutingInfo *partinfo;
+	TupleConversionMap *map;
+	MemoryContext oldctx;
+
+	/* ModifyTableState is needed for ExecFindPartition(). */
+	mtstate = makeNode(ModifyTableState);
+	mtstate->ps.plan = NULL;
+	mtstate->ps.state = estate;
+	mtstate->operation = operation;
+	mtstate->resultRelInfo = relinfo;
+	proute = ExecSetupPartitionTupleRouting(estate, mtstate, parentrel);
+
+	/*
+	 * Find the partition to which the "search tuple" belongs.
+	 */
+	Assert(remoteslot != NULL);
+	oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+	partrelinfo = ExecFindPartition(mtstate, relinfo, proute,
+									remoteslot, estate);
+	Assert(partrelinfo != NULL);
+	partrel = partrelinfo->ri_RelationDesc;
+
+	/*
+	 * To perform any of the operations below, the tuple must match the
+	 * partition's rowtype. Convert if needed or just copy, using a dedicated
+	 * slot to store the tuple in any case.
+	 */
+	partinfo = partrelinfo->ri_PartitionInfo;
+	remoteslot_part = partinfo->pi_PartitionTupleSlot;
+	if (remoteslot_part == NULL)
+		remoteslot_part = table_slot_create(partrel, &estate->es_tupleTable);
+	map = partinfo->pi_RootToPartitionMap;
+	if (map != NULL)
+		remoteslot_part = execute_attr_map_slot(map->attrMap, remoteslot,
+												remoteslot_part);
+	else
+	{
+		remoteslot_part = ExecCopySlot(remoteslot_part, remoteslot);
+		slot_getallattrs(remoteslot_part);
+	}
+	MemoryContextSwitchTo(oldctx);
+
+	estate->es_result_relation_info = partrelinfo;
+	switch (operation)
+	{
+		case CMD_INSERT:
+			apply_handle_insert_internal(partrelinfo, estate,
+										 remoteslot_part);
+			break;
+
+		case CMD_DELETE:
+			apply_handle_delete_internal(partrelinfo, estate,
+										 remoteslot_part,
+										 &relmapentry->remoterel);
+			break;
+
+		case CMD_UPDATE:
+			/*
+			 * For UPDATE, depending on whether or not the updated tuple
+			 * satisfies the partition's constraint, perform a simple UPDATE
+			 * of the partition or move the updated tuple into a different
+			 * suitable partition.
+			 */
+			{
+				AttrMap	   *attrmap = map ? map->attrMap : NULL;
+				LogicalRepRelMapEntry *part_entry;
+				TupleTableSlot *localslot;
+				ResultRelInfo *partrelinfo_new;
+				bool		found;
+
+				part_entry = logicalrep_partition_open(relmapentry, partrel,
+													   attrmap);
+
+				/* Get the matching local tuple from the partition. */
+				found = FindReplTupleInLocalRel(estate, partrel,
+												&part_entry->remoterel,
+												remoteslot_part, &localslot);
+
+				oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+				if (found)
+				{
+					/* Apply the update.  */
+					slot_modify_cstrings(remoteslot_part, localslot,
+										 part_entry,
+										 newtup->values, newtup->changed);
+					MemoryContextSwitchTo(oldctx);
+				}
+				else
+				{
+					/*
+					 * The tuple to be updated could not be found.
+					 *
+					 * TODO what to do here, change the log level to LOG
+					 * perhaps?
+					 */
+					elog(DEBUG1,
+						 "logical replication did not find row for update "
+						 "in replication target relation \"%s\"",
+						 RelationGetRelationName(partrel));
+				}
+
+				/*
+				 * Does the updated tuple still satisfy the current
+				 * partition's constraint?
+				 */
+				if (partrelinfo->ri_PartitionCheck == NULL ||
+					ExecPartitionCheck(partrelinfo, remoteslot_part, estate,
+									   false))
+				{
+					/*
+					 * Yes, so simply UPDATE the partition.  We don't call
+					 * apply_handle_update_interal() here, which would
+					 * normally do the following work, to avoid repeating some
+					 * work already done above to find the local tuple in the
+					 * partition.
+					 */
+					EPQState epqstate;
+
+					EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1);
+					ExecOpenIndices(partrelinfo, false);
+
+					EvalPlanQualSetSlot(&epqstate, remoteslot_part);
+					ExecSimpleRelationUpdate(estate, &epqstate, localslot,
+											 remoteslot_part);
+					ExecCloseIndices(partrelinfo);
+					EvalPlanQualEnd(&epqstate);
+				}
+				else
+				{
+					/* Move the tuple into the new partition. */
+
+					/*
+					 * New partition will be found using tuple routing, which
+					 * can only occur via the parent table.  We might need to
+					 * convert the tuple to the parent's rowtype.  Note that
+					 * this is the tuple found in the partition, not the
+					 * original search tuple received by this function.
+					 */
+					if (map)
+					{
+						TupleConversionMap *PartitionToRootMap =
+							convert_tuples_by_name(RelationGetDescr(partrel),
+												   RelationGetDescr(parentrel));
+						remoteslot =
+							execute_attr_map_slot(PartitionToRootMap->attrMap,
+												  remoteslot_part, remoteslot);
+					}
+					else
+					{
+						remoteslot = ExecCopySlot(remoteslot, remoteslot_part);
+						slot_getallattrs(remoteslot);
+					}
+
+
+					/* Find the new partition. */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partrelinfo_new = ExecFindPartition(mtstate, relinfo,
+														proute, remoteslot,
+														estate);
+					MemoryContextSwitchTo(oldctx);
+					Assert(partrelinfo_new != partrelinfo);
+
+					/* DELETE old tuple found in the old partition. */
+					estate->es_result_relation_info = partrelinfo;
+					apply_handle_delete_internal(partrelinfo, estate,
+												 localslot,
+												 &relmapentry->remoterel);
+
+					/* INSERT new tuple into the new partition. */
+
+					/*
+					 * Convert the replacement tuple to match the destination
+					 * partition rowtype.
+					 */
+					oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+					partrel = partrelinfo_new->ri_RelationDesc;
+					partinfo = partrelinfo_new->ri_PartitionInfo;
+					remoteslot_part = partinfo->pi_PartitionTupleSlot;
+					if (remoteslot_part == NULL)
+						remoteslot_part = table_slot_create(partrel,
+															&estate->es_tupleTable);
+					map = partinfo->pi_RootToPartitionMap;
+					if (map != NULL)
+					{
+						remoteslot_part = execute_attr_map_slot(map->attrMap,
+															   remoteslot,
+															   remoteslot_part);
+					}
+					else
+					{
+						remoteslot_part = ExecCopySlot(remoteslot_part,
+													   remoteslot);
+						slot_getallattrs(remoteslot);
+					}
+					MemoryContextSwitchTo(oldctx);
+					estate->es_result_relation_info = partrelinfo_new;
+					apply_handle_insert_internal(partrelinfo_new, estate,
+												 remoteslot_part);
+				}
+			}
+			break;
+
+		default:
+			elog(ERROR, "unrecognized CmdType: %d", (int) operation);
+			break;
+	}
+
+	ExecCleanupTupleRouting(mtstate, proute);
+}
+
+/*
  * Handle TRUNCATE message.
  *
  * TODO: FDW support
@@ -988,6 +1236,7 @@ apply_handle_truncate(StringInfo s)
 	List	   *remote_relids = NIL;
 	List	   *remote_rels = NIL;
 	List	   *rels = NIL;
+	List	   *part_rels = NIL;
 	List	   *relids = NIL;
 	List	   *relids_logged = NIL;
 	ListCell   *lc;
@@ -1017,6 +1266,52 @@ apply_handle_truncate(StringInfo s)
 		relids = lappend_oid(relids, rel->localreloid);
 		if (RelationIsLogicallyLogged(rel->localrel))
 			relids_logged = lappend_oid(relids_logged, rel->localreloid);
+
+		/*
+		 * Truncate partitions if we got a message to truncate a partitioned
+		 * table.
+		 */
+		if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			ListCell   *child;
+			List	   *children = find_all_inheritors(rel->localreloid,
+													   RowExclusiveLock,
+													   NULL);
+
+			foreach(child, children)
+			{
+				Oid			childrelid = lfirst_oid(child);
+				Relation	childrel;
+
+				if (list_member_oid(relids, childrelid))
+					continue;
+
+				/* find_all_inheritors already got lock */
+				childrel = table_open(childrelid, NoLock);
+
+				/*
+				 * It is possible that the parent table has children that are
+				 * temp tables of other backends.  We cannot safely access
+				 * such tables (because of buffering issues), and the best
+				 * thing to do is to silently ignore them.  Note that this
+				 * check is the same as one of the checks done in
+				 * truncate_check_activity() called below, still it is kept
+				 * here for simplicity.
+				 */
+				if (RELATION_IS_OTHER_TEMP(childrel))
+				{
+					table_close(childrel, RowExclusiveLock);
+					continue;
+				}
+
+				rels = lappend(rels, childrel);
+				part_rels = lappend(part_rels, childrel);
+				relids = lappend_oid(relids, childrelid);
+				/* Log this relation only if needed for logical decoding */
+				if (RelationIsLogicallyLogged(childrel))
+					relids_logged = lappend_oid(relids_logged, childrelid);
+			}
+		}
 	}
 
 	/*
@@ -1032,6 +1327,12 @@ apply_handle_truncate(StringInfo s)
 
 		logicalrep_rel_close(rel, NoLock);
 	}
+	foreach(lc, part_rels)
+	{
+		Relation rel = lfirst(lc);
+
+		table_close(rel, NoLock);
+	}
 
 	CommandCounterIncrement();
 }
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 9971a80..4650b4f 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -34,6 +34,8 @@ extern void logicalrep_relmap_update(LogicalRepRelation *remoterel);
 
 extern LogicalRepRelMapEntry *logicalrep_rel_open(LogicalRepRelId remoteid,
 												  LOCKMODE lockmode);
+extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *root,
+						  Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index ea5812c..b0308dc 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 15;
+use Test::More tests => 17;
 
 # setup
 
@@ -33,19 +33,30 @@ $node_publisher->safe_psql('postgres',
 $node_publisher->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_publisher->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (5, 6)");
+	"CREATE TABLE tab1_2 PARTITION OF tab1 FOR VALUES IN (4, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab1_def PARTITION OF tab1 DEFAULT");
 $node_publisher->safe_psql('postgres',
 	"ALTER PUBLICATION pub1 ADD TABLE tab1, tab1_1");
 
 # subscriber1
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1 (a int PRIMARY KEY, b text, c text) PARTITION BY LIST (a)");
+	"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
+
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (4, 5, 6) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
-	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3, 4)");
+	"CREATE TABLE tab1_2_1 (c text, b text, a int NOT NULL)");
 $node_subscriber1->safe_psql('postgres',
-	"CREATE TABLE tab1_2 PARTITION OF tab1 (c DEFAULT 'sub1_tab1') FOR VALUES IN (5, 6)");
+	"ALTER TABLE tab1_2 ATTACH PARTITION tab1_2_1 FOR VALUES IN (5)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_2_2 PARTITION OF tab1_2 FOR VALUES IN (4, 6)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab1_def PARTITION OF tab1 (c DEFAULT 'sub1_tab1') DEFAULT");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1");
 
@@ -57,6 +68,8 @@ $node_subscriber2->safe_psql('postgres',
 $node_subscriber2->safe_psql('postgres',
 	"CREATE TABLE tab1_2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1_2', b text)");
 $node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_def (a int PRIMARY KEY, b text, c text DEFAULT 'sub2_tab1_def')");
+$node_subscriber2->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub_all");
 
 # Wait for initial sync of all subscriptions
@@ -74,13 +87,15 @@ $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_1 (a) VALUES (3)");
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (0)");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
 my $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|1|5), 'insert into tab1_1, tab1_2 replicated');
+is($result, qq(sub1_tab1|4|0|5), 'inserts into tab1 and its partitions replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
@@ -90,43 +105,64 @@ $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
 is($result, qq(sub2_tab1_2|1|5|5), 'inserts into tab1_2 replicated');
 
-# update (no partition change)
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1_def GROUP BY 1");
+is($result, qq(sub2_tab1_def|1|0|0), 'inserts into tab1_def replicated');
+
+# update (replicated as update)
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 2 WHERE a = 1");
+# all of the following cause an update to applied to a partitioned table
+# on subscriber1 -- tab1_2 is leaf partition on publisher, whereas it's
+# sub-partitioned on subsriber1.
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 4 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 4");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 5 WHERE a = 6");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|2|5), 'update of tab1_1 replicated');
+is($result, qq(sub1_tab1|4|0|5), 'update of tab1_1, tab1_2 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
 is($result, qq(sub2_tab1_1|2|2|3), 'update of tab1_1 replicated');
 
-# update (partition changes)
+# update (replicated as delete+insert)
 $node_publisher->safe_psql('postgres',
-	"UPDATE tab1 SET a = 6 WHERE a = 2");
+	"UPDATE tab1 SET a = 1 WHERE a = 0");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 4 WHERE a = 1");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
-is($result, qq(sub1_tab1|3|3|6), 'update of tab1 replicated');
+is($result, qq(sub1_tab1|4|2|5), 'update of tab1 (delete from tab1_def + insert into tab1_1) replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_1 GROUP BY 1");
-is($result, qq(sub2_tab1_1|1|3|3), 'delete from tab1_1 replicated');
+is($result, qq(sub2_tab1_1|2|2|3), 'delete from tab1_1 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, count(*), min(a), max(a) FROM tab1_2 GROUP BY 1");
-is($result, qq(sub2_tab1_2|2|5|6), 'insert into tab1_2 replicated');
+is($result, qq(sub2_tab1_2|2|4|5), 'insert into tab1_2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1_def");
+is($result, qq(0||), 'delete from tab1_def replicated');
 
 # delete
 $node_publisher->safe_psql('postgres',
-	"DELETE FROM tab1 WHERE a IN (3, 5)");
+	"DELETE FROM tab1 WHERE a IN (2, 3, 5)");
 $node_publisher->safe_psql('postgres',
 	"DELETE FROM tab1_2");
 
@@ -147,9 +183,9 @@ is($result, qq(0||), 'delete from tab1_2 replicated');
 
 # truncate
 $node_subscriber1->safe_psql('postgres',
-	"INSERT INTO tab1 VALUES (1), (2), (5)");
+	"INSERT INTO tab1 (a) VALUES (1), (2), (5)");
 $node_subscriber2->safe_psql('postgres',
-	"INSERT INTO tab1_2 VALUES (2)");
+	"INSERT INTO tab1_2 (a) VALUES (2)");
 $node_publisher->safe_psql('postgres',
 	"TRUNCATE tab1_2");
 
@@ -175,4 +211,4 @@ $result = $node_subscriber1->safe_psql('postgres',
 is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
-is($result, qq(0||), 'truncate of tab1_1 replicated');
+is($result, qq(0||), 'truncate of tab1 replicated');
-- 
1.8.3.1

v16-0002-Publish-partitioned-table-inserts-as-its-own.patchapplication/octet-stream; name=v16-0002-Publish-partitioned-table-inserts-as-its-own.patchDownload
From 52d19a401b36742f32eba5da45e3340419503b5f Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v16 2/2] Publish partitioned table inserts as its own

To control whether partition changes are replicated using their
own identity (and schema) or an ancestor's, add a new parameter
that can be set per publication named 'publish_using_root_schema'.
---
 doc/src/sgml/logical-replication.sgml       |  11 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   9 +
 src/backend/catalog/pg_publication.c        |  63 ++++++-
 src/backend/commands/publicationcmds.c      |  95 ++++++-----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 +++++++++++++++++++-----
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 ++++++------
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 244 +++++++++++++++++++++++++++-
 17 files changed, 666 insertions(+), 151 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8bd7c9c..a99e90b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -402,15 +402,8 @@
 
    <listitem>
     <para>
-     Replication is only supported by tables, partitioned or not, although a
-     given table must either be partitioned on both servers or not partitioned
-     at all.  Also, when replicating between partitioned tables, the actual
-     replication occurs between leaf partitions, so partitions on the two
-     servers must match one-to-one.
-    </para>
-
-    <para>
-     Attempts to replicate other types of relations such as views, materialized
+     Replication is only supported by tables, partitioned or not.
+     Attempts to replicate other types of relations such as view, materialized
      views, or foreign tables, will result in an error.
     </para>
    </listitem>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..f796d9b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_via_partition_root</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac01..15b8063 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relispartition(relid) &&
+			get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE;
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..0c534a2 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..fde5c4b 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_via_partition_root_given,
+						  bool *publish_via_partition_root)
 {
 	ListCell   *lc;
 
+	*publish_via_partition_root_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_via_partition_root = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_via_partition_root") == 0)
+		{
+			if (*publish_via_partition_root_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_via_partition_root_given = true;
+			*publish_via_partition_root = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_via_partition_root);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_via_partition_root_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_via_partition_root);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c8c88be..1e9a788 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14694,7 +14694,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..f71fd98 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..f48a8fb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index f8e2c6e..cc99118 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,6 +43,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5142,7 +5143,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5161,7 +5162,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 408637c..0a5f13b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3879,11 +3880,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_via_partition_root = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..d12c28b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..cbd6994 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873af..c6c1911 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..a85a6c8 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..5b4e73d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..d844075 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index b0308dc..f3c3fcc 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 17;
+use Test::More tests => 44;
 
 # setup
 
@@ -44,7 +44,6 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
-
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_subscriber1->safe_psql('postgres',
@@ -80,6 +79,8 @@ $node_subscriber1->poll_query_until('postgres', $synced_query)
 $node_subscriber2->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
+# Tests for replication using leaf partition identity and schema
+
 # insert
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1)");
@@ -212,3 +213,242 @@ is($result, qq(0||), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT count(*), min(a), max(a) FROM tab1");
 is($result, qq(0||), 'truncate of tab1 replicated');
+
+# Tests for replication using root table identity and schema
+
+# Publisher
+$node_publisher->safe_psql('postgres',
+	"DROP PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (0, 1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (0, 1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub_all SET (publish_via_partition_root = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_viaroot FOR TABLE tab2, tab3_1 WITH (publish_via_partition_root = true)");
+
+# Subscriber 1
+$node_subscriber1->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (0) TO (10)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_viaroot CONNECTION '$publisher_connstr' PUBLICATION pub_viaroot");
+
+# Subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"DROP TABLE tab1");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub2 REFRESH PUBLICATION");
+
+# Wait for initial sync of all subscriptions
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (0)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (0), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (0), (3), (5)");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|5), 'inserts into tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|5), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|5), 'inserts into tab3 replicated');
+
+# update (replicated as update)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 5");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|6), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|6), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|6), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|6), 'inserts into tab3 replicated');
+
+# update (replicated as delete+insert)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 6");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|3), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|3), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|3), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|3), 'update of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|3), 'update of tab3 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'delete tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'delete from tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3");
+is($result, qq(0||), 'delete from tab3 replicated');
+
+# truncate
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+# these will NOT be replicated
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2, tab2_1, tab3_1");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(3|1|5), 'truncate of tab2_1 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(3|1|5), 'truncate of tab1_2 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(3|1|5), 'truncate of tab2_1 NOT replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1, tab2, tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3");
+is($result, qq(0||), 'truncate of tab3 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
-- 
1.8.3.1

#57Petr Jelinek
petr@2ndquadrant.com
In reply to: Amit Langote (#56)
Re: adding partitioned tables to publications

Hi,

On 03/04/2020 16:25, Amit Langote wrote:

On Fri, Apr 3, 2020 at 6:34 PM Amit Langote <amitlangote09@gmail.com> wrote:

I am checking test coverage at the moment and should have the patches
ready by sometime later today.

Attached updated patches.

I confirmed using a coverage build that all the new code in
logical/worker.c due to 0002 is now covered. For some reason, coverage
report for pgoutput.c doesn't say the same thing for 0003's changes,
although I doubt that result. It seems strange to believe that *none*
of the new code is tested. I even checked by adding debugging elog()s
next to the lines that the coverage report says aren't exercised,
which tell me that that's not true. Perhaps my coverage build is
somehow getting messed up, so it would be nice if someone with
reliable coverage builds can confirm one way or the other. I will
continue to check what's wrong.

AFAIK gcov can't handle multiple instances of same process being started
as it just overwrites the coverage files. So for TAP test it will report
bogus info (as in some code that's executed will look as not executed).
We'd probably have to do some kind of `GCOV_PREFIX` magic in the TAP
framework and merge (gcov/lcov can do that AFAIK) the resulting files to
get accurate coverage info. But that's beyond this patch IMHO.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

#58Tom Lane
tgl@sss.pgh.pa.us
In reply to: Petr Jelinek (#57)
Re: adding partitioned tables to publications

Petr Jelinek <petr@2ndquadrant.com> writes:

AFAIK gcov can't handle multiple instances of same process being started
as it just overwrites the coverage files. So for TAP test it will report
bogus info (as in some code that's executed will look as not executed).

Hm, really? I routinely run "make check" (ie, parallel regression
tests) under coverage, and I get results that seem sane. If I were
losing large chunks of the data, I think I'd have noticed.

regards, tom lane

#59Petr Jelinek
petr@2ndquadrant.com
In reply to: Tom Lane (#58)
Re: adding partitioned tables to publications

On 03/04/2020 16:59, Tom Lane wrote:

Petr Jelinek <petr@2ndquadrant.com> writes:

AFAIK gcov can't handle multiple instances of same process being started
as it just overwrites the coverage files. So for TAP test it will report
bogus info (as in some code that's executed will look as not executed).

Hm, really? I routinely run "make check" (ie, parallel regression
tests) under coverage, and I get results that seem sane. If I were
losing large chunks of the data, I think I'd have noticed.

Parallel regression still just starts single postgres instance no?

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Petr Jelinek (#59)
Re: adding partitioned tables to publications

Petr Jelinek <petr@2ndquadrant.com> writes:

On 03/04/2020 16:59, Tom Lane wrote:

Petr Jelinek <petr@2ndquadrant.com> writes:

AFAIK gcov can't handle multiple instances of same process being started
as it just overwrites the coverage files. So for TAP test it will report
bogus info (as in some code that's executed will look as not executed).

Hm, really? I routinely run "make check" (ie, parallel regression
tests) under coverage, and I get results that seem sane. If I were
losing large chunks of the data, I think I'd have noticed.

Parallel regression still just starts single postgres instance no?

But the forked-off children have to write the gcov files independently,
don't they?

regards, tom lane

#61Petr Jelinek
petr@2ndquadrant.com
In reply to: Tom Lane (#60)
Re: adding partitioned tables to publications

On 03/04/2020 17:51, Tom Lane wrote:

Petr Jelinek <petr@2ndquadrant.com> writes:

On 03/04/2020 16:59, Tom Lane wrote:

Petr Jelinek <petr@2ndquadrant.com> writes:

AFAIK gcov can't handle multiple instances of same process being started
as it just overwrites the coverage files. So for TAP test it will report
bogus info (as in some code that's executed will look as not executed).

Hm, really? I routinely run "make check" (ie, parallel regression
tests) under coverage, and I get results that seem sane. If I were
losing large chunks of the data, I think I'd have noticed.

Parallel regression still just starts single postgres instance no?

But the forked-off children have to write the gcov files independently,
don't they?

Hmm that's very good point. I did see these missing coverage issue when
running tests that explicitly start more instances of postgres before
though. And with some quick googling, parallel testing seems to be issue
with gcov for more people.

I wonder if the program checksum that gcov calculates when merging the
.gcda data while updating it is somehow different for separately started
instances but not for the ones forked from same parent or something. I
don't know internals of gcov well enough to say how exactly that works.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Petr Jelinek (#61)
Re: adding partitioned tables to publications

Petr Jelinek <petr@2ndquadrant.com> writes:

On 03/04/2020 17:51, Tom Lane wrote:

But the forked-off children have to write the gcov files independently,
don't they?

Hmm that's very good point. I did see these missing coverage issue when
running tests that explicitly start more instances of postgres before
though. And with some quick googling, parallel testing seems to be issue
with gcov for more people.

I poked around and found this:

https://gcc.gnu.org/legacy-ml/gcc-help/2005-11/msg00074.html

which says

gcov instrumentation is multi-process safe, but not multi-thread
safe. The multi-processing safety relies on OS level file locking,
which is not available on some systems.

That would explain why it works for me, but then there's a question
of why it doesn't work for you ...

regards, tom lane

#63Petr Jelinek
petr@2ndquadrant.com
In reply to: Tom Lane (#62)
Re: adding partitioned tables to publications

On 04/04/2020 07:25, Tom Lane wrote:

Petr Jelinek <petr@2ndquadrant.com> writes:

On 03/04/2020 17:51, Tom Lane wrote:

But the forked-off children have to write the gcov files independently,
don't they?

Hmm that's very good point. I did see these missing coverage issue when
running tests that explicitly start more instances of postgres before
though. And with some quick googling, parallel testing seems to be issue
with gcov for more people.

I poked around and found this:

https://gcc.gnu.org/legacy-ml/gcc-help/2005-11/msg00074.html

which says

gcov instrumentation is multi-process safe, but not multi-thread
safe. The multi-processing safety relies on OS level file locking,
which is not available on some systems.

That would explain why it works for me, but then there's a question
of why it doesn't work for you ...

Hmm, I wonder if it has something to do with docker then (I rarely run
any tests directly on the main system nowadays). But that does not
explain why it does not work for Amit either.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

#64Amit Langote
amitlangote09@gmail.com
In reply to: Petr Jelinek (#63)
Re: adding partitioned tables to publications

On Sat, Apr 4, 2020 at 5:56 PM Petr Jelinek <petr@2ndquadrant.com> wrote:

On 04/04/2020 07:25, Tom Lane wrote:

Petr Jelinek <petr@2ndquadrant.com> writes:

On 03/04/2020 17:51, Tom Lane wrote:

But the forked-off children have to write the gcov files independently,
don't they?

Hmm that's very good point. I did see these missing coverage issue when
running tests that explicitly start more instances of postgres before
though. And with some quick googling, parallel testing seems to be issue
with gcov for more people.

I poked around and found this:

https://gcc.gnu.org/legacy-ml/gcc-help/2005-11/msg00074.html

which says

gcov instrumentation is multi-process safe, but not multi-thread
safe. The multi-processing safety relies on OS level file locking,
which is not available on some systems.

That would explain why it works for me, but then there's a question
of why it doesn't work for you ...

Hmm, I wonder if it has something to do with docker then (I rarely run
any tests directly on the main system nowadays). But that does not
explain why it does not work for Amit either.

One thing to I must clarify: coverage for most of pgoutput.c looks
okay on each run. I am concerned that the coverage for the code added
by the patch is shown to be close to zero, which is a mystery to me,
because I can confirm by other means such as debugging elogs() to next
to the new code that the newly added tests do cover them.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#65Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#64)
Re: adding partitioned tables to publications

Amit Langote <amitlangote09@gmail.com> writes:

One thing to I must clarify: coverage for most of pgoutput.c looks
okay on each run. I am concerned that the coverage for the code added
by the patch is shown to be close to zero, which is a mystery to me,
because I can confirm by other means such as debugging elogs() to next
to the new code that the newly added tests do cover them.

According to

https://coverage.postgresql.org/src/backend/replication/pgoutput/index.html

the coverage is pretty good. Maybe you're doing something wrong
in enabling coverage testing locally?

regards, tom lane

#66Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#56)
Re: adding partitioned tables to publications

On 2020-04-03 16:25, Amit Langote wrote:

On Fri, Apr 3, 2020 at 6:34 PM Amit Langote <amitlangote09@gmail.com> wrote:

I am checking test coverage at the moment and should have the patches
ready by sometime later today.

Attached updated patches.

Committed 0001 now. I'll work on the rest tomorrow.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#67Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#66)
1 attachment(s)
Re: adding partitioned tables to publications

On Mon, Apr 6, 2020 at 10:25 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-03 16:25, Amit Langote wrote:

On Fri, Apr 3, 2020 at 6:34 PM Amit Langote <amitlangote09@gmail.com> wrote:

I am checking test coverage at the moment and should have the patches
ready by sometime later today.

Attached updated patches.

Committed 0001 now. I'll work on the rest tomorrow.

Thank you. I have rebased the one remaining.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v17-0001-Allow-publishing-partition-changes-via-ancestors.patchapplication/octet-stream; name=v17-0001-Allow-publishing-partition-changes-via-ancestors.patchDownload
From f60442deda9ca4c57a379cc3d06a84e2b4d7d5a0 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v17] Allow publishing partition changes via ancestors

To control whether partition changes are replicated using their
own identity and schema or an ancestor's, add a new parameter
that can be set per publication named 'publish_via_partition_root'.
---
 doc/src/sgml/logical-replication.sgml       |  12 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   9 +
 src/backend/catalog/pg_publication.c        |  63 ++++++-
 src/backend/commands/publicationcmds.c      |  95 ++++++-----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 +++++++++++++++++++-----
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 ++++++------
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 244 +++++++++++++++++++++++++++-
 17 files changed, 672 insertions(+), 146 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c513621..aef4c17 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -411,10 +411,14 @@
    <listitem>
     <para>
      When replicating between partitioned tables, the actual replication
-     originates from the leaf partitions on the publisher, so partitions on
-     the publisher must also exist on the subscriber as valid target tables.
-     (They could either be leaf partitions themselves, or they could be
-     further subpartitioned, or they could even be independent tables.)
+     originates, by default, from the leaf partitions on the publisher, so
+     partitions on the publisher must also exist on the subscriber as valid
+     target tables. (They could either be leaf partitions themselves, or they
+     could be further subpartitioned, or they could even be independent
+     tables.)  Publications can also specify to replicate changes using
+     partitioned table identity and schema instead of that of the individual
+     leaf partitions in which the changes to be replicated actually originate.
+     (See <xref linkend=="sql-createpublication"/>.)
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..f796d9b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_via_partition_root</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac01..15b8063 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relispartition(relid) &&
+			get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE;
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..0c534a2 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubasroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubasroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubasroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubasroot = pubform->pubasroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubasroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubasroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..fde5c4b 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_via_partition_root_given,
+						  bool *publish_via_partition_root)
 {
 	ListCell   *lc;
 
+	*publish_via_partition_root_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_via_partition_root = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_via_partition_root") == 0)
+		{
+			if (*publish_via_partition_root_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_via_partition_root_given = true;
+			*publish_via_partition_root = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubasroot - 1] =
+		BoolGetDatum(publish_via_partition_root);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_via_partition_root_given)
+	{
+		values[Anum_pg_publication_pubasroot - 1] = BoolGetDatum(publish_via_partition_root);
+		replaces[Anum_pg_publication_pubasroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6162fb0..c91e9a3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14746,7 +14746,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..f71fd98 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..f48a8fb 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubasroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubasroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubasroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubasroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubasroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubasroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubasroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubasroot = pub->pubasroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index dfd81f1..05e0e88 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -44,6 +44,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5313,7 +5314,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5332,7 +5333,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubasroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 408637c..0a5f13b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubasroot;
 	int			i,
 				ntups;
 
@@ -3879,11 +3880,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubasroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubasroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubasroot = PQfnumber(res, "pubasroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubasroot =
+			(strcmp(PQgetvalue(res, i, i_pubasroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubasroot)
+		appendPQExpBufferStr(query, ", publish_via_partition_root = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..d12c28b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubasroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..cbd6994 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubasroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubasroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubasroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubasroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubasroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubasroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubasroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubasroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873af..c6c1911 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..a85a6c8 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubasroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubasroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubasroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..5b4e73d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..d844075 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 5db1b21..c670490 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 24;
+use Test::More tests => 51;
 
 # setup
 
@@ -48,7 +48,6 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
-
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_subscriber1->safe_psql('postgres',
@@ -87,6 +86,8 @@ $node_subscriber1->poll_query_until('postgres', $synced_query)
 $node_subscriber2->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
+# Tests for replication using leaf partition identity and schema
+
 # insert
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1)");
@@ -260,3 +261,242 @@ is($result, qq(), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab1 ORDER BY 1");
 is($result, qq(), 'truncate of tab1 replicated');
+
+# Tests for replication using root table identity and schema
+
+# Publisher
+$node_publisher->safe_psql('postgres',
+	"DROP PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (0, 1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (0, 1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub_all SET (publish_via_partition_root = true)");
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_viaroot FOR TABLE tab2, tab3_1 WITH (publish_via_partition_root = true)");
+
+# Subscriber 1
+$node_subscriber1->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (0) TO (10)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_viaroot CONNECTION '$publisher_connstr' PUBLICATION pub_viaroot");
+
+# Subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"DROP TABLE tab1");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3_1', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub2 REFRESH PUBLICATION");
+
+# Wait for initial sync of all subscriptions
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (0)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (0), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (0), (3), (5)");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|5), 'inserts into tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|5), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|5), 'inserts into tab3 replicated');
+
+# update (replicated as update)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 5");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|6), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|6), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|6), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|6), 'inserts into tab3 replicated');
+
+# update (replicated as delete+insert)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 6");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub1_tab2|4|0|3), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3_1 GROUP BY 1");
+is($result, qq(sub1_tab3_1|4|0|3), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab1 GROUP BY 1");
+is($result, qq(sub2_tab1|4|0|3), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab2 GROUP BY 1");
+is($result, qq(sub2_tab2|4|0|3), 'update of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, count(*), min(a), max(a) FROM tab3 GROUP BY 1");
+is($result, qq(sub2_tab3|4|0|3), 'update of tab3 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'delete tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'delete from tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'delete from tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3");
+is($result, qq(0||), 'delete from tab3 replicated');
+
+# truncate
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+# these will NOT be replicated
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2, tab2_1, tab3_1");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(3|1|5), 'truncate of tab2_1 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(3|1|5), 'truncate of tab1_2 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(3|1|5), 'truncate of tab2_1 NOT replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1, tab2, tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab1");
+is($result, qq(0||), 'truncate of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab2");
+is($result, qq(0||), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3");
+is($result, qq(0||), 'truncate of tab3 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT count(*), min(a), max(a) FROM tab3_1");
+is($result, qq(0||), 'truncate of tab3_1 replicated');
-- 
1.8.3.1

#68Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#67)
1 attachment(s)
Re: adding partitioned tables to publications

On Tue, Apr 7, 2020 at 12:04 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Mon, Apr 6, 2020 at 10:25 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-03 16:25, Amit Langote wrote:

On Fri, Apr 3, 2020 at 6:34 PM Amit Langote <amitlangote09@gmail.com> wrote:

I am checking test coverage at the moment and should have the patches
ready by sometime later today.

Attached updated patches.

Committed 0001 now. I'll work on the rest tomorrow.

Thank you. I have rebased the one remaining.

I updated the patch to make the following changes:

* Rewrote the tests to match in style with those committed yesterday
* Renamed all variables that had pubasroot in it to have pubviaroot
instead to match the publication parameter
* Updated pg_publication catalog documentation

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v18-0001-Allow-publishing-partition-changes-via-ancestors.patchapplication/octet-stream; name=v18-0001-Allow-publishing-partition-changes-via-ancestors.patchDownload
From 15747fcbdec6366e6ff4d5c8d448fad227d95fc0 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v18] Allow publishing partition changes via ancestors

To control whether partition changes are replicated using their
own identity and schema or an ancestor's, add a new parameter
that can be set per publication named 'publish_via_partition_root'.
---
 doc/src/sgml/catalogs.sgml                  |  10 +
 doc/src/sgml/logical-replication.sgml       |  12 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   9 +
 src/backend/catalog/pg_publication.c        |  63 +++++-
 src/backend/commands/publicationcmds.c      |  95 +++++----
 src/backend/commands/tablecmds.c            |   2 +-
 src/backend/executor/nodeModifyTable.c      |   4 +
 src/backend/replication/pgoutput/pgoutput.c | 211 ++++++++++++++++----
 src/backend/utils/cache/relcache.c          |   7 +-
 src/bin/pg_dump/pg_dump.c                   |  22 +-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/partition.h             |   1 +
 src/include/catalog/pg_publication.h        |   7 +-
 src/test/regress/expected/publication.out   | 103 +++++-----
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 298 +++++++++++++++++++++++++++-
 18 files changed, 736 insertions(+), 146 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 64614b5..4e9cd7f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5437,6 +5437,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       <entry>If true, <command>TRUNCATE</command> operations are replicated for
        tables in the publication.</entry>
      </row>
+
+     <row>
+      <entry><structfield>pubviaroot</structfield></entry>
+      <entry><type>bool</type></entry>
+      <entry></entry>
+      <entry>If true, operations on a leaf partition are replicated using the
+       identity and schema of its topmost partitioned ancestor mentioned in the
+       publication instead of its own.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c513621..7d34e7d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -411,10 +411,14 @@
    <listitem>
     <para>
      When replicating between partitioned tables, the actual replication
-     originates from the leaf partitions on the publisher, so partitions on
-     the publisher must also exist on the subscriber as valid target tables.
-     (They could either be leaf partitions themselves, or they could be
-     further subpartitioned, or they could even be independent tables.)
+     originates, by default, from the leaf partitions on the publisher, so
+     partitions on the publisher must also exist on the subscriber as valid
+     target tables. (They could either be leaf partitions themselves, or they
+     could be further subpartitioned, or they could even be independent
+     tables.)  Publications can also specify changes to be replicated using
+     partitioned table identity and schema instead of that of the individual
+     leaf partitions in which the changes actually originate.
+     (See <xref linkend=="sql-createpublication"/>.)
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..f796d9b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_via_partition_root</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac01..15b8063 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
@@ -126,6 +127,14 @@ get_partition_ancestors(Oid relid)
 	return result;
 }
 
+/* Is given relation a leaf partition? */
+bool
+is_leaf_partition(Oid relid)
+{
+	return	get_rel_relispartition(relid) &&
+			get_rel_relkind(relid) != RELKIND_PARTITIONED_TABLE;
+}
+
 /*
  * get_partition_ancestors_worker
  *		recursive worker for get_partition_ancestors
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..7ba873d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -220,13 +220,30 @@ publication_add_relation(Oid pubid, Relation targetrel,
 /*
  * Gets list of publication oids for a relation, plus those of ancestors,
  * if any, if the relation is a partition.
+ *
+ * *published_rels, if asked for, will contain the OID of the relation for
+ * each publication returned, that is, of the relation that is actually
+ * published.  Examining this list allows the caller, for instance, to
+ * distinguish publications that it is directly part from those that it is
+ * indirectly part of via an ancestor.
  */
 List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, List **published_rels)
 {
 	List	   *result = NIL;
+	int			i,
+				num;
+
+	if (published_rels)
+		*published_rels = NIL;
 
 	result = get_rel_publications(relid);
+	if (published_rels)
+	{
+		num = list_length(result);
+		for (i = 0; i < num; i++)
+			*published_rels = lappend_oid(*published_rels, relid);
+	}
 	if (get_rel_relispartition(relid))
 	{
 		List	   *ancestors = get_partition_ancestors(relid);
@@ -238,6 +255,12 @@ GetRelationPublications(Oid relid)
 			List	   *ancestor_pubs = get_rel_publications(ancestor);
 
 			result = list_concat(result, ancestor_pubs);
+			if (published_rels)
+			{
+				num = list_length(ancestor_pubs);
+				for (i = 0; i < num; i++)
+					*published_rels = lappend_oid(*published_rels, ancestor);
+			}
 		}
 	}
 
@@ -373,9 +396,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubviaroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +424,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubviaroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubviaroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +483,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubviaroot = pubform->pubviaroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +584,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubviaroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubviaroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..ffc5ab2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_via_partition_root_given,
+						  bool *publish_via_partition_root)
 {
 	ListCell   *lc;
 
+	*publish_via_partition_root_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_via_partition_root = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_via_partition_root") == 0)
+		{
+			if (*publish_via_partition_root_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_via_partition_root_given = true;
+			*publish_via_partition_root = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubviaroot - 1] =
+		BoolGetDatum(publish_via_partition_root);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_via_partition_root_given)
+	{
+		values[Anum_pg_publication_pubviaroot - 1] = BoolGetDatum(publish_via_partition_root);
+		replaces[Anum_pg_publication_pubviaroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6162fb0..c91e9a3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14746,7 +14746,7 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 	 * UNLOGGED as UNLOGGED tables can't be published.
 	 */
 	if (!toLogged &&
-		list_length(GetRelationPublications(RelationGetRelid(rel))) > 0)
+		list_length(GetRelationPublications(RelationGetRelid(rel), NULL)) > 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("cannot change table \"%s\" to unlogged because it is part of a publication",
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..f71fd98 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,8 +2320,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
+		/* Only necessary to check replication identity. */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..c017f67 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,33 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * True when publication that is matched by get_rel_sync_entry for this
+	 * relation is configured as such.
+	 */
+	bool		pubviaroot;
+
+	/*
+	 * OID of the ancestor whose schema will be used when replicating changes
+	 * to a partition; InvalidOid if pubviaroot is false.
+	 */
+	Oid			replicate_as_relid;
+
+	/*
+	 * Map, if any, used when replicating using an ancestor's schema to
+	 * convert the tuples from partition's type to the ancestor's; NULL if
+	 * pubviaroot is false.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +287,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (OidIsValid(relentry->replicate_as_relid))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->replicate_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +399,68 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubviaroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubviaroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Publish as root relation change if requested. */
+				if (OidIsValid(relentry->replicate_as_relid))
+				{
+					Assert(relentry->pubviaroot);
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->replicate_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -413,9 +506,10 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 		/*
 		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * sent instead, unless user specified to send the former.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			!relentry->pubviaroot)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,7 +634,8 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
@@ -562,8 +657,10 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	/* Not found means schema wasn't sent */
 	if (!found || !entry->replicate_valid)
 	{
-		List	   *pubids = GetRelationPublications(relid);
+		List	   *published_rels = NIL;
+		List	   *pubids = GetRelationPublications(relid, &published_rels);
 		ListCell   *lc;
+		Oid			ancestor = InvalidOid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,13 +685,42 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubviaroot && get_rel_relispartition(relid))
+					ancestor = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				ListCell *lc1,
+						 *lc2;
+
+				forboth(lc1, pubids, lc2, published_rels)
+				{
+					Oid		pubid = lfirst_oid(lc1);
+					Oid		pub_relid = lfirst_oid(lc2);
+					if (pubid == pub->oid)
+					{
+						publish = true;
+						if (pub->pubviaroot && pub_relid != relid)
+							ancestor = pub_relid;
+						break;
+					}
+				}
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
 				entry->pubactions.pubdelete |= pub->pubactions.pubdelete;
-				entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				if (!OidIsValid(ancestor))
+					entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
+				entry->pubviaroot = pub->pubviaroot;
 			}
 
 			if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
@@ -604,6 +730,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->replicate_as_relid = ancestor;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index dfd81f1..872569f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -44,6 +44,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5313,7 +5314,7 @@ GetRelationPublicationActions(Relation relation)
 					  sizeof(PublicationActions));
 
 	/* Fetch the publication membership info. */
-	puboids = GetRelationPublications(RelationGetRelid(relation));
+	puboids = GetRelationPublications(RelationGetRelid(relation), NULL);
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
@@ -5332,7 +5333,9 @@ GetRelationPublicationActions(Relation relation)
 		pubactions->pubinsert |= pubform->pubinsert;
 		pubactions->pubupdate |= pubform->pubupdate;
 		pubactions->pubdelete |= pubform->pubdelete;
-		pubactions->pubtruncate |= pubform->pubtruncate;
+		if (!pubform->pubviaroot ||
+			!is_leaf_partition(RelationGetRelid(relation)))
+			pubactions->pubtruncate |= pubform->pubtruncate;
 
 		ReleaseSysCache(tup);
 
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 408637c..4d03608 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubviaroot;
 	int			i,
 				ntups;
 
@@ -3879,11 +3880,18 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubviaroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubviaroot = PQfnumber(res, "pubviaroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubviaroot =
+			(strcmp(PQgetvalue(res, i, i_pubviaroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubviaroot)
+		appendPQExpBufferStr(query, ", publish_via_partition_root = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..61c909e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubviaroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..c731ed6 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubviaroot AS \"%s\"",
+						  gettext_noop("Publishes Using Root Schema"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubviaroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubviaroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubviaroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubviaroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubviaroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubviaroot)
+			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubviaroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 27873af..c6c1911 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -21,6 +21,7 @@
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern bool is_leaf_partition(Oid relid);
 extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 									 Relation to_rel, Relation from_rel);
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..afa006f 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubviaroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,12 +76,13 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubviaroot;
 	PublicationActions pubactions;
 } Publication;
 
 extern Publication *GetPublication(Oid pubid);
 extern Publication *GetPublicationByName(const char *pubname, bool missing_ok);
-extern List *GetRelationPublications(Oid relid);
+extern List *GetRelationPublications(Oid relid, List **published_rels);
 
 /*---------
  * Expected values for pub_partopt parameter of GetRelationPublications(),
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubviaroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..5b4e73d 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                                        List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                                       Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                             Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
+                                         Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                          Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                          Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                                    List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                                       List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..d844075 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 5db1b21..beb708b 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 24;
+use Test::More tests => 51;
 
 # setup
 
@@ -48,7 +48,6 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
-
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_subscriber1->safe_psql('postgres',
@@ -87,6 +86,8 @@ $node_subscriber1->poll_query_until('postgres', $synced_query)
 $node_subscriber2->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
+# Tests for replication using leaf partition identity and schema
+
 # insert
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1)");
@@ -260,3 +261,296 @@ is($result, qq(), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab1 ORDER BY 1");
 is($result, qq(), 'truncate of tab1 replicated');
+
+# Tests for replication using root table identity and schema
+
+# Publisher
+$node_publisher->safe_psql('postgres',
+	"DROP PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (0, 1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (0, 1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub_all SET (publish_via_partition_root = true)");
+# Note: tab3_1's parent is not in the publication, in which case its
+# changes are published using own identity.
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_viaroot FOR TABLE tab2, tab3_1 WITH (publish_via_partition_root = true)");
+
+# Subscriber 1
+$node_subscriber1->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (0) TO (10)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_viaroot CONNECTION '$publisher_connstr' PUBLICATION pub_viaroot");
+
+# Subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"DROP TABLE tab1");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+# Note: tab1's partitions are named tab1_1 and tab1_2 on the publisher.
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3_1', b text)");
+# Publication that sub2 points to now publishes via root, so must update
+# subscription target relations.
+$node_subscriber2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub2 REFRESH PUBLICATION");
+
+# Wait for initial sync of all subscriptions
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (0)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (0), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (0), (3), (5)");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|3
+sub1_tab2|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|3
+sub1_tab3_1|5), 'inserts into tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|3
+sub2_tab1|5), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|3
+sub2_tab2|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|3
+sub2_tab3|5), 'inserts into tab3 replicated');
+
+# update (replicated as update)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 5");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|3
+sub1_tab2|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|3
+sub1_tab3_1|6), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|3
+sub2_tab1|6), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|3
+sub2_tab2|6), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|3
+sub2_tab3|6), 'inserts into tab3 replicated');
+
+# update (replicated as delete+insert)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 6");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|2
+sub1_tab2|3), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|2
+sub1_tab3_1|3), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|2
+sub2_tab1|3), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|2
+sub2_tab2|3), 'update of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|2
+sub2_tab3|3), 'update of tab3 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'delete tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1");
+is($result, qq(), 'delete from tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'delete from tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3");
+is($result, qq(), 'delete from tab3 replicated');
+
+# truncate
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+# these will NOT be replicated
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2, tab2_1, tab3_1");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab2_1 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab1_2 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab2_1 NOT replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1, tab2, tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1");
+is($result, qq(), 'truncate of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3");
+is($result, qq(), 'truncate of tab3 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3_1");
+is($result, qq(), 'truncate of tab3_1 replicated');
-- 
1.8.3.1

#69Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#68)
2 attachment(s)
Re: adding partitioned tables to publications

On 2020-04-07 08:44, Amit Langote wrote:

I updated the patch to make the following changes:

* Rewrote the tests to match in style with those committed yesterday
* Renamed all variables that had pubasroot in it to have pubviaroot
instead to match the publication parameter
* Updated pg_publication catalog documentation

Thanks. I have some further questions:

The change in nodeModifyTable.c to add CheckValidResultRel() is unclear.
It doesn't seem to do anything, and it's not clear how it's related to
this patch.

The changes in GetRelationPublications() are confusing to me:

+   if (published_rels)
+   {
+       num = list_length(result);
+       for (i = 0; i < num; i++)
+           *published_rels = lappend_oid(*published_rels, relid);
+   }

This adds relid to the output list "num" times, where num is the number
of publications found. Shouldn't "i" be used in the loop somehow?
Similarly later in the function.

The descriptions of the new fields in RelationSyncEntry don't seem to
match the code accurately, or at least it's confusing.
replicate_as_relid is always filled in with an ancestor, even if
pubviaroot is not set.

I think the pubviaroot field is actually not necessary. We only need
replicate_as_relid.

There is a markup typo in logical-replication.sgml:

<xref linkend=="sql-createpublication"/>

In pg_dump, you missed updating a branch for an older version. See
attached patch.

Also attached a patch to rephrase the psql output a bit to make it not
so long.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-fixup-Allow-publishing-partition-changes-via-ancesto.patchtext/plain; charset=UTF-8; name=0001-fixup-Allow-publishing-partition-changes-via-ancesto.patch; x-mac-creator=0; x-mac-type=0Download
From 572549aa23d4e3fa2d1abc0733d33f28cb692c40 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 7 Apr 2020 10:56:11 +0200
Subject: [PATCH 1/2] fixup! Allow publishing partition changes via ancestors

---
 src/bin/pg_dump/pg_dump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4d03608596..c579227b19 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3891,14 +3891,14 @@ getPublications(Archive *fout)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false as pubviaroot "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false AS pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate, false AS pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 
-- 
2.26.0

0002-fixup-Allow-publishing-partition-changes-via-ancesto.patchtext/plain; charset=UTF-8; name=0002-fixup-Allow-publishing-partition-changes-via-ancesto.patch; x-mac-creator=0; x-mac-type=0Download
From 3e12e0ff53e4d1aa5b78b6b7fa181e79ca280ef0 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 7 Apr 2020 10:58:55 +0200
Subject: [PATCH 2/2] fixup! Allow publishing partition changes via ancestors

---
 src/bin/psql/describe.c                   |  4 +-
 src/test/regress/expected/publication.out | 72 +++++++++++------------
 2 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c731ed6322..f05e914b4d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5741,7 +5741,7 @@ listPublications(const char *pattern)
 	if (pset.sversion >= 130000)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubviaroot AS \"%s\"",
-						  gettext_noop("Publishes Using Root Schema"));
+						  gettext_noop("Via root"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5874,7 +5874,7 @@ describePublications(const char *pattern)
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
 		if (has_pubviaroot)
-			printTableAddHeader(&cont, gettext_noop("Publishes Using Root Schema"), true, align);
+			printTableAddHeader(&cont, gettext_noop("Via root"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 5b4e73d91b..63d6ab7a4e 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -28,18 +28,18 @@ ERROR:  unrecognized "publish" value: "cluster"
 CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
 ERROR:  conflicting or redundant options
 \dRp
-                                                        List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                              List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------
  testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
  testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                                        List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                              List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------
  testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
  testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
@@ -85,9 +85,9 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                                       Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                              Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
@@ -100,18 +100,18 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                                             Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                    Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                                             Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                    Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
@@ -131,9 +131,9 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                                         Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                               Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
@@ -147,9 +147,9 @@ ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 UPDATE testpub_parted1 SET a = 1;
 ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
 \dRp+ testpub_forparted
-                                         Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                               Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | t         | t
 Tables:
     "public.testpub_parted"
@@ -170,9 +170,9 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                                          Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                 Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
@@ -211,9 +211,9 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                                          Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
@@ -258,9 +258,9 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                                          Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
  regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
@@ -271,9 +271,9 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                                    List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
--------------+--------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                           List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+-------------+--------------------------+------------+---------+---------+---------+-----------+----------
  testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
@@ -281,9 +281,9 @@ ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                                       List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Publishes Using Root Schema 
------------------+---------------------------+------------+---------+---------+---------+-----------+-----------------------------
+                                             List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+----------
  testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
-- 
2.26.0

#70Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#69)
1 attachment(s)
Re: adding partitioned tables to publications

Thanks for the review.

On Tue, Apr 7, 2020 at 6:01 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-07 08:44, Amit Langote wrote:

I updated the patch to make the following changes:

* Rewrote the tests to match in style with those committed yesterday
* Renamed all variables that had pubasroot in it to have pubviaroot
instead to match the publication parameter
* Updated pg_publication catalog documentation

Thanks. I have some further questions:

The change in nodeModifyTable.c to add CheckValidResultRel() is unclear.
It doesn't seem to do anything, and it's not clear how it's related to
this patch.

CheckValidResultRel() checks that replica identity is present for
replicating given update/delete, which I think, it's better to perform
on the root table itself, rather than some partition that would be
affected. The latter already occurs by way of CheckValidResultRel()
being called on partitions to be updated. I think we get a more
helpful message if the root parent is flagged instead of a partition.

update prt1 set b = b + 1 where a = 578;
ERROR: cannot update table "prt1" because it does not have a replica
identity and publishes updates
HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

vs.

-- checking the partition
update prt1 set b = b + 1 where a = 578;
ERROR: cannot update table "prt1_p3" because it does not have a
replica identity and publishes updates
HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

I am okay to get rid of the check on root table if flagging invidual
partitions seems good enough.

The changes in GetRelationPublications() are confusing to me:

+   if (published_rels)
+   {
+       num = list_length(result);
+       for (i = 0; i < num; i++)
+           *published_rels = lappend_oid(*published_rels, relid);
+   }

This adds relid to the output list "num" times, where num is the number
of publications found. Shouldn't "i" be used in the loop somehow?
Similarly later in the function.

published_rels contains an *OID* for each publication that will be in
result. Callers should iterate the two lists together and for each
publication found in result, it will know which relation it is
associated with using the OID found in published_rels being scanned in
parallel. If publishing through an ancestor's publication, we need to
know which ancestor, so the whole dance.

I have thought this to be a bit ugly before, but after having to
explain it, I think it's better to use some other approach for this.
I have updated the patch so that GetRelationPublications no longer
considers a relation's ancestors. That way, it doesn't have to
second-guess what other information will be needed by the caller.

I hope that's clearer, because all the logic is in one place and that
is get_rel_sync_entry().

The descriptions of the new fields in RelationSyncEntry don't seem to
match the code accurately, or at least it's confusing.
replicate_as_relid is always filled in with an ancestor, even if
pubviaroot is not set.

Given this confusion, I have changed how replicate_as_relid works so
that it's now always set -- if different from the relation's own OID,
the code for "publishing via root" kicks in in various places.

I think the pubviaroot field is actually not necessary. We only need
replicate_as_relid.

Looking through the code, I agree. I guess I only kept it around to
go with pubupdate, etc.

I guess it might also be a good idea to call it publish_as_relid
instead of replicate_as_relid for consistency.

There is a markup typo in logical-replication.sgml:

<xref linkend=="sql-createpublication"/>

Oops, fixed.

In pg_dump, you missed updating a branch for an older version. See
attached patch.

Also attached a patch to rephrase the psql output a bit to make it not
so long.

Thank you, merged.

Attached updated patch with above changes.

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v19-0001-Allow-publishing-partition-changes-via-ancestors.patchapplication/octet-stream; name=v19-0001-Allow-publishing-partition-changes-via-ancestors.patchDownload
From baa9fc0cd40f040e59070c8e479afa3a16a74409 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v19] Allow publishing partition changes via ancestors

To control whether partition changes are replicated using their
own identity and schema or an ancestor's, add a new parameter
that can be set per publication named 'publish_via_partition_root'.
---
 doc/src/sgml/catalogs.sgml                  |  10 +
 doc/src/sgml/logical-replication.sgml       |  12 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/partition.c             |   1 +
 src/backend/catalog/pg_publication.c        |  70 +++----
 src/backend/commands/publicationcmds.c      |  95 +++++----
 src/backend/executor/nodeModifyTable.c      |   9 +
 src/backend/replication/pgoutput/pgoutput.c | 226 +++++++++++++++++----
 src/backend/utils/cache/relcache.c          |  15 ++
 src/bin/pg_dump/pg_dump.c                   |  24 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/pg_publication.h        |   5 +-
 src/test/regress/expected/publication.out   | 103 +++++-----
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 298 +++++++++++++++++++++++++++-
 16 files changed, 734 insertions(+), 172 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 64614b5..4e9cd7f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5437,6 +5437,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       <entry>If true, <command>TRUNCATE</command> operations are replicated for
        tables in the publication.</entry>
      </row>
+
+     <row>
+      <entry><structfield>pubviaroot</structfield></entry>
+      <entry><type>bool</type></entry>
+      <entry></entry>
+      <entry>If true, operations on a leaf partition are replicated using the
+       identity and schema of its topmost partitioned ancestor mentioned in the
+       publication instead of its own.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c513621..3c34d36 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -411,10 +411,14 @@
    <listitem>
     <para>
      When replicating between partitioned tables, the actual replication
-     originates from the leaf partitions on the publisher, so partitions on
-     the publisher must also exist on the subscriber as valid target tables.
-     (They could either be leaf partitions themselves, or they could be
-     further subpartitioned, or they could even be independent tables.)
+     originates, by default, from the leaf partitions on the publisher, so
+     partitions on the publisher must also exist on the subscriber as valid
+     target tables. (They could either be leaf partitions themselves, or they
+     could be further subpartitioned, or they could even be independent
+     tables.)  Publications can also specify changes to be replicated using
+     partitioned table identity and schema instead of that of the individual
+     leaf partitions in which the changes actually originate.
+     (See <xref linkend="sql-createpublication"/>.)
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..f796d9b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_via_partition_root</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 239ac01..d1fe6e7 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/partcache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..68f6887 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -42,8 +42,6 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
-static List *get_rel_publications(Oid relid);
-
 /*
  * Check if relation can be in given publication and throws appropriate
  * error if not.
@@ -216,39 +214,11 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	return myself;
 }
 
-
-/*
- * Gets list of publication oids for a relation, plus those of ancestors,
- * if any, if the relation is a partition.
- */
+/* Gets list of publication oids for a relation */
 List *
 GetRelationPublications(Oid relid)
 {
 	List	   *result = NIL;
-
-	result = get_rel_publications(relid);
-	if (get_rel_relispartition(relid))
-	{
-		List	   *ancestors = get_partition_ancestors(relid);
-		ListCell   *lc;
-
-		foreach(lc, ancestors)
-		{
-			Oid			ancestor = lfirst_oid(lc);
-			List	   *ancestor_pubs = get_rel_publications(ancestor);
-
-			result = list_concat(result, ancestor_pubs);
-		}
-	}
-
-	return result;
-}
-
-/* Workhorse of GetRelationPublications() */
-static List *
-get_rel_publications(Oid relid)
-{
-	List	   *result = NIL;
 	CatCList   *pubrellist;
 	int			i;
 
@@ -373,9 +343,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubviaroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +371,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubviaroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubviaroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +430,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubviaroot = pubform->pubviaroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +531,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubviaroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubviaroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..ffc5ab2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_via_partition_root_given,
+						  bool *publish_via_partition_root)
 {
 	ListCell   *lc;
 
+	*publish_via_partition_root_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_via_partition_root = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_via_partition_root") == 0)
+		{
+			if (*publish_via_partition_root_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_via_partition_root_given = true;
+			*publish_via_partition_root = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubviaroot - 1] =
+		BoolGetDatum(publish_via_partition_root);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_via_partition_root_given)
+	{
+		values[Anum_pg_publication_pubviaroot - 1] = BoolGetDatum(publish_via_partition_root);
+		replaces[Anum_pg_publication_pubviaroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..c312c7f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,9 +2320,18 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
 
+		/*
+		 * Check replication identity. Checking for partitions would suffice,
+		 * as we do below, but checking for the root relation provides a more
+		 * useful error message if the required replica identity is not there.
+		 */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
+
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
 
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..ee54a17 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,30 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * OID of the relation to publish changes as.  For a partition, this may
+	 * be set to one of its ancestors whose schema will be used when
+	 * replicating changes, if publish_via_partition_root is set for the
+	 * publication.
+	 */
+	Oid			publish_as_relid;
+
+	/*
+	 * Map used when replicating using an ancestor's schema to convert tuples
+	 * from partition's type to the ancestor's; NULL if publish_as_relid is
+	 * same as 'relid' or if unnecessary due to partition and the ancestor
+	 * having identical TupleDesc.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +284,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (relentry->publish_as_relid != RelationGetRelid(relation))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->publish_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +396,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Switch relation if publishing via root. */
+				if (relentry->publish_as_relid != RelationGetRelid(relation))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->publish_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Switch relation if publishing via root. */
+				if (relentry->publish_as_relid != RelationGetRelid(relation))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->publish_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Switch relation if publishing via root. */
+				if (relentry->publish_as_relid != RelationGetRelid(relation))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->publish_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -412,10 +499,19 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 			continue;
 
 		/*
-		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * Don't send partitioned tables unless publication wants to send
+		 * only the root tables, because partitions will be sent instead.
+		 */
+		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+			relentry->publish_as_relid != relid)
+			continue;
+
+		/*
+		 * Don't send partitions if the publication wants to send only the
+		 * root tables through it.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relispartition &&
+			relentry->publish_as_relid != relid)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,12 +636,14 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
 {
 	RelationSyncEntry *entry;
+	bool		am_partition = get_rel_relispartition(relid);
 	bool		found;
 	MemoryContext oldctx;
 
@@ -564,6 +662,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	{
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc;
+		Oid			publish_as_relid = relid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,8 +687,52 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubviaroot)
+					publish_as_relid = am_partition ?
+								llast_oid(get_partition_ancestors(relid)) :
+								relid;
+			}
+
+			if (!publish)
+			{
+				bool	ancestor_published = false;
+
+				/*
+				 * For a partition, check if any of the ancestors are
+				 * published.  If so, note down the topmost ancestor that is
+				 * published via this publication, which will be used as the
+				 * relation via which to publish the partition's changes.
+				 */
+				if (am_partition)
+				{
+					List   *ancestors = get_partition_ancestors(relid);
+					ListCell *lc2;
+
+					/* Find the "topmost" ancestor that is in this publication. */
+					foreach(lc2, ancestors)
+					{
+						Oid		ancestor = lfirst_oid(lc2);
+
+						if (list_member_oid(GetRelationPublications(ancestor),
+											pub->oid))
+						{
+							ancestor_published = true;
+							if (pub->pubviaroot)
+								publish_as_relid = ancestor;
+						}
+					}
+				}
+
+				if (list_member_oid(pubids, pub->oid) || ancestor_published)
+					publish = true;
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			if (publish)
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
@@ -604,6 +747,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->publish_as_relid = publish_as_relid;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index dfd81f1..9f1f11d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -44,6 +44,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5314,6 +5315,20 @@ GetRelationPublicationActions(Relation relation)
 
 	/* Fetch the publication membership info. */
 	puboids = GetRelationPublications(RelationGetRelid(relation));
+	if (relation->rd_rel->relispartition)
+	{
+		/* Add publications that the ancestors are in too. */
+		List   *ancestors = get_partition_ancestors(RelationGetRelid(relation));
+		ListCell *lc;
+
+		foreach(lc, ancestors)
+		{
+			Oid		ancestor = lfirst_oid(lc);
+
+			puboids = list_concat_unique_oid(puboids,
+											 GetRelationPublications(ancestor));
+		}
+	}
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 408637c..c579227 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubviaroot;
 	int			i,
 				ntups;
 
@@ -3879,18 +3880,25 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubviaroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false AS pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate, false AS pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubviaroot = PQfnumber(res, "pubviaroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubviaroot =
+			(strcmp(PQgetvalue(res, i, i_pubviaroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubviaroot)
+		appendPQExpBufferStr(query, ", publish_via_partition_root = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..61c909e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubviaroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..f05e914 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubviaroot AS \"%s\"",
+						  gettext_noop("Via root"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubviaroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubviaroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubviaroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubviaroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubviaroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubviaroot)
+			printTableAddHeader(&cont, gettext_noop("Via root"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubviaroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..ec02f48 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubviaroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,6 +76,7 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubviaroot;
 	PublicationActions pubactions;
 } Publication;
 
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubviaroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..63d6ab7 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                              List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                              List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                              Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                    Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                    Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                               Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
+                               Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                 Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                           List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+-------------+--------------------------+------------+---------+---------+---------+-----------+----------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                             List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+----------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..d844075 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 5db1b21..beb708b 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 24;
+use Test::More tests => 51;
 
 # setup
 
@@ -48,7 +48,6 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
-
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_subscriber1->safe_psql('postgres',
@@ -87,6 +86,8 @@ $node_subscriber1->poll_query_until('postgres', $synced_query)
 $node_subscriber2->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
+# Tests for replication using leaf partition identity and schema
+
 # insert
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1)");
@@ -260,3 +261,296 @@ is($result, qq(), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab1 ORDER BY 1");
 is($result, qq(), 'truncate of tab1 replicated');
+
+# Tests for replication using root table identity and schema
+
+# Publisher
+$node_publisher->safe_psql('postgres',
+	"DROP PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (0, 1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (0, 1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub_all SET (publish_via_partition_root = true)");
+# Note: tab3_1's parent is not in the publication, in which case its
+# changes are published using own identity.
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_viaroot FOR TABLE tab2, tab3_1 WITH (publish_via_partition_root = true)");
+
+# Subscriber 1
+$node_subscriber1->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (0) TO (10)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_viaroot CONNECTION '$publisher_connstr' PUBLICATION pub_viaroot");
+
+# Subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"DROP TABLE tab1");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+# Note: tab1's partitions are named tab1_1 and tab1_2 on the publisher.
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3_1', b text)");
+# Publication that sub2 points to now publishes via root, so must update
+# subscription target relations.
+$node_subscriber2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub2 REFRESH PUBLICATION");
+
+# Wait for initial sync of all subscriptions
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (0)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (0), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (0), (3), (5)");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|3
+sub1_tab2|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|3
+sub1_tab3_1|5), 'inserts into tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|3
+sub2_tab1|5), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|3
+sub2_tab2|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|3
+sub2_tab3|5), 'inserts into tab3 replicated');
+
+# update (replicated as update)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 5");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|3
+sub1_tab2|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|3
+sub1_tab3_1|6), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|3
+sub2_tab1|6), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|3
+sub2_tab2|6), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|3
+sub2_tab3|6), 'inserts into tab3 replicated');
+
+# update (replicated as delete+insert)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 6");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|2
+sub1_tab2|3), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|2
+sub1_tab3_1|3), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|2
+sub2_tab1|3), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|2
+sub2_tab2|3), 'update of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|2
+sub2_tab3|3), 'update of tab3 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'delete tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1");
+is($result, qq(), 'delete from tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'delete from tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3");
+is($result, qq(), 'delete from tab3 replicated');
+
+# truncate
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+# these will NOT be replicated
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2, tab2_1, tab3_1");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab2_1 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab1_2 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab2_1 NOT replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1, tab2, tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1");
+is($result, qq(), 'truncate of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3");
+is($result, qq(), 'truncate of tab3 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3_1");
+is($result, qq(), 'truncate of tab3_1 replicated');
-- 
1.8.3.1

#71Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#70)
2 attachment(s)
Re: adding partitioned tables to publications

On Wed, Apr 8, 2020 at 1:22 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Apr 7, 2020 at 6:01 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

The descriptions of the new fields in RelationSyncEntry don't seem to
match the code accurately, or at least it's confusing.
replicate_as_relid is always filled in with an ancestor, even if
pubviaroot is not set.

Given this confusion, I have changed how replicate_as_relid works so
that it's now always set -- if different from the relation's own OID,
the code for "publishing via root" kicks in in various places.

I think the pubviaroot field is actually not necessary. We only need
replicate_as_relid.

Looking through the code, I agree. I guess I only kept it around to
go with pubupdate, etc.

Think I broke truncate replication with this. Fixed in the attached
updated patch.

--

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v20-0001-Allow-publishing-partition-changes-via-ancestors.patchapplication/octet-stream; name=v20-0001-Allow-publishing-partition-changes-via-ancestors.patchDownload
From 740955c9162ffa66f1a844cd8dbd51cbeec51e10 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 29 Nov 2019 17:40:11 +0900
Subject: [PATCH v20] Allow publishing partition changes via ancestors

To control whether partition changes are replicated using their
own identity and schema or an ancestor's, add a new parameter
that can be set per publication named 'publish_via_partition_root'.
---
 doc/src/sgml/catalogs.sgml                  |  10 +
 doc/src/sgml/logical-replication.sgml       |  12 +-
 doc/src/sgml/ref/create_publication.sgml    |  17 ++
 src/backend/catalog/pg_publication.c        |  70 +++----
 src/backend/commands/publicationcmds.c      |  95 +++++----
 src/backend/executor/nodeModifyTable.c      |   9 +
 src/backend/replication/pgoutput/pgoutput.c | 223 +++++++++++++++++----
 src/backend/utils/cache/relcache.c          |  15 ++
 src/bin/pg_dump/pg_dump.c                   |  24 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/psql/describe.c                     |  17 +-
 src/include/catalog/pg_publication.h        |   5 +-
 src/test/regress/expected/publication.out   | 103 +++++-----
 src/test/regress/sql/publication.sql        |   3 +
 src/test/subscription/t/013_partition.pl    | 298 +++++++++++++++++++++++++++-
 15 files changed, 730 insertions(+), 172 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 0d61d98..386c6d7 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5437,6 +5437,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       <entry>If true, <command>TRUNCATE</command> operations are replicated for
        tables in the publication.</entry>
      </row>
+
+     <row>
+      <entry><structfield>pubviaroot</structfield></entry>
+      <entry><type>bool</type></entry>
+      <entry></entry>
+      <entry>If true, operations on a leaf partition are replicated using the
+       identity and schema of its topmost partitioned ancestor mentioned in the
+       publication instead of its own.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c513621..3c34d36 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -411,10 +411,14 @@
    <listitem>
     <para>
      When replicating between partitioned tables, the actual replication
-     originates from the leaf partitions on the publisher, so partitions on
-     the publisher must also exist on the subscriber as valid target tables.
-     (They could either be leaf partitions themselves, or they could be
-     further subpartitioned, or they could even be independent tables.)
+     originates, by default, from the leaf partitions on the publisher, so
+     partitions on the publisher must also exist on the subscriber as valid
+     target tables. (They could either be leaf partitions themselves, or they
+     could be further subpartitioned, or they could even be independent
+     tables.)  Publications can also specify changes to be replicated using
+     partitioned table identity and schema instead of that of the individual
+     leaf partitions in which the changes actually originate.
+     (See <xref linkend="sql-createpublication"/>.)
     </para>
    </listitem>
   </itemizedlist>
diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 597cb28..f796d9b 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -123,6 +123,23 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>publish_via_partition_root</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          This parameter determines whether DML operations on a partitioned
+          table (or on its partitions) contained in the publication will be
+          published using its own schema rather than of the individual
+          partitions which are actually changed; the latter is the default.
+          Setting it to <literal>true</literal> allows the changes to be
+          replicated into a non-partitioned table or a partitioned table
+          consisting of a different set of partitions.  However,
+          <literal>TRUNCATE</literal> operations performed directly on
+          partitions are not replicated.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </para>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 500a5ae..68f6887 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -42,8 +42,6 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
-static List *get_rel_publications(Oid relid);
-
 /*
  * Check if relation can be in given publication and throws appropriate
  * error if not.
@@ -216,39 +214,11 @@ publication_add_relation(Oid pubid, Relation targetrel,
 	return myself;
 }
 
-
-/*
- * Gets list of publication oids for a relation, plus those of ancestors,
- * if any, if the relation is a partition.
- */
+/* Gets list of publication oids for a relation */
 List *
 GetRelationPublications(Oid relid)
 {
 	List	   *result = NIL;
-
-	result = get_rel_publications(relid);
-	if (get_rel_relispartition(relid))
-	{
-		List	   *ancestors = get_partition_ancestors(relid);
-		ListCell   *lc;
-
-		foreach(lc, ancestors)
-		{
-			Oid			ancestor = lfirst_oid(lc);
-			List	   *ancestor_pubs = get_rel_publications(ancestor);
-
-			result = list_concat(result, ancestor_pubs);
-		}
-	}
-
-	return result;
-}
-
-/* Workhorse of GetRelationPublications() */
-static List *
-get_rel_publications(Oid relid)
-{
-	List	   *result = NIL;
 	CatCList   *pubrellist;
 	int			i;
 
@@ -373,9 +343,13 @@ GetAllTablesPublications(void)
 
 /*
  * Gets list of all relation published by FOR ALL TABLES publication(s).
+ *
+ * If the publication publishes partition changes via their respective root
+ * partitioned tables, we must exclude partitions in favor of including the
+ * root partitioned tables.
  */
 List *
-GetAllTablesPublicationRelations(void)
+GetAllTablesPublicationRelations(bool pubviaroot)
 {
 	Relation	classRel;
 	ScanKeyData key[1];
@@ -397,12 +371,35 @@ GetAllTablesPublicationRelations(void)
 		Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
 		Oid			relid = relForm->oid;
 
-		if (is_publishable_class(relid, relForm))
+		if (is_publishable_class(relid, relForm) &&
+			!(relForm->relispartition && pubviaroot))
 			result = lappend_oid(result, relid);
 	}
 
 	table_endscan(scan);
-	table_close(classRel, AccessShareLock);
+
+	if (pubviaroot)
+	{
+		ScanKeyInit(&key[0],
+					Anum_pg_class_relkind,
+					BTEqualStrategyNumber, F_CHAREQ,
+					CharGetDatum(RELKIND_PARTITIONED_TABLE));
+
+		scan = table_beginscan_catalog(classRel, 1, key);
+
+		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+		{
+			Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
+			Oid			relid = relForm->oid;
+
+			if (is_publishable_class(relid, relForm) &&
+				!relForm->relispartition)
+				result = lappend_oid(result, relid);
+		}
+
+		table_endscan(scan);
+		table_close(classRel, AccessShareLock);
+	}
 
 	return result;
 }
@@ -433,6 +430,7 @@ GetPublication(Oid pubid)
 	pub->pubactions.pubupdate = pubform->pubupdate;
 	pub->pubactions.pubdelete = pubform->pubdelete;
 	pub->pubactions.pubtruncate = pubform->pubtruncate;
+	pub->pubviaroot = pubform->pubviaroot;
 
 	ReleaseSysCache(tup);
 
@@ -533,9 +531,11 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * need those.
 		 */
 		if (publication->alltables)
-			tables = GetAllTablesPublicationRelations();
+			tables = GetAllTablesPublicationRelations(publication->pubviaroot);
 		else
 			tables = GetPublicationRelations(publication->oid,
+											 publication->pubviaroot ?
+											 PUBLICATION_PART_ROOT :
 											 PUBLICATION_PART_LEAF);
 		funcctx->user_fctx = (void *) tables;
 
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 494c0bd..ffc5ab2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -23,6 +23,7 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/objectaddress.h"
+#include "catalog/partition.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -56,20 +57,23 @@ static void PublicationDropTables(Oid pubid, List *rels, bool missing_ok);
 static void
 parse_publication_options(List *options,
 						  bool *publish_given,
-						  bool *publish_insert,
-						  bool *publish_update,
-						  bool *publish_delete,
-						  bool *publish_truncate)
+						  PublicationActions *pubactions,
+						  bool *publish_via_partition_root_given,
+						  bool *publish_via_partition_root)
 {
 	ListCell   *lc;
 
+	*publish_via_partition_root_given = false;
 	*publish_given = false;
 
 	/* Defaults are true */
-	*publish_insert = true;
-	*publish_update = true;
-	*publish_delete = true;
-	*publish_truncate = true;
+	pubactions->pubinsert = true;
+	pubactions->pubupdate = true;
+	pubactions->pubdelete = true;
+	pubactions->pubtruncate = true;
+
+	/* Relation changes published as of itself by default. */
+	*publish_via_partition_root = false;
 
 	/* Parse options */
 	foreach(lc, options)
@@ -91,10 +95,10 @@ parse_publication_options(List *options,
 			 * If publish option was given only the explicitly listed actions
 			 * should be published.
 			 */
-			*publish_insert = false;
-			*publish_update = false;
-			*publish_delete = false;
-			*publish_truncate = false;
+			pubactions->pubinsert = false;
+			pubactions->pubupdate = false;
+			pubactions->pubdelete = false;
+			pubactions->pubtruncate = false;
 
 			*publish_given = true;
 			publish = defGetString(defel);
@@ -110,19 +114,28 @@ parse_publication_options(List *options,
 				char	   *publish_opt = (char *) lfirst(lc);
 
 				if (strcmp(publish_opt, "insert") == 0)
-					*publish_insert = true;
+					pubactions->pubinsert = true;
 				else if (strcmp(publish_opt, "update") == 0)
-					*publish_update = true;
+					pubactions->pubupdate = true;
 				else if (strcmp(publish_opt, "delete") == 0)
-					*publish_delete = true;
+					pubactions->pubdelete = true;
 				else if (strcmp(publish_opt, "truncate") == 0)
-					*publish_truncate = true;
+					pubactions->pubtruncate = true;
 				else
 					ereport(ERROR,
 							(errcode(ERRCODE_SYNTAX_ERROR),
 							 errmsg("unrecognized \"publish\" value: \"%s\"", publish_opt)));
 			}
 		}
+		else if (strcmp(defel->defname, "publish_via_partition_root") == 0)
+		{
+			if (*publish_via_partition_root_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			*publish_via_partition_root_given = true;
+			*publish_via_partition_root = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -143,10 +156,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	Datum		values[Natts_pg_publication];
 	HeapTuple	tup;
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	AclResult	aclresult;
 
 	/* must have CREATE privilege on database */
@@ -183,9 +195,9 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_pubowner - 1] = ObjectIdGetDatum(GetUserId());
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	puboid = GetNewOidWithIndex(rel, PublicationObjectIndexId,
 								Anum_pg_publication_oid);
@@ -193,13 +205,15 @@ CreatePublication(CreatePublicationStmt *stmt)
 	values[Anum_pg_publication_puballtables - 1] =
 		BoolGetDatum(stmt->for_all_tables);
 	values[Anum_pg_publication_pubinsert - 1] =
-		BoolGetDatum(publish_insert);
+		BoolGetDatum(pubactions.pubinsert);
 	values[Anum_pg_publication_pubupdate - 1] =
-		BoolGetDatum(publish_update);
+		BoolGetDatum(pubactions.pubupdate);
 	values[Anum_pg_publication_pubdelete - 1] =
-		BoolGetDatum(publish_delete);
+		BoolGetDatum(pubactions.pubdelete);
 	values[Anum_pg_publication_pubtruncate - 1] =
-		BoolGetDatum(publish_truncate);
+		BoolGetDatum(pubactions.pubtruncate);
+	values[Anum_pg_publication_pubviaroot - 1] =
+		BoolGetDatum(publish_via_partition_root);
 
 	tup = heap_form_tuple(RelationGetDescr(rel), values, nulls);
 
@@ -251,17 +265,16 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 	bool		replaces[Natts_pg_publication];
 	Datum		values[Natts_pg_publication];
 	bool		publish_given;
-	bool		publish_insert;
-	bool		publish_update;
-	bool		publish_delete;
-	bool		publish_truncate;
+	PublicationActions pubactions;
+	bool		publish_via_partition_root_given;
+	bool		publish_via_partition_root;
 	ObjectAddress obj;
 	Form_pg_publication pubform;
 
 	parse_publication_options(stmt->options,
-							  &publish_given, &publish_insert,
-							  &publish_update, &publish_delete,
-							  &publish_truncate);
+							  &publish_given, &pubactions,
+							  &publish_via_partition_root_given,
+							  &publish_via_partition_root);
 
 	/* Everything ok, form a new tuple. */
 	memset(values, 0, sizeof(values));
@@ -270,19 +283,25 @@ AlterPublicationOptions(AlterPublicationStmt *stmt, Relation rel,
 
 	if (publish_given)
 	{
-		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(publish_insert);
+		values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactions.pubinsert);
 		replaces[Anum_pg_publication_pubinsert - 1] = true;
 
-		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(publish_update);
+		values[Anum_pg_publication_pubupdate - 1] = BoolGetDatum(pubactions.pubupdate);
 		replaces[Anum_pg_publication_pubupdate - 1] = true;
 
-		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(publish_delete);
+		values[Anum_pg_publication_pubdelete - 1] = BoolGetDatum(pubactions.pubdelete);
 		replaces[Anum_pg_publication_pubdelete - 1] = true;
 
-		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(publish_truncate);
+		values[Anum_pg_publication_pubtruncate - 1] = BoolGetDatum(pubactions.pubtruncate);
 		replaces[Anum_pg_publication_pubtruncate - 1] = true;
 	}
 
+	if (publish_via_partition_root_given)
+	{
+		values[Anum_pg_publication_pubviaroot - 1] = BoolGetDatum(publish_via_partition_root);
+		replaces[Anum_pg_publication_pubviaroot - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
 							replaces);
 
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d71c0a4..c312c7f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2320,9 +2320,18 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	/* If modifying a partitioned table, initialize the root table info */
 	if (node->rootResultRelIndex >= 0)
+	{
 		mtstate->rootResultRelInfo = estate->es_root_result_relations +
 			node->rootResultRelIndex;
 
+		/*
+		 * Check replication identity. Checking for partitions would suffice,
+		 * as we do below, but checking for the root relation provides a more
+		 * useful error message if the required replica identity is not there.
+		 */
+		CheckValidResultRel(mtstate->rootResultRelInfo, operation);
+	}
+
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
 
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 552a70c..844b285 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,8 @@
  */
 #include "postgres.h"
 
+#include "access/tupconvert.h"
+#include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "fmgr.h"
 #include "replication/logical.h"
@@ -20,6 +22,7 @@
 #include "replication/pgoutput.h"
 #include "utils/int8.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -49,6 +52,7 @@ static bool publications_valid;
 static List *LoadPublications(List *pubnames);
 static void publication_invalidation_cb(Datum arg, int cacheid,
 										uint32 hashvalue);
+static void send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx);
 
 /*
  * Entry in the map used to remember which relation schemas we sent.
@@ -59,9 +63,30 @@ static void publication_invalidation_cb(Datum arg, int cacheid,
 typedef struct RelationSyncEntry
 {
 	Oid			relid;			/* relation oid */
-	bool		schema_sent;	/* did we send the schema? */
+
+	/*
+	 * Did we send the schema?  If ancestor relid is set, its schema must also
+	 * have been sent for this to be true.
+	 */
+	bool		schema_sent;
 	bool		replicate_valid;
 	PublicationActions pubactions;
+
+	/*
+	 * OID of the relation to publish changes as.  For a partition, this may
+	 * be set to one of its ancestors whose schema will be used when
+	 * replicating changes, if publish_via_partition_root is set for the
+	 * publication.
+	 */
+	Oid			publish_as_relid;
+
+	/*
+	 * Map used when replicating using an ancestor's schema to convert tuples
+	 * from partition's type to the ancestor's; NULL if publish_as_relid is
+	 * same as 'relid' or if unnecessary due to partition and the ancestor
+	 * having identical TupleDesc.
+	 */
+	TupleConversionMap *map;
 } RelationSyncEntry;
 
 /* Map used to remember which relation schemas we sent. */
@@ -259,47 +284,72 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 }
 
 /*
- * Write the relation schema if the current schema hasn't been sent yet.
+ * Write the current schema of the relation and its ancestor (if any) if not
+ * done yet.
  */
 static void
 maybe_send_schema(LogicalDecodingContext *ctx,
 				  Relation relation, RelationSyncEntry *relentry)
 {
-	if (!relentry->schema_sent)
+	if (relentry->schema_sent)
+		return;
+
+	/* If needed, send the ancestor's schema first. */
+	if (relentry->publish_as_relid != RelationGetRelid(relation))
 	{
-		TupleDesc	desc;
-		int			i;
+		Relation	ancestor =
+		RelationIdGetRelation(relentry->publish_as_relid);
+		TupleDesc	indesc = RelationGetDescr(relation);
+		TupleDesc	outdesc = RelationGetDescr(ancestor);
+		MemoryContext oldctx;
+
+		/* Map must live as long as the session does. */
+		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+		relentry->map = convert_tuples_by_name(indesc, outdesc);
+		MemoryContextSwitchTo(oldctx);
+		send_relation_and_attrs(ancestor, ctx);
+		RelationClose(ancestor);
+	}
 
-		desc = RelationGetDescr(relation);
+	send_relation_and_attrs(relation, ctx);
+	relentry->schema_sent = true;
+}
 
-		/*
-		 * Write out type info if needed.  We do that only for user-created
-		 * types.  We use FirstGenbkiObjectId as the cutoff, so that we only
-		 * consider objects with hand-assigned OIDs to be "built in", not for
-		 * instance any function or type defined in the information_schema.
-		 * This is important because only hand-assigned OIDs can be expected
-		 * to remain stable across major versions.
-		 */
-		for (i = 0; i < desc->natts; i++)
-		{
-			Form_pg_attribute att = TupleDescAttr(desc, i);
+/*
+ * Sends a relation
+ */
+static void
+send_relation_and_attrs(Relation relation, LogicalDecodingContext *ctx)
+{
+	TupleDesc	desc = RelationGetDescr(relation);
+	int			i;
 
-			if (att->attisdropped || att->attgenerated)
-				continue;
+	/*
+	 * Write out type info if needed.  We do that only for user-created types.
+	 * We use FirstGenbkiObjectId as the cutoff, so that we only consider
+	 * objects with hand-assigned OIDs to be "built in", not for instance any
+	 * function or type defined in the information_schema. This is important
+	 * because only hand-assigned OIDs can be expected to remain stable across
+	 * major versions.
+	 */
+	for (i = 0; i < desc->natts; i++)
+	{
+		Form_pg_attribute att = TupleDescAttr(desc, i);
 
-			if (att->atttypid < FirstGenbkiObjectId)
-				continue;
+		if (att->attisdropped || att->attgenerated)
+			continue;
 
-			OutputPluginPrepareWrite(ctx, false);
-			logicalrep_write_typ(ctx->out, att->atttypid);
-			OutputPluginWrite(ctx, false);
-		}
+		if (att->atttypid < FirstGenbkiObjectId)
+			continue;
 
 		OutputPluginPrepareWrite(ctx, false);
-		logicalrep_write_rel(ctx->out, relation);
+		logicalrep_write_typ(ctx->out, att->atttypid);
 		OutputPluginWrite(ctx, false);
-		relentry->schema_sent = true;
 	}
+
+	OutputPluginPrepareWrite(ctx, false);
+	logicalrep_write_rel(ctx->out, relation);
+	OutputPluginWrite(ctx, false);
 }
 
 /*
@@ -346,28 +396,65 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			OutputPluginPrepareWrite(ctx, true);
-			logicalrep_write_insert(ctx->out, relation,
-									&change->data.tp.newtuple->tuple);
-			OutputPluginWrite(ctx, true);
-			break;
+			{
+				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
+
+				/* Switch relation if publishing via root. */
+				if (relentry->publish_as_relid != RelationGetRelid(relation))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->publish_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						tuple = execute_attr_map_tuple(tuple, relentry->map);
+				}
+
+				OutputPluginPrepareWrite(ctx, true);
+				logicalrep_write_insert(ctx->out, relation, tuple);
+				OutputPluginWrite(ctx, true);
+				break;
+			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
 			{
 				HeapTuple	oldtuple = change->data.tp.oldtuple ?
 				&change->data.tp.oldtuple->tuple : NULL;
+				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
+
+				/* Switch relation if publishing via root. */
+				if (relentry->publish_as_relid != RelationGetRelid(relation))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->publish_as_relid);
+					/* Convert tuples if needed. */
+					if (relentry->map)
+					{
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+					}
+				}
 
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple,
-										&change->data.tp.newtuple->tuple);
+				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
 				OutputPluginWrite(ctx, true);
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_DELETE:
 			if (change->data.tp.oldtuple)
 			{
+				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
+
+				/* Switch relation if publishing via root. */
+				if (relentry->publish_as_relid != RelationGetRelid(relation))
+				{
+					Assert(relation->rd_rel->relispartition);
+					relation = RelationIdGetRelation(relentry->publish_as_relid);
+					/* Convert tuple if needed. */
+					if (relentry->map)
+						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+				}
+
 				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_delete(ctx->out, relation,
-										&change->data.tp.oldtuple->tuple);
+				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
 			}
 			else
@@ -412,10 +499,11 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 			continue;
 
 		/*
-		 * Don't send partitioned tables, because partitions should be sent
-		 * instead.
+		 * Don't send partitions if the publication wants to send only the
+		 * root tables through it.
 		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		if (relation->rd_rel->relispartition &&
+			relentry->publish_as_relid != relid)
 			continue;
 
 		relids[nrelids++] = relid;
@@ -540,12 +628,15 @@ init_rel_sync_cache(MemoryContext cachectx)
  * This looks up publications that the given relation is directly or
  * indirectly part of (the latter if it's really the relation's ancestor that
  * is part of a publication) and fills up the found entry with the information
- * about which operations to publish.
+ * about which operations to publish and whether to use an ancestor's schema
+ * when publishing.
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
 {
 	RelationSyncEntry *entry;
+	bool		am_partition = get_rel_relispartition(relid);
+	char		relkind = get_rel_relkind(relid);
 	bool		found;
 	MemoryContext oldctx;
 
@@ -564,6 +655,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 	{
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc;
+		Oid			publish_as_relid = relid;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -588,8 +680,56 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		foreach(lc, data->publications)
 		{
 			Publication *pub = lfirst(lc);
+			bool		publish = false;
+
+			if (pub->alltables)
+			{
+				publish = true;
+				if (pub->pubviaroot && am_partition)
+					publish_as_relid = llast_oid(get_partition_ancestors(relid));
+			}
+
+			if (!publish)
+			{
+				bool	ancestor_published = false;
+
+				/*
+				 * For a partition, check if any of the ancestors are
+				 * published.  If so, note down the topmost ancestor that is
+				 * published via this publication, which will be used as the
+				 * relation via which to publish the partition's changes.
+				 */
+				if (am_partition)
+				{
+					List   *ancestors = get_partition_ancestors(relid);
+					ListCell *lc2;
+
+					/* Find the "topmost" ancestor that is in this publication. */
+					foreach(lc2, ancestors)
+					{
+						Oid		ancestor = lfirst_oid(lc2);
+
+						if (list_member_oid(GetRelationPublications(ancestor),
+											pub->oid))
+						{
+							ancestor_published = true;
+							if (pub->pubviaroot)
+								publish_as_relid = ancestor;
+						}
+					}
+				}
+
+				if (list_member_oid(pubids, pub->oid) || ancestor_published)
+					publish = true;
+			}
 
-			if (pub->alltables || list_member_oid(pubids, pub->oid))
+			/*
+			 * Don't publish changes for partitioned tables, because
+			 * publishing those of its partitions suffices, unless partition
+			 * changes won't be published due to pubviaroot being set.
+			 */
+			if (publish &&
+				(relkind != RELKIND_PARTITIONED_TABLE || pub->pubviaroot))
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
@@ -604,6 +744,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 
 		list_free(pubids);
 
+		entry->publish_as_relid = publish_as_relid;
 		entry->replicate_valid = true;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index dfd81f1..9f1f11d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -44,6 +44,7 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -5314,6 +5315,20 @@ GetRelationPublicationActions(Relation relation)
 
 	/* Fetch the publication membership info. */
 	puboids = GetRelationPublications(RelationGetRelid(relation));
+	if (relation->rd_rel->relispartition)
+	{
+		/* Add publications that the ancestors are in too. */
+		List   *ancestors = get_partition_ancestors(RelationGetRelid(relation));
+		ListCell *lc;
+
+		foreach(lc, ancestors)
+		{
+			Oid		ancestor = lfirst_oid(lc);
+
+			puboids = list_concat_unique_oid(puboids,
+											 GetRelationPublications(ancestor));
+		}
+	}
 	puboids = list_concat_unique_oid(puboids, GetAllTablesPublications());
 
 	foreach(lc, puboids)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 408637c..c579227 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3868,6 +3868,7 @@ getPublications(Archive *fout)
 	int			i_pubupdate;
 	int			i_pubdelete;
 	int			i_pubtruncate;
+	int			i_pubviaroot;
 	int			i,
 				ntups;
 
@@ -3879,18 +3880,25 @@ getPublications(Archive *fout)
 	resetPQExpBuffer(query);
 
 	/* Get the publications. */
-	if (fout->remoteVersion >= 110000)
+	if (fout->remoteVersion >= 130000)
+		appendPQExpBuffer(query,
+						  "SELECT p.tableoid, p.oid, p.pubname, "
+						  "(%s p.pubowner) AS rolname, "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, p.pubviaroot "
+						  "FROM pg_publication p",
+						  username_subquery);
+	else if (fout->remoteVersion >= 110000)
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, p.pubtruncate, false AS pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 	else
 		appendPQExpBuffer(query,
 						  "SELECT p.tableoid, p.oid, p.pubname, "
 						  "(%s p.pubowner) AS rolname, "
-						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate "
+						  "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS pubtruncate, false AS pubviaroot "
 						  "FROM pg_publication p",
 						  username_subquery);
 
@@ -3907,6 +3915,7 @@ getPublications(Archive *fout)
 	i_pubupdate = PQfnumber(res, "pubupdate");
 	i_pubdelete = PQfnumber(res, "pubdelete");
 	i_pubtruncate = PQfnumber(res, "pubtruncate");
+	i_pubviaroot = PQfnumber(res, "pubviaroot");
 
 	pubinfo = pg_malloc(ntups * sizeof(PublicationInfo));
 
@@ -3929,6 +3938,8 @@ getPublications(Archive *fout)
 			(strcmp(PQgetvalue(res, i, i_pubdelete), "t") == 0);
 		pubinfo[i].pubtruncate =
 			(strcmp(PQgetvalue(res, i, i_pubtruncate), "t") == 0);
+		pubinfo[i].pubviaroot =
+			(strcmp(PQgetvalue(res, i, i_pubviaroot), "t") == 0);
 
 		if (strlen(pubinfo[i].rolname) == 0)
 			pg_log_warning("owner of publication \"%s\" appears to be invalid",
@@ -4005,7 +4016,12 @@ dumpPublication(Archive *fout, PublicationInfo *pubinfo)
 		first = false;
 	}
 
-	appendPQExpBufferStr(query, "');\n");
+	appendPQExpBufferStr(query, "'");
+
+	if (pubinfo->pubviaroot)
+		appendPQExpBufferStr(query, ", publish_via_partition_root = true");
+
+	appendPQExpBufferStr(query, ");\n");
 
 	ArchiveEntry(fout, pubinfo->dobj.catId, pubinfo->dobj.dumpId,
 				 ARCHIVE_OPTS(.tag = pubinfo->dobj.name,
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 3e11166..61c909e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -602,6 +602,7 @@ typedef struct _PublicationInfo
 	bool		pubupdate;
 	bool		pubdelete;
 	bool		pubtruncate;
+	bool		pubviaroot;
 } PublicationInfo;
 
 /*
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 109245f..f05e914 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -5707,7 +5707,7 @@ listPublications(const char *pattern)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false, false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -5738,6 +5738,10 @@ listPublications(const char *pattern)
 		appendPQExpBuffer(&buf,
 						  ",\n  pubtruncate AS \"%s\"",
 						  gettext_noop("Truncates"));
+	if (pset.sversion >= 130000)
+		appendPQExpBuffer(&buf,
+						  ",\n  pubviaroot AS \"%s\"",
+						  gettext_noop("Via root"));
 
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
@@ -5779,6 +5783,7 @@ describePublications(const char *pattern)
 	int			i;
 	PGresult   *res;
 	bool		has_pubtruncate;
+	bool		has_pubviaroot;
 
 	if (pset.sversion < 100000)
 	{
@@ -5791,6 +5796,7 @@ describePublications(const char *pattern)
 	}
 
 	has_pubtruncate = (pset.sversion >= 110000);
+	has_pubviaroot = (pset.sversion >= 130000);
 
 	initPQExpBuffer(&buf);
 
@@ -5801,6 +5807,9 @@ describePublications(const char *pattern)
 	if (has_pubtruncate)
 		appendPQExpBufferStr(&buf,
 							 ", pubtruncate");
+	if (has_pubviaroot)
+		appendPQExpBufferStr(&buf,
+							 ", pubviaroot");
 	appendPQExpBufferStr(&buf,
 						 "\nFROM pg_catalog.pg_publication\n");
 
@@ -5850,6 +5859,8 @@ describePublications(const char *pattern)
 
 		if (has_pubtruncate)
 			ncols++;
+		if (has_pubviaroot)
+			ncols++;
 
 		initPQExpBuffer(&title);
 		printfPQExpBuffer(&title, _("Publication %s"), pubname);
@@ -5862,6 +5873,8 @@ describePublications(const char *pattern)
 		printTableAddHeader(&cont, gettext_noop("Deletes"), true, align);
 		if (has_pubtruncate)
 			printTableAddHeader(&cont, gettext_noop("Truncates"), true, align);
+		if (has_pubviaroot)
+			printTableAddHeader(&cont, gettext_noop("Via root"), true, align);
 
 		printTableAddCell(&cont, PQgetvalue(res, i, 2), false, false);
 		printTableAddCell(&cont, PQgetvalue(res, i, 3), false, false);
@@ -5870,6 +5883,8 @@ describePublications(const char *pattern)
 		printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false);
 		if (has_pubtruncate)
 			printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false);
+		if (has_pubviaroot)
+			printTableAddCell(&cont, PQgetvalue(res, i, 8), false, false);
 
 		if (!puballtables)
 		{
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index bb52e8c..ec02f48 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -52,6 +52,8 @@ CATALOG(pg_publication,6104,PublicationRelationId)
 	/* true if truncates are published */
 	bool		pubtruncate;
 
+	/* true if partition changes are published using root schema */
+	bool		pubviaroot;
 } FormData_pg_publication;
 
 /* ----------------
@@ -74,6 +76,7 @@ typedef struct Publication
 	Oid			oid;
 	char	   *name;
 	bool		alltables;
+	bool		pubviaroot;
 	PublicationActions pubactions;
 } Publication;
 
@@ -99,7 +102,7 @@ typedef enum PublicationPartOpt
 
 extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
 extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(void);
+extern List *GetAllTablesPublicationRelations(bool pubviaroot);
 
 extern bool is_publishable_relation(Relation rel);
 extern ObjectAddress publication_add_relation(Oid pubid, Relation targetrel,
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 2634d2c..63d6ab7 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -25,21 +25,23 @@ CREATE PUBLICATION testpub_xxx WITH (foo);
 ERROR:  unrecognized publication parameter: "foo"
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
 ERROR:  unrecognized "publish" value: "cluster"
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
+ERROR:  conflicting or redundant options
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | f       | t       | f       | f
+                                              List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | f       | t       | f       | f         | f
 (2 rows)
 
 ALTER PUBLICATION testpub_default SET (publish = 'insert, update, delete');
 \dRp
-                                         List of publications
-        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------+--------------------------+------------+---------+---------+---------+-----------
- testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f
- testpub_default    | regress_publication_user | f          | t       | t       | t       | f
+                                              List of publications
+        Name        |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------+--------------------------+------------+---------+---------+---------+-----------+----------
+ testpib_ins_trunct | regress_publication_user | f          | t       | f       | f       | f         | f
+ testpub_default    | regress_publication_user | f          | t       | t       | t       | f         | f
 (2 rows)
 
 --- adding tables
@@ -83,10 +85,10 @@ Publications:
     "testpub_foralltables"
 
 \dRp+ testpub_foralltables
-                        Publication testpub_foralltables
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | t          | t       | t       | f       | f
+                              Publication testpub_foralltables
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | t          | t       | t       | f       | f         | f
 (1 row)
 
 DROP TABLE testpub_tbl2;
@@ -98,19 +100,19 @@ CREATE PUBLICATION testpub3 FOR TABLE testpub_tbl3;
 CREATE PUBLICATION testpub4 FOR TABLE ONLY testpub_tbl3;
 RESET client_min_messages;
 \dRp+ testpub3
-                              Publication testpub3
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                    Publication testpub3
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
     "public.testpub_tbl3a"
 
 \dRp+ testpub4
-                              Publication testpub4
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                    Publication testpub4
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_tbl3"
 
@@ -129,10 +131,10 @@ ALTER TABLE testpub_parted ATTACH PARTITION testpub_parted1 FOR VALUES IN (1);
 -- only parent is listed as being in publication, not the partition
 ALTER PUBLICATION testpub_forparted ADD TABLE testpub_parted;
 \dRp+ testpub_forparted
-                          Publication testpub_forparted
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                               Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "public.testpub_parted"
 
@@ -143,6 +145,15 @@ HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
+                               Publication testpub_forparted
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | t
+Tables:
+    "public.testpub_parted"
+
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 -- fail - view
@@ -159,10 +170,10 @@ ERROR:  relation "testpub_tbl1" is already member of publication "testpub_fortbl
 CREATE PUBLICATION testpub_fortbl FOR TABLE testpub_tbl1;
 ERROR:  publication "testpub_fortbl" already exists
 \dRp+ testpub_fortbl
-                           Publication testpub_fortbl
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | t
+                                 Publication testpub_fortbl
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | t         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -200,10 +211,10 @@ Publications:
     "testpub_fortbl"
 
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 Tables:
     "pub_test.testpub_nopk"
     "public.testpub_tbl1"
@@ -247,10 +258,10 @@ DROP TABLE testpub_parted;
 DROP VIEW testpub_view;
 DROP TABLE testpub_tbl1;
 \dRp+ testpub_default
-                           Publication testpub_default
-          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
---------------------------+------------+---------+---------+---------+-----------
- regress_publication_user | f          | t       | t       | t       | f
+                                Publication testpub_default
+          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+--------------------------+------------+---------+---------+---------+-----------+----------
+ regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- fail - must be owner of publication
@@ -260,20 +271,20 @@ ERROR:  must be owner of publication testpub_default
 RESET ROLE;
 ALTER PUBLICATION testpub_default RENAME TO testpub_foo;
 \dRp testpub_foo
-                                     List of publications
-    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates 
--------------+--------------------------+------------+---------+---------+---------+-----------
- testpub_foo | regress_publication_user | f          | t       | t       | t       | f
+                                           List of publications
+    Name     |          Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+-------------+--------------------------+------------+---------+---------+---------+-----------+----------
+ testpub_foo | regress_publication_user | f          | t       | t       | t       | f         | f
 (1 row)
 
 -- rename back to keep the rest simple
 ALTER PUBLICATION testpub_foo RENAME TO testpub_default;
 ALTER PUBLICATION testpub_default OWNER TO regress_publication_user2;
 \dRp testpub_default
-                                        List of publications
-      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates 
------------------+---------------------------+------------+---------+---------+---------+-----------
- testpub_default | regress_publication_user2 | f          | t       | t       | t       | f
+                                             List of publications
+      Name       |           Owner           | All tables | Inserts | Updates | Deletes | Truncates | Via root 
+-----------------+---------------------------+------------+---------+---------+---------+-----------+----------
+ testpub_default | regress_publication_user2 | f          | t       | t       | t       | f         | f
 (1 row)
 
 DROP PUBLICATION testpub_default;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 219e041..d844075 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -23,6 +23,7 @@ ALTER PUBLICATION testpub_default SET (publish = update);
 -- error cases
 CREATE PUBLICATION testpub_xxx WITH (foo);
 CREATE PUBLICATION testpub_xxx WITH (publish = 'cluster, vacuum');
+CREATE PUBLICATION testpub_xxx WITH (publish_via_partition_root = 'true', publish_via_partition_root = '0');
 
 \dRp
 
@@ -87,6 +88,8 @@ UPDATE testpub_parted1 SET a = 1;
 ALTER TABLE testpub_parted DETACH PARTITION testpub_parted1;
 -- works again, because parent's publication is no longer considered
 UPDATE testpub_parted1 SET a = 1;
+ALTER PUBLICATION testpub_forparted SET (publish_via_partition_root = true);
+\dRp+ testpub_forparted
 DROP TABLE testpub_parted1;
 DROP PUBLICATION testpub_forparted, testpub_forparted1;
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 5db1b21..beb708b 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 24;
+use Test::More tests => 51;
 
 # setup
 
@@ -48,7 +48,6 @@ $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)");
-
 $node_subscriber1->safe_psql('postgres',
 	"ALTER TABLE tab1 ATTACH PARTITION tab1_1 FOR VALUES IN (1, 2, 3)");
 $node_subscriber1->safe_psql('postgres',
@@ -87,6 +86,8 @@ $node_subscriber1->poll_query_until('postgres', $synced_query)
 $node_subscriber2->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
+# Tests for replication using leaf partition identity and schema
+
 # insert
 $node_publisher->safe_psql('postgres',
 	"INSERT INTO tab1 VALUES (1)");
@@ -260,3 +261,296 @@ is($result, qq(), 'truncate of tab1_1 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab1 ORDER BY 1");
 is($result, qq(), 'truncate of tab1 replicated');
+
+# Tests for replication using root table identity and schema
+
+# Publisher
+$node_publisher->safe_psql('postgres',
+	"DROP PUBLICATION pub1");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (b text, a int NOT NULL)");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES IN (0, 1, 2, 3)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab2_2 PARTITION OF tab2 FOR VALUES IN (5, 6)");
+
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
+$node_publisher->safe_psql('postgres',
+	"CREATE TABLE tab3_1 PARTITION OF tab3 FOR VALUES IN (0, 1, 2, 3, 5, 6)");
+$node_publisher->safe_psql('postgres',
+	"ALTER PUBLICATION pub_all SET (publish_via_partition_root = true)");
+# Note: tab3_1's parent is not in the publication, in which case its
+# changes are published using own identity.
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub_viaroot FOR TABLE tab2, tab3_1 WITH (publish_via_partition_root = true)");
+
+# Subscriber 1
+$node_subscriber1->safe_psql('postgres',
+	"DROP SUBSCRIPTION sub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub1_tab2', b text) PARTITION BY RANGE (a)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab2_1 (c text DEFAULT 'sub1_tab2', b text, a int NOT NULL)");
+$node_subscriber1->safe_psql('postgres',
+	"ALTER TABLE tab2 ATTACH PARTITION tab2_1 FOR VALUES FROM (0) TO (10)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (c text DEFAULT 'sub1_tab3_1', b text, a int NOT NULL PRIMARY KEY)");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_viaroot CONNECTION '$publisher_connstr' PUBLICATION pub_viaroot");
+
+# Subscriber 2
+$node_subscriber2->safe_psql('postgres',
+	"DROP TABLE tab1");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab1', b text) PARTITION BY HASH (a)");
+# Note: tab1's partitions are named tab1_1 and tab1_2 on the publisher.
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part1 (b text, c text, a int NOT NULL)");
+$node_subscriber2->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_part1 FOR VALUES WITH (MODULUS 2, REMAINDER 0)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab1_part2 PARTITION OF tab1 FOR VALUES WITH (MODULUS 2, REMAINDER 1)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab2 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab2', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3', b text)");
+$node_subscriber2->safe_psql('postgres',
+	"CREATE TABLE tab3_1 (a int PRIMARY KEY, c text DEFAULT 'sub2_tab3_1', b text)");
+# Publication that sub2 points to now publishes via root, so must update
+# subscription target relations.
+$node_subscriber2->safe_psql('postgres',
+	"ALTER SUBSCRIPTION sub2 REFRESH PUBLICATION");
+
+# Wait for initial sync of all subscriptions
+$node_subscriber1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber2->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
+
+# insert
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (0)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_1 (a) VALUES (3)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1_2 VALUES (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (0), (3), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab3 VALUES (1), (0), (3), (5)");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|3
+sub1_tab2|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|3
+sub1_tab3_1|5), 'inserts into tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|3
+sub2_tab1|5), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|3
+sub2_tab2|5), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|3
+sub2_tab3|5), 'inserts into tab3 replicated');
+
+# update (replicated as update)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 6 WHERE a = 5");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 6 WHERE a = 5");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|3
+sub1_tab2|6), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|3
+sub1_tab3_1|6), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|3
+sub2_tab1|6), 'inserts into tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|3
+sub2_tab2|6), 'inserts into tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|3
+sub2_tab3|6), 'inserts into tab3 replicated');
+
+# update (replicated as delete+insert)
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab2 SET a = 2 WHERE a = 6");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab3 SET a = 2 WHERE a = 6");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub1_tab2|0
+sub1_tab2|1
+sub1_tab2|2
+sub1_tab2|3), 'update of tab2 replicated');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT c, a FROM tab3_1 ORDER BY 1, 2");
+is($result, qq(sub1_tab3_1|0
+sub1_tab3_1|1
+sub1_tab3_1|2
+sub1_tab3_1|3), 'update of tab3_1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab1 ORDER BY 1, 2");
+is($result, qq(sub2_tab1|0
+sub2_tab1|1
+sub2_tab1|2
+sub2_tab1|3), 'update of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab2 ORDER BY 1, 2");
+is($result, qq(sub2_tab2|0
+sub2_tab2|1
+sub2_tab2|2
+sub2_tab2|3), 'update of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT c, a FROM tab3 ORDER BY 1, 2");
+is($result, qq(sub2_tab3|0
+sub2_tab3|1
+sub2_tab3|2
+sub2_tab3|3), 'update of tab3 replicated');
+
+# delete
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab1");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab2");
+$node_publisher->safe_psql('postgres',
+	"DELETE FROM tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'delete tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1");
+is($result, qq(), 'delete from tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'delete from tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3");
+is($result, qq(), 'delete from tab3 replicated');
+
+# truncate
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab1 VALUES (1), (2), (5)");
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab2 VALUES (1), (2), (5)");
+# these will NOT be replicated
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1_2, tab2_1, tab3_1");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab2_1 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab1_2 NOT replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2 ORDER BY 1");
+is($result, qq(1
+2
+5), 'truncate of tab2_1 NOT replicated');
+
+$node_publisher->safe_psql('postgres',
+	"TRUNCATE tab1, tab2, tab3");
+
+$node_publisher->wait_for_catchup('sub_viaroot');
+$node_publisher->wait_for_catchup('sub2');
+
+$result = $node_subscriber1->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab1");
+is($result, qq(), 'truncate of tab1 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab2");
+is($result, qq(), 'truncate of tab2 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3");
+is($result, qq(), 'truncate of tab3 replicated');
+
+$result = $node_subscriber2->safe_psql('postgres',
+	"SELECT a FROM tab3_1");
+is($result, qq(), 'truncate of tab3_1 replicated');
-- 
1.8.3.1

v19-v20-delta.patchapplication/octet-stream; name=v19-v20-delta.patchDownload
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index ee54a17..844b285 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -499,14 +499,6 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 			continue;
 
 		/*
-		 * Don't send partitioned tables unless publication wants to send
-		 * only the root tables, because partitions will be sent instead.
-		 */
-		if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
-			relentry->publish_as_relid != relid)
-			continue;
-
-		/*
 		 * Don't send partitions if the publication wants to send only the
 		 * root tables through it.
 		 */
@@ -644,6 +636,7 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 {
 	RelationSyncEntry *entry;
 	bool		am_partition = get_rel_relispartition(relid);
+	char		relkind = get_rel_relkind(relid);
 	bool		found;
 	MemoryContext oldctx;
 
@@ -692,10 +685,8 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 			if (pub->alltables)
 			{
 				publish = true;
-				if (pub->pubviaroot)
-					publish_as_relid = am_partition ?
-								llast_oid(get_partition_ancestors(relid)) :
-								relid;
+				if (pub->pubviaroot && am_partition)
+					publish_as_relid = llast_oid(get_partition_ancestors(relid));
 			}
 
 			if (!publish)
@@ -732,7 +723,13 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 					publish = true;
 			}
 
-			if (publish)
+			/*
+			 * Don't publish changes for partitioned tables, because
+			 * publishing those of its partitions suffices, unless partition
+			 * changes won't be published due to pubviaroot being set.
+			 */
+			if (publish &&
+				(relkind != RELKIND_PARTITIONED_TABLE || pub->pubviaroot))
 			{
 				entry->pubactions.pubinsert |= pub->pubactions.pubinsert;
 				entry->pubactions.pubupdate |= pub->pubactions.pubupdate;
#72Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#71)
Re: adding partitioned tables to publications

On 2020-04-08 07:45, Amit Langote wrote:

On Wed, Apr 8, 2020 at 1:22 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Apr 7, 2020 at 6:01 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

The descriptions of the new fields in RelationSyncEntry don't seem to
match the code accurately, or at least it's confusing.
replicate_as_relid is always filled in with an ancestor, even if
pubviaroot is not set.

Given this confusion, I have changed how replicate_as_relid works so
that it's now always set -- if different from the relation's own OID,
the code for "publishing via root" kicks in in various places.

I think the pubviaroot field is actually not necessary. We only need
replicate_as_relid.

Looking through the code, I agree. I guess I only kept it around to
go with pubupdate, etc.

Think I broke truncate replication with this. Fixed in the attached
updated patch.

All committed.

Thank you and everyone very much for working on this. I'm very happy
that these two features from PG10 have finally met. :)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#73Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#72)
Re: adding partitioned tables to publications

On Wed, Apr 8, 2020 at 6:26 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

All committed.

Thank you and everyone very much for working on this. I'm very happy
that these two features from PG10 have finally met. :)

Thanks a lot for reviewing and committing.

prion seems to have failed:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&amp;dt=2020-04-08%2009%3A53%3A13

Also, still unsure why the coverage report for pgoutput.c changes not good:
https://coverage.postgresql.org/src/backend/replication/pgoutput/pgoutput.c.gcov.html

Will check.

--

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#74Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#73)
Re: adding partitioned tables to publications

On 2020-04-08 13:16, Amit Langote wrote:

On Wed, Apr 8, 2020 at 6:26 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

All committed.

Thank you and everyone very much for working on this. I'm very happy
that these two features from PG10 have finally met. :)

Thanks a lot for reviewing and committing.

prion seems to have failed:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&amp;dt=2020-04-08%2009%3A53%3A13

This comes from -DRELCACHE_FORCE_RELEASE.

Also, still unsure why the coverage report for pgoutput.c changes not good:
https://coverage.postgresql.org/src/backend/replication/pgoutput/pgoutput.c.gcov.html

I think this is because the END { } section in PostgresNode.pm shuts
down all running instances in immediate mode, which doesn't save
coverage properly.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#75Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#74)
1 attachment(s)
Re: adding partitioned tables to publications

On Wed, Apr 8, 2020 at 9:21 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-08 13:16, Amit Langote wrote:

On Wed, Apr 8, 2020 at 6:26 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

All committed.

Thank you and everyone very much for working on this. I'm very happy
that these two features from PG10 have finally met. :)

Thanks a lot for reviewing and committing.

prion seems to have failed:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&amp;dt=2020-04-08%2009%3A53%3A13

This comes from -DRELCACHE_FORCE_RELEASE.

I'm seeing some funny stuff on such a build locally too, although
haven't been able to make sense of it yet.

Also, still unsure why the coverage report for pgoutput.c changes not good:
https://coverage.postgresql.org/src/backend/replication/pgoutput/pgoutput.c.gcov.html

I think this is because the END { } section in PostgresNode.pm shuts
down all running instances in immediate mode, which doesn't save
coverage properly.

Thanks for that tip. Appending the following at the end of the test
file has fixed the coverage reporting for me.

I noticed the following coverage issues:

1. The previous commit f1ac27bfd missed a command that I had included
to cover the following blocks of apply_handle_tuple_routing():

1165 : else
1166 : {
1167 0 : remoteslot =
ExecCopySlot(remoteslot, remoteslot_part);
1168 0 : slot_getallattrs(remoteslot);
1169 : }
...

1200 2 : if (map != NULL)
1201 : {
1202 0 : remoteslot_part =
execute_attr_map_slot(map->attrMap,
1203 :
remoteslot,
1204 :
remoteslot_part);
1205 : }

2. Now that I am able to see proper coverage fo
publish_via_partition_root related changes, I can see that a block in
pgoutput_change() is missing coverage

The attached fixes these coverage issues.

--

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

logicalrep-partition-test-coverage-fixes.patchapplication/octet-stream; name=logicalrep-partition-test-coverage-fixes.patchDownload
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 208bb55..96d4780 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -141,6 +141,8 @@ $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 4 WHERE a = 6");
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 4");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 5 WHERE a = 6");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
@@ -150,15 +152,15 @@ $result = $node_subscriber1->safe_psql('postgres',
 is($result, qq(sub1_tab1|0
 sub1_tab1|2
 sub1_tab1|3
-sub1_tab1|6), 'update of tab1_1, tab1_2 replicated');
+sub1_tab1|5), 'update of tab1_1, tab1_2 replicated');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT a FROM tab1_2_1 ORDER BY 1");
-is($result, qq(), 'updates of tab1_2 replicated into tab1_2_1 correctly');
+is($result, qq(5), 'updates of tab1_2 replicated into tab1_2_1 correctly');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT a FROM tab1_2_2 ORDER BY 1");
-is($result, qq(6), 'updates of tab1_2 replicated into tab1_2_2 correctly');
+is($result, qq(), 'updates of tab1_2 replicated into tab1_2_2 correctly');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_1 ORDER BY 1, 2");
@@ -167,7 +169,7 @@ sub2_tab1_1|3), 'update of tab1_1 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_2 ORDER BY 1, 2");
-is($result, qq(sub2_tab1_2|6), 'tab1_2 updated');
+is($result, qq(sub2_tab1_2|5), 'tab1_2 updated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_def ORDER BY 1");
@@ -187,12 +189,11 @@ $result = $node_subscriber1->safe_psql('postgres',
 is($result, qq(sub1_tab1|2
 sub1_tab1|3
 sub1_tab1|4
-sub1_tab1|6), 'update of tab1 (delete from tab1_def + insert into tab1_1) replicated');
+sub1_tab1|5), 'update of tab1 (delete from tab1_def + insert into tab1_1) replicated');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT a FROM tab1_2_2 ORDER BY 1");
-is($result, qq(4
-6), 'updates of tab1 (delete + insert) replicated into tab1_2_2 correctly');
+is($result, qq(4), 'updates of tab1 (delete + insert) replicated into tab1_2_2 correctly');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_1 ORDER BY 1, 2");
@@ -202,7 +203,7 @@ sub2_tab1_1|3), 'tab1_1 unchanged');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_2 ORDER BY 1, 2");
 is($result, qq(sub2_tab1_2|4
-sub2_tab1_2|6), 'insert into tab1_2 replicated');
+sub2_tab1_2|5), 'insert into tab1_2 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab1_def ORDER BY 1");
@@ -267,6 +268,13 @@ is($result, qq(), 'truncate of tab1 replicated');
 # publisher
 $node_publisher->safe_psql('postgres',
 	"DROP PUBLICATION pub1");
+# make tab1_2's tuple description different from its parent
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 DETACH PARTITION tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1_2 DROP b, ADD b text");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_2 FOR VALUES IN (4, 5, 6)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -554,3 +562,7 @@ is($result, qq(), 'truncate of tab3 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab3_1");
 is($result, qq(), 'truncate of tab3_1 replicated');
+
+$node_publisher->stop('fast');
+$node_subscriber1->stop('fast');
+$node_subscriber2->stop('fast');
#76Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#75)
Re: adding partitioned tables to publications

On Wed, Apr 8, 2020 at 11:07 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Apr 8, 2020 at 9:21 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I think this is because the END { } section in PostgresNode.pm shuts
down all running instances in immediate mode, which doesn't save
coverage properly.

Thanks for that tip. Appending the following at the end of the test
file has fixed the coverage reporting for me.

The patch posted in the previous email has it, but I meant this by
"the following":

+
+$node_publisher->stop('fast');
+$node_subscriber1->stop('fast');
+$node_subscriber2->stop('fast');

--

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#77Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#75)
Re: adding partitioned tables to publications

On Wed, Apr 8, 2020 at 11:07 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Apr 8, 2020 at 9:21 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-08 13:16, Amit Langote wrote:

prion seems to have failed:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&amp;dt=2020-04-08%2009%3A53%3A13

This comes from -DRELCACHE_FORCE_RELEASE.

I'm seeing some funny stuff on such a build locally too, although
haven't been able to make sense of it yet.

So, I see the following repeated in the publisher's log
(013_partition.pl) until PostgresNode.pm times out:

sub_viaroot ERROR: number of columns (2601) exceeds limit (1664)
sub_viaroot CONTEXT: slot "sub_viaroot", output plugin "pgoutput", in
the change callback, associated LSN 0/1621010

causing the tests introduced by this last commit to stall.

Just before where the above starts repeating is this:

sub_viaroot_16479_sync_16455 LOG: starting logical decoding for slot
"sub_viaroot_16479_sync_16455"
sub_viaroot_16479_sync_16455 DETAIL: Streaming transactions
committing after 0/1620A40, reading WAL from 0/1620A08.
sub_viaroot_16479_sync_16455 LOG: logical decoding found consistent
point at 0/1620A08
sub_viaroot_16479_sync_16455 DETAIL: There are no running transactions.
sub_viaroot_16479_sync_16470 LOG: statement: COPY public.tab3_1 TO STDOUT
sub_viaroot_16479_sync_16470 LOG: statement: COMMIT

Same thing for the other subscriber sub2.

--

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#78Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#77)
Re: adding partitioned tables to publications

On 2020-04-09 05:39, Amit Langote wrote:

sub_viaroot ERROR: number of columns (2601) exceeds limit (1664)
sub_viaroot CONTEXT: slot "sub_viaroot", output plugin "pgoutput", in
the change callback, associated LSN 0/1621010

I think the problem is that in maybe_send_schema(),
RelationClose(ancestor) releases the relcache entry, but the tuple
descriptors, which are part of the relcache entry, are still pointed to
by the tuple map.

This patch makes the tests pass for me:

diff --git a/src/backend/replication/pgoutput/pgoutput.c 
b/src/backend/replication/pgoutput/pgoutput.c
index 5fbf2d4367..cf6e8629c1 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -305,7 +305,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
         /* Map must live as long as the session does. */
         oldctx = MemoryContextSwitchTo(CacheMemoryContext);
-       relentry->map = convert_tuples_by_name(indesc, outdesc);
+       relentry->map = 
convert_tuples_by_name(CreateTupleDescCopy(indesc), 
CreateTupleDescCopy(outdesc));
         MemoryContextSwitchTo(oldctx);
         send_relation_and_attrs(ancestor, ctx);
         RelationClose(ancestor);

Please check.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#79Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#78)
1 attachment(s)
Re: adding partitioned tables to publications

On Thu, Apr 9, 2020 at 4:14 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-09 05:39, Amit Langote wrote:

sub_viaroot ERROR: number of columns (2601) exceeds limit (1664)
sub_viaroot CONTEXT: slot "sub_viaroot", output plugin "pgoutput", in
the change callback, associated LSN 0/1621010

I think the problem is that in maybe_send_schema(),
RelationClose(ancestor) releases the relcache entry, but the tuple
descriptors, which are part of the relcache entry, are still pointed to
by the tuple map.

This patch makes the tests pass for me:

diff --git a/src/backend/replication/pgoutput/pgoutput.c
b/src/backend/replication/pgoutput/pgoutput.c
index 5fbf2d4367..cf6e8629c1 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -305,7 +305,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
-       relentry->map = convert_tuples_by_name(indesc, outdesc);
+       relentry->map =
convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, ctx);
RelationClose(ancestor);

Please check.

Thanks. Yes, that's what I just found out too and was about to send a
patch, which is basically same as yours as far as the fix for this
issue is concerned.

While figuring this out, I thought the nearby code could be rearranged
a bit, especially to de-duplicate the code. Also, I think
get_rel_sync_entry() may be a better place to set the map, rather than
maybe_send_schema(). Thoughts?

--

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

logicalrep-partition-code-fixes.patchapplication/octet-stream; name=logicalrep-partition-code-fixes.patchDownload
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 5fbf2d4..c499410 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -299,14 +299,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
 	if (relentry->publish_as_relid != RelationGetRelid(relation))
 	{
 		Relation	ancestor = RelationIdGetRelation(relentry->publish_as_relid);
-		TupleDesc	indesc = RelationGetDescr(relation);
-		TupleDesc	outdesc = RelationGetDescr(ancestor);
-		MemoryContext oldctx;
-
-		/* Map must live as long as the session does. */
-		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
-		relentry->map = convert_tuples_by_name(indesc, outdesc);
-		MemoryContextSwitchTo(oldctx);
+
 		send_relation_and_attrs(ancestor, ctx);
 		RelationClose(ancestor);
 	}
@@ -362,6 +355,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
 	MemoryContext old;
 	RelationSyncEntry *relentry;
+	Relation	ancestor = NULL;
+	HeapTuple	oldtuple;
+	HeapTuple	newtuple;
 
 	if (!is_publishable_relation(relation))
 		return;
@@ -392,67 +388,43 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 	maybe_send_schema(ctx, relation, relentry);
 
+	/* Switch relation if publishing via ancestor. */
+	if (relentry->publish_as_relid != RelationGetRelid(relation))
+	{
+		Assert(relation->rd_rel->relispartition);
+		ancestor = RelationIdGetRelation(relentry->publish_as_relid);
+		relation = ancestor;
+	}
+
+	/* Set oldtuple/newtuple and convert to ancestor rowtype if necessary. */
+	oldtuple = change->data.tp.oldtuple ?
+			&change->data.tp.oldtuple->tuple : NULL;
+	if (oldtuple && relentry->map)
+		oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+
+	newtuple = change->data.tp.newtuple ?
+			&change->data.tp.newtuple->tuple : NULL;
+	if (newtuple && relentry->map)
+		newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+
 	/* Send the data */
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			{
-				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
-
-				/* Switch relation if publishing via root. */
-				if (relentry->publish_as_relid != RelationGetRelid(relation))
-				{
-					Assert(relation->rd_rel->relispartition);
-					relation = RelationIdGetRelation(relentry->publish_as_relid);
-					/* Convert tuple if needed. */
-					if (relentry->map)
-						tuple = execute_attr_map_tuple(tuple, relentry->map);
-				}
+			OutputPluginPrepareWrite(ctx, true);
+			logicalrep_write_insert(ctx->out, relation, newtuple);
+			OutputPluginWrite(ctx, true);
+			break;
 
-				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_insert(ctx->out, relation, tuple);
-				OutputPluginWrite(ctx, true);
-				break;
-			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
-			{
-				HeapTuple	oldtuple = change->data.tp.oldtuple ?
-				&change->data.tp.oldtuple->tuple : NULL;
-				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
-
-				/* Switch relation if publishing via root. */
-				if (relentry->publish_as_relid != RelationGetRelid(relation))
-				{
-					Assert(relation->rd_rel->relispartition);
-					relation = RelationIdGetRelation(relentry->publish_as_relid);
-					/* Convert tuples if needed. */
-					if (relentry->map)
-					{
-						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
-						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
-					}
-				}
+			OutputPluginPrepareWrite(ctx, true);
+			logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
+			OutputPluginWrite(ctx, true);
+			break;
 
-				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
-				OutputPluginWrite(ctx, true);
-				break;
-			}
 		case REORDER_BUFFER_CHANGE_DELETE:
-			if (change->data.tp.oldtuple)
+			if (oldtuple)
 			{
-				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
-
-				/* Switch relation if publishing via root. */
-				if (relentry->publish_as_relid != RelationGetRelid(relation))
-				{
-					Assert(relation->rd_rel->relispartition);
-					relation = RelationIdGetRelation(relentry->publish_as_relid);
-					/* Convert tuple if needed. */
-					if (relentry->map)
-						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
-				}
-
 				OutputPluginPrepareWrite(ctx, true);
 				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
@@ -460,10 +432,14 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 			else
 				elog(DEBUG1, "didn't send DELETE change because of missing oldtuple");
 			break;
+
 		default:
 			Assert(false);
 	}
 
+	if (ancestor)
+		RelationClose(ancestor);
+
 	/* Cleanup */
 	MemoryContextSwitchTo(old);
 	MemoryContextReset(data->context);
@@ -635,8 +611,6 @@ static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
 {
 	RelationSyncEntry *entry;
-	bool		am_partition = get_rel_relispartition(relid);
-	char		relkind = get_rel_relkind(relid);
 	bool		found;
 	MemoryContext oldctx;
 
@@ -656,6 +630,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc;
 		Oid			publish_as_relid = relid;
+		Relation	relation = RelationIdGetRelation(relid);
+		bool		am_partition = relation->rd_rel->relispartition;
+		char		relkind = relation->rd_rel->relkind;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -745,7 +722,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		list_free(pubids);
 
 		entry->publish_as_relid = publish_as_relid;
+		/* Also set map while at it. */
+		if (publish_as_relid != relid)
+		{
+			Relation	ancestor = RelationIdGetRelation(publish_as_relid);
+			TupleDesc	indesc = RelationGetDescr(relation);
+			TupleDesc	outdesc = RelationGetDescr(ancestor);
+			MemoryContext oldctx;
+
+			/*
+			 * Map must live as long as the session does. TupleDescs must be
+			 * copied before putting into the map, because they may not live
+			 * as long as we want the map to live.
+			 */
+			oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+			entry->map = convert_tuples_by_name(CreateTupleDescCopy(indesc),
+												CreateTupleDescCopy(outdesc));
+			MemoryContextSwitchTo(oldctx);
+			RelationClose(ancestor);
+		}
 		entry->replicate_valid = true;
+		RelationClose(relation);
 	}
 
 	if (!found)
#80Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Amit Langote (#79)
Re: adding partitioned tables to publications

On 2020-04-09 09:28, Amit Langote wrote:

This patch makes the tests pass for me:

diff --git a/src/backend/replication/pgoutput/pgoutput.c
b/src/backend/replication/pgoutput/pgoutput.c
index 5fbf2d4367..cf6e8629c1 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -305,7 +305,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
-       relentry->map = convert_tuples_by_name(indesc, outdesc);
+       relentry->map =
convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, ctx);
RelationClose(ancestor);

Please check.

Thanks. Yes, that's what I just found out too and was about to send a
patch, which is basically same as yours as far as the fix for this
issue is concerned.

I have committed my patch but not ...

While figuring this out, I thought the nearby code could be rearranged
a bit, especially to de-duplicate the code. Also, I think
get_rel_sync_entry() may be a better place to set the map, rather than
maybe_send_schema(). Thoughts?

because I didn't really have an opinion on that at the time, but if you
still want it considered or have any open thoughts on this thread,
please resend or explain.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#81Amit Langote
amitlangote09@gmail.com
In reply to: Peter Eisentraut (#80)
2 attachment(s)
Re: adding partitioned tables to publications

On Fri, Apr 17, 2020 at 10:23 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-04-09 09:28, Amit Langote wrote:

While figuring this out, I thought the nearby code could be rearranged
a bit, especially to de-duplicate the code. Also, I think
get_rel_sync_entry() may be a better place to set the map, rather than
maybe_send_schema(). Thoughts?

because I didn't really have an opinion on that at the time, but if you
still want it considered or have any open thoughts on this thread,
please resend or explain.

Sure, thanks for taking care of the bug.

Rebased the code rearrangement patch. Also resending the patch to fix
TAP tests for improving coverage as described in:
/messages/by-id/CA+HiwqFyydvQ5g=qa54UM+Xjm77BdhX-nM4dXQkNOgH=zvDjoA@mail.gmail.com

To summarize:
1. Missing coverage for a couple of related blocks in
apply_handle_tuple_routing()
2. Missing coverage report for the code in pgoutput.c added by 83fd4532

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

0001-Rearrange-some-code-in-pgoutput.c.patchapplication/octet-stream; name=0001-Rearrange-some-code-in-pgoutput.c.patchDownload
From 742410c3be8c3222a354655fc0e3c198bd9562ad Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Fri, 17 Apr 2020 23:32:05 +0900
Subject: [PATCH 1/2] Rearrange some code in pgoutput.c

pgoutput_change() has grown some code due to recent partitioning
support commits that looks repetitive.

Change where RelationSyncEntry.map is set so that it appears to be
at a less random location.
---
 src/backend/replication/pgoutput/pgoutput.c | 122 ++++++++++++++--------------
 1 file changed, 59 insertions(+), 63 deletions(-)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 77b85fc..c499410 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -299,15 +299,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
 	if (relentry->publish_as_relid != RelationGetRelid(relation))
 	{
 		Relation	ancestor = RelationIdGetRelation(relentry->publish_as_relid);
-		TupleDesc	indesc = RelationGetDescr(relation);
-		TupleDesc	outdesc = RelationGetDescr(ancestor);
-		MemoryContext oldctx;
-
-		/* Map must live as long as the session does. */
-		oldctx = MemoryContextSwitchTo(CacheMemoryContext);
-		relentry->map = convert_tuples_by_name(CreateTupleDescCopy(indesc),
-											   CreateTupleDescCopy(outdesc));
-		MemoryContextSwitchTo(oldctx);
+
 		send_relation_and_attrs(ancestor, ctx);
 		RelationClose(ancestor);
 	}
@@ -363,6 +355,9 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 	PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
 	MemoryContext old;
 	RelationSyncEntry *relentry;
+	Relation	ancestor = NULL;
+	HeapTuple	oldtuple;
+	HeapTuple	newtuple;
 
 	if (!is_publishable_relation(relation))
 		return;
@@ -393,67 +388,43 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 
 	maybe_send_schema(ctx, relation, relentry);
 
+	/* Switch relation if publishing via ancestor. */
+	if (relentry->publish_as_relid != RelationGetRelid(relation))
+	{
+		Assert(relation->rd_rel->relispartition);
+		ancestor = RelationIdGetRelation(relentry->publish_as_relid);
+		relation = ancestor;
+	}
+
+	/* Set oldtuple/newtuple and convert to ancestor rowtype if necessary. */
+	oldtuple = change->data.tp.oldtuple ?
+			&change->data.tp.oldtuple->tuple : NULL;
+	if (oldtuple && relentry->map)
+		oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
+
+	newtuple = change->data.tp.newtuple ?
+			&change->data.tp.newtuple->tuple : NULL;
+	if (newtuple && relentry->map)
+		newtuple = execute_attr_map_tuple(newtuple, relentry->map);
+
 	/* Send the data */
 	switch (change->action)
 	{
 		case REORDER_BUFFER_CHANGE_INSERT:
-			{
-				HeapTuple	tuple = &change->data.tp.newtuple->tuple;
-
-				/* Switch relation if publishing via root. */
-				if (relentry->publish_as_relid != RelationGetRelid(relation))
-				{
-					Assert(relation->rd_rel->relispartition);
-					relation = RelationIdGetRelation(relentry->publish_as_relid);
-					/* Convert tuple if needed. */
-					if (relentry->map)
-						tuple = execute_attr_map_tuple(tuple, relentry->map);
-				}
+			OutputPluginPrepareWrite(ctx, true);
+			logicalrep_write_insert(ctx->out, relation, newtuple);
+			OutputPluginWrite(ctx, true);
+			break;
 
-				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_insert(ctx->out, relation, tuple);
-				OutputPluginWrite(ctx, true);
-				break;
-			}
 		case REORDER_BUFFER_CHANGE_UPDATE:
-			{
-				HeapTuple	oldtuple = change->data.tp.oldtuple ?
-				&change->data.tp.oldtuple->tuple : NULL;
-				HeapTuple	newtuple = &change->data.tp.newtuple->tuple;
-
-				/* Switch relation if publishing via root. */
-				if (relentry->publish_as_relid != RelationGetRelid(relation))
-				{
-					Assert(relation->rd_rel->relispartition);
-					relation = RelationIdGetRelation(relentry->publish_as_relid);
-					/* Convert tuples if needed. */
-					if (relentry->map)
-					{
-						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
-						newtuple = execute_attr_map_tuple(newtuple, relentry->map);
-					}
-				}
+			OutputPluginPrepareWrite(ctx, true);
+			logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
+			OutputPluginWrite(ctx, true);
+			break;
 
-				OutputPluginPrepareWrite(ctx, true);
-				logicalrep_write_update(ctx->out, relation, oldtuple, newtuple);
-				OutputPluginWrite(ctx, true);
-				break;
-			}
 		case REORDER_BUFFER_CHANGE_DELETE:
-			if (change->data.tp.oldtuple)
+			if (oldtuple)
 			{
-				HeapTuple	oldtuple = &change->data.tp.oldtuple->tuple;
-
-				/* Switch relation if publishing via root. */
-				if (relentry->publish_as_relid != RelationGetRelid(relation))
-				{
-					Assert(relation->rd_rel->relispartition);
-					relation = RelationIdGetRelation(relentry->publish_as_relid);
-					/* Convert tuple if needed. */
-					if (relentry->map)
-						oldtuple = execute_attr_map_tuple(oldtuple, relentry->map);
-				}
-
 				OutputPluginPrepareWrite(ctx, true);
 				logicalrep_write_delete(ctx->out, relation, oldtuple);
 				OutputPluginWrite(ctx, true);
@@ -461,10 +432,14 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
 			else
 				elog(DEBUG1, "didn't send DELETE change because of missing oldtuple");
 			break;
+
 		default:
 			Assert(false);
 	}
 
+	if (ancestor)
+		RelationClose(ancestor);
+
 	/* Cleanup */
 	MemoryContextSwitchTo(old);
 	MemoryContextReset(data->context);
@@ -636,8 +611,6 @@ static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Oid relid)
 {
 	RelationSyncEntry *entry;
-	bool		am_partition = get_rel_relispartition(relid);
-	char		relkind = get_rel_relkind(relid);
 	bool		found;
 	MemoryContext oldctx;
 
@@ -657,6 +630,9 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		List	   *pubids = GetRelationPublications(relid);
 		ListCell   *lc;
 		Oid			publish_as_relid = relid;
+		Relation	relation = RelationIdGetRelation(relid);
+		bool		am_partition = relation->rd_rel->relispartition;
+		char		relkind = relation->rd_rel->relkind;
 
 		/* Reload publications if needed before use. */
 		if (!publications_valid)
@@ -746,7 +722,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
 		list_free(pubids);
 
 		entry->publish_as_relid = publish_as_relid;
+		/* Also set map while at it. */
+		if (publish_as_relid != relid)
+		{
+			Relation	ancestor = RelationIdGetRelation(publish_as_relid);
+			TupleDesc	indesc = RelationGetDescr(relation);
+			TupleDesc	outdesc = RelationGetDescr(ancestor);
+			MemoryContext oldctx;
+
+			/*
+			 * Map must live as long as the session does. TupleDescs must be
+			 * copied before putting into the map, because they may not live
+			 * as long as we want the map to live.
+			 */
+			oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+			entry->map = convert_tuples_by_name(CreateTupleDescCopy(indesc),
+												CreateTupleDescCopy(outdesc));
+			MemoryContextSwitchTo(oldctx);
+			RelationClose(ancestor);
+		}
 		entry->replicate_valid = true;
+		RelationClose(relation);
 	}
 
 	if (!found)
-- 
1.8.3.1

0002-Fix-partition-logical-replication-TAP-tests-for-bett.patchapplication/octet-stream; name=0002-Fix-partition-logical-replication-TAP-tests-for-bett.patchDownload
From c72d6f33614613127500288bd50a81248d9c8056 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Fri, 17 Apr 2020 23:41:14 +0900
Subject: [PATCH 2/2] Fix partition logical replication TAP tests for better
 coverage

---
 src/test/subscription/t/013_partition.pl | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 208bb55..96d4780 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -141,6 +141,8 @@ $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 4 WHERE a = 6");
 $node_publisher->safe_psql('postgres',
 	"UPDATE tab1 SET a = 6 WHERE a = 4");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab1 SET a = 5 WHERE a = 6");
 
 $node_publisher->wait_for_catchup('sub1');
 $node_publisher->wait_for_catchup('sub2');
@@ -150,15 +152,15 @@ $result = $node_subscriber1->safe_psql('postgres',
 is($result, qq(sub1_tab1|0
 sub1_tab1|2
 sub1_tab1|3
-sub1_tab1|6), 'update of tab1_1, tab1_2 replicated');
+sub1_tab1|5), 'update of tab1_1, tab1_2 replicated');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT a FROM tab1_2_1 ORDER BY 1");
-is($result, qq(), 'updates of tab1_2 replicated into tab1_2_1 correctly');
+is($result, qq(5), 'updates of tab1_2 replicated into tab1_2_1 correctly');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT a FROM tab1_2_2 ORDER BY 1");
-is($result, qq(6), 'updates of tab1_2 replicated into tab1_2_2 correctly');
+is($result, qq(), 'updates of tab1_2 replicated into tab1_2_2 correctly');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_1 ORDER BY 1, 2");
@@ -167,7 +169,7 @@ sub2_tab1_1|3), 'update of tab1_1 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_2 ORDER BY 1, 2");
-is($result, qq(sub2_tab1_2|6), 'tab1_2 updated');
+is($result, qq(sub2_tab1_2|5), 'tab1_2 updated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_def ORDER BY 1");
@@ -187,12 +189,11 @@ $result = $node_subscriber1->safe_psql('postgres',
 is($result, qq(sub1_tab1|2
 sub1_tab1|3
 sub1_tab1|4
-sub1_tab1|6), 'update of tab1 (delete from tab1_def + insert into tab1_1) replicated');
+sub1_tab1|5), 'update of tab1 (delete from tab1_def + insert into tab1_1) replicated');
 
 $result = $node_subscriber1->safe_psql('postgres',
 	"SELECT a FROM tab1_2_2 ORDER BY 1");
-is($result, qq(4
-6), 'updates of tab1 (delete + insert) replicated into tab1_2_2 correctly');
+is($result, qq(4), 'updates of tab1 (delete + insert) replicated into tab1_2_2 correctly');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_1 ORDER BY 1, 2");
@@ -202,7 +203,7 @@ sub2_tab1_1|3), 'tab1_1 unchanged');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT c, a FROM tab1_2 ORDER BY 1, 2");
 is($result, qq(sub2_tab1_2|4
-sub2_tab1_2|6), 'insert into tab1_2 replicated');
+sub2_tab1_2|5), 'insert into tab1_2 replicated');
 
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab1_def ORDER BY 1");
@@ -267,6 +268,13 @@ is($result, qq(), 'truncate of tab1 replicated');
 # publisher
 $node_publisher->safe_psql('postgres',
 	"DROP PUBLICATION pub1");
+# make tab1_2's tuple description different from its parent
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 DETACH PARTITION tab1_2");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1_2 DROP b, ADD b text");
+$node_publisher->safe_psql('postgres',
+	"ALTER TABLE tab1 ATTACH PARTITION tab1_2 FOR VALUES IN (4, 5, 6)");
 $node_publisher->safe_psql('postgres',
 	"CREATE TABLE tab2 (a int PRIMARY KEY, b text) PARTITION BY LIST (a)");
 $node_publisher->safe_psql('postgres',
@@ -554,3 +562,7 @@ is($result, qq(), 'truncate of tab3 replicated');
 $result = $node_subscriber2->safe_psql('postgres',
 	"SELECT a FROM tab3_1");
 is($result, qq(), 'truncate of tab3_1 replicated');
+
+$node_publisher->stop('fast');
+$node_subscriber1->stop('fast');
+$node_subscriber2->stop('fast');
-- 
1.8.3.1