Better error message for unsupported replication cases

Started by Tom Lanealmost 4 years ago3 messages
#1Tom Lane
tgl@sss.pgh.pa.us
1 attachment(s)

In [1]/messages/by-id/CANhtRiamAgYt1A-Nh4=mU3E1UhG9XPgB+X6mW1DWqa93vUXW9A@mail.gmail.com there's a complaint that if you try to logically replicate
a partitioned table from v13-or-later to v12-or-earlier, you get
"table XXX not found on publisher", which is pretty confusing
because the publisher certainly does have such a table. That
happens because fetch_remote_table_info is too aggressive about
filtering by relkind and doesn't see the relation at all.
c314c147c improved that, but it wasn't back-patched. I propose
putting the attached into v10-v12. Maybe the error message
could be bikeshedded ... is "non-table relation" terminology
that we use in user-facing messages?

regards, tom lane

[1]: /messages/by-id/CANhtRiamAgYt1A-Nh4=mU3E1UhG9XPgB+X6mW1DWqa93vUXW9A@mail.gmail.com

Attachments:

handle-unsupported-relkind-better.patchtext/x-diff; charset=us-ascii; name=handle-unsupported-relkind-better.patchDownload
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 2db88dc41a..61244e08fa 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -646,9 +646,10 @@ fetch_remote_table_info(char *nspname, char *relname,
 	WalRcvExecResult *res;
 	StringInfoData cmd;
 	TupleTableSlot *slot;
-	Oid			tableRow[2] = {OIDOID, CHAROID};
+	Oid			tableRow[3] = {OIDOID, CHAROID, CHAROID};
 	Oid			attrRow[4] = {TEXTOID, OIDOID, INT4OID, BOOLOID};
 	bool		isnull;
+	char		relkind;
 	int			natt;
 
 	lrel->nspname = nspname;
@@ -656,13 +657,12 @@ fetch_remote_table_info(char *nspname, char *relname,
 
 	/* First fetch Oid and replica identity. */
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident"
+	appendStringInfo(&cmd, "SELECT c.oid, c.relreplident, c.relkind"
 					 "  FROM pg_catalog.pg_class c"
 					 "  INNER JOIN pg_catalog.pg_namespace n"
 					 "        ON (c.relnamespace = n.oid)"
 					 " WHERE n.nspname = %s"
-					 "   AND c.relname = %s"
-					 "   AND c.relkind = 'r'",
+					 "   AND c.relname = %s",
 					 quote_literal_cstr(nspname),
 					 quote_literal_cstr(relname));
 	res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data,
@@ -683,6 +683,19 @@ fetch_remote_table_info(char *nspname, char *relname,
 	Assert(!isnull);
 	lrel->replident = DatumGetChar(slot_getattr(slot, 2, &isnull));
 	Assert(!isnull);
+	relkind = DatumGetChar(slot_getattr(slot, 3, &isnull));
+	Assert(!isnull);
+
+	/*
+	 * Newer PG versions allow things that aren't plain tables to appear in
+	 * publications.  We don't handle that in this version, but try to provide
+	 * a useful error message.
+	 */
+	if (relkind != RELKIND_RELATION)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot sync from non-table relation \"%s.%s\"",
+						nspname, relname)));
 
 	ExecDropSingleTupleTableSlot(slot);
 	walrcv_clear_result(res);
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#1)
Re: Better error message for unsupported replication cases

On Tue, Feb 15, 2022 at 3:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

In [1] there's a complaint that if you try to logically replicate
a partitioned table from v13-or-later to v12-or-earlier, you get
"table XXX not found on publisher", which is pretty confusing
because the publisher certainly does have such a table. That
happens because fetch_remote_table_info is too aggressive about
filtering by relkind and doesn't see the relation at all.
c314c147c improved that, but it wasn't back-patched. I propose
putting the attached into v10-v12. Maybe the error message
could be bikeshedded ... is "non-table relation" terminology
that we use in user-facing messages?

The other option could be "logical replication source relation
\"%s.%s\" is not a table". We use a similar message in
CheckSubscriptionRelkind.

--
With Regards,
Amit Kapila.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#2)
Re: Better error message for unsupported replication cases

Amit Kapila <amit.kapila16@gmail.com> writes:

On Tue, Feb 15, 2022 at 3:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

... Maybe the error message
could be bikeshedded ... is "non-table relation" terminology
that we use in user-facing messages?

The other option could be "logical replication source relation
\"%s.%s\" is not a table". We use a similar message in
CheckSubscriptionRelkind.

Works for me, I'll do it like that if there are no objections.

regards, tom lane