Logical replication from PG v13 and below to PG v14 (devel version) is not working.

Started by Ashutosh Sharmaover 5 years ago24 messages
#1Ashutosh Sharma
ashu.coek88@gmail.com

Hi All,

Today, while exploring logical replication in PostgreSQL, I noticed
that logical replication from PG version 13 and below to PG v14
(development version) is not working. It has stopped working from the
following git commit onwards:

commit 464824323e57dc4b397e8b05854d779908b55304
Author: Amit Kapila <akapila@postgresql.org>
Date: Thu Sep 3 07:54:07 2020 +0530

Add support for streaming to built-in logical replication.

...
...

Here is the experiment that I performed to verify this:

Publisher (PG-v12/13):
==================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;

SELECT * FROM pg2pg;

CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;

Subscriber (PG-v14 HEAD or commit 46482432):
=====================================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
dbname=postgres user=ashu' PUBLICATION pg2pg_pub;

SELECT * FROM pg2pg;

Above select query produces no result. When this experiment is
performed below the mentioned git commit, it works fine.

After spending some time looking into this issue, I observed that
above git commit has bumped the logical replication protocol version.
Due to this, the logical replication apply worker process is unable to
do WAL streaming which causes it to terminate. Therefore, the data
inserted in the publication table is not applied on the subscription
table (i.e. no data replication happens)

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#2Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Ashutosh Sharma (#1)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

commit 464824323e57dc4b397e8b05854d779908b55304
Author: Amit Kapila <akapila@postgresql.org>
Date: Thu Sep 3 07:54:07 2020 +0530

Above select query produces no result. When this experiment is
performed below the mentioned git commit, it works fine.

I encountered the same issue. We get to see below error during table
sync phase in the subscriber server logs.

2020-09-21 16:01:45.794 IST [782428] LOG: logical replication apply
worker for subscription "pg2pg_sub" has started
2020-09-21 16:01:45.799 IST [782428] ERROR: could not start WAL
streaming: ERROR: client sent proto_version=2 but we only support
protocol 1 or lower
CONTEXT: slot "pg2pg_sub", output plugin "pgoutput", in the
startup callback
2020-09-21 16:01:45.800 IST [781607] LOG: background worker "logical
replication worker" (PID 782428) exited with exit code 1

After spending some time looking into this issue, I observed that
above git commit has bumped the logical replication protocol version.
Due to this, the logical replication apply worker process is unable to
do WAL streaming which causes it to terminate. Therefore, the data
inserted in the publication table is not applied on the subscription
table (i.e. no data replication happens)

That's because the above commit changed LOGICALREP_PROTO_VERSION_NUM
to 2 from 1.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Ashutosh Sharma (#1)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi All,

Today, while exploring logical replication in PostgreSQL, I noticed
that logical replication from PG version 13 and below to PG v14
(development version) is not working. It has stopped working from the
following git commit onwards:

commit 464824323e57dc4b397e8b05854d779908b55304
Author: Amit Kapila <akapila@postgresql.org>
Date: Thu Sep 3 07:54:07 2020 +0530

Add support for streaming to built-in logical replication.

...
...

Here is the experiment that I performed to verify this:

Publisher (PG-v12/13):
==================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;

SELECT * FROM pg2pg;

CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;

Subscriber (PG-v14 HEAD or commit 46482432):
=====================================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
dbname=postgres user=ashu' PUBLICATION pg2pg_pub;

SELECT * FROM pg2pg;

Above select query produces no result. When this experiment is
performed below the mentioned git commit, it works fine.

After spending some time looking into this issue, I observed that
above git commit has bumped the logical replication protocol version.
Due to this, the logical replication apply worker process is unable to
do WAL streaming which causes it to terminate. Therefore, the data
inserted in the publication table is not applied on the subscription
table (i.e. no data replication happens)

Seems like this commit, should have only set the
LOGICALREP_PROTO_STREAM_VERSION_NUM to 2 but the
LOGICALREP_PROTO_VERSION_NUM shouln't have been changed.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#4Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#3)
1 attachment(s)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Mon, Sep 21, 2020 at 4:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi All,

Today, while exploring logical replication in PostgreSQL, I noticed
that logical replication from PG version 13 and below to PG v14
(development version) is not working. It has stopped working from the
following git commit onwards:

commit 464824323e57dc4b397e8b05854d779908b55304
Author: Amit Kapila <akapila@postgresql.org>
Date: Thu Sep 3 07:54:07 2020 +0530

Add support for streaming to built-in logical replication.

...
...

Here is the experiment that I performed to verify this:

Publisher (PG-v12/13):
==================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;

SELECT * FROM pg2pg;

CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;

Subscriber (PG-v14 HEAD or commit 46482432):
=====================================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
dbname=postgres user=ashu' PUBLICATION pg2pg_pub;

SELECT * FROM pg2pg;

Above select query produces no result. When this experiment is
performed below the mentioned git commit, it works fine.

After spending some time looking into this issue, I observed that
above git commit has bumped the logical replication protocol version.
Due to this, the logical replication apply worker process is unable to
do WAL streaming which causes it to terminate. Therefore, the data
inserted in the publication table is not applied on the subscription
table (i.e. no data replication happens)

Seems like this commit, should have only set the
LOGICALREP_PROTO_STREAM_VERSION_NUM to 2 but the
LOGICALREP_PROTO_VERSION_NUM shouln't have been changed.

I think if we don't increase the LOGICALREP_PROTO_VERSION_NUM, then
streaming will not work in the latest version. So what I feel is that
we can keep the LOGICALREP_PROTO_VERSION_NUM as 1 only add one more
parameter say LOGICALREP_PROTO_MAX_VERSION_NUM. So now from the
latest subscriber if streaming is enabled then we can send
LOGICALREP_PROTO_STREAM_VERSION_NUM and LOGICALREP_PROTO_VERSION_NUM
otherwise. And on publisher side we can check with the
max_protocol_version and min_protocol version. I have attached a
patch for the changes I have explained.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v1-0001-Bugfix-in-logical-protocol-version.patchapplication/octet-stream; name=v1-0001-Bugfix-in-logical-protocol-version.patchDownload
From dbbf36712314669689938e277a9b75af8de07a52 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Mon, 21 Sep 2020 17:15:01 +0530
Subject: [PATCH v1] Bugfix in logical protocol version

Revert the LOGICALREP_PROTO_VERSION_NUM and instead add a new
parameter to control the max protocol version we support.
---
 src/backend/replication/logical/worker.c    | 3 ++-
 src/backend/replication/pgoutput/pgoutput.c | 2 +-
 src/include/replication/logicalproto.h      | 8 +++++---
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 62c571e..6a16573 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3230,7 +3230,8 @@ ApplyWorkerMain(Datum main_arg)
 	options.logical = true;
 	options.startpoint = origin_startpos;
 	options.slotname = myslotname;
-	options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+	options.proto.logical.proto_version = MySubscription->stream ?
+			LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
 	options.proto.logical.publication_names = MySubscription->publications;
 	options.proto.logical.binary = MySubscription->binary;
 	options.proto.logical.streaming = MySubscription->stream;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 729b655..d5752de 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -282,7 +282,7 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 								&enable_streaming);
 
 		/* Check if we support requested protocol */
-		if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+		if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 					 errmsg("client sent proto_version=%d but we only support protocol %d or lower",
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index fb07580..d22893d 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -19,8 +19,9 @@
 /*
  * Protocol capabilities
  *
- * LOGICALREP_PROTO_VERSION_NUM is our native protocol and the greatest version
- * we can support. LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
+ * LOGICALREP_PROTO_VERSION_NUM is our native protocol.
+ * LOGICALREP_PROTO_MAX_VERSION_NUM is the greatest version we can support.
+ * LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
  * have backwards compatibility for. The client requests protocol version at
  * connect time.
  *
@@ -29,7 +30,8 @@
  */
 #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
-#define LOGICALREP_PROTO_VERSION_NUM 2
+#define LOGICALREP_PROTO_VERSION_NUM 1
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
 
 /*
  * This struct stores a tuple received via logical replication.
-- 
1.8.3.1

#5Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Dilip Kumar (#4)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

In the error message we are still referring to the native protocol
version number. Shouldn't it be replaced with the greatest protocol
version number we support now (i.e. LOGICALREP_PROTO_MAX_VERSION_NUM)?

-       if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+       if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
            ereport(ERROR,
                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                     errmsg("client sent proto_version=%d but we only
support protocol %d or lower",
    data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));

Other than this, I don't have any comments.

On Mon, Sep 21, 2020 at 5:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Sep 21, 2020 at 4:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Sep 21, 2020 at 3:26 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi All,

Today, while exploring logical replication in PostgreSQL, I noticed
that logical replication from PG version 13 and below to PG v14
(development version) is not working. It has stopped working from the
following git commit onwards:

commit 464824323e57dc4b397e8b05854d779908b55304
Author: Amit Kapila <akapila@postgresql.org>
Date: Thu Sep 3 07:54:07 2020 +0530

Add support for streaming to built-in logical replication.

...
...

Here is the experiment that I performed to verify this:

Publisher (PG-v12/13):
==================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

INSERT INTO pg2pg (whatever) SELECT 'str-' || i FROM generate_series(1, 5) i;

SELECT * FROM pg2pg;

CREATE PUBLICATION pg2pg_pub FOR TABLE pg2pg;

Subscriber (PG-v14 HEAD or commit 46482432):
=====================================
CREATE TABLE pg2pg (id serial PRIMARY KEY, whatever text);

CREATE SUBSCRIPTION pg2pg_sub CONNECTION 'host=127.0.0.1 port=5433
dbname=postgres user=ashu' PUBLICATION pg2pg_pub;

SELECT * FROM pg2pg;

Above select query produces no result. When this experiment is
performed below the mentioned git commit, it works fine.

After spending some time looking into this issue, I observed that
above git commit has bumped the logical replication protocol version.
Due to this, the logical replication apply worker process is unable to
do WAL streaming which causes it to terminate. Therefore, the data
inserted in the publication table is not applied on the subscription
table (i.e. no data replication happens)

Seems like this commit, should have only set the
LOGICALREP_PROTO_STREAM_VERSION_NUM to 2 but the
LOGICALREP_PROTO_VERSION_NUM shouln't have been changed.

I think if we don't increase the LOGICALREP_PROTO_VERSION_NUM, then
streaming will not work in the latest version. So what I feel is that
we can keep the LOGICALREP_PROTO_VERSION_NUM as 1 only add one more
parameter say LOGICALREP_PROTO_MAX_VERSION_NUM. So now from the
latest subscriber if streaming is enabled then we can send
LOGICALREP_PROTO_STREAM_VERSION_NUM and LOGICALREP_PROTO_VERSION_NUM
otherwise. And on publisher side we can check with the
max_protocol_version and min_protocol version. I have attached a
patch for the changes I have explained.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#6Dilip Kumar
dilipbalaut@gmail.com
In reply to: Ashutosh Sharma (#5)
1 attachment(s)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

In the error message we are still referring to the native protocol
version number. Shouldn't it be replaced with the greatest protocol
version number we support now (i.e. LOGICALREP_PROTO_MAX_VERSION_NUM)?

-       if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+       if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("client sent proto_version=%d but we only
support protocol %d or lower",
data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));

Other than this, I don't have any comments.

Thanks for the review. I have fixed it the attached patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v2-0001-Bugfix-in-logical-protocol-version.patchapplication/octet-stream; name=v2-0001-Bugfix-in-logical-protocol-version.patchDownload
From cb5ccc91ec2f96dd0702a809c4e8aa36f1a56029 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Mon, 21 Sep 2020 17:15:01 +0530
Subject: [PATCH v2] Bugfix in logical protocol version

Revert the LOGICALREP_PROTO_VERSION_NUM and instead add a new
parameter to control the max protocol version we support.
---
 src/backend/replication/logical/worker.c    | 3 ++-
 src/backend/replication/pgoutput/pgoutput.c | 4 ++--
 src/include/replication/logicalproto.h      | 8 +++++---
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 62c571e..6a16573 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3230,7 +3230,8 @@ ApplyWorkerMain(Datum main_arg)
 	options.logical = true;
 	options.startpoint = origin_startpos;
 	options.slotname = myslotname;
-	options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+	options.proto.logical.proto_version = MySubscription->stream ?
+			LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
 	options.proto.logical.publication_names = MySubscription->publications;
 	options.proto.logical.binary = MySubscription->binary;
 	options.proto.logical.streaming = MySubscription->stream;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 729b655..174af7f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -282,11 +282,11 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 								&enable_streaming);
 
 		/* Check if we support requested protocol */
-		if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+		if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 					 errmsg("client sent proto_version=%d but we only support protocol %d or lower",
-							data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));
+							data->protocol_version, LOGICALREP_PROTO_MAX_VERSION_NUM)));
 
 		if (data->protocol_version < LOGICALREP_PROTO_MIN_VERSION_NUM)
 			ereport(ERROR,
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index fb07580..d22893d 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -19,8 +19,9 @@
 /*
  * Protocol capabilities
  *
- * LOGICALREP_PROTO_VERSION_NUM is our native protocol and the greatest version
- * we can support. LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
+ * LOGICALREP_PROTO_VERSION_NUM is our native protocol.
+ * LOGICALREP_PROTO_MAX_VERSION_NUM is the greatest version we can support.
+ * LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
  * have backwards compatibility for. The client requests protocol version at
  * connect time.
  *
@@ -29,7 +30,8 @@
  */
 #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
-#define LOGICALREP_PROTO_VERSION_NUM 2
+#define LOGICALREP_PROTO_VERSION_NUM 1
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
 
 /*
  * This struct stores a tuple received via logical replication.
-- 
1.8.3.1

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Ashutosh Sharma (#5)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

--
With Regards,
Amit Kapila.

#8Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Amit Kapila (#7)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

Just a thought:

Should we change the sequence of these #define in
src/include/replication/logicalproto.h?

#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I would have changed above to something like this:

#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2

#define LOGICALREP_PROTO_MIN_VERSION_NUM LOGICALREP_PROTO_VERSION_NUM
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Ashutosh Sharma (#8)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

Just a thought:

Should we change the sequence of these #define in
src/include/replication/logicalproto.h?

#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I would have changed above to something like this:

#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2

#define LOGICALREP_PROTO_MIN_VERSION_NUM LOGICALREP_PROTO_VERSION_NUM
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Another thing is can we also test by having a publisher of v14 and
subscriber of v13 or prior version, just reverse of what Ashutosh has
tested?

--
With Regards,
Amit Kapila.

#10Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Amit Kapila (#9)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

Just a thought:

Should we change the sequence of these #define in
src/include/replication/logicalproto.h?

#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I would have changed above to something like this:

#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2

#define LOGICALREP_PROTO_MIN_VERSION_NUM LOGICALREP_PROTO_VERSION_NUM
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

The only reason I proposed that was because for the *_MAX_VERSION_NUM
we are using the latest PROTOCOL version name in its definition so why
not to do the same for defining *_MIN_VERSION_NUM as well. Other than
that, I also wanted to correct the sequence so that they are defined
in the increasing order which you have already done here.

Another thing is can we also test by having a publisher of v14 and
subscriber of v13 or prior version, just reverse of what Ashutosh has
tested?

I've tested LR from PGv12 (Publisher) to PGv14 (Subscriber) and it works fine.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#11Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Ashutosh Sharma (#10)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Tue, Sep 22, 2020 at 12:22 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

Just a thought:

Should we change the sequence of these #define in
src/include/replication/logicalproto.h?

#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I would have changed above to something like this:

#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2

#define LOGICALREP_PROTO_MIN_VERSION_NUM LOGICALREP_PROTO_VERSION_NUM
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

The only reason I proposed that was because for the *_MAX_VERSION_NUM
we are using the latest PROTOCOL version name in its definition so why
not to do the same for defining *_MIN_VERSION_NUM as well. Other than
that, I also wanted to correct the sequence so that they are defined
in the increasing order which you have already done here.

Another thing is can we also test by having a publisher of v14 and
subscriber of v13 or prior version, just reverse of what Ashutosh has
tested?

I've tested LR from PGv12 (Publisher) to PGv14 (Subscriber) and it works fine.

I meant LR from PGv14 (Publisher) to PGv12 (Subscriber) not the other way.

Show quoted text

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#12Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#9)
1 attachment(s)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 8:34 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Sep 21, 2020 at 6:27 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Thanks Dilip for the patch. AFAIU, the fix looks good. One small comment:

Thanks Ashutosh and Dilip for working on this. I'll look into it in a
day or two.

Just a thought:

Should we change the sequence of these #define in
src/include/replication/logicalproto.h?

#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I would have changed above to something like this:

#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2

#define LOGICALREP_PROTO_MIN_VERSION_NUM LOGICALREP_PROTO_VERSION_NUM
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

Another thing is can we also test by having a publisher of v14 and
subscriber of v13 or prior version, just reverse of what Ashutosh has
tested?

I have also tested this and working fine.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v3-0001-Bugfix-in-logical-protocol-version.patchapplication/octet-stream; name=v3-0001-Bugfix-in-logical-protocol-version.patchDownload
From 48cc6bd273f5386620f2847fe79682d19a0b8ece Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Mon, 21 Sep 2020 17:15:01 +0530
Subject: [PATCH v3] Bugfix in logical protocol version

Revert the LOGICALREP_PROTO_VERSION_NUM and instead add a new
parameter to control the max protocol version we support.
---
 src/backend/replication/logical/worker.c    | 3 ++-
 src/backend/replication/pgoutput/pgoutput.c | 4 ++--
 src/include/replication/logicalproto.h      | 8 +++++---
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 62c571e..6a16573 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3230,7 +3230,8 @@ ApplyWorkerMain(Datum main_arg)
 	options.logical = true;
 	options.startpoint = origin_startpos;
 	options.slotname = myslotname;
-	options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+	options.proto.logical.proto_version = MySubscription->stream ?
+			LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
 	options.proto.logical.publication_names = MySubscription->publications;
 	options.proto.logical.binary = MySubscription->binary;
 	options.proto.logical.streaming = MySubscription->stream;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 729b655..174af7f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -282,11 +282,11 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 								&enable_streaming);
 
 		/* Check if we support requested protocol */
-		if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+		if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 					 errmsg("client sent proto_version=%d but we only support protocol %d or lower",
-							data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));
+							data->protocol_version, LOGICALREP_PROTO_MAX_VERSION_NUM)));
 
 		if (data->protocol_version < LOGICALREP_PROTO_MIN_VERSION_NUM)
 			ereport(ERROR,
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index fb07580..33d719c 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -19,8 +19,9 @@
 /*
  * Protocol capabilities
  *
- * LOGICALREP_PROTO_VERSION_NUM is our native protocol and the greatest version
- * we can support. LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
+ * LOGICALREP_PROTO_VERSION_NUM is our native protocol.
+ * LOGICALREP_PROTO_MAX_VERSION_NUM is the greatest version we can support.
+ * LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
  * have backwards compatibility for. The client requests protocol version at
  * connect time.
  *
@@ -28,8 +29,9 @@
  * support for streaming large transactions.
  */
 #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
+#define LOGICALREP_PROTO_VERSION_NUM 1
 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
-#define LOGICALREP_PROTO_VERSION_NUM 2
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
 
 /*
  * This struct stores a tuple received via logical replication.
-- 
1.8.3.1

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#12)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

- options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+ options.proto.logical.proto_version = MySubscription->stream ?
+ LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.
Because say if we decide to add something additional for decode of
prepared xacts in PG14 then this check needs to be extended for stream
and prepared where as server version number check won't need to be
changed.

--
With Regards,
Amit Kapila.

#14Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#13)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

- options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+ options.proto.logical.proto_version = MySubscription->stream ?
+ LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

I am not sure how can we do this? If PG version number is 14 then we
will always sent the LOGICALREP_PROTO_STREAM_VERSION_NUM? then again
we will face the same error right? I think it should be strictly
based on whether we have enabled the streaming or not. Because
logical replication protocol is still the same, only if the streaming
is enabled we expect the streaming protocol otherwise not.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#14)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Thu, Sep 24, 2020 at 4:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

- options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+ options.proto.logical.proto_version = MySubscription->stream ?
+ LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

I am not sure how can we do this?

Have you checked what will function walrcv_server_version() will
return? I was thinking that if we know that subscriber is connected to
Publisher version < 14 then we can send the right value.

--
With Regards,
Amit Kapila.

#16Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#15)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Sep 24, 2020 at 4:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

- options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+ options.proto.logical.proto_version = MySubscription->stream ?
+ LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

I am not sure how can we do this?

Have you checked what will function walrcv_server_version() will
return? I was thinking that if we know that subscriber is connected to
Publisher version < 14 then we can send the right value.

But, suppose we know the publisher version is < 14 and user has set
streaming on while creating subscriber then still we will send the
right version? I think tablesync we are forming a query so we are
forming as per the publisher's version. But here we are sending which
protocol version we are expecting for the data transfer so I feel it
should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
streaming transfer. Now let the publisher decide whether it supports
the protocol we have asked for or not and if it doesn't support then
let it give error. Am I missing something here?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#17Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#16)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Thu, Sep 24, 2020 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Have you checked what will function walrcv_server_version() will
return? I was thinking that if we know that subscriber is connected to
Publisher version < 14 then we can send the right value.

But, suppose we know the publisher version is < 14 and user has set
streaming on while creating subscriber then still we will send the
right version?

Yeah we can send the version depending on which server we are talking to?

I think tablesync we are forming a query so we are
forming as per the publisher's version. But here we are sending which
protocol version we are expecting for the data transfer so I feel it
should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
streaming transfer.

I am not sure if this is the right strategy. See
libpqrcv_startstreaming, even if the user asked for streaming unless
the server supports it we don't send the streaming option to the user.
Another thing is this check will become more complicated if we need
another feature like decode prepared to send different version or even
if it is the same version, we might need additional checks. Why do you
want to send a streaming protocol version when we know the server
doesn't support it, lets behave similar to what we do in
libpqrcv_startstreaming.

--
With Regards,
Amit Kapila.

#18Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Amit Kapila (#13)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

Hi Amit,

On Thu, Sep 24, 2020 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 22, 2020 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Sep 22, 2020 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I am not sure if this suggestion makes it better than what is purposed
by Dilip but I think we can declare them in define number order like
below:
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM

Done this way.

- options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+ options.proto.logical.proto_version = MySubscription->stream ?
+ LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

Should subscribers be setting the LR protocol value based on what is
the publisher version it is communicating with or should it be set
based on whether streaming was enabled or not while creating that
subscription? AFAIU if we set this value based on the publisher
version (which is lets say >= 14), then it's quite possible that the
subscriber will start streaming changes for the in-progress
transactions even if the streaming was disabled while creating the
subscription, won't it?

Please let me know if I am missing something here.

Thanks,

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#19Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#17)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Thu, Sep 24, 2020 at 5:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Sep 24, 2020 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Have you checked what will function walrcv_server_version() will
return? I was thinking that if we know that subscriber is connected to
Publisher version < 14 then we can send the right value.

But, suppose we know the publisher version is < 14 and user has set
streaming on while creating subscriber then still we will send the
right version?

Yeah we can send the version depending on which server we are talking to?

Ok

I think tablesync we are forming a query so we are
forming as per the publisher's version. But here we are sending which
protocol version we are expecting for the data transfer so I feel it
should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
streaming transfer.

I am not sure if this is the right strategy. See
libpqrcv_startstreaming, even if the user asked for streaming unless
the server supports it we don't send the streaming option to the user.
Another thing is this check will become more complicated if we need
another feature like decode prepared to send different version or even
if it is the same version, we might need additional checks. Why do you
want to send a streaming protocol version when we know the server
doesn't support it, lets behave similar to what we do in
libpqrcv_startstreaming.

Okay, so even if the streaming is enabled we are sending the streaming
on to the server versions which are >= 14 which make sense. But I
still doubt that it is right thing to send the protocol version based
on the server version. For example suppose in future PG20 we change
the streaming protocol version to 3, that mean from PG14 to PG20 we
may not support the streaming transmission but we still be able to
support the normal transamission. Now if streaming is off then
ideally we should send the LOGICALREP_PROTO_VERSION_NUM and that is
still supporetd by the publisher but if we send the
LOGICALREP_PROTO_STREAM_VERSION_NUM then it will error out because the
streaming protocol changed in the latest subscriber and the same is
not supported by the older publisher (14). So now if we want non
streaming transmission from 14 to 20 and that should be allowed but if
based on the server version we always send the
LOGICALREP_PROTO_STREAM_VERSION_NUM then that will be a problem
because our streaming version has changed.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#20Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#19)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Fri, Sep 25, 2020 at 7:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Sep 24, 2020 at 5:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Sep 24, 2020 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Sep 24, 2020 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Have you checked what will function walrcv_server_version() will
return? I was thinking that if we know that subscriber is connected to
Publisher version < 14 then we can send the right value.

But, suppose we know the publisher version is < 14 and user has set
streaming on while creating subscriber then still we will send the
right version?

Yeah we can send the version depending on which server we are talking to?

Ok

I think tablesync we are forming a query so we are
forming as per the publisher's version. But here we are sending which
protocol version we are expecting for the data transfer so I feel it
should be LOGICALREP_PROTO_VERSION_NUM if we expect non-streaming
transfer and LOGICALREP_PROTO_STREAM_VERSION_NUM if we expect the
streaming transfer.

I am not sure if this is the right strategy. See
libpqrcv_startstreaming, even if the user asked for streaming unless
the server supports it we don't send the streaming option to the user.
Another thing is this check will become more complicated if we need
another feature like decode prepared to send different version or even
if it is the same version, we might need additional checks. Why do you
want to send a streaming protocol version when we know the server
doesn't support it, lets behave similar to what we do in
libpqrcv_startstreaming.

Okay, so even if the streaming is enabled we are sending the streaming
on to the server versions which are >= 14 which make sense. But I
still doubt that it is right thing to send the protocol version based
on the server version. For example suppose in future PG20 we change
the streaming protocol version to 3, that mean from PG14 to PG20 we
may not support the streaming transmission but we still be able to
support the normal transamission. Now if streaming is off then
ideally we should send the LOGICALREP_PROTO_VERSION_NUM and that is
still supporetd by the publisher but if we send the
LOGICALREP_PROTO_STREAM_VERSION_NUM then it will error out because the
streaming protocol changed in the latest subscriber and the same is
not supported by the older publisher (14).

No that won't happen, we will send the version based on what the
server version supports (in this case it would be 2 when we are
talking to publisher 14). We can define a new macro or something like
that. I feel the protocol should depend on whether the server supports
it or not rather than based on some user specified option because it
will make our life easier if tomorrow we want to enable streaming by
default rather than based on option specified by the user. You can
consider it this way that how would you handle this if we wouldn't
have a user specified option (streaming) because that is generally the
way it should work.

--
With Regards,
Amit Kapila.

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Ashutosh Sharma (#18)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Thu, Sep 24, 2020 at 6:33 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi Amit,

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

Should subscribers be setting the LR protocol value based on what is
the publisher version it is communicating with or should it be set
based on whether streaming was enabled or not while creating that
subscription? AFAIU if we set this value based on the publisher
version (which is lets say >= 14), then it's quite possible that the
subscriber will start streaming changes for the in-progress
transactions even if the streaming was disabled while creating the
subscription, won't it?

No that won't happen because we send this option to the server
(publisher in this case) only when version is >=14 and user has
specified this option. See the below check in function
libpqrcv_startstreaming()
{
..
if (options->proto.logical.streaming &&
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfo(&cmd, ", streaming 'on'");
..
}

--
With Regards,
Amit Kapila.

#22Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Amit Kapila (#21)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Fri, Sep 25, 2020 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Sep 24, 2020 at 6:33 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi Amit,

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

Should subscribers be setting the LR protocol value based on what is
the publisher version it is communicating with or should it be set
based on whether streaming was enabled or not while creating that
subscription? AFAIU if we set this value based on the publisher
version (which is lets say >= 14), then it's quite possible that the
subscriber will start streaming changes for the in-progress
transactions even if the streaming was disabled while creating the
subscription, won't it?

No that won't happen because we send this option to the server
(publisher in this case) only when version is >=14 and user has
specified this option. See the below check in function
libpqrcv_startstreaming()
{
..
if (options->proto.logical.streaming &&
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfo(&cmd, ", streaming 'on'");
..
}

Ah, okay, understood, thanks. So, that means we can use the streaming
protocol version if the server (publisher) supports it, regardless of
whether the client (subscriber) has the streaming option enabled or
not.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#23Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#21)
1 attachment(s)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Fri, Sep 25, 2020 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Sep 24, 2020 at 6:33 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi Amit,

Here, I think instead of using MySubscription->stream, we should use
server/walrecv version number as we used at one place in tablesync.c.

Should subscribers be setting the LR protocol value based on what is
the publisher version it is communicating with or should it be set
based on whether streaming was enabled or not while creating that
subscription? AFAIU if we set this value based on the publisher
version (which is lets say >= 14), then it's quite possible that the
subscriber will start streaming changes for the in-progress
transactions even if the streaming was disabled while creating the
subscription, won't it?

No that won't happen because we send this option to the server
(publisher in this case) only when version is >=14 and user has
specified this option. See the below check in function
libpqrcv_startstreaming()
{
..
if (options->proto.logical.streaming &&
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfo(&cmd, ", streaming 'on'");
..
}

Ok, I have modified as per your suggestion.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v4-0001-Bugfix-in-logical-protocol-version.patchapplication/octet-stream; name=v4-0001-Bugfix-in-logical-protocol-version.patchDownload
From 14167267e79ddb9b4de66fd6fbd4a9653616b45d Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Mon, 21 Sep 2020 17:15:01 +0530
Subject: [PATCH v4] Bugfix in logical protocol version

Revert the LOGICALREP_PROTO_VERSION_NUM and instead add a new
parameter to control the max protocol version we support.
---
 src/backend/replication/logical/worker.c    | 4 +++-
 src/backend/replication/pgoutput/pgoutput.c | 4 ++--
 src/include/replication/logicalproto.h      | 8 +++++---
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 62c571e..1961f4a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3230,7 +3230,9 @@ ApplyWorkerMain(Datum main_arg)
 	options.logical = true;
 	options.startpoint = origin_startpos;
 	options.slotname = myslotname;
-	options.proto.logical.proto_version = LOGICALREP_PROTO_VERSION_NUM;
+	options.proto.logical.proto_version =
+		walrcv_server_version(wrconn) >= 140000 ?
+			LOGICALREP_PROTO_STREAM_VERSION_NUM : LOGICALREP_PROTO_VERSION_NUM;
 	options.proto.logical.publication_names = MySubscription->publications;
 	options.proto.logical.binary = MySubscription->binary;
 	options.proto.logical.streaming = MySubscription->stream;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 729b655..174af7f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -282,11 +282,11 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 								&enable_streaming);
 
 		/* Check if we support requested protocol */
-		if (data->protocol_version > LOGICALREP_PROTO_VERSION_NUM)
+		if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 					 errmsg("client sent proto_version=%d but we only support protocol %d or lower",
-							data->protocol_version, LOGICALREP_PROTO_VERSION_NUM)));
+							data->protocol_version, LOGICALREP_PROTO_MAX_VERSION_NUM)));
 
 		if (data->protocol_version < LOGICALREP_PROTO_MIN_VERSION_NUM)
 			ereport(ERROR,
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index fb07580..33d719c 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -19,8 +19,9 @@
 /*
  * Protocol capabilities
  *
- * LOGICALREP_PROTO_VERSION_NUM is our native protocol and the greatest version
- * we can support. LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
+ * LOGICALREP_PROTO_VERSION_NUM is our native protocol.
+ * LOGICALREP_PROTO_MAX_VERSION_NUM is the greatest version we can support.
+ * LOGICALREP_PROTO_MIN_VERSION_NUM is the oldest version we
  * have backwards compatibility for. The client requests protocol version at
  * connect time.
  *
@@ -28,8 +29,9 @@
  * support for streaming large transactions.
  */
 #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
+#define LOGICALREP_PROTO_VERSION_NUM 1
 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
-#define LOGICALREP_PROTO_VERSION_NUM 2
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
 
 /*
  * This struct stores a tuple received via logical replication.
-- 
1.8.3.1

#24Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#23)
Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.

On Fri, Sep 25, 2020 at 1:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Fri, Sep 25, 2020 at 8:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

No that won't happen because we send this option to the server
(publisher in this case) only when version is >=14 and user has
specified this option. See the below check in function
libpqrcv_startstreaming()
{
..
if (options->proto.logical.streaming &&
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfo(&cmd, ", streaming 'on'");
..
}

Ok, I have modified as per your suggestion.

Pushed, thanks!

--
With Regards,
Amit Kapila.